Dr John Rumble


Presentation slides

Science and Data: How the Information Revolution has changed their relationship

Data are the mechanisms by which science expresses its understanding of nature quantitatively. Data are the results science obtains through measurements, calculations, and observations. Data take many forms and provide an unambiguous method for determining the agreement among two or more scientific results. The Information Revolution of the last 60 years has dramatically changed the way data are collected, managed, disseminated, and exploited. In fact, virtually every area of science today is assembling mammoth data collections, in many cases, attempting to bring together every relevant bit of data ever generated. Additionally, these large data collections are becoming new sources of discovery. The concepts of data mining and automated discovery are part of today’s scientific vocabulary. This profusion of data is both a source of new science as well as a generator of new challenges. A terabyte of new data can contain exciting new information, but how in the world can a single scientist work with a terabyte of data? Even the fastest reader cannot hope to look at each individual data point. In my talk I will show how the Information Revolution has fundamentally (?) changed the conduct of science. I will describe the new roles that data collections play in science – from generating results to generating ideas. I will trace the role that data collections have played in science over the last 5000+ years and how the Information Revolution has opened new roles. The challenges associated with those new roles will be enumerated, and approaches to overcoming those challenges will be suggested. Examples will be given from biology, astronomy, chemistry, physics, and the earth sciences. Topics to be discussed include automated discovery, data set integration, data volume, data quality, and data accessibility.

About the speaker

John RumbleFor over four decades, Dr. John Rumble has explored the subject of how the Information Revolution can impact the practice of science. Beginning as a computational chemist and physicist, Dr. Rumble early on migrated into the world of scientific data and has worked on some of the most interesting challenges in that area, including developing one of the first online factual data systems, some of the first PC-based databases, and major web-based scientific data systems.

Today, Dr. Rumble is Executive Vice President of Information International Associates, an information management and technology company in Oak Ridge, Tennessee. From 1980 through 2004, he worked for the National Institute of Standards and Technology (NIST – formerly the National Bureau of Standards), first as a Program Manager and then Director of the NIST Standard Reference Data Program. In later years at NIST, Dr. Rumble was responsible for all of the NIST measurement services programs.

Dr. Rumble was among the first to build online, PC, and Internet/web-based factual databases, first in physics, next in materials science, and then in other fields. In 1982, he began international efforts to build and deliver comprehensive materials databases, working with government agencies, professional societies, industry, and foreign institutions. To start this effort, he organized over 20 user needs workshops covering every aspect of computerized materials data, from discipline (wear, fracture, corrosion), user community (energy, automotive, aerospace), information vendors (publishers, professional societies, IT vendors), metadata standards, and management (federal agencies, non-governmental organizations, foreign institutions). As a direct result of these efforts, he authored, with others, a well-known series of workshop reports and articles that summarized user needs and defined needed system characteristics. In later years, this activity was extended to other disciplines.

Dr. Rumble was responsible for founding and leading two of the most important metadata standards committees for materials and other physical sciences, one under the auspices of ASTM and the other under ISO. In his ISO work, he was responsible for the initial release of the ISO 10303 Standard for the Exchange of Product Data, a multi-volume international standard for metadata related to every aspect of computer-assisted engineering. Dr. Rumble personally wrote many ASTM metadata standards as well as portions of ISO 10303. While a NIST Program Manager, he led the development of many PC databases that contained critically evaluated data and were sold to the public. In the 1990s, he oversaw the development of NIST online data systems.

Dr. Rumble has written or edited three books and many articles on the design and building of scientific and technical databases. He has developed much collaboration with organizations in the United States and abroad on collecting, evaluating, and disseminating S&T data, which has resulted in numerous computerized data collections now being available. He also managed NIST’s activities in large scale data evaluation programs in materials science.

Dr. Rumble is a Fellow of several professional societies, served as President of CODATA, the ICSU Committee on Scientific and Technical Data, and recently was awarded the CODATA 2006 Prize for outstanding achievements in S&T data. He also presently serves as editor of the CODATA Data Science Journal.