Illinois Institute of Technology
       
 
Prospective Students Current Students Business & Industry Faculty & Staff Alumni Visitors
 

Vol. 8, No. 2, January 1989
"Irreproducibility in the Scientific Literature: How Often Do Scientists Tell the Whole Truth And Nothing But the Truth?"
Robert G. Bergman, Professor of Chemistry, University of California, Berkeley

Much has been written recently about scientific. fraud. Articles on this subject are noteworthy not simply because they appear, but also because they are often written with a sense of shock. Our society has accommodated itself to regular tales of criminal activity among the general public as well as occasional stories of misconduct among professionals. Outraged by medical and legal malpractice, most of us nevertheless fully expect such incidents to occur and agree that there should be formal mechanisms for dealing with them. Scientific "malpractice," however, still surprises us.

There are several types of scientific misconduct, but the one most frequently discussed is data fabrication. The first suspicion that this type of misconduct has occurred often comes from a breakdown in scientific reproducibility-the inability of a scientist to reproduce a result obtained by a different individual (usually in another laboratory) and published in the scientific literature. This arises from the general expectation that a published experiment, measurement, or calculation contains information sufficient to allow a second investigator to repeat it and obtain results identical to those obtained by the initial experimenter (within the inherent error of the measurements involved). Most scientists also assume that data recorded in an experiment are objective rather than subjective-that is, that the recorded observations are independent of the investigator making them and will not change because a different individual makes the measurement.

How realistic are these assumptions? Clearly scientific misconduct exists, and a number of cases have come to light recently. However, even in the absence of explicit data manipulation, more subtle problems inherent in the way scientists carry out their research make the scientific literature much less reproducible than most people-including some scientists-take for granted.

It is difficult for any one individual to address the question of reproducibility for all scientific disciplines. I will therefore try to provide some information about it in an area close to my own-that of synthetic chemistry, the activity of making complex molecules from smaller (usually commercially available) compounds. It is common for an investigator to start a new project by repeating (or attempting to repeat) a preparation of a compound whose synthesis has been published in the literature, so that the material may be utilized in a new chemical transformation. Research proceeds in a similar way in biology and biochemistry, where the availability and characterization of a previously discovered bacterial strain or other organism can be crucial to the development of a new project.

The startling fact is that almost half of the literature's synthetic procedures we attempt to repeat initially fail in one way or another. A reasonably large fraction of these "recipes" can be reproduced after modification or discussions with the author. Some, however, cannot be repeated in our hands no matter what we do.

Is this troubling experience with reproducibility a general phenomenon? Fortunately, in chemistry we have two unusual journals that pro vide information about this question. They are called Organic Syntheses and Inorganic Syntheses. These journals differ from nearly all others. They were established specifically to publish only synthetic articles that had been deliberately checked in a laboratory different from the one in which they were devised. The names of the checkers appear along with the names of the authors when the articles are finally published.

Discussions with the editors of these journals, who do essentially all the checking of preparations in their own laboratories, is enlightening. Even though a scientist who submits an article knows it will be checked immediately, the experience of checkers is similar to that of the scientists who try to repeat synthesis from the open literature nearly half of the preparations submitted cannot be repeated in just the way they were described by the submitting authors. In some cases the problem is relatively minor, such as when the correct product is obtained but its isolated yield is lower than that recorded by the submitter. In other cases, the product cannot be obtained at all.

However, when such a difficulty occurs in a preparation submitted to Organic Synthesis or Inorganic Syntheses, a control mechanism comes into play: at the recommendation of the journal, the two individuals involved establish direct communication and attempt to resolve the problem so that the preparation can be reproduced in the checker's laboratory. The paper will be accepted for publication only after sufficient details have been communicated so that the synthesis is workable, with comparable results, in both laboratories.

Normally this results in a solution to the problem-but sometimes it does not. The experiments in three of the thirty articles I know of-ten percent-could never be repeated even after extensive communication with the authors. Was this due to fraud? Perhaps the following anecdote will provide some insight into this question.

One checker spent several weeks trying to duplicate a synthesis that seemed to proceed well in the laboratory of a submitter who was a well established, careful investigator. After weeks of work and numerous telephone conversations, it was discovered that a procedure for evaporation of solvent from the product was being carried out for only fifteen seconds in the submitter's laboratory. This was not stated in the written description of the experiment because it had become automatic. In the checker's laboratory, on the other hand, solvent was evaporated under vacuum for a longer period of time. Because the product of the reaction in question was relatively volatile, it was being lost in the procedure.

The overriding problem in such cases is that the researchers fail to describe exactly what they did in carrying out an experiment. It may seem incredible that this should happen to professional scientists. But, it is easy-too easy-in experimental work to fail to write down everything you did, especially when some procedures become automatic in one's laboratory. When this happens, it lakes insight, experience, and intelligence to identify the problem.

Let us turn now to the question of data objectivity. The complete fabrication of an experiment from start to finish is probably rare. On the other hand, "massaging" data-tidying up results, fudging the statistics a little, finding reasons for reporting only favorable data-could well be just as common as scientific gossip assumes it to be. There are undoubtedly many cases, for example, in which straight lines have been drawn through data that, with more experimentation or lower error, would have been clearly demonstrated to represent nonlinear relationships.

The reason for this is that all scientists have expectations about how their experiments will turn out and therefore have a tendency to see what they want to see and ignore what goes against their preconceived ideas. Responsible scientists must consciously force themselves to be suspicious of their own results-especially if they agree with expectations. We must continually ask "could this really nice result be wrong?" We should not trust any result we have measured only once. If a result comes out a certain way, we should try to find a way to get the answer in a different way or from a different perspective. How hard it is to convince research students of this-it seems such a great waste of time!

The most important tool we have available to deal with the problems 1 have discussed is education. Scientists focus so tightly on communicating to students the technical details of our profession that discussion of ethical and psychological issues often falls by the wayside. Sadly, the bulk of discussion of ethical issues that does occur in most scientific laboratories too often takes the form of gossip. Surely we can do better. We can consciously discuss with our students and colleagues situations in which ethical problems arise and try to encourage thetas to think about how to handle them. In the psychological area, it is important for us to consider more explicitly how we make observations and report them, how scientific breakthroughs occur, how old ideas persist when they are no longer valid, and how new ideas are generated and eventually take hold.

If our experiences in research have taught us anything about the nature of the investigative process itself, it is that there is a tendency in research to look for things that support our initial hypotheses. We must therefore convince ourselves and our students not just to double-check things that appear to be wrong-but to be even more suspicious of things that appear to be right.

© 2008 Illinois Institute of Technology 3300 South Federal Street, Chicago, IL 60616-3793 Tel 312.567.3000