Fundamental deficiencies in the megatrial methodology
Abstract
The fundamental methodological deficiency of megatrials is deliberate reduction of
experimental control in order to maximize recruitment and compliance of subjects.
Hence, typical megatrials recruit pathologically and prognostically heterogeneous
subjects, and protocols typically fail to exclude significant confounders. Therefore,
most megatrials do not test a scientific hypothesis, nor are they informative about
individual patients. The proper function of a megatrial is precise measurement of
effect size for a therapeutic intervention. Valid megatrials can be designed only
when simplification can be achieved without significantly affecting experimental control.
Megatrials should be conducted only at the end of a long process of therapeutic development,
and must always be designed and interpreted in the context of relevant scientific
and clinical information.
Keywords: epidemiology; history; megatrial; methodology; randomized trial
Introduction
Megatrials are very large randomized controlled trials (RCTs) - usually recruiting
thousands of subjects and usually multicentred - and their methodological hallmark
is that recruitment criteria are highly inclusive, protocols are maximally simplified,
and end points are unambiguous (eg mortality). Megatrials have been put forward -
especially by the 'evidence-based-medicine movement' - as the criterion reference
source of evidence, superior to any other method for measuring the effectiveness or
effect size of medical interventions.
This aggrandizement of megatrials to a position of superiority is an error. I explore
how it was that such a transparently ludicrous idea has gained such wide currency
and explicate some of the fundamental deficiencies of the megatrial methodology which
mean that - in most cases - megatrials are highly prone to mislead. Properly understood,
the results of large, simplified, randomized trails can be understood only against a background of a great deal of other information, especially information
derived from more scientifically rigorous research methods.
Reasons for the supposed superiority of megatrials
How did the illusion of the superiority of megatrials come about? There are probably
three main reasons - historical, managerial, and methodological.
1. Historical
When large randomized controlled trials emerged from the middle 1960s, it was as a
methodology intended to come at the end of a long process of drug development [1]. For instance, tricyclic and monoamine-oxidase-inhibitor antidepressants were synthesized
in the 1950s, and their toxicity, dosage, clinical properties, and side effects were
elucidated almost wholly by means of clinical observations, in animal studies, 'open',
uncontrolled studies, and small, highly controlled trials [2]. Only after about a decade of worldwide clinical use was a large (by contemporary standards), placebo-controlled, comparison, randomized trial executed
by the UK Medical Research Council (MRC), in 1965 - and even then, the dose of the
monoamine-oxidase inhibitor chosen was too low. So, a great deal was already known
about antidepressants before a large RCT was planned. It was already known that antidepressants worked - and the
function of the trial was merely to estimate the magnitude of the effect size.
Nowadays, because of the widespread overvaluation of megatrials, the process of drug
development has almost been turned upon its head. Instead of megatrials coming at
the end of a long process of drug development, after a great deal of scientific information
and clinical experience has accumulated, it is sometimes argued that drugs should
not even be made available to patients until after megatrials have been completed. For instance, 1999 saw the National Institute for
Clinical Excellence (NICE) delay the introduction of the anti-influenza agent Relenza® (zanamivir) with the excuse that there had been insufficient evidence from RCTs to
justify clinical use, thus preventing the kind of detailed, practical, clinical evaluation
that is actually a prerequisite to rigorous trial design.
It is not sufficiently appreciated that one cannot design an appropriate megatrial
until one already knows a great deal about the drug. This prior knowledge is required
to be able to select the right subjects, choose an optimal dose, and create a protocol
that controls for distorting variables. If a megatrial is executed without such knowledge,
then it will simplify where it ought to be controlling: eg patients will be recruited
who are actually unsuitable for treatment, they will be given the trial drug in incorrect
doses, patients taking interfering drugs will not be excluded, etc. Consequently,
such premature megatrials will usually tend systematically to underestimate the effect
size of a new drug.
2. Managerial - changes in research personnel
Before megatrials could become so widely and profoundly misunderstood, it was necessary
that the statistical aspects of research should become wildly overvalued. Properly,
statistics is a means to the end of scientific understanding [3] - and when studying medical interventions, the nature of scientific understanding
could be termed 'clinical science' - an enterprise for which the qualifications would
include knowledge of disease and experience of patients [1]. People with such qualifications would provide the basis for a leadership role in
research into the effectiveness of drugs and other technologies.
Instead, recent decades have seen biostatisticians and epidemiologists rise to a position
of primacy in the organization, funding, and refereeing of medical research - in other
words, people whose knowledge of disease and patients in relation to any particular
medical treatment is second-hand at best and nonexistent at worst.
The reason for this hegemony of the number-crunchers is not, of course, anything to
do with their possessing scientific superiority, nor even a track record of achievement;
but has a great deal to do with the needs of managerialism - a topic that lies beyond
the scope of this essay [4].
3. Methodological - masking of clinical inapplicability by statistical precision
There are also methodological reasons behind the aggrandizement of megatrials. As
therapy has advanced, clinicians have come to expect incremental, quantitative improvements
in already effective interventions, rather than qualitative 'breakthroughs' and the
development of wholly new treatment methods. This has led to demands for ever-increasing
precision in the measurement of therapeutic effectiveness, as the concern has been
expressed that the modest benefits of new treatment could be obscured by random error.
Furthermore, when expected effect sizes are relatively small, it becomes increasingly
difficult to disentangle primary therapeutic effects from confounding factors. Of
course, where confounders (such as age, sex, severity of illness) are known, they
can be controlled by selective recruitment. But selective recruitment tends to make
trials small.
Megatrials appear to offer the ability to deal with these problems. Instead of controlling
confounders by rigorous selection of subjects and tight protocols, confounding is
dealt with by randomly allocating subjects between the comparison groups, and using
sufficiently large numbers of subjects so that any confounders (including unknown
ones) may be expected to balance each other out [5]. The large numbers of subjects also offer unprecedented discriminative power to
obtain statistically precise measurements of the outcomes of treatment [6]. Even modest, stepwise increments of therapeutic progress could, in principle, be
resolved by sufficiently large studies.
Resolving power, in a strictly statistical sense, is apparently limited only by the
numbers of subjects in the trial -and very large numbers of patients can be recruited
by using simple protocols in multiple research centres [6]. Analysis of megatrials requires comparison of the average outcome in each allocation
group (ie by 'intention to treat') rather than by treatment received. This is necessitated
by the absolute dependence upon randomization rather than rigorous protocols to deal
with confounding [5]. So, in pursuit of precision, randomized trials have grown ever larger and simpler.
More recently, there has been a fashion for pooling data from such trials to expand
the number of subjects still further in a process called meta-analysis [7] - this can be considered an extension of the megatrial idea, with all its problems
multiplied [8]. For instance, results of meta-analyses differ among themselves, in relation to
RCT information, and may diverge from scientific and clinical knowledge of pharmacology
and physiology [9]
The problem is that 'simplification' of protocol translates into scientific terms
as deliberate reduction in the level of experimental control. This is employed with good intentions - in order to increase recruitment, consistency,
and compliance [5], and is vital to the creation of huge databases from randomized subjects. However,
as I have argued elsewhere, the strategy of expanding size by diminishing control
is a methodological mistake [10]. Reduced experimental control inevitably means less informational content in a trial.
At the absurd extreme, the ultimate megatrial would recruit an unselected population
of anybody at all, and randomize subjects to a protocol that would not, however, necessarily
bear any relation to what actually happened to the subject from then on. So long as
the outcomes were analysed according to the protocol to which the subject had originally
been randomized, then this would be statistically acceptable. The apparent basis for
the mistake of deliberately reducing experimental rigour in megatrials seems to be
an imagined, but unreal, tradeoff between rigour and size - perhaps resulting from
the observation that small, rigorous trials and large, simple trials may have similar
'confidence interval' statistics [10]. Yet these methodologies are not equivalent: in science the protocol defines the
experiment, and different protocols imply different studies examining different questions
in different populations [5].
Assumptions behind the megatrial methodology
Megatrials could be defined as RCTs in which recruitment is the primary methodological
imperative. The common assumption has been that with the advent of megatrials, clinicians
now have an instrument that can provide estimates and comparisons of therapeutic effectiveness
that are both clinically applicable and statistically precise. Widespread adoption
of megatrials has been based upon the assumption that their results could be extrapolated
beyond the immediate circumstances of the trial and used to determine, or at least
substantially influence, clinical practice.
However, this question of generalizing from the average result of megatrials to individual
patients has never been satisfactorily resolved. Many clinicians are aware of serious
problems [11,12], and yet these problems have been largely ignored by the advocates of a trial-led
approach to practice.
Extrapolation from megatrials to practice has been justified on the basis of several
assertions. It has been assumed (if not argued) that high levels of experimental rigour
are not important in RCTs because the randomization of large numbers of subjects compensates
(in some undefined way) for lower levels of control. This is a mistaken argument based
on a statistical confusion: large, poorly controlled trials may have a similar confidence
interval to that in a small, well controlled trial (a large scatter divided by the
square root of large numbers may be numerically equal to a smaller scatter divided
by the square root of smaller numbers) - but this does not mean that the studies are
equivalent [5]. The smaller, better-controlled study is superior. Different protocols mean a different
experiment, and low control means less information. After all, if poor control were
better than good control, scientists would never need to do experiments - control
is of the essence of experiment.
Furthermore, it is routinely assumed that the average effect measured among the many
thousands of patients in a megatrial group is also a measure of the probability of
an intervention producing this same effect in an individual patient. In other words,
it is assumed that the megatrial result and its confidence interval can serve as an
estimate of the probability of a given outcome in an individual patient to whom the
trial result might be applied.
This is not the case. Even when a megatrial population is representative of a clinical
population (something very rarely achieved), when trial populations are heterogeneous
average outcomes do not necessarily reflect probabilities in individuals. To take
a fictional example: supposing a drug called 'Fluzap' shortens an illness by 5 days
if that illness is influenza and if patients actually take the drug. Then suppose that the trial population also contains
patients who do not have influenza (because of non-rigorous recruitment criteria) and also patients who
(despite being randomized to 'Fluzap') do not take the drug - suppose that in such subjects, the drug 'Fluzap' has no effect. Then
the average effect size for 'Fluzap' according to intention-to-treat analysis would be a value
intermediate between zero and five - eg that 'Fluzap' shortened the episode of influenza
by about a day. This trial result may be statistically acceptable, but it does not
apply to any individual patient. The value of such a randomized trial as a guide to
treatment is therefore somewhat questionable, and the mass dissemination of such a
summary statistic through the professional and lay press would seem to be politically,
rather than scientifically, motivated.
Confidence intervals - confidence trick?
The decline in scientific rigour associated with the mega-trial methodology has been
disguised by the standard statistical displays used to express the outcome of megatrials.
Megatrials typically quote the statistic called the 'confidence interval' (CI) as
their summary estimate of therapeutic outcome; or else quote the average outcome for
each protocol and a measure of the 'statistical significance' of any measured difference
between averages.
But although the confidence interval has been promoted as an improvement on significance
tests [13], it has serious problems when used for clinical purposes, and is not a useful summary
statistic for determining practical applications of a trial. The confidence interval
describes the parameters within which the 'true' mean of a therapeutic trial can be considered to lie - with a quoted degree of probability
and given certain rigorous (and seldommet) statistical assumptions [14].
Clinicians need measures of outcome among individual patients in a trial, especially the nature and degree of variation in the outcome. The confidence
interval simply does not tell the clinician what he or she needs to know in order
to decide how useful the results of a megatrial would be for implementation in clinical
practice. Average results and confidence intervals from megatrials conceal an enormous
diversity among the results for individual subjects - for example, an average effect
size for a drug is uninformative when there is huge variation between individuals.
When used to summarize large data sets, the confidence-interval statistic gives no
readily apprehended indication of the scatter of patient outcomes, because it includes
the square root of the number of patients as denominator (confidence interval equals
standard deviation divided by square root of n) [15]. This creates the misleading impression that big studies are better, because simply
increasing the number of patients will increase the divisor of the fraction, which
will powerfully tend to reduce the size of the confidence interval when trials become
'mega' in size.
Consequently, the confidence interval will usually reduce as studies
enlarge, although the scatter of outcomes (eg the standard deviation) may remain the
same, or more probably will increase as a result of simplified protocols and poorer
control.
The exceptionally narrow 'confidence intervals' generated by megatrials (and even
more so by meta-analyses) are often misunderstood to mean that doctors can be very
'confident' that the trial estimates of therapeutic effectiveness are valid and accurate.
This is untrue both in narrowly statistical and broadly clinical senses. In fact,
the confidence interval per se gives no indication whatsoever of the precision of
an estimate with regard to the individual subjects in a trial. Furthermore, the narrowness
of a confidence interval does not have any necessary relation to the reality of a
proposed causal relation, nor does it give any indication of the applicability of
a trial result to another population. Indeed, since the confidence interval gives
no guide to the equivalence of the populations under comparison, differences between
trial results may be due to bias rather than causation. [16].
So, narrow, nonoverlapping confidence intervals, which discriminate sharply between
protocols in a statistical sense, may nevertheless be associated with qualitative
variation between subjects such that a minority of patients are probably actively
harmed by a treatment that benefits the majority [17].
Measures of scatter needed for clinical interpretation
It would be more useful to the clinician if randomized trials were to display their
results in terms of the scatter of patient outcomes, rather than averages. This may
be approximated by a scattergram display of trial results, with each individual patient
outcome represented as a dot. Such a display allows an estimate of experimental control
as well as statistical precision, since poorly controlled studies will have very wide
scatters of results with substantial overlaps between alternative protocols. The fact
that such displays are almost never seen for megatrials suggests that they would be
highly revealing of the scientifically slipshod methods routinely employed by such
studies.
If this graphic display of all results is too unwieldy even for modern computerized
graphics, a reasonable numerical approximation that gives the average outcome with
a measure of scatter is also useful - for example, the mean and standard deviation,
or the median with interquartile range [14]. These types of presentation allow the clinician to see at a glance, or at least
swiftly calculate, what range of outcomes followed a given intervention in the trial,
and therefore (all else being equal, and when proper standards of rigour and representativeness
apply) the probability of a given outcome in an individual patient.
While the confidence-interval statistic will usually give a misleadingly clear-cut
impression of any difference between the averages of two interventions being compared,
a mean and standard deviation reveal the degree of overlap in results. When the confidence
interval relates to an interval scale, it may indeed be possible to use the confidence
interval to generate an approximate standard-deviation statistic. This is done on
the basis that the 95% CI is (roughly) two 'standard-error-of-the-mean' (SEM) values
above and below the mean [15]. The SEM is the standard deviation divided by the square root of n. Therefore, if the difference between the mean and the confidence limit is halved
to give the SEM, and if the SEM is multiplied by the square root of n, this will yield the approximate standard deviation. The above calculation may be
a worthwhile exercise, because it is often surprising to discover the enormous scatter
of outcomes that lie hidden within a tight-looking confidence interval. However, most
megatrials use proportional measures of outcome (eg percentage mortality rate, or
5-year survival), and these measures cannot be converted to standard deviations by
the above method, or by any other convenient means.
Confidence intervals therefore have no readily comprehensible relation to confidence
concerning outcomes - which is the variable of interest to clinicians. What is required
instead of confidence intervals is a display, or numerical measure, of scatter that
assists the practitioner in deciding the clinical importance that should be attached
to 'statistically significant' differences between average results.
A false hierarchy of research methods leads to an uncritical attitude to RCTs
There is a widespread perception that RCTs are the 'gold standard' of clinical research
(a hackneyed phrase). It is routinely stated that randomized trials are 'the best'
evidence, followed by cohort studies, case-control studies, surveys, case series,
and finally single case studies (quoted by Olkin [7]). This hierarchy of methods seems to have attained the status of unquestioned dogma.
In other words, the belief is that RCTs are intrinsically superior to other forms of epidemiological or scientific study, and therefore offer
results of greater validity than the alternatives.
To anyone with a scientific background, this idea of a hierarchy of methods is amazing nonsense, and belief in such a hierarchy constitutes conclusive evidence
of scientific illiteracy. The validity of a piece of science is not determined by
its method - as if gene sequencing were 'better than' electron microscopy! For example,
contrary to the hierarchical dogma, individual case studies are not intrinsically
inferior to group studies - they merely have different uses [18]. The great physiologist Claude Bernard pointed out many years ago that the averaging
involved in group studies is a potentially misleading procedure that must be justified
in each specific instance [19]. When case studies are performed as qualitative tests of a pre-existing explicit
and detailed hypothetical model, they exemplify the highest standards of scientific
rigour - each case serving as an independent test of the hypothesis [20,21]. Individual human case studies are frequently published in top scientific journals
such as Nature and Science.
Validity is conferred not by the application of a method or technique, nor by the
size of a study, nor even by the difficulty and expense of the study, but only by
the degree of rigour (ie the level of experimental control) with which a given study
is able to test a research question. Since mega-trials deliberately reduce the level
of experimental control in order to maximize recruitment, this means that megatrial
results invariably require very careful interpretation.
NNT - not necessarily true
The assumption just mentioned is embodied in that cherished evidence-based medicine
(EBM) tool, the comparison of two interventions in terms of the 'number needed to
treat', or NNT [22]. The NNT expresses the difference between the outcomes of two rival trial protocols
in terms of how many patients must be treated for how long in order to prevent one
adverse event. For instance, comparing beta-blocker with placebo in hypertension may
yield an NNT of 13 patients treated for 5 years to prevent one stroke.
However, the apparent simplicity and clarity of this information depends upon the
clinical target population having the same risk-benefit profile as the randomized
trial population. When trial and target populations differ and the trial population
is unrepresentative of the target population, the NNT will be an inaccurate estimate
of effect size for the actual patients whose treatment is being considered. For instance,
an elderly population may be more vulnerable to the adverse effects of a drug and
less responsive to its therapeutic effect, to the point where an intervention that
produces an average benefit to the young may be harmful in the old.
On top of this, the patients in a megatrial population are always prognostically heterogeneous, because the methodology uses deliberately simplified
protocols designed to optimize recruitment rather than control - and meta-analyses
are even more heterogeneous [3,8]. In a megatrial that shows an overall benefit, it is very probable that while the
outcome for some patients will be improved by treatment, other patients will be made
worse, and others will be unaffected. What this means is that even a representative
megatrial (and such trials are exceedingly uncommon) cannot provide a risk estimate
of what will happen to individual patients who are allocated the same protocol. Trials
on unrepresentative populations may, of course, be actively misleading. The NNT generated
by a megatrial does not in itself, therefore, provide guidance for clinical management.
The NNT is Not Necessarily True! [22].
Conclusion
Megatrials, like other kinds of epidemiological study, should be considered as primarily
methods for precise measurement rather than a scientific method for generating or testing a hypothesis [10]. Precise measurements of the effect size of medical interventions such as drugs
should be attempted only when a great deal is known about the drug and its clinical actions. When megatrials
are conducted without sufficient background scientific and clinical knowledge, they
will be measuring mainly artefacts. Unless - for instance - a trial is performed on
pathologically and prognostically homogeneous populations, and uses well controlled
management protocols, the apparent precision of the result is more spurious than real.
Megatrials have become an unassailable 'gold standard' in some quarters. And this
situation has become self-perpetuating, since the results of megatrials have become
de facto untestable. Since megatrials are not testing hypotheses, because they are
merely measuring the magnitude of an effect, the result of a megatrial is itself not
an hypothesis, and cannot be tested using other methods. A mega-trial of, say, an
antihypertensive drug measures the comparative effect of that drug under the circumstances
of the trial. Assuming that no calculation mistakes have been made, this result of
a megatrial is neither right nor wrong: it is just a measurement.
People often talk of megatrials as if they proved or disproved the hypothesis that
a drug 'works'. Far from being the final word on determining the effectiveness of
a therapy, this is a question that a megatrial is inherently incapable of answering.
But once the error has been made of assuming that a statistical measurement can test
a hypothesis, the mistake becomes uncorrectable, because the level of statistical
precision in a megatrial is greater than that attainable by other methods.
In such an environment of compounded error, it should not really be a source of surprise
that statistical considerations utterly overwhelm scientific knowledge and clinical
understanding, and we end up with the lunacy of regarding statisticians and epidemiologists
as the final arbiters of medical decision-making. Health care becomes merely a matter
of managers providing systems to 'implement' whatever the number-crunching technocrats
tell them is supported by 'the best evidence' [4]. The methodological deficiencies of megatrials make them ideally suited to providing
an intellectual underpinning for that world of join-the-dots medicine which seems
just around the corner.
References
-
Charlton BG: Clinical research methods for the new millennium.
J Eval Clin Pract 1999, 5:251-263. PubMed Abstract | Publisher Full Text
-
Healy D:
The Antidepressant Era. Cambridge, MA: Harvard University Press,. 1998.
-
Charlton BG: Statistical malpractice.
J Roy Coll Physicians London 1996, 30:112-114.
-
Charlton BG: The new management of scientific knowledge: a change in direction with profound implications.
In NICE, CHI and the NHS Reforms: Enabling Excellence or Imposing Control? Edited by Miles A, Hampton JR, Hurwitz B. London: Aesculapius Medical Press, 2000, 13-32.
-
Charlton BG: Mega-trials: methodological issues and clinical implications.
J Roy Coll Physicians London 2000, 29:96-100.
-
Yusuf S, Collins R, Peto R: Why do we need some large, simple randomized trials?
Statistics Med 1984, 3:409-420.
-
Olkin I: Meta-analysis: reconciling the results of independent studies.
Statistics Med 1995, 14:457-472.
-
Charlton BG: The uses and abuses of meta-analysis.
Fam Pract 1996, 13:397-401. PubMed Abstract
-
Robertson JIS: Which antihypertensive classes have been shown to be beneficial? What are their benefits?
A critique of hypertension treatment trials.
Cardiovasc Drugs Ther 14:357-366. PubMed Abstract | Publisher Full Text
-
Charlton BG: Megatrials are based on a methodological mistake.
Brit J Gen Pract 1996, 46:429-431.
-
Julian D: Trials and tribulations.
Cardiovasc Res 1994, 28:598-603. PubMed Abstract
-
Hampton JR: Evidence-based medicine, practice variations and clinical freedom.
J Eval Clin Pract 1997, 3:123-131. PubMed Abstract | Publisher Full Text
-
Gardner MJ:
Statistics with Confidence: Confidence Intervals and Statistical Guidelines. London: British Medical Association,. 1989.
-
Bradford Hill AB, Hill ID:
Bradford Hill's Principles of Medical Statistics. London: Edward Arnold,. 1991.
-
Kirkwood BR:
Essentials of Medical Statistics. Oxford: Blackwell,. 1988.
-
Charlton BG: The scope and nature of epidemiology.
J Clin Epidemiol 1996, 49:623-626. PubMed Abstract | Publisher Full Text
-
Horvitz RI, Singer BH, Makuch , Viscoli CM: Can treatment that is helpful on average be harmful to some patients? A study of the
conflicting information-needs of clinical inquiry and drug regulation.
J Clin Epidemiol 1996, 49:395-400. PubMed Abstract | Publisher Full Text
-
Charlton BG, Walston F: Individual case studies in clinical research.
J Eval Clin Pract 1998, 4:147-155. PubMed Abstract | Publisher Full Text
-
Bernard C:
An Introduction to the Study of Experimental Medicine. New York: Dover, 1865;. 1957.
-
Marshall JC, Newcombe F: Putative problems and pure progress in neuropsychological single-case studies.
J Clin Neuropsychol 1984, 6:65-70. PubMed Abstract
-
Shallice T:
From Neuropsychology to Mental Structure. Cambridge: Cambridge University Press,. 1988.
-
Charlton BG: The future of clinical research: from megatrials towards methodological rigour and
representative sampling.
J Eval Clin Prac 1996, 2:159-169.
7 comments:
A pal of mine refers to large randomized controlled trials as "gold standard". It's not a usage I care for - it's the language of the huckster not the scientist. Anyway, how "gold" your instrument is depends on how well suited it is to the task, and how skilfully you use it.
@d - Indeed. To refer to RCTs as the Gold Standard of evidence for effectiveness makes as much sense as referring to the microscope as the Gold Standard of biology, or the telescope for Phsyics. i.e. no sense at all!
"Evidence based medicine" -- what pathetic a joke. Question-begging statistical significance trials are a "gold standard"; actual experience of actual patients by the millions, mere anecdote.
Ultimately there will be Hell to pay. Literally.
I'll be writing up John Ioannidis one of these days. If you're not aware of his contributions to getting this right, you should be. He's cleaning the Augean stables.
@scott - I know his work, but I'm not convinced he understands science.
I'd love to hear your thoughts on his work. I haven't made a deep study of what he has done, but the mere fact that he is doing *something* is cheering to me. It is obvious on inspection that many medical "facts" are anything but, and that they're using inappropriate statistical models plagued with sample bias to get the results they do. Maybe Ioannidis is doing something wrong, but at least he is doing *something!*
@SL - Ion's most famous paper was about which publications were correct and which were wrong, based on looking at subsequent publications. This is a false and distorted view of how science works (or used to work), and created the impression that one could secondguess the correctness of research by ramping-up the 'rigour' of peer review - it fitted into the 'statistical agenda' of medical research.
However, in a world where science has been replaced by research and researchers are not even trying to discover or communicate the truth and peer review is the mode of evaluation - then none of this matters either way...
When real science has evolved into a maelstrom of dishonest money- and status-grubbing, and is merely an extension of the metastatic bureaucracy... well, analysis of replicated versus unreplicated papers merely reports on the evolution of consensus, and tells us less then nothing about the real world (except to maintain the delusion that real science continues to operate in the midst of vast volumes of high status and highly funded nonsense and lies).
Post a Comment