On Being Detectably Sane in Insane Places:
Base Rates and Psychodiagnosis

Douglas A. Davis*
Haverford College

In Rosenhan's pseudopatient study, presumably sane persons were admitted to mental hospitals with a diagnosis of schizophrenia in 11 of 12 cases after appearing for an admissions interview and reporting hearing voices, persuading many that psychodiagnosis is either inherently weak or very poorly performed. The central problem of the diagnostician, however, is likely to be the selection of appropriate base rates. The Bayes' theorem probability of schizophrenia given hallucination varies from below 1% to over 50% depending on whether a nonpatient or a hospitalized psychotic group is taken as the reference population. The hospital diagnosticians sampled behaved as if they used already admitted patients as a reference population, and given this assumption schizophrenia was the most probable condition. The improvement of diagnostic accuracy requires application of condition rates among persons appearing for diagnostic interviews, and these data should he a priority research target.

If sanity and insanity exist, how likely are we to know them? Rosenhan (1973) described a study in which presumably sane adults gained admission as patients to 12 mental hospitals. The technique was simple:

Each of the eight pseudopatients made an appointment and arrived at the admission office of the hospital complaining of hearing a like-sexed, sometimes unclear voice saying "empty," "hollow," and "thud." The details of the ensuing interviews with admissions staff are not reported, but we are told that "beyond alleging the symptoms and falsifying name, vocation, and employment, no further alterations of person, history, or circumstances were made" (Rosenhan, 1973, p. 251). Not only were pseudopatients hospitalized on all 12 occasions, but in 11 cases (corresponding to the 11 public hospitals in the sample) they were given a diagnosis of "schizophrenia." The pseudopatients then "ceased simulating any symptoms of abnormality" (p. 251) and remained hospitalized for periods of from 7 to 59 days (M = 19) apparently without actively seeking release. During their hospitalization they kept copious notes on ward life and the behavior of hospital staff, and indeed the bulk of the article concerns the settings observed and the behavior of staff in them.

Even this cursory treatment of the pseudo-patient study has probably been unnecessary for most readers: In the time since its publication in Science, the Rosenhan (1973) article has been among the most widely cited and reprinted contributions to the clinical psychological literature. It now appears in several of the most widely used source books in psychopathology, is summarized in most new texts, and is a source of animated student discussion whenever it is assigned. Recently a series of articles in this journal (Crown, 1975; Farber, 1975; Millon, 1975; Rosenhan, 1975; Spitzer, 1975; Weiner, 1975) have addressed most aspects of the study and raised a number of criticisms of its methods and implications. The issues raised by the pseudo-patient study hence appear timely and important ones for psychology. I am convinced, however, that because several of these issues have remained confounded in most discussions of the study, the postpublication debate has brought more heat than light to the question of whether and how diagnoses should be made in the hospital or any other established clinical setting. I therefore propose to distinguish several important aspects of the pseudopatient study by which emotions are aroused, to offer a Bayesian characterization of the diagnostic process observed by Rosenhan and his fellow pseudopatients, and to suggest a direction in which research on these important issues might now proceed.

The impact, and the controversy, produced by the pseudopatient study seem to stem from four apparent findings:

  1. Pseudopatients interviewed by admissions staff at 12 geographically and qualitatively disparate mental hospitals were admitted as patients.
  2. Having reported an unusual auditory experience as their only symptom, 11 of 12 pseudopatients were diagnosed schizophrenic (the other being called manic-depressive).
  3. While hospitalized for periods of from 1 to over 7 weeks, the pseudopatients experienced little apparent therapeutic interaction between hospital staff and patients.
  4. On being released from the hospital, each of the 11 pseudopatients who had been diagnosed schizophrenic was classed as schizophrenia "in remission."

Each of these outcomes seems to have occasioned shock, even outrage, among many readers of the pseudopatient study; and each merits serious consideration. The present discussion however is concerned primarily with the first two outcomes: that all 12 pseudo-patients were admitted to their respective hospitals and that in the 11 public facilities they were labeled schizophrenic. Reader surprise and discomfort with these outcomes of the study seem in turn to be associated with holding one or both of the following opinions:

(a) that a decision to admit someone to a mental hospital with a diagnosis of schizophrenia is a matter of great seriousness, and hence hardly justified by the symptoms supplied, and/or (b) the diagnosis (label) schizophrenia is prognostically useless and therapeutically detrimental anyway. The latter view is rather widely held, but its merits have little to do with the pseudopatient data. My own concern is with the former opinion, which seems at least worth debating in light of such evidence as exists and which provides an ideal occasion to restate a point regarding Bayes' theorem.

BASE RATES AND SIGN EFFICACY

In several papers, Meehl and his co-workers (Dawes, 1962; Meehl, 1956; Meehl & Rosen, 1955; Rosen, 1954) attempted to persuade psychodiagnosticians that the rate of occurrence (base rate) of a condition or "disease" in an appropriate reference population is essential to the assessment of the probability that a patient or testee presumed drawn from that population has the condition, given the presence of a particular diagnostic sign. Although several of the articles cited have been around for 20 years, it may be useful to restate Bayes' theorem as applied to the situation in which we have a single dichotomous condition or diagnosis (X) and a single diagnostic sign (x).

That is, the probability that condition X exists given the presence of sign x is equal to the product of the base rate for the occurrence of X and the conditional probability of the sign given the condition, divided by the base rate for the occurrence of the sign. If, in this dichotomous case, we designate the absence of X by Y [i.e., p(Y) = 1 - p(X)], the equation becomes as follows:

Where differential diagnosis is at issue, our interest is in the circumstances under which the probability that an individual belongs in diagnostic category X is greater than the probability he belongs in Y, given the presence of valid sign x which has been empirically associated with diagnosis X.2 The inequality of interest, p(X/x) >p(Y/x), becomes Inequality 1 on substitution from Equation 1:

With appropriate transposition and substitution we have Inequality 2 as follows:

Note that the truth of these inequalities implies that p(X/x) has some value greater than .50. Inequality 2 is that presented by Meehl and Rosen (1955) using different symbols. The base rate of condition X is p(X), P(Y) is the base rate of condition Y (the absence of X), p(x/Y) is the proportion of persons in condition Y misidentified by sign x (false positives), and p(x/X) is the proportion of persons in condition X who are correctly so identified by sign x (true positives). A positive diagnosis regarding condition X based on sign x is more often right than wrong (i.e., its probability exceeds .50) if and only if the ratio of the positive to the negative base rate exceeds in value the ratio of the false positive to the true positive rate.

DIAGNOSIS OF THE PSEUDOPATIENTS

What is the likelihood that someone appearing at a mental hospital and describing an auditory hallucination is a schizophrenic? In order to give a Bayesian answer to this question we need to know, or have a plausible estimate of, the incidence of schizophrenia in the population from which the alleged hallucinator came. We also need an estimate of the proportion of schizophrenics and non-schizophrenics (or of persons at large) who hallucinate. For the rate of hallucination and a variety of other symptoms by type of diagnosis, a widely cited source is a study by Zigler and Phillips (1961), who reported the incidence of 35 symptoms among patients admitted to the Worcester State Hospital during the period of 1945 to 1957. This study is cited by Rosenhan as indicating that "there is enormous overlap in the symptoms presented by patients who have been variously diagnosed" (Rosenhan, 1973, p.254). With respect to hallucinations, however, this criticism is less appropriate: Of 287 persons diagnosed schizophrenic in the Zigler and Phillips sample, 35% manifested hallucinations, while for the three other diagnoses included (manic-depressive, psychoneurotic, and character disorder) the incidence was 10% (Zigler & Phillips, 1961, p. 71). The ratio of "false positives" to "true positives" [p(x/Y)/ p (x/X)] for the Zigler and Phillips sample is thus .10/.35 = .29. If these were the data with which a Bayesian diagnostician were constructing his assessment of the likelihood that a given hallucination-reporting interviewee is schizophrenic, the odds of his being "right" (i.e., using the term schizophrenia in a manner consistent with its usage at Worcester State Hospital) can only be better than even if p(X)/p(Y) exceeds .29, and this in turn can be the case [since p(Y) = 1 - p(X)] only if p(X) exceeds .225.

Using the Zigler and Phillips data in this way, however, implies that schizophrenics in hospital (having presumably, been admitted under a variety of circumstances) are similar to schizophrenics appearing initially of their own volition in the matter of hallucination frequency. Furthermore, the frequency of hallucination among nonschizophrenic patients is taken as an estimate of the hallucination rate among a'~ nonschizophrenics. The first of these assumptions may be plausible, the second is almost certainly not. Note that if, as seems likely, the rate of hallucination in the nonschizophrenic hospital population (or the nonschizophrenic hospital-admission-seekmg population) is below 10%, the critical value of p(X) is correspondingly reduced. Lacking data on the frequency of the diagnostic sign in question among that subset of persons who appear at a hospital, the diagnostician is thus in the curious position of having to assume admission in order to assess the probabilities of various conditions.

Picking a "base rate" estimate for p(X) in this case presents a similar problem. Disregarding the schizophrenic sign (hallucination), what is the likelihood that the person who has appeared at the hospital for an admission interview is a schizophrenic? The incidence of schizophrenia in the American population has been estimated as falling in the range of 50 to 250 cases per 100,000 population per year (Lemkau & Crocetti, 1958). Taking .25% as an upper bound, the ratio p(X)/p(Y) for the general population becomes .0025/.9975 = .002506. The inequality is far from satisfied on the assumption that each new person appearing at a hospital admissions office for an interview is a random sample of the American population. Suppose, however, that the pseudopatients were being compared not with the population at large but with the population of persons already admitted to mental hospitals in the United States. The proportions of persons diagnosed schizophrenic among patients in various broader categories for 1965 admissions to public and private facilities in the United States (U.S. National Institute of Mental Health, 1966,1967) are given in Table 1.

TABLE I
First Admissions to U.S. Inpatient Psychiatric Facilities during 1965

Public mental hospitals

Private mental hospitals

Variable

n

% diagnosed schizophrenic

n

% diagnosed schizophrenic

 

All diagnoses
Total without chronic brain syndromes
Total psychotic
Total schizophrenic

135,476

104,524

32,972
23,861

17.6

22.8*
72.4*
-----

38,744

35,433
14,715
7,247

18.7

20.5
49.2*
-----

* p(X) exceeds critical value of .225

If a diagnostician were searching among these proportions for an estimate of p(X) (i.e., if a tentative or implicit decision had been made to try to fit the interviewee to the set of diagnoses of those already admitted to the hospital), he would note two (asterisked in Table 1) instances in the case of a public, or one in the case of a private, hospital in which p(X) exceeds the critical value of .225. Clearly our diagnostician would be more likely wrong than right in calling this allegedly hallucinating interviewee schizophrenic unless the range of possibilities can be narrowed somewhat first. But suppose, in quickly perusing the list of broad symptom categories in the National Institute of Mental Health statistics, our diagnostician had touched on and eliminated the chronic brain syndromes as possibilities. Most of these, especially in public hospitals, are presumed due to changes of advanced age. In any case the implied insidious onset contradicts the reports of the pseudopatients. With the narrowing of possibilities to non-chronic organic conditions and psychopathologies, the base rate for schizophrenia diagnoses would already be high enough for the population of public hospitals (.228) to squeak by the decision point (p(X)/p(Y) = .295), given the additional datum hallucination.2 The actual probability, p(X/x), computable from Equation 2, is .51.

Note that on this latter set of assumptions about base rates for sign and condition, the 11 public mental hospital diagnosticians all were on the side of higher odds (while for private hospitals the probability of schizophrenia given hallucination, once chronic bran syndromes have been ruled out, is .47). If our diagnostician were willing to make the admittedly improbable prior assumption that the admission-interview-seeking person is psychotic simply on the strength of appearance at a hospital, the presence of hallucinations would raise the likelihood of schizophrenia from .72 and .49 to .90 and .77 for public and private hospitals, respectively. Therefore, if (a) one is entitled to apply base rate statistics figured on the population of persons already hospitalized to a new self-referred interviewee, and (b) one concludes from the interview that this person is experiencing hallucinations,3 and (c) one is simply interested in making the highest probability diagnosis among those already in common usage, then one ought to have diagnosed Rosenhan's pseudopatients schizophrenic if one were seeing them in a public mental hospital.

At this point I hear persistent voices muttering "three big ifs." Yes indeed, and that is the point of this exercise. Each of the conditional assertions listed might be the subject of cogent controversy. First, while it seems absurd and even unethical to assume that a person who has come to a mental hospital seeking help is a randomly drawn sample of the American population (viz., to ignore that persons seeking or seeming to seek admission to mental hospitals are at least statistically quite abnormal), it is also questionable whether the group of persons already admitted is the appropriate reference group. Rather, we would like data on the subset of self-referred admittees and on the total population of admissions-interview appointment makers from which they were drawn. The representation of schizophrenics among this group is likely to be appreciably less than among inpatients as a whole, though far greater than in the population at large.

The proportion of self-referred schizophrenics who report hallucinations, p (x/X), may also differ substantially from the Zigler and Phillips (1961) figure. Such data are not likely to be in the memory or the reprint file of the average diagnostician, and in fact broad parameter studies on this topic have not to my knowledge been done. In the absence of such figures, a diagnostician must choose between two unlikely assumptions about condition base rates and must rather arbitrarily assume that the relation of sign to symptom in the self-referred population is as among inpatients. It seems reasonable to say of the diagnosticians who saw the pseudopatients that they behaved as if they had made a Bayesian computation with recent admittees as the source of p(X) and the Zigler and Phillips data as the source of p(x/X). Surely such behavior may serve to compound past mistakes, but I question whether it should be traduced until we have better numbers to plug into the formula.

Second, although Rosenhan reports selecting the pseudopatient's symptom in part for its implausibility (1973, p. 251; cf. Spitzer, 1975, pp. 445~446), hallucinations are strong candidates for inclusion in an odds-setting diagnostic process because of their relative discreteness and reliability of identification. The other symptoms whose presence singly produce better than even odds of schizophrenia on the Zigler and Phillips data require interpretation of behaviors that are subject to inter-judge disagreement. Although a disclaimer of hallucinatory experience in a presumed schizophrenic would be of uncertain diagnostic value at best (cf. Meehl, 1973, p. 231), an affirmation of such experience is likely to be believed.4

On the question of the readiness to assign a psychotic label to persons with "normal" developmental histories, I think it important to take issue with the pseudopatient study, although as Rosenhan (1973, p.254) notes, the overlap of personal histories between schizophrenic and nonschizophrenic patients is sufficiently great to make distinction solely on this basis a highly dubious business. To cite an example, Schofield and Balian (1959) found such similarity in the personal histories of schizophrenic and non-psychiatric patients that accurate diagnosis on the basis of background variables alone would be almost impossible. Such presumed "schizophrenogenic" events as maternal overprotection and dominance, while more frequent among schizophrenic than non-psychiatric patients, were noted in less than one fourth of the mothers of schizophrenics. It is probably in the evaluation of reported personal background data that the hospital context exerts its greatest effect, as both Rosenhan (1973, 1975) and several of his critics have suggested (Farber, 1975; Spitzer, 1975). An interviewee's statements about family and other items of personal history should probably be categorized by the interviewer with less certainty than is assigned a report of hearing voices. But as reports of recent perceptual experience, recounted personal background data are of strikingly different diagnostic usefulness depending on the context which allows one to set prior odds for the diagnosis in question. Hence, the very dependence on context and setting which lessen the likelihood that the sane would be detected as such in the hospital setting also improves the likelihood that the insane would be detected, if the signs used in the detection have validity [i.e., if p(x/X) > p (x/Y)]. When the base rates for schizophrenia get high enough, even an interviewee's report of having had a very protective mother may improve diagnostic accuracy, although as Meehl has suggested, one gets most out of such a sign when base rates are so high that one does not "need it much" (1973, p.232).

The third point is the most difficult and the most important. The willingness of a diagnostician to assign a diagnostic label ought to be a function not of simply picking the most probable category, but of utilities attached to the two types of error (or of correctness). The Bayesian process illustrated above pertains to the artificial case in which the benefits and/or costs of each possible decision outcome (true positive, false positive, true negative, false negative) are set equal. And here, of course, is where much of the heat produced by the pseudopatient study originates: Readers hold widely disparate views of the consequences (costs) of either turning away a schizophrenic from the hospital or of admitting a sane person. It is likely that as Rosenhan (1973, p.252) suggests, the bias of hospital staff is toward Type II error, calling a sane person schizophrenic, while many who were shocked by the study would prefer to reduce such errors even at the cost of increased Type I error because they are convinced either of the nonexistence of schizophrenia or of the harmfulness of hospitals. For some, the stickiness~~ of the schizophrenia label is itself the problem: Even if the syndrome exists and its identification might facilitate treatment, one should make all such diagnoses subject to automatic removal or renewal in light of future evidence. This last view may be a desirable goal, yet as Spitzer (1975) points out it is the phrase "schizophrenia in remission" which most succinctly conveys to the psychiatric community that the individual so labeled was the subject of two diagnostic decisions, separated in time, to the effect that schizophrenia was the most plausible diagnosis at the former point and that the manifest characteristics on which the schizophrenia diagnosis was based had ceased or abated by the letter. The primary danger in such a labeling process is probably the misuse to which the information may be put if known outside the clinical setting, rather than the existence of the label.

The major reason for raising these issues, however, is to suggest that the pseudopatient data under discussion are hardly an adequate basis for their resolution. Psychological diagnosis itself is under attack, and the Rosenhan study seems to have escalated the conflict. Some benefit will doubtless result. Hospital staff, chastened to learn that pseudopatients received an average of only 6.8 minutes per day of staff contact, are perhaps now stopping to chat with patients formerly overlooked. On the diagnostic side, however, I think the yield is mixed. Surely we can change diagnosticians' subjective prior odds, and hence the outcome of their decisions. Publicity surrounding the pseudopatient study has probably already achieved this result in many settings. Since those odds and their effects on decisions are seldom articulated, however, none but the militantly anti-nosological should take comfort. The same categories may now be applied with greater likelihood of error. Schizophrenics presenting themselves at hospitals may now be more likely either to be denied admission or to receive another diagnosis. To assert that this is all to the good because hospitals do not treat and diagnosis makes no difference is to beg the question.

SUGGESTIONS FOR RESEARCH

It should be noted that neither the pseudo-patient study nor this discussion of it may be expected to provide anything remotely like an adequate model or description of the entire process by which diagnoses are made. An adequate study or simulation of that process would involve the array of categories and signs available together with the great variety of cognitive processes by which categories and signs are brought into relation with each other.

A number of attempts along these lines have been made (e.g., Ledley & Lusted, 1959). The questions addressed here, however, concern the explanation of the empirical finding that persons behaving as the pseudopatients behaved had a high likelihood of being hospitalized and diagnosed schizophrenic. It is assumed that, given the settings in which data were collected, whether to hospitalize and whether to diagnose as schizophrenic were plausible questions for diagnosticians to ask. Given the near unanimity of the diagnostic outcomes, these two questions would also appear to be the best starting point for study of the psychodiagnostic process in settings like those sampled by Rosenhan. We need more data on the characteristics of persons presenting themselves at mental health settings, both as regards relative incidence of symptoms and "true" diagnostic condition. It is also important that more be known about the consequences of diagnostic errors. What happens to a hallucinating schizophrenic refused admission to a hospital? How upsetting, and how costly of limited resources, is the temporary hospitalization of a sane person?

Finally, the subjective odds used implicitly by diagnosticians in their day-to-day work ought to be made explicit. We ought to know what the probabilities of various signs and diagnoses in various settings are believed to be by those doing the diagnosing. Then we can move to assess whether these odds make Bayesian or therapeutic sense. Questions about the utility of diagnosis generally and the schizophrenia concept specifically also ought to be discussed, but the determination of adequate estimates of conditional probabilities for the signs and syndromes under evaluation is essential to assessment of the costs (in human and economic terms) of the possible errors. Schizophrenia may finally prove not to exist. It may exist but its identification produces such terror that we dare not speak its name. Yet if the world contains person-treatment combinations whose utility varies, we shall not match persons and treatments well until we know our "Ps" and "Xs."

REFERENCES

Crown, S. "On being sane in insane places": A comment from England. Journal of Abnormal Psychology, 1975, 84, 453~55.

Dawes, R. M. A note on base rates and psychometric efficiency. Journal of Consulting Psychology, 1962, 26, 422424.

Farber, I. E. Sane and insane: Constructions and misconstructions. Journal of Abnormal Psychology 1975, 84, 589-620.

Ledley, R. S., & Lu sted, L. B. Reasoning foundations of medical diagnosis. Science, 1959, 130, 9-21.

Lemkau, P. V., & Crocetti, G. M. Vital statistics of schizophrenia. In L. Bellak & P. K. Benedict (Eds.), Schizophrenia: A review of the syndrome. New York: Logos Press, 1958.

Meehl, P. E. Wanted-a good cookbook. American Psychologist, 1956, 11, 263-272.

Meehl, P. E. Why I do not attend case conferences. In P. E. Meehl (Ed.), Psychodiagnosis: Selected papers. Minneapolis: University of Minnesota Press, 1973.

Meehl, P. E., & Dawes, R. M. Mixed-group validation: A method for determining the validity of diagnostic signs without using criterion groups. Psychological Bulletin, 1966, 66, 63-67.

Meehl, P. E., & Rosen, A. Antecedent probability and the efficiency of psychometric signs, patterns, or cutting scores. Psychological Bulletin, 1955, 52, 194-216.

Millon, T. Reflections on Rosenhan's "On being sane in insane places." Journal of Abnormal Psychology, 1975, 84, 456-461.

Rosen, A. Detection of suicidal patients: An example of some limitations in the prediction of infrequent events. Journal of Consulting Psychology, 1954, 18, 397403.

Rosenhan, D. L. On being sane in insane places. Science, 1973, 179, 250-258.

Rosenhan, D. L. The contextual nature of psycho-diagnosis. Journal of Abnormal Psychology, 1975, 84, 462474.

Schofield, W., & Balian, L. A comparative study of the personal histories of schizophrenic and nonpey chiatric patients. Journal of Abnormal and Social Psychology, 1959, 59, 216-225.

Spitzer, R. L. On pseudoscience in science, ]ogic in remission, and psychiatric diagnosis: A critique of Rosenhan's "On being sane in insane places." Journal of Abnormal Psychology, 1975, 84, 442452.

U.S. National Institute of Mental Health. Patients in mental institutions, 1964, Part 3: Private mental hospitals and general hospitals with psychiatric facilities (Public Health Service Publication No. 1452). Washington, D.C.: Government Printing Office, 1966.

U.S. National Institute of Mental Health. Patients in mental hospitals (Public Health Service Publication No. 1597). Washington, D.C.: Government Printing Office, 1967.

Weiner, B. "On being sane in insane places": A process (attributional) analysis and critique. Journal of Abnormal Psychology, 1975, 84, 433441.

Zigler, E., & Phillips, L. Psychiatric diagnosis and symptomatology. Journal of Abnormal and Social Psychology, 1961, 63, 69-75.


A version of this paper appeared in the Journal ol Abnormal Psychology, 1976, 85, 416-422. I am indebted to Jonathan Baron, Douglas Heath, Clark McCauley, and Sidney Perloe for comments on an earlier draft. Copyright (c) Douglas A. Davis, 1976. All rights reserved.

1In conditional probability terms, the establishment of "validity" for a particular sign involves the demonstration, at an appropriate level of statistical significance, that p(x/X) > p(x/Y). For a treatment of such sign validation issues when the criterion group membership of individual subjects is unknown, see Meehl and Dawes (1966).

2For an alternate, nonalgebraic account of how this narrowing of possibilities might have been achieved, see Spitzer (1975, p.446).

3Throughout this discussion hallucinations have been treated as a generic category, disregarding the usual distinction between auditory and visual or other sensory modalities. In this, the example of Zigler and Phillips (1961) has been followed. From Kraepelin and Bleuler on, most writers have considered auditory hallucinations most characteristic of schizophrenia. Were comparable data on the relative incidence of auditory hallucination across diagnoses employed, the ratio p(x/Y)/P(x/X) would thus probably have a lower value, permitting a some-what lower p(X) estimate as the cutoff for diagnosing schizophrenia.

4The credibility of reported experiences is, of course, subject to wide and rapid change with variations of milieu and cognitive set, as indicated by Rosenhan's data on the number of "real" patients perceived as pseudopatients when the hospitals were (falsely) told pseudopatients would try to gain admission (Rosenhan, 1973, p. 252). Consider the likelihood that a simple report of auditory hallucination would be believed in a military combat hospital. The central questions for public mental health settings are whether dissimulation constitutes a sufficient problem to merit expensive steps against it and whether such institutional care should be available to anyone who seems to desire it.