## Notes on Haverford Class of `79 Admissions data

Doug Davis
Psychology 205h

The Statistica file used in these lab exercises is in the Psych 206h folder on the Faculty Courseuse server. The file name is "admit79."

Please make a copy of this file on the PC you'll be using and do the following:

### Part 1

1. Run a mean and median for the SAT scores using Statistica's Nonparametric module.
Are there any striking features of these SAT data? Any anomalies that might affect how you would use them to predict performance at (or after?) Haverford? What is effect of Haverford's selectivity with respect to SAT on the possible correlation of these tests with performance?
2. Reload the main Statistica module and run a correlation between the SAT-V and SAT-M scores, then make a scatterplot and fit a regression line.
Is the relationship between these variables strong enough that one might be substituted for the other? Are you especially curious about some points (individuals) on the scatterplot?
3. Run an analysis of variance on the relationship between the SAT scores and the new graduation variable, GRAD2.
Are entry SAT scores good predictors of graduation outcome?
4. Run a crosstabulation of the admissions academic rating and the GRAD2 dependent variable and find the value and significance of chi-square.
What percentage of entering students with each of the admissions ratings graduated with honors? Is this a surprising result? Is it important/useful/troubling?

### Part 2

1. Read the Statistica Manual's brief explanation of multiple repression
2. Run a multiple regression analysis of the Class of '79 data using Statistica's Regression module. Specify GRAD2 as the dependent variable and include all the predictor variables. Compute a multiple R and examine the various beta weights for the predictors, when other variables are controlled ("partialled out").
What is the best predictor of graduation outcome? How much better are all the variable that this one predictor? How predictable is this variable from the various data apparently used in its construction?

### The Variables Explained

The following notes are based on a conversation in May, 1985, with William Ambler, Haverford Director of Admissions at the time our data were collected.

High school rank is very important, but the top 2-5% are treated differently than the rest, since "grades predict grades."

SATs

The average math score is higher than the verbal, and this had been a steady trend at Haverford at the time, although both had dropped on average, due to national trends.

The interviewer is looking for seriousness as well as achievement, and is trying to predict the response to Haverford classes and teachers. Before expansion of the College (the Class of `61 was roughly 450 students, the Class of `75 roughly 750), there was an attempt to predict specific Haverford professors' responses to the student. The interview was more important when the pool was smaller. Generally, the interview is not a very good predictor. It's a first meeting, and the student is often very tense.

As an example of a "risky" applicant, Bill Ambler gave the case of someone with:

• very strong recommendations
• high achievement
• strong premedical interest, and
• SAT Math score in the low 500s

On the whole the process is "like a jigsaw puzzle," where the relation of the pieces to each other is especially important.

Extra-curricular Activities

The variable is the sum of participations in "leadership" roles and as athlete, musician, or actor.

Size of Applicant Pool