How They Write the SAT
How They Write the SAT By David Owen Standardized multiple-choice tests, such as the Scholastic Aptitude Test, are more than hurdles on the way to college. The tests have become a pervasive...
...Strictly speaking (too strictly probably), doesn't the phoenix symbolize death and rebirth rather than immortality...
...A variety of people write questions for the SAT company employees, freelancers, even student interns...
...The phoenix item, an analogy problem, also drew a comment from Cruise...
...You get an answer sheet and some statistics with it...
...But all she was really saying was that she would have been more inclined to defend the item if fewer people had answered it correctly...
...Substituting "reject" for "decline" in the above could have made the item easier to answer, thus lowering its delta and throwing off the test specifications...
...It ain't often you see CONVOKE...
...If she had found the word nigger in one of the questions, presumably she would have scratched it out...
...Words are in dictionary, they have modern usage, and we test more specialized science vocab than this...
...He has to realize that the ETS answer will be something drab, humorless, and plodding— something very like (A), as indeed it is...
...And I said, the options here are verbs, and it appears that this is the only item with different parts of speech for the stem and the options...
...Willie M. May, chairman of the committee and assistant principal for personnel and programming at Wendell Phillips High School in Chicago, told me in no uncertain terms that preparing questions was not one of his committee's duties...
...Cruise made a similar comment in her review, but the item was not changed ("Sounds fine to me and is supported by dictionary," wrote Curley...
...Curley's most frequent remark is a mildly petulant "but OK as is," which is scribbled after most criticisms...
...C) What good service...
...The reviewer simply answers each item, marking his choices on ordinary lined paper and handwriting comments on items he feels need improvement...
...noted JW of a word tested in one item...
...D) Je voudrais une cuillere...
...An item that very few students get right might have a delta of 16.8...
...My French is vestigial at best...
...The company has always insisted that its work is too complex and too important to accommodate the scrutiny of outsiders...
...Sometimes they're very stubborn...
...Minor revisions can be made in questions at this stage," says an ETS document, referring to a test-development stage prior to the one in which the committee is consulted, "but a major revision in a question makes it necessary to represent the question in order to determine the effect of the changes on the statistical characteristics of a question ." Such revisions are almost never made...
...It's really just a fancy way of expressing the percentage of students who consider a particular item but either omit it or get it wrong...
...Which test doesn't really matter...
...B) La soupe est delicieuse...
...But I obtained a copy of the review materials for the SAT administered in May 1982, which were used as evidence in a court case...
...Think we can defend," he wrote...
...According to an ETS flier, "Each test is reviewed to ensure that the questions reflect the multicultural nature of our society and that appropriate, positive references are made to minorities and women...
...This might be an item the critics pick on...
...Actual minorities Whether the SAT is culturally biased against minorities is a perennial concern at ETS...
...You review the questions to make sure that the question is legible and that there aren't any trick language aspects to it, and that it's clear, and that you come up with the answer that you think should be gotten...
...The minority reviewer, a company employee, simply counts the number of items that refer to each of five "population subgroups" and enters these numbers on a Test Sensitivity Review Report Form...
...She made no other remarks...
...By the time the committee receives a test, changing it is virtually impossible...
...I'm not fishing for anything," I said...
...ETS has little interest in their opinions...
...If correcting the wording of a question changes the way it performs on the test, then some of the people now getting it wrong—or right, as the case may be—are doing so only because the question is badly written...
...Delta can say only that a question was answered correctly by the exact percentage of people who answered it correctly...
...They many times try to dismiss it," she said...
...The tests have become a pervasive measure of worthiness in our society— even a status symbol, as in, "My boy scored double 700s...
...Now I'm using American Heritage Dictionary, which I feel is common access across the country...
...SAT items also often test the third, fourth, or fifth meanings of otherwise common words, which can create confusion...
...CONFIDENTIAL are stamped in red ink at the top of every page...
...Le garcon maladroit lui renverse le potage sur les genoux...
...Then it was filed and forgotten...
...In general, he said, he found the SAT impressive...
...What Cruise thought she was saying was that if the item had been more "difficult," and thus intended for "abler" students, the ambiguity in it would have been less objectionable...
...Noun schmoun After its sensitivity review, every SAT is passed along to what the College Board (which hires ETS to write and administer its college admissions tests) describes as a committee of "prominent specialists in educational and psychological measurement...
...The assembler, Ed Curley, decided not to follow JW's suggestions, but the comment is revealing of the level at which ETS analyzes its tests...
...According to the College Board's one-paragraph "Charge to SAT committee," each member reviews, by mail, two tests a year...
...If you happen to be less competent in math, you still review it for language...
...Now, I don't want to spend much more time on this, because we do exactly what the College Board statement says, so I don't know what you're out fishing for...
...The question stayed in the test...
...ETS doesn't pursue the implications...
...According to the official mythology, the SAT committee ensures the integrity of the test by subjecting it to rigorous, independent, expert scrutiny... 'matriarchate' so item is fine...
...The item: 42...
...It needs Scotch tape...
...An ETS answer key is included with each test...
...Or, to put it another way, she would have thought it less ambiguous if it had been more ambiguous...
...He is an unbeliever, but he is broad-minded enough to decline the mysteries of religion without ---- them...
...A spoon, waiter, for the soup in my lap...
...these particular Asian Americans were Shang Dynasty Chinese, 1766-1122 B.0...
...In ETS test reviews, the emphasis is not always on whether keyed answers are good or absolutely correct, but on whether they can be defended in the event that someone later complains...
...Yes, we have that opportunity...
...In the following item from the same test, the word "decline" is used peculiarly: ETS 'assemblers' don't like to be challenged on their test questions Mop on yob' one responded to an in-house reviewer: 17...
...Understanding how the test-makers think is one of the keys both to doing well on ETS tests and to penetrating the mystique in which the company cloaks its work...
...On the verbal SAT administered in May 1982, minority member Beverly Whittington found seven items that mentioned women, one that mentioned black Americans, two that mentioned Hispanic Americans, none that mentioned native Americans, four that mentioned Asian Americans (actually, she was stretching here...
...David Owen is a New York writer...
...In ETS analogies, students are given a pair of words and asked to select another pair "that best expresses a relationship similar to that expressed in the original pair...
...A question is hard if few people answer it correctly, easy if many do...
...We are in an advisory capacity," he said...
...All I've got is what most people would have...
...Sometimes they change, but I find that item writers are very pompous about their work, and they don't like you to say anything...
...Thus bright students sometimes have trouble on ETS tests, because they see possibilities that ETS's question-writers missed...
...Many of the test questions are ambiguous, arbitrary, and downright silly...
...Here's an item from a recent Achievement Test in French: 2. Un client est assis dans un restaurant chic...
...Curley did this and responded, "1st meaning of 'matriarchy' in Webs...
...And do you do that...
...When the second test reviewer, Pamela Cruise, wondered whether answering one difficult item required "outside knowledge," Ed Curley responded: "We must draw the line somewhere but I gave item to Sandy...
...easier"), at delta 13.2...
...1985 by David Owen...
...Now, (B), (C), and (D) strike me as nice, funny, sarcastic responses that come very close to being the sort of remark I would make in the situation described...
...Cruise had forgotten the real meaning of delta and fallen victim to her own circular logic...
...Sometimes the reviewer is nearly apologetic...
...ETS is very secretive about its methods...
...I asked her how ETS responded to her criticisms...
...This, of course, doesn't make any sense...
...A) denouncing (B) understanding (C) praising (D) doubting (E) studying My Webster's Seventh New Collegiate Dictionary gives the fourth meaning of decline as "to refuse to accept" This is more or less what ETS wants to say...
...Medicine freaks Exactly how does ETS come to write questions like this...
...The easiest way to see this is to look at the tests themselves...
...But as should be apparent to anyone who has taken these tests, the white-coated image is just that...
...You see, they are going around my complaint" I called Hammett Worthington-Smith, an associate professor of English at Albright College in Reading, Pennsylvania and asked him to describe his duties on the SAT committee...
...ETS made Whittington take a three-day training program in "test sensitivity" before permitting her to do all of this...
...ETS calls delta a measure of "difficulty," but this definition is circular...
...Okay, so that's enough for today...
...The customer exclaims: (A) You could not pay attention, no...
...she could not key—none of the terms were familiar to her...
...Making even a slight alteration in an item can necessitate a new pretest (or trial run in ungraded portions of existing tests), which is expensive... that many get right might have a delta of 6.3...
...ETS's test developers cloak their work in scientific hocus-pocus and end up deceiving not only us but themselves...
...The company's literature conjures up the image of a testing instrument endowed with the learning and precision of white-coated physicists measuring a rocket's lift-off power...
...I am saying something, though, because I feel that maybe 40 people are responsible for writing items, let's say, for the verbal area, and why should 40 people govern by chance what thousands of youngsters' opportunities might be...
...PHOENIX:IMMORTALITY:: (A) unicorn:cowardice (B) sphinx:mystery (C) salamander:speed (D) ogre:wisdom (E) chimera:stability Cruise said she would "be more inclined to defend this item if it were a delta 15 ." The item had been rated somewhat lower (i.e...
...The clumsy waiter spills soup in his lap...
...She feels that if sentence is from a legit source, we could defend" "The legitimate source" tends to be either the American Heritage or Webster's dictionary, depending on which supports the answer ETS has selected...
...In other words, if the item were worded a little differently, more future physicians might be tempted to answer it incorrectly...
...The advice traditionally given to such students is to take the test quickly and without thinking too hard...
...would suggest that matriarchy is a social system & matriarchate a state (& a gov't system...
...An ETS test review doesn't take long...
...The words SECURE and E.T.S...
...JW concluded this comment by drawing a little smiling face...
...When her report was finished, it was stamped E .T...
...But, of course, in taking a test like this, the student has to suppress his sometimes powerful urge to respond according to his own sense of what is right...
...When Cruise described item 26 as a "weak question—trivial," Curley responded in the margin "Poop on you...
...Well—item OK—but this reminds me of the kind of thing we used to test but don't do much now—relates to outside knowledge—myth, lit., etc...
...These are tests, of course, that are made up by professional test-makers, so in a sense what you're doing is applying some kind of quality control...
...He has to remember that the "best" answer—which is what ETS always asks for, even on math and science tests—isn't necessarily a good answer, or even a correct one...
...And they said, Well, this word is also a verb, and it's tested as such in this item...
...of offers and invitations ." This usage, and not ETS's, is the proper one...
...This article is adapted from his book, None of the Above: Behind the Myth of Scholastic Aptitude, to be published in May by Houghton Mifflin...
...Just to make sure, for the last few years it has used "an actual member of a minority" (as one ETS employee told me) to read every test before it is published...
...We don't do any tinkering at all ." When I asked Worthington-Smith what test-reviewing involved, he said, "You know, normally what anyone else would do with this one particular test...
...D) I would like a spoon...
...As soon as committee members have completed their busywork, the test is sent to the printer...
...Committee members tiptoe through questions and statistics they don't understand, flattered to have been asked to look at them in the first place, and then help spread the good word about ETS...
...students were supposed to select the lettered choice that is the nearest opposite in meaning to the word in capital letters: 4. BYPASS: (A) enlarge (B) advance (C) copy (D) throw away (E) go through The first reviewer, identified only as "JW," suggested substituting the word "clog" for one of the incorrect choices (called "distractors" in testing jargon), because "perhaps clog would tempt the medicine freaks...
...Check Webster...
...Test assemblers don't like being criticized by test reviewers...
...And he hung up on me...
...Each test item is reviewed to ensure that any word, phrase, or description that may be regarded as biased, sexist, or racist is removed ." But the actual "sensitivity review" process is much more cursory and superficial than this description implies...
...Now, the one I got back recently was about a word that my dictionary said is a noun...
...To a criticism of another item he responded, "I had some pause over this, too, but tight by dictionary?' I'd always thought that ETS item-writers must depend heavily on dictionaries...
...But with the help of my wife I made this out as follows: 2. A customer is seated in a fancy restaurant...
...The company says it has proven statistically that the SAT is fair for all...
...Yes" But committee members don't write questions for the SAT...
...In reviewing item 44, JW wrote, "Looked fine to me but AH Dict...
...However it's tough enuf as is...
...I haven't got the Oxford English unabridged 30-volume thing...
...I tend to be more verbal" In the course of our conversation, Controvillas used the word "criteria" twice as a singular noun and told me that the committee reviews each test "physically" (he also seemed to be uncertain about the meaning of the word "legible...
...Key a bit off, but okay," Cruise wrote in regard to one item...
...I asked...
...Item's OK, really...
...The principal difference between the SAT and a test that cannot be graded by machine is that the SAT leaves no room for more than one correct answer...
...Nor does it distinguish between knowledge and good luck...
...Le client s'exclame: (A) Vous ne pourriez pas faire attention, non...
...The Educational Testing Service, which produces the SAT, encourages this attitude...
...A quality control function ." The SAT committee's real duties have more to do with public relations than with test development...
...Our phone conversation had lasted exactly two-and-a-half minutes...
...Revisions are made only grudgingly, even if assembler and reviewer agree that something is wrong...
...The diction in SAT questions is sometimes slightly off in a way that suggests the item writers are testing words they don't actually use...
...Reviewing involves both verbal as well as the math," he said...
...Far fewer than 40 people are involved in writing an SAT, but no matter...
...Two items overlapped, so Whittington put a "12" in the box for Total Representational Items...
...They have no real power, and ETS generally ignores their suggestions...
...Curley didn't share Cruise's peculiar concern...
...She also commented "OK" on the exam's text specifications, "OK" on the subgroup reference items, and "OK" on item review...
...But in fact committee members are largely undistinguished in the measurement field...
...ETS is almost always reluctant to change the wording of test items, or even the order of distractors, because small changes can make big differences in statistics...
...But with some determined digging around, even an outsider can get an idea of what goes on inside ETS's test development office...
...For example, the fourth item in the first section was an antonym problem...
...They always hate to see my comments," says Margaret Fleming, one of the committee's ten members and a deputy superintendent in Cleveland's public school system...
...B) The soup is delicious...
...Despite ETS's claims to the contrary, its tests are written by people who tend to think in certain predictable ways...
...Tough enuf The most important statistic that ETS derives from pretests in terms of building new SATs, is called "delta!' Like virtually all ETS statistics, delta sounds more sophisticated than it is...
...JW commented on another: "At pretest I would have urged another compound word or unusual distractor...
...ETS and the board talk publicly about the SAT committee as though it were a sort of psychometric Supreme Court, sitting in thoughtful judgment on every question in the SAT...
...Aren't we willing to say that knowledge of these terms is related to success in college...
...C) Quel beau service de table...
...But since delta refers to no standard beyond the item itself, it makes no distinction between one body of subject matter and another...
...Assemblers invest a great deal of ego in their tests, and they don't like to be challenged...
...We review exams," he said, "we prepare questions, we take the tests, and that type of thing ." "You prepare questions in addition to reviewing tests...
...But the dictionary goes on to explain that decline in this sense "implies courteous refusal esp...
...The item was not changed...
...I asked committee member William Controvillas, a guidance counselor at Farmington High School in Farmington, Connecticut, what these reviews entailed...
...What ETS really wants here is a word like "reject...
...An "assembler" oversees the process, and once the test is completed, this person gives it to two or three colleagues for a review...
...Or, as ETS inimitably describes it, delta "is the normal deviate of the point above which lies the proportion of the area under the curve equal to the proportion of correct responses to the item .") For all practical purposes, the SAT delta scale runs from about 5.0 to about 19.0...
...Now, we have had some showdowns about it...
...It takes a simple piece of known information and restates it in a way that makes it seem pregnant with new significance...
...It leaves no room, in other words, for people who don't see eye-to-eye with ETS...
...ETS's test reviews aren't meant to be seen by the public...
Vol. 17 • April 1985 • No. 3