Scoring System of the TOEFL

Computer Adaptation, Guessing, and the TOEFL iBT

Students taking the TOEFL iBT can optimize their scores if they understand a few important points about the test's administration. Unlike many of the standardized tests used by universities in English-speaking countries, the iBT is not computer adaptive. There can be some variation in the difficulty level of the questions and this variation can be a factor in the conversion of a raw scores to sectional scores, but the exercises are selected before the test begins and the answers given by an individual test-taker have no effect on any of the other questions. There are thus no additional consequences for incorrect answers, and there is no reason for students to obsess over a single question. Test-takers must also remember that it is in their best interest to guess on multiple-choice questions if they are unsure of the answers, because the scores for these sections are based entirely on correct answers and points are not deducted for incorrect answers. Even a completely random guess has at least some chance of success.

Speaking Section Scoring Rubric

Two sections of the TOEFL iBT are evaluated by human graders: speaking and writing. Each of the speaking tasks is assessed according to an ETS scoring rubric, on a scale of 0 to 4. There are separate rubrics for the independent and integrated tasks, but the general description of each scoring level is very similar for both. If a test-taker receives the highest task rating of 4, his or her response "fulfills the demands of the task, with at most minor lapses of completeness." A score of 3 is associated with responses that "address the task appropriately, but may fall short of being fully developed." Responses that earn a score of 2 are "connected to the task," but they are "missing some relevant information" and "contain inaccuracies." Responses scored with a 1 are "limited in content or coherence" and are "only minimally connected to the task." A score of 0 is only given if the response is "unrelated to the topic" or if the student offers no response at all. ETS graders focus primarily on the effectiveness of responses, and students will not be significantly penalized for pronunciation mistakes or accents unless these things affect the coherence of the responses. The average of all task ratings is converted to the final sectional score (0 to 30).

Integrated Writing Scoring Rubric

Essays on the iBT's writing section are graded with distinct rubrics for the independent and integrated writing tasks, both on a scale of 0 to 5. Integrated essays, which require students to synthesize content from readings and lectures, will earn the top score of 5 if they "successfully select the important information" and "coherently and accurately present this information." If a test-taker receives an integrated essay score of 4, he or she is "generally good" at meeting the above criteria, but the essay may have "minor omissions, vagueness, or imprecision of content." A level 3 essay contains "some important information" with "some relevant connections," but it leaves out key points and includes frequent errors. Level 2 essays "misrepresent" key lecture and reading points and/or "obscure" connections through poor use of language. An essay that provides "little or no meaningful or relevant content" is given a score of 1, and test-takers who "merely copy sentences from the reading" or demonstrate an extremely low level of language usage will receive a score of 0.

Independent Writing Scoring Rubric

The independent task on the iBT's writing section is based on the expression of personal opinions and preferences, and a different set of evaluation criteria is therefore necessary. The best essays will receive the highest score of 5, and they are characterized by "effectively addressing the topic and task," strong "development and organization," "unity, progression, and coherence," and "consistent facility in the use of language." A score of 4 is reserved for essays that meet most of the preceding conditions, but with "occasional redundancy, digression, or unclear connections" and "minor errors in structure, word form, or use of idiomatic language." A level 3 essay includes "somewhat developed explanations," but "connection of ideas may be occasionally obscured." A level 2 essay is characterized by "limited development," "inadequate organization," and "insufficient explanations," while a level 1 essay is "seriously flawed" by "disorganization," "underdevelopment," and "little or no detail." As with the rubric for integrated writing, a score of 0 is given to essays that are thoroughly non-responsive. On both iBT essays, human grading is supplemented by eRater automated scoring technology, which evaluates certain linguistic features. All ratings for essays are averaged and then converted to the final 0-30 sectional score.

Score Reporting for the Speaking and Writing Sections: Comments

TOEFL iBT score reports include general comments about the typical performance of test-takers at the appropriate score level. These comments are not specific to the essays or spoken responses provided by the individual student, and they merely indicate common strengths and weaknesses associated with the reported score.