Assessing sentence comprehension, as opposed to word reading, may reduce the frequency of Invalid MMPI-2-RF Protocols in forensic evaluation. This is the bottom line of a recently published article in International Journal of Forensic Mental Health. Below is a summary of the research and findings as well as a translation of this research into practice.

Featured Article | International Journal of Forensic Mental Health | 2017, Vol. 16, No. 3, 239-248

Assessing Reading Ability for Psychological Testing in Forensic Assessments: An Investigation with the WRAT-4 AND MMPI-2-RF


Kiera Himsl, Patton State Hospital
Danielle Burchett, Loma Linda University
Anthony M. Tarescavage, California State University
David M. Glassmire, Kent State University


This study examined the association between two measures of WRAT-4 reading ability—Word Reading and Sentence Comprehension—and two well-validated measures of inconsistent responding—MMPI-2-RF Variable Response Inconsistency (VRIN-r) and True Response Inconsistency (TRIN-r) among 136 forensic inpatients (90 men, 46 women). It was hypothesized that WRAT-4 Sentence Comprehension would demonstrate stronger associations with VRIN-r than WRAT-4 Word Reading. It was also hypothesized that there may be a minimal association between Sentence Comprehension and TRIN-r. Although WRAT-4 Word Reading was not significantly correlated with VRIN-r (rs = -.17, p = .07) or TRIN-r (rs = -.10, p =.31), Sentence Comprehension was significantly correlated with VRIN-r (rs = -.27, p = .01). A hierarchical regression predicting VRIN-r scores indicated that WRAT-4 Sentence Comprehension significantly accounted for an additional 5.4% of the variance in VRIN-r scores after accounting for self-reported education level and Word Reading (p = .03). However, Word Reading did not significantly account for any additional variance in VRIN-r after accounting for Education and Sentence Comprehension (incremental R2 = .001, p = .74). These results suggest that Sentence Comprehension (rather than Word Reading) should be assessed prior to administering psychological testing, especially in forensic settings.


MMPI-2-RF, WTAT-4, reading level, consistency, forensic assessment

Summary of the Research

“One of the reasons for the widespread use of the MMPI [Minnesota Multiphasic Personality Inventory] instruments in forensic evaluations is their utility in measuring an examinee’s test-taking styles. Burchett and Ben-Porath (2010) found that the validity (as measured by external correlates) of MMPI-2-RF substantive scales was lower among examinees instructed to feign than among examinees who took the test under standard instructions. Therefore, forensic clinicians can be more confident in test scores when the reliability and validity of the examinee’s responses have been evaluated formally as part of the assessment” (p. 239).

“Two of the MMPI-2-RF Validity Scales—Variable Response Inconsistency (VRIN-r) and True Response Inconsistency (TRIN-r)—assess for inconsistent responding, which is most relevant to the current investigation. VRIN-r examines variable (i.e., random) responding, whereas TRIN-r assesses fixed responding by indicating whether examinees engage in acquiescent (i.e., fixed true) or counter-acquiescent (i.e., fixed false) responding styles” (p. 239).

“Because the MMPI-2-RF uses a subset of items from the MMPI-2 item pool, issues related to item-level reading comprehension are similar for the two instruments. Schinka and Borum (1993) calculated the grade level reading equivalent of MMPI-2 items using the formula for Flesch-Kincaid grade level, which utilizes word and syllable counts. The Flesch-Kincaid grade level of the MMPI-2 was roughly at the 4th to fifth-grade reading level, although some scales had grade level estimates at the sixth- to seventh-grade level” (p. 240).

Schinka and Borum cautioned that using an examinee’s completed grade level as a proxy for reading ability may result in overestimation of reading ability and concluded that individuals with less than an eighth-grade education may require additional assistance (e.g., synonyms for unrecognized words) due to the possibility of insufficient reading skills” (p. 240).

“Given these cautions, it is important to note that a large portion of adults in the United States have basic (“indicates skills necessary to perform simple and every- day literacy activities”) or below basic (“indicates no more than the most simple and concrete literacy skills”) literacy skills” (p. 240).

“MMPI-2-RF scales assessing important clinical constructs are likely to be less accurate among test protocols that were answered inconsistently and/or not pre-screened for consistency of responding” (p. 240).

Measures used: MMPI-2-RF and WRAT

“Archival MMPI-2 and MMPI-2-RF protocols were utilized from a larger data set of inpatients at a state-operated forensic psychiatric hospital […] All participants were primary English speakers” (p. 241).

“TRIN-r is first calculated by counting the raw number of TRIN-r True responses and TRIN-r False responses for each examinee. The raw count of False item pairs is subtracted from the raw count of True item pairs. Higher scores above 50T indicate an overall acquiescent response style, and lower scores indicate a counter-acquiescent response style” (p. 243).

“To determine whether WRAT-4 performance predicted inconsistent and acquiescent/counter-acquiescent responding above and beyond variance accounted for by self-reported grade level, hierarchical regression analyses were conducted.” (p. 243).

In addition “[A] series of hierarchical regression analyses [was conducted] in which the predictor of self-reported education was entered into the first block, WRAT-4 Word Reading was entered into the second block, and Sentence Comprehension was entered into the third block. This series of analyses was also conducted with education in the first block, Sentence Comprehension in the second block, and Word Reading in the third block.” (p. 244)

“In the second analysis in which self-reported education was entered into the first block, Sentence Comprehension into the second block, and Word Reading into the third block” (P. 244)

“[T]he WRAT-4 Word Reading subtest was not significantly correlated with VRIN-r or final TRIN-r; however, WRAT-4 Word Reading was significantly correlated with TRIN-r True. As expected, the WRAT-4 Sentence Comprehension subtest was significantly correlated with VRIN-r and TRIN-r True, but was not significantly associated with TRIN-r False or final TRIN-r. Both the WRAT-4 Word Reading and Sentence Comprehension subtests were significantly correlated with self-reported education attainment” (p. 244)

“It is important to note that a greater number of participants had Word Reading data than Sentence Comprehension data, which impacted the power for some analyses. Self-reported education was not significantly correlated with VRIN-r, final TRIN-r, TRIN-r True, or TRIN-r False” (p. 244).

“The results indicated that WRAT-4 Word Reading was associated with fixed acquiescent responding, but not with variable responding, whereas Sentence Comprehension was associated with variable responding and fixed acquiescent responding” (p. 245).

Moreover, from the hierarchical regression analyses “[T]he first block, which consisted of self-reported education, significantly accounted for 4.4% of the adjusted variance in VRIN-r scores. In the second block WRAT-4 Word Reading was entered, which did not significantly account for additional variance in VRIN-r; however, Sentence Comprehension significantly accounted for an additional 5.4% of the variance when entered into the final block” (P. 244)

“[W]ith Sentence Comprehension in the second block, rather than the third, Sentence Comprehension accounted for an addition 5.8% of the variance above and beyond self-reported education” (p. 244)

“The authors found that several MMPI-2-RF overreporting and underreporting scales were affected by random responding and acquiescent responding. Additionally, the authors found that MMPI-2-RF scales of overreporting produced fewer misclassifications when protocols were pre-screened with VRIN-r and TRIN-r before interpreting the overreporting scales” (p.240).

Translating Research into Practice

“Sentence Comprehension should be assessed prior to administering psychological testing like the MMPI-2-RF, particularly in cases where reading ability is questionable, as it appears to generalize better than Word Reading to the task of completing a self-report inventory” (p. 245).

“Examiners should be mindful that although word reading tasks are substantially more expedient than sentence comprehension tasks, the results of the current study suggest that word reading does not generalize as well as sentence comprehension to the consistent completion of self-report psychological tests” (p. 245).

“[I]n many cases any time saved by using a shorter word reading task might then lead the examiner to conduct lengthy self-report psychological testing that may be invalid in cases where examinees with low reading levels were not adequately screened for reading comprehension” (p. 245).

“[R]eading comprehension issues can impact performance on validity indicators, such as VRIN-r. In cases where reading comprehension is poor, administering the standardized audio version of the MMPI-2-RF is recommended” (p. 246).

Other Interesting Tidbits for Researchers and Clinicians

“Unlike the MMPI instruments and other multi-scale inventories such as the Personality Assessment Inventory, several other psychological tests do not have validity scales to identify if an individual was responding consistently to test items.” (p. 245).

In addition, “Limitations of this study include use of a relatively small sample and a lack of experimental control over the time between administration of the WRAT-4 and the MMPI-2-RF, due to the use of archival data” (p. 246).

“[R]eading ability is generally considered a test of premorbid functioning that is stable across time for adults, the amount of variability in reading ability over time is likely limited” (p. 246).

“In the second block WRAT-4 Word Reading was entered, which did not significantly account for additional variance in VRIN-r [and] it is important for future research to confirm if reading ability as assessed with the WRAT-4 is likely to remain stable for patients with psychotic spectrum diagnoses” (p. 246).

“It is recommended that future researchers consistently administer reading tests to all patients during the same testing session as the MMPI-2-RF. Future research with larger sample sizes would be useful for more fine-grained analyses of subgroups with different reading levels” (p. 246).

“[F]uture research examine the effect of language comprehension issues on validity indicators. Additionally, many individuals may have limited reading abilities in English due to English not being an individual’s primary language, which highlights the importance of assessing reading ability in English prior to testing” (p. 246).

Authored by Ahyun Go

Ahyun Go graduated from John Jay College of Criminal Justice with a BA in Forensic Psychology. She was also minoring in Police Studies. She plans to continue her studies in forensic psychology MA program in the near future. Her main research interests include cognitive biases and crime investigation.