The Challenges of Basic Concept Assessment/Intervention

On October 18, 2011, Ann E. Boehm, PhD presented: The Challenges of Basic Concept Assessment/Intervention. A multi-step model for assessing and planning treatment for basic concepts was explored. Dr. Boehm presented research-based intervention strategies and checklists to monitor progress. The session also addressed the complexity of direction-following and ways to improve children’s performance.

DELV: Who is the Test for and How is it Useful?

On October 12, 2011, Jill de Villers PhD, Peter de Villiers PhD, and Tom Roeper PhD presented: DELV: Who is the Test for and How is it Useful? While DELV addresses dialect issues in language testing, it is appropriate for mainstream English speakers as well. Subtests unique to DELV (e.g., wh-question asking, fast mapping, narrative, quantifiers) complement other assessments and are important indicators for SLPs designing interventions.

When to use CELF Preschool 2 or CELF-4


I am a speech pathologist currently working in a preschool/kindergarten building. I often use the CELF-Preschool 2 or the CELF-4 to evaluate their communication skills. I would like this question directed to the authors of these assessment tools. Since both of these tests cover the 5-6 year age range, which test would they recommend we use at the kindergarten level?

Elisabeth H. Wiig, PhDAnswer:

In general, the CELF Preschool-2 is your best option for children in Kindergarten–the formats in the test are more supportive and child-friendly for young children. This is especially the case if a child is a young five year old (e.g., 5:0 through 5:5) and has had little preschool experience, and limited verbal ability. There is more in-depth content coverage for younger children in CELF Preschool-2 than you will find on CELF-4, which covers content for mostly older children (ages 5-8).

Keep in mind that if the children you are testing in Kindergarten are five years old, have enough preschool experience that they are comfortable and familiar with school types of tasks, and express themselves well in social situations, you will be able to obtain accurate test results using the CELF-4. Your choice of assessment really depends on the maturity of the child, previous preschool experiences, social verbal ability, and his or her experiences with standardized assessment tools.

Score Discrepancies on CELF-4


I have an 8 year 3 month old 2nd grade boy whose overall profile falls between 5 and 6 standard scores with Formulated Sentences at 8 and Expressive Vocabulary at 7 [On the CELF-4]. Working Memory subtests standard scores as follows: Number Rep Fwd 6, Number Rep Backward 5, Familiar Sequences 10. This is huge discrepancy. No inattentive behaviors noted. Any help?

-Beth M.

Elisabeth H. Wiig, PhD

Dr. Elisabeth Wiig’s Answer:
To begin, take a look at page 121 of the Examiner’s Manual. As you will see, both the Number Repetition subtests and the Familiar Sequences subtests place a heavy demand on attention, concentration, and auditory or verbal working memory. If you examine the content in the test items on the Record Form , you will see that the first 7 items in the Familiar Sequences subtest are relatively easy in comparison to items 8-12—the context includes “familiar sequences” such as the letters of the alphabet and the days of the week, not the long random sequences of numbers in the Number Repetition task. There is a great deal of automaticity in producing those sequences (and they are a closed set!) compared to the Number Repetition subtest. The score discrepancy this student exhibited is a red flag that there may be some working memory issues operating with this child and that further assessment is warranted. Consult with your school psychologist who can conduct a more thorough assessment of the student’s skills memory and attention skills.

You might want to administer the CELF-4 Rapid Automatic Naming subtest. It probes attention, visual working memory and set shifting. If the boy uses significantly longer time to name the color-form combinations, this can serve as validation since color-form naming requires adequate bilateral temporal-parietal, subcortical and hippocampal functioning. In other words,significantly impaired performance on that subtest can point to an underlying neuropsychological/neurological deficit involving the attention-working memory and cognitive

Answering Tough Questions About CELF-4 Interpretation

On April 12, 2011, researcher and Pearson author Elisabeth Wiig, Ph.D, answered your questions about CELF-4 interpretation. The recording and a PDF of the slides are available below.

You can download a PDF of the slides here: Answering Tough Questions About CELF-4 Interpretation.

***Please note: we are unable to provide CEUs for watching the recording of this webinar. CEUs were only offered for attending the live event.

How the SCAN-3 Tests Can Be Used

The original SCAN test published in the early 1980s was designed to be a screening test. It soon became clear that the test provided important diagnostic information and with the subsequent revision it was published as a test of auditory processing disorders, i.e. a diagnostic test.

Standardized scores used in medicine, psychology, education, and speech-language pathology are used for diagnostic purposes. The ability to determine a subject’s performance at a specific level and categorize that performance as normal or not is very specifically what is used in fields such as medicine, where performance below -2 SD is considered pathological.

The current SCAN-3 batteries contain the major tests recommended by position papers published by ASHA and AAA. There are small portions of the most recent versions that may be used as screening tools. Primarily, however, they are diagnostic in nature. While some might argue that test of auditory processing disorders (APD) are not or should not be diagnostic in nature, the SCAN tests are designed to be so. Conversely, if the SCAN test batteries are not diagnostic, then what tests are available that have better normative data? Professionals familiar with the APD literature and available tests of auditory processing recognize that published norms are not available for a majority of tests currently used. When cut-scores are recommended in the literature there often is little, if any information available to the user on how those scores were obtained.

The most recent revision of the test batteries, SCAN:3 for Children, Tests of Auditory Processing Disorders and SCAN:3 for Adolescents and Adults, include:

  1. Three screening measures with criterion referenced cut-off scores;
  2. Four tests of auditory processing used to develop the composite score; and
  3. Three optional tests of auditory processing including two additional signal-to-noise ratios and a time compressed sentence test.

In addition, the manual describes administering the Competing Words test under free recall and then directed ear conditions in order to assess higher order memory/executive functions. The revised test batteries were completely renormed on 775 subjects.

It may be of interest to the readers of this note that Friburg & McNamara (2010) found that SCAN-C and SCAN-A have the highest level of sensitivity and specificity of any auditory processing test or battery.


Dr. Robert W. Keith

By Robert W. Keith, Ph.D.

Adjunct Professor
University of Cincinnati – College of Allied Health Sciences
Department of Communications Sciences and Disorders
Professor Emeritus
Department of Otolaryngology
University of Cincinnati College of Medicine



Friberg, J.C. & McNamara, T.L. (2010). Evaluating the reliability and validity of (Central) Auditory Processing Tests: A preliminary investigation. Journal of Educational Audiology, 2.

How to Report and Interpret Extreme Raw Scores

We recently received the following question about the CASL test:

When the Norms Book lists a standard score (SS) associated with a raw score of 0, but the manual guides interpretation differently, which reporting/interpretation strategy do you use?

Although a normative score equivalent is reported in norms tables for scores of 0, best practice would be to follow the recommendations in the manual. Page 73 of the CASL manual, for example, states the following: “If the examinee responds incorrectly to Items 1, 2 and 3, do not administer the test. No normative information can be derived. However, the examiner may wish to describe qualitatively in a report the examinee’s difficulty with the task.”

In addition, page 88 in the CASL manual deals with extreme raw scores. Essentially, raw scores that are 0 or “nearly perfect” should be interpreted with great caution.

From a psychometric perspective alone, it’s important to know that an associated SS is possible for raw scores of zero. In the CASL norms tables, zeros complete the range of possible raw scores. However, from an interpretive perspective, even though an associated score is mathematically and statistically possible, the examiner must consider the usefulness or meaningfulness of a score of zero. Caution is always recommended when attempting to interpret a score of zero on any assessment.

School districts may want to see a score, but if that score is meaningless, the examiner must consider the implication for the examinee of a misinterpretation or misuse of that score.

In short, we recommend that you follow the manual’s directive regarding raw scores of zero, and do not report the SS for a raw score of 0.

Other Than SLPs, Who Is Qualified To Administer Speech-Language Tests?

Question, from Virginia A.:
As a new SLP, I am curious as to which other professionals are qualified to administer speech-language tests.  I recently came across a report in which a neuropsychologist had administered the CELF-4 and this surprised me. I asked CSHA about it and the basic reply was: “It depends on what it says in the test manual”.  The CELF-4 manual says: “You should have experience or training in administering, scoring and interpreting results of standardized tests before attempting to administer or intrepret the CELF-4. You should also have experience or training in testing children, adolescents, and young adults whose ages, linguistic abilities and cultural backgrounds, and clinical history are similar to those of the students you plan to assess with CELF-4.”  Seems very wide open, to me.  It also suggests that an SLP might then be considered qualified to administer and interpret results of a test like the Wechsler Intelligence Scale for Children or the Kaufman, depending on what those manuals say?  Any clarity or direction you can provide would be appreciated.
The primary users of CELF-4 are licensed/certified SLPs, just as the primary users of WISC-IV are licensed/certified psychologists.
That being said, there is a very small minority of researchers and/or practitioners in related fields who use assessments published by Pearson. Practitioners who might be interested in CELF-4 may include neuropsychologists, psychologists, linguists, educational diagnosticians, and special educators who conduct research in the area of language development and assessment. Any practitioner who is not a speech-language pathologist and wants to order any of the diagnostic language tests are referred to our qualifications team who determine if that person is qualified to purchase the test. CELF-4 is a B level product, requiring a Master’s degree “in a field closely related to the intended use of the instrument, and formal training in the ethical administration, scoring, and interpretation of clinical assessments.” The qualifications team determines if the individual is qualified to purchase the assessment based on that person’s background and training. Practitioners in approved “related fields” who are active members of their professional organization are bound by their profession’s code of ethics for determining if they are qualified to administer, score, and interpret the test.
Just as a neuropsychologist or psychologist would have to be “cleared to purchase” CELF-4 depending on their background and training, the Wechsler Memory Scale is approved for purchase by SLPs conducting research or working primarily with adult clients with memory issues.  Qualifications, training, and test use is evaluated on an individual basis.

Practice Effects in Testing

A recent question came to us from a colleague in Pennsylvania.

Q: What is current thinking on practice effects with standardized testing? How often is it ok to repeat tests like the PPVT-4, the CASL, or the CELF-4?

A: It depends (sorry, we know it’s easier to have black and white answers!). Most tests should have information in the manual about a development study in test-retest reliability–that is, the reliability of an individual’s performance over time. To help determine the risk of “practice effects” (i.e., low test-retest reliability), you need to consider the domain being measured, what research has been done to show the impact of practice effects between administrations, and the circumstances of your original administration.

As an example, in the PPVT-4 manual, pages 55-57, there is a description of the test-retest study completed during standardization. The window of time between administrations was a minimum of 14 days and averaged four weeks. In this 300+ person study, the reliability of the scores averaged a very high .93, which means that the PPVT-4 is quite resistant to practice effects given that window of time between administrations. Other tests will have different retest windows and should give guidance on a recommended “wait time” between administrations, with the usual caveats. Certainly, CELF-4 and CASL include this information in their manuals as well.

Another option to consider is the use of parallel forms, where available. In the case of PPVT-4 or EVT-2, for example, a second parallel form exists and you may choose to use the alternate form (i.e., a completely different but similar in difficulty item set) for your next test administration. This is one of the benefits of having two (or more) forms of a test.

Finally, for those tests without parallel forms for children, one set of giudelines might be that you allow enough time to elapse so that:

1. the examinee is now in the next norm group (e.g., 3-6-12 month interval, depending on the content and the norms,

2) the examinee no longer remembers the test items, OR

3) the examinee appears to have made progress (otherwise, why test?)

A final consideration is that if the examinee is sick or has other reasons for not participating in the original administration, you probably can feel confident testing right away again as soon as the individual feels better.

Hope this helps…as always, it’s a somewhat nuanced answer depending on the situation, examinee, and test. The best advice is to consult the test manual for direction. Feel free to continue the conversation with your comments below!

Using PLS-4 With Families Who Speak Multiple Languages

Question, from Terri W.:
Our SLPs have been having discussion about using the PLS-4 with families who speak more than one language. If a family is reporting that they speak both English and arabic at home and that the child understands both languages, can you report the standard score and percentile or should you just be reporting the raw score? We have reviewed the manual and cannot find the information.
3.5% of the PLS-4 standardization sample spoke a language in addition to English (see Table 6.13 in the Examiner’s Manual). The standardization sample included children who “could speak and understand English and were able to take the take in the standard fashion without modification.” [page 175, Examiner's Manual]
If the child you are testing responds well in the test environment and understands and speaks English well enough to take the test in the standard fashion without modification, you can use the standard scores.
If the child you are testing is unfamiliar with and uncomfortable with participating in the test tasks with an unfamiliar adult or lacks proficiency in English, you should try alternative testing strategies (e.g., dynamic assessment; language sampling) or describe the skills the child was able and unable to do in the PLS-4 test session. Raw scores provide no information and should not be reported.