Thursday, September 21, 2017


What are age scores?

Age scores, also called age-equivalent scores, are supposed to help people understand how a person’s test score compares to other people of the same age. They are often provided to teachers and parents to show how children scored on achievement tests compared to their age peers. A common age-equivalent abbreviation is AE

Age scores are reported with a hyphen. The first number refers to age in years and the second number refers to the age in months. A score of 8-4 is supposed to mean a test performance typical of children age 8-years and 4-months.

The scores appear convenient and make a kind of common sense. An age score of 7-6 is supposed to mean that a child earned a test score similar to children age 7-years and 6-months. But there are problems with the scores.

What tests report age scores?

Age scores are commonly reported with results of achievement tests. They are sometimes reported with results of intelligence tests. Old intelligence tests reported a mental-age score (MA).

What’s the problem with age-equivalent scores?

The scores create an inaccurate impression of performance for children who are much younger or older than the age comparison group. An age score of 7-6 obtained by a 5-year-old does not account for all of the knowledge or ability that is typical of a child age 7-6. Similarly, a teen aged 13 may have different skills than represented by an age score of 7-6, which of course suggests a very low skill level for a 13-year-old.

The reliability of age scores for children whose actual age is much lower or higher than the reported age score varies from the reliability values for children close to the age-score. The reliability of scores is not a stable characteristic. Retesting can yield very different results on retesting when a very high or low score is obtained.

Age scores do not allow for an accurate comparison over time because the content of the tests and the abilities of children change dramatically as children age. Reading tests and reading abilities are very different for children age 7 and those age 14. Children learn very different math concepts at age 12 than do those at age 8. To say an 8-year old has an age score of 12-1 in math is hardly accurate.


Read more about age-equivalent and other scores in


Age score differences do not provide parents and teachers with an accurate picture of delays and advancement. A child who is one year behind peers in math at age 8 is further behind than is a child who is one year behind at age 15. The gains children make in reading, math, and other skills are greater in the early years of life than in later years.

Age units are inaccurate compared to other scores. Age scores report differences in months but child development is uneven—especially when it comes to mental abilities like reading comprehension, spelling, and visual memory. Comparisons do not make a lot of sense for children of the same chronological age who earn different age scores. Two children having the same chronological age of 8-years and 2-months but different reading comprehension test age scores of 6-10 and 9-3 have different skill levels. But concluding they are 2-years and 5-months apart on reading comprehension is not reasonable because only a small sample of skills are assessed on tests. Even worse might be the perception that the score difference is somehow fixed. Differences this large will almost always change over time.

There are problems with the samples of children at different ages when the age scores are very different from the chronological age. Consider children with an actual age of 8-3. If one obtains a test age score of 5-6 and another of 11-3. A lot of 8-year-olds can take a test designed for 8-year-olds but how many 5-year-olds or 11-year-olds take the same test? The problem is having an accurate sample group for comparison purposes.

What scores are better than age-equivalent scores?

Several scores are better than age-equivalent scores. Most tests report standard scores (SS) and national percentile rank (NPR) scores on children’s achievement tests. These scores compare children to others of the same age group.

What about grade scores?

Grade scores have the same problems as do age scores. A grade score is reported as a grade number with a decimal and a second number referring to the month of a school year. A grade score of 5.6 means the sixth month of grade 5. 

You can see my glossary of test and statistical terms at this website:

Sutton, G. W. (2017). Applied statistics: Concepts for counselors. Springfield, MO: Sunflower. Amazon  Paperback ISBN-10: 1521783926, ISBN-13: 978-1521783924

Thinking about conducting a survey? Consider CREATING SURVEYS.



Connections and Links to Resources

My Page

My Books   AMAZON

FACEBOOK   Geoff W. Sutton

TWITTER  @Geoff.W.Sutton

LinkedIN Geoffrey Sutton  PhD

Publications (many free downloads)
     Academia   Geoff W Sutton   (PhD)

     ResearchGate   Geoffrey W Sutton   (PhD)

Wednesday, September 6, 2017

Measuring Spiritual Outcomes in Psychotherapy

The Theistic Spirituality Outcome Scale (TSOS) has potential as a useful outcome measure.

Recently, a group of us completed a study of clients who saw Christian counselors. We assessed their current well-being using two measures: The Schwartz Outcome Scale (SOS) and the Theistic Outcome Scale (TSOS). (See references below.)

The TSOS was designed by Richards (2005) as a measure of well-being for people associated with a theistic religion like Christianity, Judaism or Islam. We used the 17-item version, which uses a 5-point response format from 1 = never to 5 = almost always to rate each item (e.g., “I felt spiritually alive.”).


We only calculated coefficient alpha, which was strong at .95.


The TSOS was significantly correlated with ratings of satisfaction with Christian counseling (.65) and likelihood of returning to Christian counseling (.62).

It was significantly correlated with the SOS measure of general well-being (.84).

Other significant correlations were:

TIPI (a Big 5 measure; Gosling et al., 2003)

Extraversion .34
Agreeableness .50
Neuroticism .51
Conscientiousness .39
Snyder's Hope Scale .72 (Snyder et al., 2010)

Attachment to God Inventory (Beck & McDonald, 2004)
  Avoidant  -.55
  Anxious   -.40

Religious Practices Index  .41 (See Sutton et al., 2016)

Intratextual Fundamentalism Scale .56 (See Williamson et al., 2010)

Counselors, read more about reliability and validity of test scores in APPLIED STATISTICS: CONCEPTS FOR COUNSELORS

Resource Link:  A – Z Test Index


Beck, R., & McDonald, A. (2004). Attachment to God: The Attachment to God Inventory, tests of working model correspondence, and an exploration of faith group differences. Journal of Psychology and Theology, 32, 92-103. doi:10.1037/t46035-000

Gosling, S. D., Rentfrow, P. J., & Swann, Jr., W. B. (2003). A very brief measure of the big-five personality domains. Journal of Research in Personality, 37, 504-528. doi:10.1016/s0092-6566(03)00046-1

Richards, P. S., Smith, T. B., Schowalter, M., Richard, M., Berrett, M. E., & Hardman, R. K. (2005). Development and validation of the Theistic Spiritual Outcome Survey. Psychotherapy Research, 15, 457-469. doi:10.1080/10503300500091405

Snyder, C. R., Harris, C., Anderson, J. R., Holleran, S. A., Irving, L. M., Sigmon, S. T., Yoshinoba, L., Gibb, J., Langelle, C., & Harney, P. (1991). The will and the ways: Development and validation of an individual-differences measure of hope. Journal of Personality and Social Psychology, 60, 570-585. doi:10.1037/0022-3514.60.4.570

Sutton, G. W., Arnzen, C. A., & Kelly, H. L. (2016). Christian counseling and psychotherapy: Components of clinician spirituality that predict type of Christian intervention. Journal of Psychology and Christianity, 35, 204-214.

Sutton, G. W., Kelly, H., Worthington, E. L. Jr., Griffin, B. J., & Dinwiddie, C. (in press) Satisfaction with Christian Psychotherapy and Well-being: Contributions of Hope, Personality, and Spirituality. Spirituality in Clinical Practice.

Williamson, W. P., Hood, R. W. Jr., Ahmad, A., Sadiq, M., Hill, P. C. (2010). The Intratextual Fundamentalism Scale: Cross-cultural application, validity evidence, and relationship with religious orientation and the big 5 factor markers. Mental Health, Religion & Culture, 13, 721-747. doi:10.1080/13674670802643047

Read more about validity of surveys and tests in CREATING SURVEYS

Please check out my website

   and see my books on   AMAZON       or  GOOGLE STORE

Also, consider connecting with me on    FACEBOOK   Geoff W. Sutton    

   TWITTER  @Geoff.W.Sutton    

You can read many published articles at no charge:

  Academia   Geoff W Sutton     ResearchGate   Geoffrey W Sutton 



Sunday, September 3, 2017

Reporting Mean or Median

Who would think that a simple statistic like a mean or a median would make a difference?

In large samples involving thousands of people, and when data are normally distributed (close to the shape of a bell curve), the mean and median will be nearly the same. In fact, in a theoretical distribution called the normal curve, the mean, median, and  mode are in the middle.

But, many samples are not normal distributions. Instead, the often contain extreme scores called outliers or a lot of scores bunched up at high or low levels (skewed). Sadly, even people that understand statistics, continue to report the mean as if they are not thinking about their samples.

Suppose you work for a company where the top person earns $300,000 but most folks earn $30,000 to $60,000. Well that $300,000 is gonna skew results and the mean will look much higher than the median.

I ran some fictitious data on a sample of 10 people. Nine earn between $30 and $60K and one earns $300K. The Mean = $67K (standard deviation = 82.58), but the Median is only $38.5K and the Range = $270K.

Now those results are fictitious and it is a small sample so it magnifies the differences. But you know some folks are earning over $1,000,000.00 in some companies and lots of folks aren't earning anywhere near that amount.

So who cares? Well salaries make a lot of difference if you are arguing for a raise, considering a change of jobs, voting on budgets in not-for-profit organizations, and more. How motivating is it to give a donation to a company that helps the poor where the CEO pulls down nearly a million bucks a year and you get by on $65K-- or less?

But there's more. Teacher evaluations are usually skewed -- most students give high ratings-- so the median and range are more appropriate than the mean.

[ Read more about statistics in
Creating Surveys on AMAZON]

Real estate prices can be out-of-whack if you look at the mean price in a city where a few multimillion dollar homes pull the mean to a high level compared to the median price.

I see research papers where the scientists report the average age of people in surveys is 19 and they tell you thir sample was from a university. No problem with age 19 but when they report a Mean of 19 and a standard deviation of 5, there is a problem! If you understand standard deviations, you will know why they probably did not have a lot of 14-year olds in their university!

You can see that knowledgeable folks can play games with a simple statistic.

If you forgot about the meaning of some terms, here's a link to a free glossary.

A simple example

Counselors, teachers, and parents - think about test scores and how they are reported.  Test scores for students at school may be distorted by a few very high scoring or very low scoring students.

"Averages" can be deceiving.

Read more about basic statistics in APPLIED STATISTICS: CONCEPTS FOR COUNSELORS at



My Page

My Books  

 Geoff W. Sutton

TWITTER  @Geoff.W.Sutton

LinkedIN Geoffrey Sutton  PhD

Publications (many free downloads)
  Academia   Geoff W Sutton   (PhD)
  ResearchGate   Geoffrey W Sutton   (PhD)

Belief in God Scale

  Assessment name: Belief in God Scale Scale overview: Authors: D. Randles et al. (2015). Response Type: Items are rated on a scale ...