Wednesday, July 19, 2017

Understanding the Reliability of Educational and Psychological Tests

Why aren't tests reliable?

The reason tests are not reliable is that reliability is a property of the interpretation of scores not the tests themselves.

This isn't a matter of semantics.

Think about it this way.

Give all the students in one school an achievement test. The test items don't change so they appear stable, consistent, and reliable. However, when publishers report reliability values, they calculate the reliability statistics based on scores. Scores vary from one administration to another. If you ever took a test twice and got a different score, you know what I mean. Individuals change from day to day. And we change from year to year. Also, even a representative sample of students for a nation can be different each year.

Every time we calculate a reliability statistic, the statistic is slightly different.

Reliability values vary with the sample.

Reliability values vary with the method of calculation.

Reliability values also vary with the method used for calculation. You can get high reliability values using coefficient alpha with scores from a one-time administration. This method is common in research articles. But you will see different values from the same research team in different samples in the same article.

If we use a split-half method, which usually calculates reliability based on a correlation between two halves of one test, then we can get a reliability value based on one administration. But that's only half a test! Researchers use the Spearman-Brown formula to correct for the shortened half-test problem- but that's just an estimate of what the full test could be.

There's also a test-retest reliability method. Give a test one time, wait awhile- maybe a week or several weeks, then retest. That gives you an estimate of stability. But if you have a good memory, you can score higher on the second test on some tests like intelligence and achievement.

By now you get the point. Any one test can be associated with a lot of reliability values. The reliability problem is not just about tests. The problem can be understanding that tests do not have one reliability value. As with many things in science, there are many variables to consider when answering a question.

Reputable test publishers include reliability values in their test manuals. Teachers, Counselors, Psychologists, and other users ought to know about test score reliability.

Learn more assessment and statistical concepts in

Applied Statistics: Concepts for Counselors 


Learn more about assessment and statistics at the Applied Statistics website

Learn more about Creating Surveys

Quick Notes on Test Reliability

Reliability is a property of scores not tests.

Reliability may mean stability of scores over time.

Reliability may mean how consistently test questions measure whatever the test measures.

Reliable test scores in one culture do not mean they will be reliable in another culture.

Reliable test scores do not guarantee the score are valid - but reliability places a limit on validity.

Reliability statistical concepts apply to tests, quizzes, polls, surveys...sets of questions yielding numerical scores.

Note: This is a re-posting of a post to this new blog. 

Links to Connections

My Page


My Books  AMAZON          and             GOOGLE STORE


FOLLOW   FACEBOOK   Geoff W. Sutton   TWITTER  @Geoff.W.Sutton




Articles: Academia   Geoff W Sutton   ResearchGate   Geoffrey W Sutton 



Assessing Spiritual Practices

Portsmouth Cathedral, UK; April 2017, Geoff W. Sutton

In recent years, researchers have characterized religion and spirituality as ways people find meaning in their lives--that is, religion and spirituality as meaning making systems. It isn't surprising that researchers disagree on a way to define religion and spirituality that encompasses all similar activities. Meanwhile, researchers continue to study various dimensions of religion and spirituality. In this post I will introduce a short scale that might be helpful in research and possibly in a clinical setting. Although it was written to assess Christian Practices, I will suggest how it could be modified.

One way to think about the components of faith is three-dimensional, which includes beliefs, practices, and experiences. A few years ago, a group of us studied Christian counseling to discover what Christian counselors actually did that was different from other counselors. We wanted to get more specific about the identity of Christian counselors--beyond a simple checklist of their affiliation with some large group such as Catholic or Methodist.

As part of our plan to be more specific about spirituality, we created a few measures. One of the measures is a Spiritual Practices Index, which we referred to as Personal Christian Practices in the article (Sutton, Arnzen, & Kelly, 2016).

The wording of three of the four original items clearly applies to the Christian faith. But wording changes may make the index applicable to other faiths. First, I will provide the original index then I will add suggestions for alternate wording.

The full index used in the published article follows. It is presented by asking respondents to rate the items using a 5-point rating scale from 1 = strongly disagree to 5 = strongly agree.  

Please tell us a little about your spiritual practices.
  1. I study the Bible every week.
  2. I pray each day.
  3. I attend church almost every week.
  4. I regularly support Christian ministry
Score the index by adding the values. In our study, the mean was 17.51 and the standard deviation was 2.40. The skew was -1.10, which was acceptable (+/- 1.50). Kurtosis was also acceptable (.81).

Coefficient alpha for our study was adequate (.74). We have used this measure in other studies. Those alpha levels are as follows: .86 (Studies 1 and 2; Sutton, Kelly, Griffin, Worthington, & Dinwiddie, 2018), .84 (Kelly et al., 2018), and .83 (Sutton & Kelly, 2018).

Validity data indicate significant positive correlations with other aspects of spirituality supporting its use as measuring another dimension of the construct, spirituality. Following is a table showing the Pearson Correlation Coefficients for the relationship between the Spirituality Index and four other measures of spirituality.

Index or Measure
Christian Beliefs
Intratextual Fundamentalism Items
Christian Social Values
Christian Service Scale

* p < .01

Learn to read statistics used in counseling and psychology research. Review basic concepts learned in courses on assessment and statistics.

Spiritual Practices Index

A proposed rewording of the Spirituality Practices Index follows. This has not been used in research so if you use it as is or in a modified form, please share the results. The items in either index are free to use for research and teaching purposes. Contact me if you want to use it in a commercial product.

I suggest presenting these asking respondents to rate the items using a 5-point rating scale from 1 = strongly disagree to 5 = strongly agree.  

Please tell us a little about your spiritual practices.

I study sacred texts every week.
I pray each day.
I attend a place of worship almost every week.
I regularly support ministries related to my faith.

Praying in Shibaozhai, China; 21 May 2015 Geoff W. Sutton

Resource Link:  A – Z Test Index


Kelly, H.L., Sutton, G. W, Hicks, L., Godfrey, A. & Gillihan, C. (2018). Factors predicting the moral appraisal of sexual behavior in Christians. Journal of Psychology and Christianity, 37, (2), 162-177.

Sutton, G. W. (2017a). Applied statistics: Concepts for counselors. Springfield, MO: Sunflower. Amazon  Paperback ISBN-10: 1521783926, ISBN-13: 978-1521783924

Sutton, G. W. (2017b). Creating surveys: Evaluating programs and reading research. Springfield, MO: Sunflower. Amazon  Paperback ISBN-10: 1522012729  ISBN-13: 978-1522012726

Sutton, G. W., Arnzen, C., & Kelly, H. (2016). Christian counseling and psychotherapy: Components of clinician spirituality that predict type of Christian intervention. Journal of Psychology and Christianity35, 204-214.

Sutton, G.W. & Kelly, H. (2018, April 14). Christian Spirituality and Moral Judgments. Paper presented at the annual meeting of the Christian Association for Psychological Studies, International Conference, Roanoke, VA. [Manuscript in progress]

Sutton, G. W., Kelly, H., Worthington, E. L. Jr., Griffin, B. J., & Dinwiddie, C. (2018) Satisfaction with Christian Psychotherapy and Well-being: Contributions of Hope, Personality, and Spirituality. Spirituality in Clinical Practice, 5 (1), 8-24, doi: 10.1037/scp0000145 

Williamson, W.P., Hood, R. W. Jr., Ahmad, A., Sadiq, M., Y Hill, P.C. (2010). The intratextual fundamentalism scale: cross-cultural application, validity evidence, and relationship with religious orientation and the Big 5 factor markers. Mental Health, Religion & Culture13, 721-747.

Connections and Links to Resources

My Page
My Books   AMAZON
FACEBOOK   Geoff W. Sutton
TWITTER  @Geoff.W.Sutton

Publications (many free downloads)
     Academia   Geoff W Sutton   (PhD)
     ResearchGate   Geoffrey W Sutton   (PhD)

Adult Decision Making Competence ADMC

  Measure name: Adult Decision-Making Competence ADMC Overview: The Adult Decision-Making Competence measure consists of a set of seven d...