Wednesday, July 19, 2017

Understanding the Reliability of Educational and Psychological Tests


Why aren't tests reliable?

The reason tests are not reliable is that reliability is a property of the interpretation of scores not the tests themselves.
 

This isn't a matter of semantics.

Think about it this way.

Give all the students in one school an achievement test. The test items don't change so they appear stable, consistent, and reliable. However, when publishers report reliability values, they calculate the reliability statistics based on scores. Scores vary from one administration to another. If you ever took a test twice and got a different score, you know what I mean. Individuals change from day to day. And we change from year to year. Also, even a representative sample of students for a nation can be different each year.


Every time we calculate a reliability statistic, the statistic is slightly different.

Reliability values vary with the sample.

Reliability values vary with the method of calculation.

Reliability values also vary with the method used for calculation. You can get high reliability values using coefficient alpha with scores from a one-time administration. This method is common in research articles. But you will see different values from the same research team in different samples in the same article.


If we use a split-half method, which usually calculates reliability based on a correlation between two halves of one test, then we can get a reliability value based on one administration. But that's only half a test! Researchers use the Spearman-Brown formula to correct for the shortened half-test problem- but that's just an estimate of what the full test could be.


There's also a test-retest reliability method. Give a test one time, wait awhile- maybe a week or several weeks, then retest. That gives you an estimate of stability. But if you have a good memory, you can score higher on the second test on some tests like intelligence and achievement.


By now you get the point. Any one test can be associated with a lot of reliability values. The reliability problem is not just about tests. The problem can be understanding that tests do not have one reliability value. As with many things in science, there are many variables to consider when answering a question.

Reputable test publishers include reliability values in their test manuals. Teachers, Counselors, Psychologists, and other users ought to know about test score reliability.

Learn more assessment and statistical concepts in


Applied Statistics: Concepts for Counselors 

AMAZON BOOKS




Learn more about assessment and statistics at the Applied Statistics website


Learn more about Creating Surveys




Quick Notes on Test Reliability

Reliability is a property of scores not tests.

Reliability may mean stability of scores over time.

Reliability may mean how consistently test questions measure whatever the test measures.

Reliable test scores in one culture do not mean they will be reliable in another culture.

Reliable test scores do not guarantee the score are valid - but reliability places a limit on validity.

Reliability statistical concepts apply to tests, quizzes, polls, surveys...sets of questions yielding numerical scores.



Note: This is a re-posting of a post to this new blog. 

Links to Connections

My Page    www.suttong.com

  

My Books  AMAZON          and             GOOGLE STORE

 

FOLLOW   FACEBOOK   Geoff W. Sutton   TWITTER  @Geoff.W.Sutton

 

PINTEREST  www.pinterest.com/GeoffWSutton

 

Articles: Academia   Geoff W Sutton   ResearchGate   Geoffrey W Sutton 

 

 



No comments:

Post a Comment

Perceptions and Experiences of Grace Scale--Short Form

Assessment name:   Perceptions and Experiences of Grace Scale--Short Form Scale overview: The Perceptions and Experiences of Grace Scale-...