Showing posts with label Standard Error of Measurement. Show all posts
Showing posts with label Standard Error of Measurement. Show all posts

Saturday, October 30, 2021

Average Intelligence


The concept of average intelligence is sometimes difficult to appreciate because the two words, average and intelligence, are sometimes not defined.


To psychologists and counselors who administer tests of intelligence, a person who scores at the 50th Percentile has average intelligence as defined by the number of correct answers to test tasks compared to others in their age group.

Many tests set the middle score at 100 thus, 100 = average intelligence on many tests.

All test scores vary from time to time so, a person may earn more or less points on another day. This fluctuation is estimated and can range for example by plus or minus 3-5 IQ points depending on the test and age group.

 If you retake the test in a month or so, you may score better because of the “practice effect”—you’ve seen the items recently so you will probably do better.

There is an average range so examiners will not focus on the obtained score but consider a broader range. For example, some may consider 90 to 110 as average. Some use a statistic called the standard deviation, which is often 15 points on an IQ test. If a clinician uses a Standard Deviation of 15 points then the average range of intelligence scores = 85 to 115 (that is plus or minus 15 points from 100). Statistically, about 68% of people earn scores in this broad average range thus, most people in a given age group and the same population, will have an IQ score or scores in this broad average range.

By this definition, people who are above average intelligence earn scores above 115 on tests. In the US, schools often considered scores at 130 or higher as gifted but other tests and reports are considered. Also, people who scored below 85 were considered below average intelligence. Depending on their other abilities, they may need assistance with school work or work tasks. People with high and low scores are different so broad statements can be misleading.


There are different theories of intelligence and tests have been constructed based on a few of the theories. Clinicians should be able to tell you basic facts about the test you or your child/loved one took. For the most part, the best tests ask examinees to answer a variety of questions and solve different types of problems. Thus, the best tests sample a variety of problem-solving tasks and average the scores for the different types of tasks.

For example, the ability to define words is one common measure of verbal intelligence. Through many years, examiners have found what people know in different age groups.

An example of performance intelligence is solving puzzles using blocks with different designs, which can be arranged to match pictures on a card. This ability increases considerably from preschool to adulthood.

There are other types of intelligence like emotional intelligence and social intelligence. Clinicians have developed tests to measure these skills too.

In a sense, intelligence is what is measured by intelligence tests—that’s circular—but it does give people a sense of what people know how to do compared to their age peers.

In addition, when abilities decline due to disease or head injury, knowing what is average for a person of a given age can be helpful in understanding the loss and marking recovery or further decline.

As a matter of context, clinicians usually administer other tests and conduct an interview to avoid interpreting test scores out of context.

Average intelligence is therefore, a middle range of abilities compared to other people of the same age who have taken the same test.


Learn more about test and other statistics in

Applied Statistics for Counselors

See related books and resources at

Tuesday, April 21, 2020

Measurement Error Standard Error of Measurement

In testing, measurement error usually refers to the fact that the same people can obtain different scores on the same test at different times. In a broad sense, measurement error can also refer to the degree of accuracy of a test to correctly identify a condition, which is discussed as test validity.

Recall that test score reliability is a necessary but insufficient condition for test score validity.

Many tests in psychology, medicine, and education are useful. The reliability of the scores will vary depending on such factors as the properties of the test itself as well as how well the user follows standard procedures in administering the test, environmental factors that can affect the scores, and factors within the person taking the test.

The scores on many tests conform to the pattern called the normal curve or bell curve. In classical test theory, the scores people obtain on tests are simply called obtained scores (symbol X). Statisticians consider the variation in scores to estimate a "true score." Variations of obtained scores around the theoretical true score (symbol T) indicate error because a reliable test ought to yield the same score every time it is used. The deviations of those obtained scores are referred to as error (symbol E). In a formula, X = T + E.

Theoretically, the reliability of test scores depends on the ratio of variances of the true scores divided by the variances of the obtained scores. A perfectly reliable test would yield a reliability value of rxx = 1.0. In reality, most of the better tests yield average reliability values above .90. Test publishers are obligated by professional ethics to include reliability values in their test manuals.

Studies of score patterns allow statisticians to calculate the average variability of score error. Thus, for any given published test, there ought to be a statistic known as the Standard Error of Measurement, which is abbreviated as SEM.

Once the history of the SEM for a test is known based upon large scale studies, users can use that value to estimate how the scores of test takers might vary if the test taker were to take the same test again under similar conditions. The estimates are based on the properties of the normal curve thus, the test must yield scores that conform to the normal score pattern to use a SEM based on this model.

Example, suppose a student obtains an IQ score of 100 and one SEM = 4 then on future administration of the same test, the student would likely score between 96 and 104 68% of the time.

The process of forming a range of values around the obtained score should remind users and test takers that scores are not fixed properties. Scores vary and they tend to vary in a "standard" pattern. In this theory, the error variance has been standardized. Clearly, a user who wanted to be careful could use 2 SEMs, which would then allow a range of plus and minus 8 points. In the example, the IQ could range between 108 and 92.

It is important to keep in mind that tests are neither reliable or unreliable because reliability is the property of scores not tests. Thus it is incorrect to refer to a test as reliable or unreliable. We can speak about the degree of reliability of the scores.

There are other theories about testing and reliability.

The concept of how well a test accurately identifies a criterion, see the discussion of validity.

Cite this Blog Post

Sutton, G.W. (2020, April 21). Measurement error standard error of measurement. Assessment, Statistics, & Research. /2020/04/measurement-error-standard-error-of.html

Read more about statistics in these two books.

Read more about basic statistics in APPLIED STATISTICS: CONCEPTS FOR COUNSELORS at  AMAZON

Creating Surveys on AMAZON    or   GOOGLE  Worldwide


My Page
My Books  AMAZON                       GOOGLE STORE

FACEBOOK   Geoff W. Sutton
TWITTER  @Geoff.W.Sutton

Publications (many free downloads)
Academia   Geoff W Sutton   (PhD)     

  ResearchGate   Geoffrey W Sutton   (PhD)

Interfaith Spirituality Scale

  Assessment name:   Interfaith Spirituality Scale Scale overview: The Interfaith Spirituality Scale is a self-report rating scale that m...