Find tests, questionnaires, examples of charts, and tips on statistics related to psychology, counseling, social work and other behavioral science fields. I may earn from purchases of advertised or linked products.
Note: Most tests and questionnaires are for research purposes and not for personal assessment. Readers interested in personal evaluations should consult a psychologist or other qualified provider.
Search This Blog
What makes a test valid?
What makes a test valid? is a tricky question.
and rather obnoxious response is, “nothing.”
validity is a property of test scores
rather than tests but more accurately, an interpretation of the scores.
But it is
important to take the question seriously when test-takers and users are
wondering how much confidence to place in a test score. As with many aspects of
science, the answers can be simply stated but there is a complicated backstory.
For many, the traditional views of test score validity will
be sufficient. Tests measure constructs. Scientific constructs are ideas that
have features that can be measured like reading comprehension, dominance, short-term
memory, and verbal intelligence.
is not a single entity but rather the current state of knowledge about how a
test instrument’s scores have functioned in many settings and in relation to
criteria. Construct validity primarily includes findings from studies of content
validity, convergent validity and discriminant validity.
is based on judgment analysis from
experts who mostly agree that test items measure the construct (e.g., marital
The other types of validity are based on the concept of
correlations with a criterion. Researchers ask participants to take a specific
test X along with other tests Y and Z. Test X is the test of interest such as a
new math achievement test. Test Y represents other similar tests such as other
math tests. When test X and test Y yield similar scores we have evidence of convergent
When test X and test Z yield dissimilar results such as a
relationship between our test X math achievement and test Z vocabulary, we have
evidence of discriminant validity—a math test ought not to measure
vocabulary aside from the minimal vocabulary used in the instructions and word
problems. The relationship between the tests is based on a statistic called the
validity coefficient, which will vary anytime you have a group of people taking
two tests—even the very same people will get different scores on two different
compares test scores to some criterion. The relationship between depression test
scores measuring depression today is called concurrent validity. The relationship between test scores today and
some future measurable performance is predictive
validity—for example, a pre-employment test may be correlated with
supervisor ratings after six months on the job.
Aside from content validity, most traditional studies are
looking at the strength of the relationship between one set of test scores and
is a complex correlational procedure that examines the underlying relationship
among test items and how they relate to other test items. For example, a set of
vocabulary items may be correlated with answers to questions about general
knowledge and be considered a “verbal factor” when the two sets of items may be
grouped as representing an underlying verbal factor. These abstract underlying
factors are sometimes called latent
variables or latent traits.
Assessment name: STUDENT SELF-EFFICACY SCALE * Note. This post has been updated to provide an available measure of student self-efficacy. ———- Scale overview: The student self-efficacy scale i s a 10-item measure of self-efficacy. It was developed using data from university nursing students in the United States. Authors: Melodie Rowbotham and Gerdamarie Schmitz Response Type: A four-choice rating scale as follows: 1 = not at all true 2 = hardly true 3 = moderately true 4 = exactly true Self-efficacy is the perception that a person can act in a way to achieve a desired goal. Scale items There are 10 items. Examples: I am confident in my ability to learn, even if I am having a bad day. If I try hard enough, I can obtain the academic goals I desire. Psychometric properties The authors reported that their sample scores ranged from 25 to 40 with a scale mean of 34.23 ( SD = 3.80. Internal consistency was high at alpha = .84. The authors reported the results of a principal compon
The Personal Self-Concept Questionnaire ( PSQ ) Overview The Personal Self-Concept Questionnaire (PSQ) measures self-concept based on ratings of 18 items, which are grouped into four categories: Self-fulfilment, autonomy, honesty, and emotional self-concept. Subscales : The PSQ has four subscales 1. Self-fulfilment (6 items) 2. Autonomy (4 items) 3. Honesty (3 items) 4. Emotional self-concept (5 items) 👉 [ Read more about Self-Concept and Self-Identity] The PSQ is a Likert-type scale with five response options ranging from totally disagree to totally agree. Reliability and Validity In the first study, coefficient alpha = .85 and in study two, alpha = .83. Data analysis supported a four-dimensional model (see the four categories above). Positive correlations with other self-concept measures were statistically significant. Other notes The authors estimated it took about 10 minutes to complete the PSQ. Their first study included people ages 12 to 36 ( n = 506). In the second s
Scale name: Mathematics Self-Efficacy and Anxiety Questionnaire (MSEAQ) Scale overview: The Mathematics Self-Efficacy and Anxiety Questionnaire (MSEAQ) is a 29-item self-report measure of both mathematics self-efficacy and mathematics anxiety. Author: Diana Kathleen May Response Type: Items are rated on a 5-point Likert-type scale following a “no response” option: 1 = Never 2 = Seldom 3 = Sometimes 4 = Often 5 = usually Sample items 1. I feel confident enough to ask questions in my mathematics class. 6. I worry that I will not be able to get a good grade in my mathematics course. Subscales and basic statistics for the MSEAQ Self-Efficacy M = 44.11, SD = 10.78, alpha = .93 Anxiety M = 46.47, SD = 12.61, alpha = .93 Total Scale M = 90.58, SD = 22.78, alpha = .96 Reliability: See the Cronbach’s alpha levels reported above. Validity: There were significant positive correlations with similar measures. The results of a Fa