Showing posts with label mode. Show all posts
Showing posts with label mode. Show all posts

Sunday, March 28, 2021

Skewed Distributions

 

Skewed Distributions*



Skewed distributions have one tail that is longer than the other tail compared to the "normal" distribution, which is perfectly symmetrical. Skew affects the location of the central values of the mean and median.

Positive Skew

Below is an image of positive skew, which is also called right skew. Skew is named for the "tail." If you had statistics, you may have heard a professor say, "the tail tells the tale." The tail is the extended part of the distribution close to the horizontal axis.

The large "hump" area to the left represents the location of most data. In behavioural science, the high part often refers to the location of most of the scores. Thus, in positively skewed distributions, most of the participants earned low scores and few obtained high scores as you can see by the low level of the curve, or the tail, to the right.



Negative Skew

As you might expect, negatively skewed distributions have the long tail on the left thus, they are also called left-skewed distributions. A negatively skewed distribution of test scores illustrates an easy test--just what students want. Teachers used to talk about grading on a curve. You can see that such grading could be good or bad for students depending on what curve the teacher uses.



Skewed distributions are nonnormal by definition. 

Recall that in the normal curve, the mean, median, and mode are all at the same point in the middle of the distribution. The value of skew in a normal distribution is zero. 

In skewed distributions, the mode is at the high point and it represents the most frequent value or test score. The mean is pulled in the direction of the long tail and the median falls between the mode and the mean.

Common test questions ask what happens to the mean in skewed distributions. Keep in mind that the mean is "pulled" toward the tail. The mean is an average and, as such, it is most susceptible to extreme scores.

Skew and Data Analysis

Most statisticians accept small deviations from normality when analysing data using procedures designed for a normal distribution like the Pearson r, t tests, and the parametric ANOVAs

The question of acceptable ranges of skew will yield different answers from different sources. A range of +1.5 to -1.5 is a common rule of thumb. An important consideration is the "true" nature of the measured variable. Scientists may argue for flexibility in analysing data from a sample if the variable is known to be normally distributed in the population.

Skewed data can be adjusted and should be adjusted before using parametric tests. One method of adjustment is to convert all scores to logarithms and perform the data analysis on these transformed values.

If the data are too skewed and it is inappropriate to transform the data, then analysts should use nonparametric statistical methods.

Moments

In statistics, the concept of moments is taken from physics. Moments refer to central values. The first moment is found by calculating the value of the mean. The first moment is zero.

The second moment is seen in the calculation of variance, which uses squared values.

The third moment is found by calculating skew and the fourth moment results in the calculation for kurtosis.


Learn more about behavioural statistics in Applied Statistics Concepts for Counselors on AMAZON   or   GOOGLE








Learn More about analyzing data  in Creating Surveys on AMAZON or GOOGLE








Please check out my website   www.suttong.com

   and see my books on   AMAZON       or  GOOGLE STORE

Also, consider connecting with me on    FACEBOOK   Geoff W. Sutton    

   TWITTER  @Geoff.W.Sutton    

You can read many published articles at no charge:

  Academia   Geoff W Sutton     ResearchGate   Geoffrey W Sutton 


*Photo credit- From Bing images labeled "Free to share and use."

Sunday, September 3, 2017

Reporting Mean or Median

Who would think that a simple statistic like a mean or a median would make a difference?




In large samples involving thousands of people, and when data are normally distributed (close to the shape of a bell curve), the mean and median will be nearly the same. In fact, in a theoretical distribution called the normal curve, the mean, median, and  mode are in the middle.

But, many samples are not normal distributions. Instead, the often contain extreme scores called outliers or a lot of scores bunched up at high or low levels (skewed). Sadly, even people that understand statistics, continue to report the mean as if they are not thinking about their samples.

Suppose you work for a company where the top person earns $300,000 but most folks earn $30,000 to $60,000. Well that $300,000 is gonna skew results and the mean will look much higher than the median.

I ran some fictitious data on a sample of 10 people. Nine earn between $30 and $60K and one earns $300K. The Mean = $67K (standard deviation = 82.58), but the Median is only $38.5K and the Range = $270K.

Now those results are fictitious and it is a small sample so it magnifies the differences. But you know some folks are earning over $1,000,000.00 in some companies and lots of folks aren't earning anywhere near that amount.



So who cares? Well salaries make a lot of difference if you are arguing for a raise, considering a change of jobs, voting on budgets in not-for-profit organizations, and more. How motivating is it to give a donation to a company that helps the poor where the CEO pulls down nearly a million bucks a year and you get by on $65K-- or less?

But there's more. Teacher evaluations are usually skewed -- most students give high ratings-- so the median and range are more appropriate than the mean.




[ Read more about statistics in
Creating Surveys on AMAZON]





Real estate prices can be out-of-whack if you look at the mean price in a city where a few multimillion dollar homes pull the mean to a high level compared to the median price.

I see research papers where the scientists report the average age of people in surveys is 19 and they tell you thir sample was from a university. No problem with age 19 but when they report a Mean of 19 and a standard deviation of 5, there is a problem! If you understand standard deviations, you will know why they probably did not have a lot of 14-year olds in their university!

You can see that knowledgeable folks can play games with a simple statistic.

If you forgot about the meaning of some terms, here's a link to a free glossary.


A simple example




















Counselors, teachers, and parents - think about test scores and how they are reported.  Test scores for students at school may be distorted by a few very high scoring or very low scoring students.

"Averages" can be deceiving.




Read more about basic statistics in APPLIED STATISTICS: CONCEPTS FOR COUNSELORS at

AMAZON



Connections

My Page    www.suttong.com

My Books  
 AMAZON     GOOGLE PLAY STORE

FACEBOOK  
 Geoff W. Sutton

TWITTER  @Geoff.W.Sutton

LinkedIN Geoffrey Sutton  PhD

Publications (many free downloads)
     
  Academia   Geoff W Sutton   (PhD)
     
  ResearchGate   Geoffrey W Sutton   (PhD)


Identity Salience Questionnaire (ISQ)

  Assessment name: Identity Salience Questionnaire (ISQ) Scale overview: The Identity Salience Questionnaire (ISQ) is a 6-item self-repor...