ED 602

Statistical Research for Behavioral Sciences

Brian G. Smith, Ph.D.

Lesson 7 - Normal Distribution, Probability and Standard Scores

You may also check your understanding of the material on the Wadsworth web site. Click on the Publisher Help Site button.

 

Homework - Lesson 7

Any student may may do the assignments from any area. You may run through this work an unlimited number of times. If you make errors, you will be referred to the appropriate area of the book for re-study.

 

.

 

Assessment - Lesson 7

You will have two options to take the quiz. If you fail to achieve 100% on the quiz, you will not able to advance to the next lesson. After failing on the second take, please email the instructor at ed602@mnstate.edu so remedial action can be taken.

Homework and Quizzes are on Desire 2 Learn. Click on the Desire 2 Learn link, log in, select the Homework/Quizzes icon and choose the appropriate homework or quiz.

Assignments and Information

 
Reading: Chapter 5
  Definition Page: Contains definitions arranged alphabetically.

Notes:
So far we have been plotting and analyzing distributions that come from actual data that we have collected. In the notes, the homework and the quizzes, we have real scores from real samples. These distributions are called empirical distributions. There is another set of distributions called theoretical distributions. These are hypothetical distributions that only appear under certain unusual circumstances that may or may not actually exist. This chapter, on the standard normal curve, and standard scores is based on the information you can get from these theoretical distributions.

The normal distribution

  • A symmetrical, continuous, and asymptomatic bell shaped distribution of scores
  • Scores bunch up around the mean
  • The mean, median, and mode are all the same number
The normal curve
  • A graph of the normal distribution
  • The curve never actually touches the x-axis
  • When graphed, it generally shows 6-8 standard deviations (3-4 one either side of the mean)
  • If we divide the distribution up into standard deviation units, a known proportion of scores lies within each portion of the curve.
  • Tables exist so that we can find the proportion of scores above and below any part of the curve, expressed in standard deviation units. Scores expressed in standard deviation units, as we will see shortly, are referred to as Z-scores.

Area under the normal curve

Looking at the figure above, we can see that 34.13% of the scores lie between the mean and 1 standard deviation above the mean. An equal proportion of scores (34.13%) lie between the mean and 1 standard deviation below the mean. We can also see that for a normally distributed variable, approximately two-thirds of the scores lie within one standard deviation of the mean (34.13% + 34.13% = 68.26%).

13.59% of the scores lie between one and two standard deviations above the mean, and between one and two standard deviations below the mean. We can also see that for a normally distributed variable, approximately 95% of the scores lie within two standard deviations of the mean (13.59% + 34.13% + 34.13% + 13.59% = 95.44%).

Finally, we can see that almost all of the scores are within three standard deviations of the mean. (2.14% + 13.59% + 34.13% + 34.13% + 13.59% + 2.14% = 99.72%) We can also find the percentage of scores within three standard deviation units of the mean by subtracting .13% + .13% from 100% (100.00% - (.13% + .13%) = 99.74%). (The difference in these totals 99.72, and 99.74 is due to rounding)
Standard normal distribution (aka, unit normal distribution)

  • Has a mean of 0
  • Has a standard deviation of 1
  • Scores from this distribution are called z scores

Standard scores (z scores)

  • Changing raw scores to z scores allows for comparisons across measurement tools that are not already equivalent
  • It is only possible to change raw scores to z scores if the raw scores come from a relatively normally distributed set of scores
  • Positive z scores indicate a score above the mean, negative z scores indicate a score below the mean. A z score of zero indicates a score at the mean.

Standard score formula

 

 

 

Using the table of the standard normal distribution
Statistical tables exist that allow us to look up the exact proportion of scores between any Z-score and the mean. Such a table of values is found in your textbook as Appendix A, table A.1 Proportions of Area under the Standard Normal Distribution. In the first column of this table are listed Z scores from 0.00 on page 487 to 3.70 on page 490. In the second column is the area from the given Z-score to the mean, expressed as a proportion. To express this proportion as a percentage, multiply it by 100, i.e. move the decimal point two places to the right, and add the percent sign. In the third column is the area beyond a given z on the side opposite the mean. This Table allows us to solve many problems involving scores that can be related to the normal curve. The most obvious being percentile ranks.

Properties of Discrete Probability Distribution

  • The standard normal curve is a discrete probability distribution
  • The probability of any event is between 0.00 (will never occur) and 1.00 (will always occur)
  • The individual probabilities in a probability distribution will always add to 1.00
  • The probability of an event comprised of mutually exclusive outcomes can be found by adding the probabilities of those outcomes

So how does all of this relate to our BASC study? We have already graphed the scores from the students and noted that the distribution is fairly symmetrical and unimodal. We can therefore convert our scores to z scores, and use those z scores when manipulating the data.

For example, if we know that students with BASC scores more than 1.5 standard deviations above the mean (so a z score more than 1.5) tend to need behavior interventions to succeed in school, we can take any score and look to see if that student will need help. Jessie has a score of 60 on the BASC. Does she need referral for support services? First we look back at past chapters to find that the mean for our data is 51.4 and the standard deviation is 11.12. Then we take the z score formula from the notes above

=
60-51.4
=
9.6
=
.086


11.12
11.12

Since Jessie’s z score is less than 1.5, she will not need added supports in school.

Now let’s find her percentile ranking using the z score and the z score table. On page 489 you can see that the proportion of the distribution that falls between the mean and a z score of 0.86 is 0.3051. Because Jessie’s score is positive (or above the mean) we add the proportion below the mean (0.5000) and get a proportion of 0.8051. This proportion, times 100 gives us a percentage of 80.51% of the scores are below Jessie’s score. She is at the 80.51th percentile.

Next lets assume that your school wants to offer supports to anyone who falls at or above the 90th percentile in this study. The 90th percentile is the same as a proportion of 0.9000. subtracting the 0.5000 for the proportion of students below the mean, we are left with a proportion of 0.4000. Looking at the z score table on page 489 we see that there is no listing for a proportion of 0.4000 in the second column. We need to find the number that is closest. 0.3997 is a good close estimate (and being slightly under our goal number will include anyone at 0.4000 which is better than being close but over). 0.3997 is paired with a z score of 1.28. We can take that z-score back to the z score formula to find the cut off score we are looking for.

 

 

1.28
=
X-51.4

Multiply both sides by 11.12 to get rid of the fraction.
11.12

Which gives us:

14.2336 = X - 51.4
Add 51.4 to both sides to isolate X. resulting in: 65.63 = X

So any score above 65 (since BASC scores are given in whole numbers) would warrant a referral for support.

Just to give you practice in working with negative z scores, lets find the percentile rank for a student with a BASC score of 40.

 

=
40-51.4
=
-11.4
=
-1.03


11.12
11.12

A z of –1.03 is paired with a proportion of .3485 between the score and the mean. Because we are below the mean, we subtract this number from 0.5000 (which is the mean) to get a proportion of 0.1515. a score of 40 is at the 15.15th percentile.

Vocabulary

Normal distribution – a theoretical mathematical distribution that specifies the relative frequency of a set of scores in a population.

Standard normal distribution – A normal distribution with a mean of 0 and a standard deviation of 1.

Outcome – each possible occurrence in a probability distribution

Event – the occurrence of a specific set of outcomes in a probability distribution

Probability of occurrence of an event – the number of outcomes comprising an event, divided by the total number of possible outcomes

Discrete outcomes – Outcomes in a distribution that have a countable set of outcomes

Mutually exclusive outcomes – outcomes that cannot occur at the same time

Theoretical probability distribution – a probability distribution found from the use of a theoretical probability model

Empirical probability distribution – a probability distribution found by counting actual occurrences of an event

Standard score – A score obtained by using the transformation z = X - / S