ED 602

Statistical Research for Behavioral Sciences

Brian G. Smith, Ph.D.

Lesson -9

You may also check your understanding of the material on the Wadsworth web site. Click on the Publisher Help Site button.

 

Homework - Lesson 9

Any student may may do the assignments from any area. You may run through this work an unlimited number of times. If you make errors, you will be referred to the appropriate area of the book for re-study.

 

.

 

Assessment - Lesson 9

You will have two options to take the quiz. If you fail to achieve 100% on the quiz, you will not able to advance to the next lesson. After failing on the second take, please email the instructor at ed602@mnstate.edu so remedial action can be taken.

Homework and Quizzes are on Desire 2 Learn. Click on the Desire 2 Learn link, log in, select the Homework/Quizzes icon and choose the appropriate homework or quiz.

Assignments and Information

 
Reading: Chapter 14
  Definition Page: Contains definitions arranged alphabetically.

 

Regression in statistics is all about prediction. The regression line is used to calculate or predict a value on one variable when you know the value for the other variable. Using the regression equation gives you a regression line.

Relationship between regression and correlation

  • Similarities
    • Closely related
    • If there is a high correlation you can use r to find the slope of the regression line
  • Differences
    • Correlations are used to support theories
    • Regression is used to make predictions (applied)
    • Sample size – correlation is often done with small samples (30+) and regression is usually done with large samples (100s +)
    • Regression equations only work with interval or ratio scale, there is no rho equation like correlation has

Linear relationships

  • For each change of one unit in X variable there is a constant change in the y variable
  • Changes can be positive or negative
  • Uses the formula Y = bX + a for predicting Y
  • The slope of the line is b which can be found with the formula: b = r (Sy/Sx)
  • The Y intercept of the line is a, which can be found with the formula: a = Y bar – b(X bar)

Linear regression line

  • The straight line that minimizes the difference between the sum of the real Ys minus the predicted Ys squared or S(Y-Y1)2, is called the least squares regression line
  • This line also uses the formula Y = bX + a
  • The slope for a regression line is found with b = r (sy/sx) if you know r -
    or this formula if you don’t know the Pearson bivariate correlation.

  • The Y intercept of the line is a, which can be found with the formula a = Y bar – b(X bar)

Standard error of estimate

  • Similar to a standard deviation, but is the standard deviation of the regression line rather than a set of data
  • Can be found using the formula

Now we can apply this information to our BASC study. If we grab the data from lesson 8 BASC scores and sociability and do a few calculations we get this handy table.

First we need to find the regression equation for these sets of data. We’ll find b first, using the formula:

b = -4203.6 / 2966 = -1.417

 

Next we find the Y intercept

a = Y bar – b(X bar) = 54.76 – (-1.417)(51.4) = 54.76 – (-72.8338) = 127.594

so we get a regression equation of

Y1 = -1.417X + 127.594

We can now use this equation to predict sociability scores for students who have taken the BASC. For example, if I did a BASC on Jenny, and she came up with a score of 42 we could use our regression equation to predict that she would get a 68 on the sociability measure.
Y1 = (-1.417)(42) + 127.594 = -59.514 + 127.594 = 68.08

The standard error for this estimate can be calculated if you remember from lesson 8 that r for these two data sets is –0.892.

 

Regression Toward the Mean

Please read this information quoted from another text Statistics for the Behavioral Sciences.(Gravetter,2004)

Regression toward the mean, or statistical regression, has an effect on our predicted Y scores. Our predicted Y will tend to be closer to the mean than the true score would be. So a scores below the mean will be predicted a little too high, and scores above the mean will be a little too low. The effect is stronger the farther the predicted score is from the mean.

 

SPSS Tips:

 

 

Vocabulary

Linear relation – A relation between two variables such that each time that variable X changes by 1 unit, variable Y changes by a constant amount.

General equation of a straight line: Y = bX + a Where y is the score of the Y variable, X is the score of the X variable, b is the slope of the line, and a is a constant called the Y-intercept.

Slope of a straight line – The slope of a line is the change in value of Y divided by the change in the value of X.
The equation is b = Y2-Y1/X2-X1.

Y-intercept – The value of Y when X is equal to zero in the general equation of a straight line.

Y prime ( Y1 ) – The value of Y predicted from using a linear regression equation.

Least-squares regression line – a straight line that minimizes the value of S (Y-Y1)2

Residual – The measure of the error between a measured Y and a predicted Y, i.e., the value of Y-Y1

Sum of the squared residual (SSresidual) – The value of S (Y-Y1)2. This number is not very useful in and of itself, but it is used to calculate the standard error of estimate.

Standard error of estimate ( sy.x) – a measure of the accuracy of prediction, calculated by using the formula.

 

Works Cited

Gravetter, F. J., & Wallnau, L. B. (2004). Statistics for the Behavioral
Sciences (6th ed). Belmont, CA: Wadsworth.