ED 602 Statistical Research for Behavioral Sciences Brian G. Smith, Ph.D. |
|
Definitions a – The number of levels of a factor K. Archival records research - answering scientific questions from information in existing records. If we were basing our measure of student behavior on information in their permanent school records, that would be archival records research. Meta analysis and historical research are two popular forms of archival records research. Alpha (α) – The value of the significance level stated as a probability. In research we often want an alpha of 0.05 or less. Alternative hypothesis – A statement of what must be true if the null hypothesis is false. The alternative hypothesis is that there is a significant difference between the scores of students who are Caucasian and the scores of students who are Hispanic, and the test is biased. Analysis of variance (ANOVA) – A statistical test used to analyze multilevel designs. Bar graph - a graph used to present a frequency distribution for qualitative data. Beta (β) – The probability of a Type II error. Between groups variance – The variance calculated using the variation of the group means about the grand mean. Between subjects design – An experiment in which two or more groups are created. Bivariate distribution – A distribution in which two scores are obtained from each subject. Central limit theorem – A mathematical theorem stating that, as a sample size increases, the sampling distribution of the mean approaches a normal distribution. Class interval - the range of score values into which the raw scores are grouped in a grouped frequency distribution Cluster sampling – Randomly choosing intact groups like families, classes, offices, rather than individuals. For simplicity purposes, it would be possible to have intact groups take the survey. If you could make sure that all the students in 7th hour English are not Hispanic, and all the students in 3rd hour English as a Learned Language class are, then you could give the survey to those two intact groups. Coefficient of determination ( r2 ) – The value of r2 indicating the common variance of variables X and Y. Confidence interval – A range of score values expected to contain the value of m with a certain level of confidence. The 95% confidence interval for our BASC scores is from 47.05 to 55.75 points. We are 95% certain that the true population mean is within this range of scores. Confidence limits – The lower and upper scores defining the confidence interval. The lower confidence limit for our 95% confidence interval is 47.05. The upper limit is 55.75. Confounded experiment – An experiment in which an extraneous variable is allowed to vary consistently with the independent variable. Constant - a value that does not change for the duration of a study. Gender is usually constant during a study. Continuous Variable - A variable that can assume an infinite set of values between any two levels of the variable. A studentís height can be measured with ever increasing accuracy (1.7 meters, 1.73 meters, 1.732783956 meters, etc.) so it is continuous. Convenience sampling – Obtaining participants from among people who are accessible or convenient to the researcher. Correlation coefficient – Any descriptive statistic that expresses the degree of relationship between two variables. Correlational method - A type of research in which two or more variables are measured but not manipulated, and the relationship between the variables is assessed. If we were looking for a relationship between BASC scores and absenteeism, we would be doing correlational research. Correlational studies – Studies in which two or more variables are measured to find the direction and degree to which they covary. Covariability – Two variables covary when a change in one variable is related to a change in the other variable. Critical difference – The minimum numerical difference between two treatment means that is statistically significant. Critical value – The specific numerical values that define the boundaries of the rejection region. Sum of Products of X and Y ( CPXY) - The value of S (X-)(Y-) for variables X and Y. Cumulative frequency of a score -( cf ) - the frequency of occurrence of a score plus the sum of the frequencies of all the scores of lower value. Curvilinear relationship – A relationship between two variables that is not linear. It starts with a positive or negative trend, but at some midpoint changes direction forming a U-shaped curve. Data - the scores or measurements obtained by a scientist doing research. This can be test scores, survey answers, observations made, anything recorded by the scientist for later analysis. Degrees of freedom – the number of scores free to vary when calculating a statistic. Dependent variable - The variable that is measured in an experiment. If we are testing the effects of high and low sugar diets on child behavior, then we watch to see if the groups will have different levels of misbehavior once the study begins. The misbehavior of each group in response to the new diets, is the dependent variable. Discrete outcomes – Outcomes in a distribution that have a countable set of outcomes Discrete Variable - A variable that can be measured only with a finite set of values. The number of students in a classroom is a discrete variable. Empirical data - scores or measurements based on sensory experience or observation. Responses to the BASC survey are empirical, because we can see or observe them. Empirical probability distribution – a probability distribution found by counting actual occurrences of an event. Equivalent groups – Groups of participants that are not expected to differ in any consistent or systematic way prior to receiving the independent variable of the experiment. Error rate in an experiment (alpha level) – The probability of making at least one Type I error in the statistical comparisons conducted in an experiment. Estimated Standard error of the difference between means – The standard error of the difference between means obtained by using s2 to estimate σ2 Event – the occurrence of a specific set of outcomes in a probability distribution Experimental method - manipulating one or more independent variables in a carefully controlled situation. Randomly assigning students to one of two groups and teaching one group with a new spelling method, and the other group with the traditional method would be experimental research. Confounding variables – Any variables, other than the independent variable, that can affect the dependent variable in an experiment. Factor – an alternative name for independent variable. Frequency distributions - A table showing each score in a set of scores and the number of times it occurred
Frequency polygons - Connected dots indicating the frequency at the midpoints of classintervals with straight lines. General equation of a straight line: Y = bX + a Where y is the score of the Y variable, X is the score of the X variable, b is the slope of the line, and a is a constant called the Y-intercept. Grand mean – The mean of all scores in an experiment. Histograms - a bar graph in which size of the class interval is represented by the width of the bar on the abscissa (x axis). and the frequency of scores in the class interval is given by the height of the bar. Independent variable - The variable that is manipulated in an experiment. If we are testing the effects of high and low sugar diets on child misbehavior, then the diet is what changes for each group of students being studied. Diet is the independent variable. Interval Scale - Assigning numerical values to a variable with an arbitrary zero point. A zero on the BASC survey does not mean that the student has an absence of behavior, so it is measured on an interval scale. Least-squares regression line – a straight line that minimizes the value of S (Y-Y1)2 Level of an independent variable – One value of the independent variable. To be a variable, an independent variable must take on at least two different levels. For example there can be males and females as two levels, or there can be several levels of household income, etc. Linear relation – A relation between two variables such that each time that variable X changes by 1 unit, variable Y changes by a constant amount. Linear relationship – A relationship between two variables that can be described by a straight line. Mean square - The name used for a variance in the analysis of variance. Measurement - Assigning numbers to variables following a set of rules. For example, the BASC comes with a strict set of rules for scoring, or measuring, student responses. Measures of Central Tendency - Numbers that represent the average or typical score obtained from measurements of a sample. Mean, median and mode are the three most common measures of central tendency. Measures of Variability - Numbers that indicate how much scores differ from each other and the measure of central tendency in a set of scores. These will be covered in detail in later chapters, but variance and standard deviation are two examples of measures of variability. Median - A score value in the distribution with an equal number of scores above and below it. The median is the 50th percentile in a distribution. Mode - the most frequently occurring score in a distribution
Multiple comparison test – Statistical tests used to make pairwise comparisons to find which means differ significantly from one another in a one factor multi-level design. Mutually exclusive outcomes – outcomes that cannot occur at the same time nA – Number of scores in a level of a one-factor design Naturalistic Observation - observing behaviors occurring in natural settings without intruding into the situation. Portions of the BASC that are not being used in our study involve observing the students in the natural setting of their regular classroom. This is also referred to as ethnographic research. Near- Zero relationship – A bivariate distribution that has no obvious relationship between variables. Negative relationship – A linear relationship between two variables in which as the value of one variable increases, the value of the other variable tends to decrease. Nominal Scale - Classification of a measured variable into different categories. We can classify our students as 0 for males and 1 for females. The numbers have no meaning other than to differentiate between the two genders. Nonparametric test – A statistical test involving hypotheses that do not state a relationship about a population parameter. Nonsignificant difference – The observed value of the test statistic does not fall into a rejection region and the null hypothesis is not rejected. Normal distribution – a theoretical mathematical distribution that specifies the relative frequency of a set of scores in a population. Null hypothesis – a statement of a condition that a scientist tentatively holds to be true about a population; it is the hypothesis that is tested by a statistical test. For our BASC study we hope to show that there is no difference between the scores of students who are Caucasian, and scores of students who are Hispanic. One-factor between subjects design – A research design in which one independent variable is manipulated and two or more groups are created. One-factor multilevel design – An experiment with one independent variable and three or more levels of that independent variable. One-sample t test – A t test used to test the difference between a sample mean and a hypothesized population mean for statistical significance when S is estimated by σ. One-tailed test– a statistical test using a rejection region in only one tail of the sampling distribution of the test statistic. Also called a directional test Operational definition - specifies the procedures used to manipulate an independent variable or to measure a dependent variable. The operational definition for our study is that we are giving written copies of the BASC student self-report survey to our subjects in the controlled environment of the counselorís office. Ordinal Scale - Arranging characteristics of a variable along an ordered continuum from largest to smallest. If we are also interested in the class ranking of our students, their ranking would be measured on an ordinal scale. Outcome – each possible occurrence in a probability distribution Pairwise comparisons – Statistical comparisons involving two means. Parametric test – A statistical test involving hypotheses that state a relationship about a population parameter. Parameter - a single number used to describe a characteristic of a population, often symbolized by a Greek letter. A parameter for 4th grade students might be that the average 4th grade student is 9 years old. Pearson correlation coefficient ( r ) – A statistic that indicates the degree of linear relationship between two variables that have been measured in either interval or ratio level. Percentiles - a score at or below which a specified percentage of the scores ina distribution fall Percentile rank - the percentage of scores in a distribution that are equal to or less than that score Placebo control – A simulated treatment condition. Point estimation – Estimating the value of a parameter as a single point from the value of a statistic. We will be using point estimation to say that the mean we get for our sample is the same as the mean that we would get if we surveyed the entire population. Population - the complete set of people sharing the common characteristic specified by a researcher. So, if a study is on fourth grade students, then the population is ALL the fourth grade students in the world. If a study is on freshmen at MSUM, then the population is ALL the freshmen in MSUM. Population mean (μ) - The sum of all the scores in a population divided by the number of scores summed. This is the same as a sample mean, but using every score for every subject, rather than just a small sampling. Positive relationships – A linear relationship between two variables in which as the value of the first variable increases, the value of the second variable tends to increase as well. Post hoc comparisons – Statistical tests that make all possible pairwise comparisons after a statistically significant Fobs has occurred for the overall analysis of variance. Power – The probability of rejecting the null hypothesis when the null hypothesis is false and the alternative hypothesis is true. The power of a statistical test is given by 1-β. Probability of occurrence of an event – the number of outcomes comprising an event, divided by the total number of possible outcomes Qualitative data - Nominal measurements, which categorize the measured variable. Again, the gender of our students would be qualitative data. Quantitative data - Measurements that provide numerical information about the variable measured, such as the BASC scores. Quasi-Experimental method - A loosely defined type of research that is intended to be run as an experiment, but with limitations to the randomness of the sample. If we tested all the Hispanic students in one English as a Learned Language class and all the Caucasian students in a regular English class rather than drawing names from all the students in the school, we would have a quasi-experimental method. Random – A method of assigning participants to treatment groups so that any individual selected for the experiment has an equal probability of assignment to any of the groups, and the assignment of one person to a group does not affect the assignment of any other individual to that same group. Random sampling – A sampling method in which individuals are selected so that each member of the population has an equal chance of being elected for the sample, and the selection of one member is independent of any other member of the population. Range– The numeric difference between the lowest and the highest scores in a distribution. Raw score - The scores of a subject exactly as collected and before they areanalyzed statistically Ratio Scale - Assigning numerical values to a variable with a scale that possesses a physically real zero point. The studentsí heights, while irrelevant, are measured on a ratio scale. Rejection region – Values on the sampling distribution of the test statistic that have a probability equal to or less than a if the null hypothesis is true. If the test statistic falls in to the rejection region, the null hypothesis is rejected. Research hypothesis - the predicted relationship between an independent and a dependent variable. If we are testing the effects of high and low sugar diets on child misbehavior, then we might have as our research hypothesis, "There will be no significant difference in child misbehavior between the low sugar and high sugar groups." Scientific method - A general approach used by a behavioral scientist to collect data. We will be using the survey method for collecting data on our two groups of students. Residual – The measure of the error between a measured Y and a predicted Y, i.e., the value of Y-Y1 Robustness – A term used to indicate that violating the assumptions of a statistical test has little effect on the probability of a Type I error. Sample - a subset of a population. A sample of our 4th grade population might be the 4th grade students in Mr. Bryantís classroom, or the 4th grade students in Moorhead schools. Sampling distribution – A theoretical probability distribution of values of a statistic resulting from selecting all possible samples of size N from a population. This is like a frequency distribution, but you graph statistics, like mean or mode, rather than raw scores. Sampling distribution of the difference between means – The distribution of differences when all possible pairs of samples of size n are selected from a population and found for each pair of samples. Sampling distributions of the mean – The distribution of mean values when all possible samples of size N are selected from a population. Sampling error – The amount by which a sample mean differs from a population mean. Significance level – A probability value that provides the criterion for rejecting a null hypothesis in a statistical test. Standard error of the difference between means – The standard deviation of a theoretical sampling distribution of values. Sample mean ( ) - The sum of a set of scores divided by the number of scores summed. In adding all our BASC scores I get 1285, then I divide by 25 (the number of scores) and I get a sample mean of 51.40. Scatterplot– A graph of a bivariate distribution in which the X variable is plotted on the horizontal axis and the Y variable is plotted on the vertical axis. Simple random sampling – Selecting members from a population such that each member of the population has an equal chance of being selected for the sample and the selection of one member is independent of the selection of any other member of the population, also known as random sampling. If we put the names of all the students in our school in a hat and drew 25 names, that would be a simple random sample of our students. Significance level – A probability value that provides the criterion for rejecting a null hypothesis in a statistical test. Skewed - when a distribution has scores clustered more at one end than at the other. Slope
of a straight line – The slope of a line is
the change in value of Y divided by the change in the value of
X. Spearman rank-order correlation coefficient ( rs) – a correlation coefficient used with ordinal measurements. Standard deviation – a measure of variability that represents an average of how much scores vary from the mean. Standard error of estimate ( sy.x) – a measure of the accuracy of prediction, calculated by using the formula. Standard error of the mean (μ) – the standard deviation of the sampling distribution of the mean found by dividing s by the square root of the size of the sample. The standard error for our BASC scores happens to be 2.22 points. Standard normal distribution – A normal distribution with a mean of 0 and a standard deviation of 1. Standard score or z-score – A score obtained by using the transformation z = X - / S Statistic - a single number used to describe data from a sample, often symbolized by a roman letter. A statistic for Moorheadís 4th grade classes is that the average age is 9.5. (Note that a statistic is just like a parameter, except for the group of people being described.) Statistical hypothesis – A statement about a population parameter (for a parametric test). Statistical inference – Estimating population values from statistics obtained from a sample. Because we are giving the BASC survey to only a small sample of students we will use statistical inference to see if the BASC survey is culturally biased against Hispanic teens. Statistically significant difference – The observed value of the test statistic falls into the rejection region and the null hypothesis is rejected. Stem and leaf display - A display of data in which the first digit of a score is the stem, and the last digit is the leaf. Stratified random sampling – A sampling method in which members of a population are categorized into homogeneous subgroupings called strata. Members of the population are randomly sampled from the strata in the proportion to which the strata occur in the population. For our BASC study, we are placing our students into strata by cultural background before drawing names from the hat. Strength of effect (size) – The strength of an independent variable as measured by one of the strength of effect statistics. Sum of squares (SS) - A numerical value obtained by subtracting the mean of a distribution from each score in the distribution, squaring each difference, and then summing the differences. This number by itself means very little, but it is a key component of many statistical calculations. Sum of the squared residual -total squared error (SSresidual) – The value of S (Y-Y1)2. This number is not very useful in and of itself, but it is used to calculate the standard error of estimate. Survey method - obtaining data from oral or written interviews with people. Our study involves the student self-report survey portion of the BASC, so we are doing the survey method. Symmetrical frequency distribution - A distribution in which one side is the mirror image of the other side Systematic sampling – A procedure in which every nth person in line or on a list is chosen for the sample. If we don’t have a hat or a random number table handy, we can list all the names on a page and take every 10th person to fill our sample. Test statistic – A number calculated from the scores of the sample that allows testing a statistical null hypothesis. z and t are examples of test statistics. Two-tailed test – A statistical test using rejection regions in both tails of the sampling distribution of the test statistic. Type I error – The error in statistical decision making that occurs if the null hypothesis is rejected when actually true of the population. Type II error – The error in statistical decision making that occurs if the null hypothesis is not rejected when it is false and the alternative hypothesis is true. Unbiased estimator – A statistic with a mean value over an infinite number of random samples equal to the parameter it estimates. The mean of our BASC scores is an unbiased estimator, because if we took a dozen samples of 25 students, the mean for the dozen samples would be very close to the mean of our original sample. Sample variance is a biased estimator, because it consistently underestimates the variance of the population. Variability– How much scores differ from each other and the measure of central tendency in a distribution. Variable - Any environmental condition or event, stimulus, personal characteristic or attribute, or behavior that can take on different values at different times. For example scores on a test can change (think pretest/posttest), weight can change in response to diet or exercise, opinions can change, age can change in longitudinal studies, or teaching methods can be changed. Variance– a measure of dispersion that produces results in terms of square units, and for that reason rarely used. The mean of the squared deviation scores. Repeated measures – A research design in which one group of participants is exposed to and measured under each level of an independent variable. In a within-subjects design, each person receives each treatment condition. |