Frequency
Distributions
G&W Ch. 2
Parameters and Statistics
When describing data, it is necessary to distinguish whether the data come from a population or a sample.
Typically, every population
parameter has a corresponding sample statistic.
-
Parameter—a value that describes a population
-
Statistic—a value that describes a sample
Descriptive Statistics
- techniques used to summarize, organize, and simplify data
-
can't look at it all - get a quick, good impression
Inferential Statistics
-
techniques used to study samples and then make generalizations
about the populations from which they were selected
Variables
discrete - separate categories. No values can exist between two neighboring categories (e.g., dice)
continuous - infinite fineness. There are an infinite number of possible values that fall between any two observed values (e.g., time)
-
each score corresponds to an interval of the scale
- the boundaries that separate these intervals are called real limits
Frequency Distribution
Goal: simplify the organization and
presentation of data
Definition: lists or displays
graphically the number of individuals located in each category on the scale of
measurement
-
takes a disorganized set of scores and places them in order from
highest to lowest, grouping together all individuals who have the same score
-
see the set of scores “at a glance”
-
frequency distribution can be structured either as a table or
graph but both display (1) all categories that made up the measurement scale and
(2) the number of individuals in each category
Frequency Distribution Tables
data:
3,1,1,2,5,4,4,5,3,5,3,2,3,4,3,3,4,3,2,3
-
X—categories of the measurement scale (usually listed highest to
lowest). To calculate the sum of scores, must use both X and f columns
-
f—frequency, number of individuals in that category
-
To obtain the total number of individuals in the data set, add up
the frequencies
- p—proportion, the proportion of the total number of responses that fall into this category (p = f/N)
Cumulative frequencies (cf) show the number of individuals located
at or below each score
-
Cumulative percentages (c%) show the percentage of individuals
accumulated as move up the scale
X f
p %
cf c%
5 3
.15 15%
20 100%
4 4
.20 20%
17 85%
3 8
.40 40%
13 65%
2 3
.15 15%
5 25%
1 2
.10 10%
2 10%
Grouped
frequency distribution tables—group the scores into intervals and list these
intervals in the frequency distribution table.
The wider the interval, the more information that is lost.
Should have about 5-10 intervals depending on range of data. Width
of each interval should be an easy number (5 or 10) and all intervals should be
the same width.
rank
or percentile rank—the percentage of individuals in the distribution with
scores at or below the particular value.
Frequency distribution graphs/charts
-
x-axis (abscissa)—lists the measurement scale categories
-
y-axis (ordinate)—lists the frequencies
histogram—a
bar is drawn above each X value, so that the height of the bar corresponds to
the frequency of the score. If data
is from interval or ratio scale, the bars are draw on so that adjacent bars
touch each other. The touching bars
produce a continuous figure, which emphasizes the continuity of the variable.
bar
graph—like a histogram, a bar is drawn above each X value, so that the height
of the bar corresponds to the frequency of the score. If data is from nominal or
ordinal scale, graph is constructed with space between the bars.
frequency
distribution polygon—used with interval or ratio scales instead of a
histogram. A single dot is drawn
above each score so that the height of the dot corresponds to the frequency.
The
Shape of a Frequency Distribution
Measures of Shape of a Distribution
1. Skewness
Measures to what extent a distribution of values deviates from symmetry around the mean. A value of zero represents a symmetrical or balanced distribution. Skewness values between +/- 1.0 are considered excellent for most purposes, but values between +/- 2.0 are also acceptable in many cases, especially for basic research.
Positive skew--Tail is in the positive direction. There are fewer larger scores than we would expect with a normal distribution.
Negative skew--Tail is in the negative direction. There are fewer smaller scores than we would expect with a normal distribution.
2. Kurtosis
Measures the "peakedness" or "flatness" of a distribution. A value of zero indicates the shape is close to a normal distribution. A positive value indicates a distribution is more peaked than normal. A negative value indicates a distribution is flatter than normal. As with skewness, a value of zero represents a distribution that is shaped very similarly to a normal distribution. Kurtosis values between +/- 1.0 are considered excellent for most purposes, but values between +/- 2.0 are also acceptable in many cases, especially for basic research.