INSIGHTS INTO DATA
GOALS
1.
Represent
data graphically.
2.
Describe
data numerically.
3.
Describe
the relationship between two variables.
4.
Identify
the degree of correlation between variables.
5.
Describe
a linear relationship with an equation of a straight line.
6.
Design,
conduct and analyze ways of gathering data: surveys, simulations, experiments.
7.
Use
random samples in gathering data.
8.
Analyze
representations of data.
9.
Draw
conclusions based on given data sets and representations of data.
10.
Recognize
possible bias in sample surveys.
11.
Determine
whether representations of data (numerical and visual) are appropriate.
12.
Become
aware of the questions that should be asked when analyzing data and
representations of data.
1.
Some
conclusions you draw from a graph may be very obvious – if there are clusters
of data or outliers.
2.
Other
conclusions may be more complex – a description of a typical data point.
3.
Careful
examination of a graph may raise new questions requiring more research to provide
answers.
1.
A
population is a group of people or a set of objects you want to gather
information about.
2.
When
taking a sample, it’s important to do so randomly, so each member of the
population has an equal chance of being selected.
3.
You
can also collect data by designing and running an experiment or simulation.
4.
Sampling
bias should be avoided; some possible
causes of bias are: incorrectly
choosing the sample, neglecting to account for the people who do not respond,
and letting interviewers select the people they want to interview.
SECTION C: INTERPRETING GRAPHS
1.
Data
must be reliable and presented appropriately in order to make accurate
conclusions.
2.
Pictorial,
line, bar, histograms, scatterplots, and box plots are only a few ways of
presenting data.
3.
Data
may be misrepresented if one or more of the following occurs:
a.
The
graph’s axes are scaled improperly
b.
origins
on the graph are excluded
c.
three
dimensional pictures are used inappropriately
d.
numbers
that should not be compared are compared
e.
pictures
that do not fit the numbers are used
SECTION D: USING PLANT GROWTH
DATA
1.
Mean,
median, and mode can numerically describe the “typical” or “normal” number of a
data set.
2.
A
plot over time allows you to look for trends in the day to day change of the
data.
3.
A
histogram is a general picture of the data, allowing you to see clusters and
gaps in the data.
4.
A
box plot provides a summary of the 5 major data points: minimum, the 1st quarter, the
median, the 3rd quarter, and the maximum. It shows the big picture but you lose the small details.
1.
Scatter
plots show info about pairs of data and whether a relationship exists between
them. (look at the trend)
2.
Points
close to forming a straight line show a strong correlation. Scattered points show weak correlation.
3.
A
strong correlation does not mean there is a cause-effect relationship. Other tests are needed to see whether a
change in one variable (the x-axis independent variable) causes a change in the
other (the y-axis dependent variable).
SECTION F: LINES THAT SUMMARIZE
DATA
1.
If
the relationship between two variables appears to be linear, a line can be
found to describe it.
2.
Straight
lines can be used to predict unknown values, check existing values, and compare
different values.
3.
The
slope of the line can be expressed in terms of the data, and can lead to
statements such as ‘When the
increases by , the increases or decreases by .’
4.
The
best fit line drawn by hand is used to capture the trend of the data. The least squares linear regression line is
based on the mean of the data. The
median fit line is used to find medians as representative points. A curve is used to describe data that is not
constant, such as growth.
MY NOTES
(you may want to make some notes here to remember how to generate random
#’s, create a histogram, specifics on a box plot, or how to find the different
lines of best fit )