Data analysis with graphs
Statistics
Statistics is gathering, organization, analysis and presentation of numerical information
Raw Data
Raw data is the unprocessed information collected for a study
Variable
Variable is the quantity that is being measured
Frequency table
gives an overview of the distribution of values of the variable and reveal trends in a data
Frequency diagram
gives an overview of the distribution of values of the variable and reveal trends in a data
Classes
Bar graph
Relative frequency
Shows the frequency of a data group as a fraction or percent of the whole data set
Categorical data
Given Labels rather than being measured numerically
Example: surveys of blood types, citizenship, favorite foods
Circle graphs
often used instead of bar graph to illustrate categorial data
pictographs
Statistics Concept Map
Indices
Another way to summarize data and recognize trends
Index
relates the value of a variable to base level which is often the value on a particular data
The base level is set so that the index produces number that are easy to understand and compare
Time series graph
Used to show how indices change overtime
Consumer price index
Most widely reported of these economic indices
Inflation
Increases in price which corresponds to a decrease in the value of money
consumer price index and cost of living index are not the same
Readability index
Gunning fog is a measure of readability index
used to estimate the years of schooling required to read the material easily
Sampling Techniques
Population
Refers to all individuals who belong to a group being studied
Example students in your school
Sampling frame
The group of individuals who actually have a chance of being selected
Simple Random Sample
Every member has an equal chance of being selected and the selection of any particular individual does not affect the chance of others
Select names by drawing names randomly or by assigning each member with a number then using a random number generator to pick
Systematic Sample
You go through the population sequentially and select members at regular intervals
Stratified Sample
group of members who share common characteristics
Example: Gender, age, education level, etc
Has the same proportion of members from each stratum as the population
Cluster Sample
If certain groups are likely to be represent of the entire population, you can use a random selection of such groups as a cluster sample
Examples: Fast food chain could save time and money by severing all its employees at random selected locations instead of surveying randomly selected employees throughout the chain
Multi Stage Sample
Uses several levels of random sampling
Example: population consisted of all Ontario households, you could first randomly sample all cities and townships then sample all subdivisions or blocks then finally sample from all the houses
Voluntary Response Sample
The researchers simply invites any member of the population to participate in the survey
Convenience Sample
Sample is selected because it is easily accessible
Bias in Surveys
Sample Bias
This occurs when the sampling frame does not reflect the characteristics of the population
Non Response Bias
This occurs when particular group are under represented in a survey because they choose not to participate
Measurement Bias
This occurs when the data collected method consistently either under or overestimates a characteristics of the population
Response Bias
This occurs when participants in a survey deliberately give false answers
Measures of Central Tendency
Used to summarize a set of data
Mean
defined as the sum of the value of a variable divided by the number of values
Mean is also known as average
Median
Middle value of the data when they are ranked highest to lowest
Mode
the value that occurs most frequently in a distribution
Some distributions do not have mode while others have several
Outliers
values distant from majority of the data