Data analysis with graphs

Statistics

Statistics is gathering, organization, analysis and presentation of numerical information

Raw Data

Raw data is the unprocessed information collected for a study

Variable

Variable is the quantity that is being measured

Frequency table

gives an overview of the distribution of values of the variable and reveal trends in a data

Frequency diagram

gives an overview of the distribution of values of the variable and reveal trends in a data

Classes

Bar graph

Relative frequency

Shows the frequency of a data group as a fraction or percent of the whole data set

Categorical data

Given Labels rather than being measured numerically

Example: surveys of blood types, citizenship, favorite foods

Circle graphs

often used instead of bar graph to illustrate categorial data

pictographs

Statistics Concept Map

Indices

Another way to summarize data and recognize trends

Index

relates the value of a variable to base level which is often the value on a particular data

The base level is set so that the index produces number that are easy to understand and compare

Time series graph

Used to show how indices change overtime

Consumer price index

Most widely reported of these economic indices

Inflation

Increases in price which corresponds to a decrease in the value of money

consumer price index and cost of living index are not the same

Readability index

Gunning fog is a measure of readability index

used to estimate the years of schooling required to read the material easily

Sampling Techniques

Population

Refers to all individuals who belong to a group being studied

Example students in your school

Sampling frame

The group of individuals who actually have a chance of being selected

Simple Random Sample

Every member has an equal chance of being selected and the selection of any particular individual does not affect the chance of others

Select names by drawing names randomly or by assigning each member with a number then using a random number generator to pick

Systematic Sample

You go through the population sequentially and select members at regular intervals

Stratified Sample

group of members who share common characteristics

Example: Gender, age, education level, etc

Has the same proportion of members from each stratum as the population

Cluster Sample

If certain groups are likely to be represent of the entire population, you can use a random selection of such groups as a cluster sample

Examples: Fast food chain could save time and money by severing all its employees at random selected locations instead of surveying randomly selected employees throughout the chain

Multi Stage Sample

Uses several levels of random sampling

Example: population consisted of all Ontario households, you could first randomly sample all cities and townships then sample all subdivisions or blocks then finally sample from all the houses

Voluntary Response Sample

The researchers simply invites any member of the population to participate in the survey

Convenience Sample

Sample is selected because it is easily accessible

Bias in Surveys

Sample Bias

This occurs when the sampling frame does not reflect the characteristics of the population

Non Response Bias

This occurs when particular group are under represented in a survey because they choose not to participate

Measurement Bias

This occurs when the data collected method consistently either under or overestimates a characteristics of the population

Response Bias

This occurs when participants in a survey deliberately give false answers

Measures of Central Tendency

Used to summarize a set of data

Mean

defined as the sum of the value of a variable divided by the number of values

Mean is also known as average

Median

Middle value of the data when they are ranked highest to lowest

Mode

the value that occurs most frequently in a distribution

Some distributions do not have mode while others have several

Outliers

values distant from majority of the data

Discrete variable

certain separate values

Example: number of students in each class

Continuous variable

Value within given range

Example: Hight of a student in your school

Intevals

When the number of values is large, data are usually grouped into classes or intervals

convenient to use from 5 to 20 equal intervals that cover the entire range from smallest to largest

Histogram

Specials form of bar graph. The bar in a histogram are connected and represent a continuous range of values

They are used for variables whose values can be arranged in numerical order

Example: weights, temperature or travel time

Frequency polygon

illustrate the same information as a histogram or bar graph

Researchers, advertisers, professors and sport announcers all use statistics

Used to report on a wide variety of variables including prices and wages, ultraviolet levels in sunlight and even the readability of textbook