Statistic2

oceanography

Narawit Kamaiにより

HanaTopicsOverview

Prathap Reddyにより

DERECHO A LA INFORMACIÓN Y PROTECCIÓN DE DATOS PERSONALES

JOSE RODRIGUEZにより

Black Hole

Santiago Barreraにより

Statistic

Testing a claim

confidence intervals and two sided test

confidence interval cannot be used in place of a significance test for one sided test

z test form the population mean when sd is known

statiscal significance

significance level

significance test

p value

test statistics

hypothesis

alternative

null

Estimating with confidence

when population sd is known

independence

normality

srs

when population sd is unkown

t distribution,one sample

confidence level C

confidence interval:statistic+,-marginal error

Sampling distribution

bias and variability

variability decribed by spread

larger samples give smaller spread

determined by sampling design

unbiased if u=x

sample proportions

distributions of values taken by the statistic in all possible samples of the same size from the same populations

sample mean

sampling variability

statistic

parameters

Binomial & Geometric distribution

geometric distribution

mean and standard deviation

P(x=n)=(1-p)^n-1p

binomial distribution

nomal approximation

condition

representations

mean and standrad deviation

formulas

B(n,p)

conditions

continuity correction

Bernoulli distribution

x= success1,x=0 failure

two outcomes of interest

random phenomena

More about relationships between two variables

relations in categorical data

simpson's paradox,anassociation holds for all groups

two way table

conditional distribution

entry/row total

entry/column total

marginal distribution

column sums

row sums

establishing causation

criteria for causation

the alleged cause precedes the effect in time

strong association

large values of y

the alleged cause is plausible

consistent association

confounding effect z~y,x?y

common response z~x,y

causation usually from experiment x~y

transforming the variable

power law mode

take logarithm of both sides,lny=lna+plnx

y=ax^p

exponential growth

increase by a fixed percent

lny=lna+xlnb

linear growth

increased by fix amount

y=ab^x

Examing relationship

assessing model quality

residual plot

no obvious pattern

mean of residuals is always 0

assess how well the regression line fits the data

residualsagainst y

coefficient of determination

lurking variable, neither x nor y, but influence the interpretation of relationship among x and y

regression line

extrapolation: predict outside the range of values of x,not accurate

predict y

r^2 percent of variation in y can be explained by the least squares regression line relating y&x

line passes through

correlation

away from 0 to +,-1,relation gets stronger

r has a value of 1, r=+,-1,perfect straight line relation

r,measure direction and strength of linear relationship

scatterplots

different colors or symbols for categories

direction, form, strength, overall pattern, association (+,-),linear,outliers

data

explanatory variable y, response variable x

categorical or quantitative

Describing location in a distribution

density curve

median: balance point

mean:equal areas point

area under the curve

proportion

total is 1

nomal distribution

standard normal distribution

N(0,1)

no shape change for linear transformation

probability density function

empirical rule,N()

99.7% fall within of

95% fall within of

68% fall within of

symmetric,unimodel and bell-shaped

assessing normality

normal probability plot, linear/straight line

proportion of observation, empirical rule

graphical display-bell shaped

meausre of relative standard

percentile (less than or equal to)

z-score

chebyshev's inequality (the distribution most be skewed,100(1-1/k^2)

Exploring Data

Comparing distribution

quantitative values

side by side boxplots

back to back stemplots

categorical data: side by side bar graph

Changing uni of measure

linear transformation x: y=ax+b

IQR bR

standard deviation bs

median a+bM

mean a+bx

Describing graphical displays

Mean and standard deviation (for symmetric distribution, free of outliers)

standard deviation

spread,outliers & skeweness

always positive or 0

variance

Five number summary: box plot (for skewed distribution)

outliers: Q1-1.5IQR,Q3+1.5IQR

Minimum

Maximum

range IQR

median

Q3: 75%

Q1: 25%

shape

bell shaped (inverted bell)

uniformed

skewed

symmetric

mode,center,spread,clusters,gaps,outliers

Display

Tree plot

time on the horizontal axis

variable on vertical axis

Quantitative data

ogive

cululative frequency

histogram

relative frequency

frequency

stem plot

trimming

splitting stem

back to back

Categorical data

dotplot

bar chart

Pie chart

The text delves into the statistical analysis of relationships between variables, focusing on correlation and regression techniques. Correlation measures the direction and strength of a linear relationship, where values range from -1 to +1, indicating the strength and direction of the relationship.