Statistics II & III
Chapter 8: The Binomial and Geometric distributions
Binomial Distributions
Binomial Formula
Binomial Setting
1) Each observation has only 2 outcomes, "success" and "failure"2) There is a fixed number of observations, n3) The n observations are all independent4) The probability of success, p, is the same for each observation.It is important to reconize which situations binomial distributions can and cannot be used.
Sampling Distribution of a count
Choose an SRS (simple random sample) of size n from a population with proportion p of successes. When the population is much larger than the sample, the count X of successes in the sample has approximately the binomial distribution with parameters n and p.
Binomial Probability
If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0,1,2...nP(X=k)=(n¦k) * (p)^k * (1-p)^(n-k)(n¦k) is known as the binomial coefficientThis counts the number of ways in which k successes can be distributed among n observations.
Probability distribution function (pdf)
Given a discrete random variable X, the pdf assigns a probability to each value of XCalculator function;tistat.binomPdf(n,p,X)
Example
Binomial Distribution.n=10000 ballsp=0.2 are white ballsSRS of 10 ballsWhat is probability there are exactly 2 white ballsP(x=2)= (10¦2) * (0.2)^2 * (0.8)^2= 0.30199tistat.binompdf(10,0.2,2)= 0.30199
Cumulative distribution function (cdf)
Given a random variable X, the cdf of X calculates the sum of probabilities for 0,1,2... till X. It calculates the probability of obtaining at most X successes in n trails.Calculator function;tistat.binomCdf(n,p,X)
Example
Binomial Distribution.p=0.06 are out of shapeSRS of 20 bearsWhat is probability there are more then 3 bearsP(x>3)= 1-P(x=0)-P(x=1)-P(x=2)-P(x=3)= 0.028966tistat.binomcdf(20,0.06,4,20)= 0.028966
Example
Large number of red and white balls25% are redIf balls are picked randomly what is the least number of balls to be picked so that hte probability of getting at least 1 red ball is greater than 0.95?x = no. of red ballsP(x≥1)=1-p(x=0)=1-(0.75)^n1-(0.75)^n>0.95(0.75)^n<0.05n>10.4133n=11
Binomial Mean and standard Deviation
µ=npσ=√(np(1-p))
Normal Approximation to Binomial Distributions
When n is large, the distribution of X is appoxmately NormalCan be used when np≥10 and n(1-p)≥10most accurate when p close to 0.5least accurate when p is near 0 or 1
Geometric Distributions
Conditions
1) Each observation has only 2 outcomes, "success" and "failure"2) The n observations are all independent3) The probability of success, p, is the same for each observation.4) The variable of interest, X, is the number of trails required to obtain the first successThe number of trails in a geometric setting is not fixed but is the variable of interest
Calculating Geometric Probabilities
the probability that the first success occurs on the nth trial;P(X=n)=(1-p)^(n-1)pp=probability of successCalculator function;tistat.geomPdf(p,n)tistat.geomCdf(p,n)
Geometric Mean and standard Deviation
mean = µ=1/pvariance = σ^2=(1-p)/p^2Probability that it takes more than n trials to see the first success isP(X>n)=(1-p)^n
Chapter 9: Sampling Distributions
Parameter and statistic
A parameter is a number that discribe the population;µ, p, σA statistic is a number that can be computed from the sample data without the use of any unknown parameters; x-bar, p-hat, s.
Sampling
Sampling distribution
Distribution of values taken by the statistic in all possible samples of the same size from the same population.
Example
9.5, 9.6, 9.7 (pages 571-573)When discribing a histogramCenter: center of distribution is very close to the true value of pShape: overall shape is roughly symmetric and approximately Normal.Spread: values of p-hat range from 0.2 to 0.55.Since spread is approx Normal, we can use standard deviation to describe its spread.
Sample proportion
We often take SRS of size n and use the p-hat to estimate the unknown parameter p.Mean of sampling distribution is given by pStandard deviation of sampling distribution is given by;√((p(1-p))/n)
Conditions
The formula for standard deviation of p-hat is only used when the population is at least 10 times as large as the sample.Normal approximation is used when np and n(1-p)≥10
Sample mean
Mean of x bar: µStandard deviation of x bar: σ/√n
Conditions
1. The formula for standard deviation of x-bar is only used when the population is at least 10 times as large as the sample.2. The facts above about the mean and standard deivation of x-bar are true no matter what the population distribution looks like.3. The shape of the distribution of x-bar depends on the shape of the population distribution. In particular, if the population distribution is Normal, then the population distribution of x-bar is also Normal.
Bias and variability
Bias: how far the mean of the sampling distribution is from the true value of the parameter being estimated.Variability: spread of its sampling distribution. Larger samples gives a smaller spread.
Central Limit Theorem
For large sample size n>30, the sampling distribution of x-bar is approximately Normal for any population with a finite standard deviation.The mean is given by µ and standard deviation by σ/√nThe sample size n needed depends on the poplatoin. More observations are required if the shape is skewed.
Chapter 10: Estimating with confidence
Confidence intervals
Range of plausible values that are likely to contain the unknown populatoin parameter.Generated using a set of sample data.
Confidence level
confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples.
Confidence interval for a population mean
Known σ
x.bar±z*σ/√n
Conditions
1. the sample must be an SRS from the population of interest.2. The sampling distribution of the sample mean x-bar is at least approximately Normal.If the population distribution is not Normal, central limit theorem tells us that is approximately Normal if n is large.3. Individual observations are independent. 4.The population size is at least 10 times as large as the sample size.
Procedure for inference with Confidence Intervals
1. State the parameter of interest.2. Name the inference procedure and check conditions.3. Calculate the confidence interval.4. Interpret results in the context of the problem.
Reducing Margin of error
The confidence level C decreases (z* gets smaller)The population standard deviation decreasesThe sample size increases
Unknown σ
x.bar±t*s/√ndf=n-1
Conditions
SRS: Data are SRS of size n from population of interest or come from a randomised experimentNormality: Observations from the population have a Normal distribution.It is enough that the distribution be symmetric and single-peaked.Independence: Individual observations are independent. The population size should be at least 10 times the sample size.
t-distributions
Substitute standard deviation σ for standard error s.The resultant distribution is not Normal. It is a t distribution. There is a different t distribution for each sample size n. We specify a particular t distribution by giving its degrees of freedom (df).The density curves of the t distributions are similar in shape to the standard Normal curve. The spread of the t distributions is a bit greater than that of the standard Normal distribution.As the degrees of freedom increases, the t(k) density curve approaches the N(0,1) curve ever more closely.This interval is exactly correct when the population distribution is Normal and approximately correct for large n in other cases.
Robustness of t procedures
Procedures that are not strongly affected by lack of Normality are called robust.t-procedures are not robust against outliersBut they are quite robust against non-Normality of the population, when there are no outliers, even if the distribution is asymmetric.Larger samples improve accuracy of critical values from the t distributions when the population is not Normal. This is because of the central limit theorem.
Using the t procedures
Except in the case of small samples, the assumption that the data are an SRS from the population of interest is more improtant than the assumption that the population distribution is Normal.Sample size less than 15. Use t procedures if the data are close to Normal.Samples size at least 15. The t procedures can be used except in the presence of outliers or strong skewness.Large samples. The t procedures can be used even for clearly skewed distribution when the sample is large (central limit theorem).
Paired t procedures
Matched pairs design or before-and-after measurements on the same subjects
Estimating a population proportion
p.hat±z*√((p.hat(1-p.hat))/n)
Conditions
SRS: The data are an SRS from the population of interestNormality: For a confidence interval, n is so large that both np-hat and n(1-p-hat) are 10 or moreIndependence: Individual observations are independent. When sampling without replacement, the population is at least 10 times as large as the sample.
Choosing the sample size
Margin of error involves the sample proportion of successes, we need to guess this value when choosing mThe guess is called p*Use a guess p* based on pilot study or past experiences with similar studies.Use p* = 0.5 as the guess. Margin of error is largest when p-hat = 0.5.
Chapter 11: Testing a claim
The Basics
Basic:
An outcome that would rarely happen if a claim was true is good evidence that the claim is false.The results of a test are expressed in terms of a probability that measures how well the data and the hypothesis agree.
P-value
Rule of thumb: alpha = 0.05 unless otherwise statedA result with a small P-value (less than alpha) is called statistically significant.
Large P-value
Large P-values fail to give evidence against H0.
Small P-value
Small P-values are evidence against H0 because they say that the observed trait is unlikely to occur just by chance.
Hypotheses
We can have one-sided or two-sided alternative hypotheses.
null hypothesis
The null hypothesis is the statement that this effect is not present in the population.
alternative hypothesis
The alternative hypothesis states that it is present in the population.
Conditions for significance tests
SRS from the population of interestNormality: np > 10 and n(1-p) > 10Independent observations.
Test statistic
The test is based on a statistic that compares the value of the parameter as stated in the null hypothesis with an estimate of the parameter from the sample data.Values of the estimate far from the parameter value in the direction specified by the alternative hypothesis give evidence against H0.Standardise the estimate: Test statistic = (estimate - hypothesised value) / standard deviation of estimate
Carrying out significance tests
General procedure
1. Hypotheses: Identify the population of interest and the parameter you want to draw conclusions about.2. Conditions: Choose the appropriate inference procedure. Verify the conditions for using it.3. Calculations: Calculate test statistic and the P-value.4. Interpretation: Interpret your results in context of the problem.
z-test for population mean
z=(x.bar - µ0)/(s/√n)
P-value
µ>µ0 P(Z>z)µ<µ0 P(Z<z)µ≠µ0 2P(Z≥|z|)
Interpretation
These P-values are exact if the population distribution is Normal and are approximately correct for large n in other cases.Failing to find evidence against H0 means only that the data are consistent with H0, not that we have clear evidence that H0 is true.
Confidence intervals and two-sided tests
A level α 2-sided significance test rejects a hypothesis exactly when the value µ0 falls outside a level 1-α confidence interval for µThe link between 2-sided significance tests and confidence intervals is called dualityFor a two-sided hypothesis test for mean, a significance test (level α) and a confidence interval (level C = 1-α) will yield the same conclusion.
Importance of Significance
Choosing a level of significance
There is no sharp border between "statistically significant" and "statistically insignificant" so giving the P-value allows each of us to decide individually if the evidence is sufficiently strong.
Statistical significance and practical importance
A statistically significant effect need not be practically important.Use confidence intervals to estimate the actual value for parameters as confidence intervals estimate the size of an effect rather than simply asking if it is too large to reasonably occur by chance alone.
Don't ignore lack of significance
There is a tendency to infer there is no effect whenever a P-value fails to attain the usual 5% standard.Lack of significance does not imply that H0 is true.In some areas of research, small effects that are detectable only with lage sample sizes can be of great practical significance.
Statistical inference is not valid for all sets of data
Badly designed surveys or experiments often produce invalid results.Faulty data collection, outliers in the data, and testing a hypothesis on the same data can invalidate a test.Beware of multiple analyses; many tests run at once will probably produce some significant results by chance alone, even if all the null hypotheses are true.
Using inference to make decisions
Type I error
reject Ho when Ho is actually true.
Significance
The significance level α of any fixed level test is the probability fo a Type I error.α is the probability that the test will reject the null hypothesis when it is in fact true.
Type II error
fail to reject Ho when Ho is false.
Probability
1. Calculate when the test stops accepting Ho.2. Use the critical value obtained and standardise using a curve based on alternative hypothesis to find the probability.
Power
The probability that a fixed level α test will reject Ho when a particular alternative value of the parameter is true is called the power of the tests against that alternative.
Increasing power
Increase alpha.Consider a particular alternative further away from the meanIncrease the sample size; decreases standard errorDecrease σ
Chapter 12: Tests about a population mean
One-proportion z test
z = (p.hat - p0)/√((p0-(1-p0))/n)
Conditions
Normality condition: np and n(1-p) ≥ 10
Alternate hypotheses
p>p0 P(Z>z)p<p0 P(Z<z)p≠p0 2P(Z≥|z|)
One-sample t test
t=(x.bar - µ0)/(s/√n)
Alternate hypotheses
µ>µ0 P(T>t)µ<µ0 P(T<t)µ≠µ0 2P(T≥|t|)
Chapter 13: Comparing two population parameters
Conditions
SRS: We have two SRSs, from two distinct populatoins. Independence: The samples are independent. That is, one sample has no influence on the other.When sampling without replacement, each population must be at least 10 times as large as the corresponding sample size.Normality: Both populations are Normally distributed.
Two-sample tests
Two-sample z statistic
z= (x.bar_1-x.bar_2 (μ_1-μ_2))/√((σ_1^2)/n_1 +(σ_2^2)/n_2 )
Two-sample t procedure
(x.bar_1-x.bar_2 )±t* √((s_1^2)/n_1 +(s_2^2)/n_2 )
Two-proportion z interval
(p.hat_1-p.hat_2 )±z* √((p.hat_1 (1-p.hat_1))/n_1 +(p.hat_2 (1-p.hat_2))/n_2 )
Two-proportion z test
z=(p.hat_1-p.hat_2)/√(p.hat_c (1-p.hat_c )(1/n_1 +1/n_2 ))
Robustness
More robust than the one-sample t methods, particularly when the distributions are not symmetric.Choose equal sample sizes if possible.n1 and n2 must both be at least 5.If n1+n2 > 30, the two-sample t procedure can be used even for skewed distributions.
Chapter 14: Chi-square procedures
Chi-square test for goodness of fit
Hypothese
Ho: The actual population proportions are equal to the hypothesised proportions.Ha: At least one of the actual population proportions differ from the hypothesised proportions.
Calculations
x^2=∑(O-E)^2/Edf = k -1P-value = P(X^2>x^2)
Chi-square distributions
The total area under a chi-square curve is equal to 1.Each chi-square curve (except when degrees of freedom = 1) begins at 0 or the horizontal axis, increase to a peak, and approaches the horizontal axis asymptotically from above.Each chi-square curve is skewed to the right. As the number of degrees of freedom icnreases, the curve becomes more symmetric and looks more like a Normal curve.
Chi-square test and the z test
We can compare the 2 proportions using the z test.The Chi-square statistic is the square of the z statistic, and the P-value for Chi-square is the same as the two-sided P-value for z.
Uses
Used to compare two proportions because it gives the choice of a one-sided test and is related to a confidence interval for p1-p2.
Chi-square test for homogeneity of populations
Select SRS from each c populations. Each individual is classified in a sample according to a categorical response variable with r possible values. There are c different sets of proportions to be compared, one for each population.
Hypothese
null hypothes is is that the distribution of the response variable is the same in all c populations. alternative hypothesis is that these c distributions are not all the same.
Implications
If Ho is accepted, the Chi-square statistic has approximately a chi-square distribution with (r-1)(c-1) degrees of freedom.
Conditions
No more of 20% of the expected counts are less than 5, all individual expected counts are at least 1.All counts in a 2 x 2 table should be at least 5.Expected count = (row total x column total) / n
Chi-square test of association & independence
Hypotheses
Ho: There is no association between two categorical variables.Ha: There is an association between two categorical variables.
Uses
2-way table from a single SRSeach individualy classified according to both categorical variables.
Chapter 15: Inference for regression
The slope b and intercept a of the least-squares regression line are statistics.
Conditions
Repeated response y are independent of each other.Scatterplot: overall pattern is roughly linear. Residual plot has a random pattern.The standard deviation σ of y (σ is unknown) is the same for all values of x.For any fixed value of x, the response y varies according to a Normal distribution.
Calculations
degrees of freedom = n -2Residual = observed y - predicted yStandard error= s=√(∑(y-y.hat)^2 )/(n-2))confidence interval = b ± t* s/√(∑(x-x.bar)^2 )
Significant tests for regression slope
Hypotheses
Ho: β = 0This Ho says that there is no true linear relationship between x and y.This Ho also says that there is no correlation between x and y.The testing correlation makes sense only if the observations are a random sample.
t-statistics
t= (b√(∑〖(x-x.bar)〗^2 ))/s
Poisson Distribution
Conditions
1) The events occur singly and randomly.2) The events occur uniformly.3) The events occur independently.4) The probability of occurrence of an event within a small fixed interval is negligible.
Mean & variance
If X~Po(λ), then E(X) =λ and Var(X) =λλ=average number of ocurrance
Additive Property of the poisson distribution
If X and Y are independent Poisson random variable and X~Po(λ), Y~Po(µ),Then;X+Y~po(λ+µ)
Approximating Binomial distribution with poisson distribution
Given X~B(n,p) such that n is large (>50) and np<5 (normally p<0.1), then the binomial distribution can be approximated usingthe poison distribution with mean λ=npIt is more accurate when n gets larger and p gets smaller
Chapter 1: Exploring data
Categorical data
Pie charts, dotplots, bar charts
Qualitative categories
Quantitative data
Numerical data
Histogram
Area represents the size of dataRelative frequencies on vertical axis
Stemplots
Cumulative frequency plot (Ogive)
Description of graphical display
Mode
Center
MeanMedian
Spread
RangePercentileInterquartile Range (IQR)Box plots (five number summary)VarianceStandard deviation, s
Clusters
Gaps
Outliers
Shape
SymmetricSkewed (spreads far and thinly)UniformedBell-shaped
Resistant measure
Median is not affected by outlier values
Changing units of measure
Linear transformations
Transformed variables
mean a+bxmedian a+bMstandard deviation bsIQR bR
Comparing distributions
Side-by-side graphsBack-to-back stemplotsNarrative comparisons
Chapter 2: Describing location in a distribution
Z-score
z=(x-x.bar)/s
Percentile
pth percentile of a distribution is the value with p% of the observation less than or equal to it.
Chebyshev's inequality
Density curves
Mathematical model for the distributionA curve that is always or above the x axisArea underneath it is always exactly 1.
Mean, µ,σ
% of observations falling within k standard deviations of the mean is at least (100)(1-1/k^2)Median is the equal-areas pointMean is the balance pointIn a symmetric density curve the mean and median are the same
Normal distribution
Probablity density function given as:f(x)=1/(σ√(2π )) e^(-〖(x-µ)〗^2/〖2σ〗^2 )
Empirical rule
68% fall within 1 standard deviation of the mean95% fall within 2 standard deviation of the mean99.7% fall within 3 standard deviation of the mean
Standard Normal distribution
For any Normal distribution we can perform a linear transformation to obtain standard Normal distributionIf the variable x has Normal distribution N(µ, σ), then the standard variable has Normal distribution N(0,1);z=(x-µ)/σThe area under the standard Normal curve can be found from a standard Normal table, or the GC.
Assessing Normality
Close to straight line - NormalSystematic deviations - non-Normal
Chapter 3: Examining Relationship
Response and explanatory variables
Response (dependent) - measures outcome of studyExplanatory (independent) - explains or influences changes in response variable
Scatterplot
Scatterplot shows relationship between 2 quantitative variablesExplanatory on x-axisResponse on y-axis
Interpreting a scatterplot
1. Look for overall pattern and for striking deviations from that pattern.2. Describe the pattern by the direction, form and strength of the relationship3. Look for outliers.
Associations
Postive - above-average values of one tend to accompany above-average values of the other, and vice versa.Negative - above-average values of one tend to accompany below-average values of the other.
Correlation
Correlation measures the direction and strength of the linear relationship between two quantitative variables.∑((x_i-x.bar)/s_x )((y_i-y.bar)/s_y )
Least Squares Regression Line
Regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes.You can use a regression line to predict the value of y for any value of x.y.hat = a+bx
Residuals & Residual plot
Residual = observed y - predicted y. Sum of residuals = 0.Residual plot should show no obvious pattern.
Standard deviation
Measure error size: compare Standard deviation of residuals to actual data points.
Coefficient of determination
r^2=1-(∑ (y-y.hat)^2 )/(∑(y-y.bar)^2 )
Outliers and influential observations
Outlier - observation that lies outside the overall pattern of the other osbervations.Influential - if removing it would markedly change the result of the calculation.
Lurking variable
A variable that is not among the explanatory or response variabels in a study and yet may influence the interpretation of relationships among these variables.
Chapter 4: Relationships between two variables
Transforming to achieve linearity
able to apply the least squares regression line.
Exponential growth model
Exponential growht increases by a fixed percent of the previous total in each equal time period.y=ab^xln y=ln(ab^x)ln a = cx ln b= mPlot ln y against x to obtain a straight line with gradient ln b.
Power law model
y=ax^pln y=ln a+ p ln xPlot ln y vs ln x to obtain straight line with gradient p.
Relationship between categorical variables
Two-way table organizes data about 2 categorical variables.Row & column totals - marginal distributions or marginal frequencies.Find the condition distribution of the row variable for one specific value of the column variable, look only at the one column in the table. Find each entry in the column as a percent of the column total.
Simpson's paradox
Simpson's paradox (or the Yule-Simpson effect) is a statistical paradox wherein the successes of groups seem reversed when the groups are combined.
Explaining association
Causation - change in x causes the direct change in y.Common response - the observed association between x and y can be explained by lurking variable z.Confounding effect - variable effects cannot be distinguished from each other.To explain causation, we need to conduct carefully-designed experiments.
Chapter 5: Producing data
Observational study
Designing samples
population-Entire group we want information about Sample - Part of the population that we examine.Sampling - studying a part in order to gain information about the whole.Census - attempts to contact every individual.Voluntary response sample - people who choose themselves.Convenience sampling - choosing individuals who are easiest to reach.
Simple Random Sample (SRS)
Consists of n individuals chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected.1. Label. Assign a numerical label to every individual in the population.2. Table. Use the random number table to select labels at random.3. Stopping rule. Indicate when you should stop sampling.4. Identify sample. Use the labels to identify subjects selected to be in the sample.
Other sampling methods
Probability sample - sample chosen by chance.Stratified random sampling - first divided nto strata, then SRS from the stratas.Cluster sample - divide population into clusters, then randomly select some clusters.Multi-stage sampling design.
Cautions
Undercoverage - Some groups in the population are left out.Non-response - Individuals do not respond or cooperate.Response bias - lyingWording of questions - confusing & misleading questions.
Experiment
Definition
Deliberately impose some treatment on individuals in order to observe their responses.Individuals - experimental units or subjects (humans).Treatment - experimental condition applied.Factors - explanatory variables.
Control
Effort to minimise variability in the way experimental units are obtained and treated.Helps reduce problems from confounding and lurking variables.One group receives the treatment while the other group does not. Compare responses between 2 groups.
Placebo
See if there is any placebo effects which could have affected the reults
Replication
Even with control, there is still natural variability.Replication reduces the role of chance variation and increase the sensitivity of the experiment to differences between the treatments.
Randomisation
Treatment groups are essentially similar and there is no systematic differences between them.
Designs
Block design
Block - group of experimental subjects that are known to be similar in some way that is expected to systematically affect the response to the treatments.Blocks are a form of control.Blocks are chosen to reduce variability based on the likelihood that the blocking variable is related to the response.Blocks should be formed based on the most important unavoidable sources of variability among the experimental units.Blocking allows us to draw separate conclusions about each block.
Matched pairs design
Example of block design.Compare two treatments and the subjects are matched in pairs.
Cautions
Double-blind experiment
Neither subjects nor those who measure the response know which treatment a subject received.Controls the placebo effect.
Lack of realism
Cannot duplicate exact conditions that we want to studyLimits our ability to apply conclusions to the settings of greater interest.Statistical analysis cannot tell us how far the results will generalise to other settings.
Chapter 6: Permutation, combination and probability
Permutation and combination
Addition principle
the number of ways of selecting a single objects; m+n.
Multiplication principle
If you can do task 1 in m number of ways, and task 2 in n number of ways,both tasks can be done in m*n number of ways.
Permutation
The order of objects is important.If there are n distinct objects, we have n! ways of arranging all the objects in a row.
Identical objects
p identical objects and q identical objects,from a total of n objectsarrange the r objects in a row in:n!/p! or n!/(p!q!)
Distinct
r objects from n distinct objects〖(_^n)P〗_r=n!/(n-r)!
Combination
The unordered selection of objects from a set.If there are n distinct objects, then we can select r objects in ways: 〖(_^n)C〗_r
Circular permutation
When objects are arranged in a circle, since each object has the same neighbours, they can be rotated.We have (n-1)! ways to arrange n distinct objects in a circle.
Probability
Random - individual outcomes are uncertain but there is a regular distribution of outcomes in a large number of repetitions.
Probabilty models
Sample space of a random phenomenon - set of all possible outcomesEvent - any outcome or set of outcomesProbability model - mathematical description of random phenomenon.
Probability of event with equal likely outcomes
P(A)=(n(A))/(n(S))where n(A) represents the number of outcomes in event A and n(S) represents number of outcomes in space S.
Independent events
Two events A and B are independent if the chance of one event happening or not happening does not change the probability that the other event occurs.If A and B are independent, then P(A and B) = P(A)*P(B).
Probability Tree
Diagrammatic representation of possible outcomes of series of events. A probability tree to calculate the chances of flipping a coin and coming up heads three times in a row would have three levels. The first reflects the chances of throwing either heads or tails; the second level reflects the chances of throwing heads or tails after throwing heads the first time, and the chances of throwing heads or tails after throwing tails the first time: the third level shows the chances of throwing heads or tails after all the possible outcomes of the first two throws. The series of probabilities can be multiplied to give the overall probability of a possible event occurring.Probabilities add up to 1.
Conditional probability
Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written P(A|B), and is read "the probability of A, given B".P(A|B)=(P(A n B))/(P(B))If A and B are mutually exclusive, then P(A | B) = 0.If A and B are independent, then P(A | B) = P(A).
Chapter 7: Random Variables
A variable whose value is a random numerical outcome.
Discrete random variable
values that might be observed are restricted to being within a pre-defined list of possible values
Conditions
All probabilities must add up to 1.0≤P_k≤1
Probability Histogram
Probability distributions of real-valued random variables
Equations
µ_x=∑_(i=1)^k (x_i p_i )σ_x^2=∑_(i=1)^k (x_i-µ_x )^2 p_iVar(x)=E[(x-µ_x)^2]
Continuous random variable
Takes all values in an interval of numbersFor all continuous probability distributions, P(any individual outcome) = 0
Probability distribution of X is described by a probability distribution function
Total area under graph is 1f(x)≥0∫_a^b f(x)dx
Cumulative distribution function
∫_-∞^x f(t)dt
Properties for expectations and variances
E(a) = aE(aX + b) = aE[X] + bE(X + Y) = E[X] + E[Y]Var(a) = 0Var(aX + b) = a^2 Var(X)Var(X + Y) = Var(X) + Var(Y)Var(X - Y) = Var(X) + Var(Y)