Rutgers University**We aren't endorsed by this school
Course
STAT METHO 33:623:385
Subject
Statistics
Date
Dec 16, 2024
Pages
13
Uploaded by ProfessorSheepPerson1068
Conceptual Questions:●The basis of statistical inference defined by Tufte and others iscomparison●Linear regression is the most commonly used and abused form ofanalysis●The most important question in data analysis is: How do you know that?Continuous DistributionsMeans and standard deviations●At the center of a sampling distribution constructed for the mean ofthe population is the population mean
●Changing μ shifts the distribution left or right
●Changing σ increases or decreases the spread
Z-distribution●Standardized normal distribution (Z) always has amean of 0and astandard deviation of 1.○Z =Z = z-score, how many standard deviations x is? − µ σfrom the mean■X =data points in normal distribution■= population meanµ■= standard deviation ( =if not given orσσ(?𝑖−?)2?need to calculate it)Z-score calculation example problem:If X is distributed normally with a mean of $100 and a standard of $50, findthe Z value of X. (Space can be used to solve problem)
Empirical rule and Normality●A normal distribution is bell shaped (symmetrical) when:○Mean = median○Empirical rule applies to the normal distribution○Interquartile range of a normal distribution is 1.33 (4/3)standard deviationsCentral Limit Theorem●CLT allows to approximate the shape of sampling distribution with anunknown population●Asnincreases, distribution of sample means narrow in on pop meanµ
●Large enough sample = disregard actual pop shape and treat it as if it’sa normal distribution●N > 30 to apply CLTSampling distributions and Estimation●Sampling Error (e) = x̄-µ●Bias = E(x̄) —µConfidence IntervalsPoint and Interval Estimates●Point estimate is a single number●A confidence interval provides additional information about thevariability (spread) of the estimate
Confidence Interval Formula:●X +/- Zα/2+ is for upper bound– is for lower boundσ?●Zα/2is found on the Z table with confidence levels●Upper bound — lower bound = confidence interval●Z andused only if population st. dev is knownσExample problem: A sample of 11 circuits from a large normal population hasa mean resistance of 2.22 ohms. Population standard deviation is 0.35 ohms.Determine a 95% confidence interval for the true mean resistance of thepopulation.
Confidence interval forif population st dev is unknownµ●Student’s t distribution is used insteadConfidence Interval Estimate:X +/– ta/2??ta/2= critical value of t distr. w/n–1 degrees of freedom (df)S = sample standard deviationdegrees of freedom (d.f.) = n — 1i.e if n = 10, df = 9Determining Sample Size●If needed to fit the mean:n =with e(accepted sampling error )= Za/2()(𝑍α/2)2σ2?2σ?●If needed to fit the confidence interval width and confidence level:n = 2)2((𝑍α/2) σ 𝐶???𝑖????? 𝑖?? ?𝑖??ℎ●To narrow confidence interval while keeping confidence level constant:○Increase n●To narrow c.i. while keeping population constant:○Lower confidence levelHypothesis TestingAhypothesisis a claim (assertion) about a population parameter
Hypothesis always consists of:1.TheNull Hypothesis, H0●Always a population parameter sois used instead of xµ●Begin with assumption that the null hypothesisis true●Always contains “=”, “≥”, “≤”●Null hypothesis may or may not be rejected●Ex: H0:=30µ2. TheAlternative Hypothesis, H1○Opposite of null hypothesis○Two hypotheses aremutually exclusiveandcollectivelyexhausted○Ex: H1:≠ 30µTest statistic and Critical values●If sample mean is close to stated population mean, H0isnotrejected●If instead far, H0isrejectedCritical Value Approach●Convert x to Z to get test statistic●Determine critical values based on confidence level and table●Decision rule: If test statistic falls in the rejection region, reject H0;otherwise do not reject H0.●Ex: H0= 30, H1≠ 30
Two tailed test:If –Zstat < –CritValue, Reject H0If –Zstat > –CritValue, Do not reject H0If +Zstat < +CritValue, Do not reject H0If +Zstat > +CritValue, Reject H0ANOVAHypothesis:●H0:1=2=3= … =cµµµµ○All population means are equal○i.e., no factor effect (no variation in means among groups)●H1: Not all means are equal○At least one mean differs from the rest○Doesn’t always mean none of them are equal
Partitioning the VariationTermDefinitionFormulaSSTTotal sum of squares(Total variation)SST = SSA + SSWSSASum of SquaresAmong/Between Groups(Among GroupVariation)Won’t be calculated,will already be given onthe testIf not given but SSWand SST is known:SSA = SST – SSWSSWSum of Squares WithinGroups (Within groupvariation)Won’t be calculated,will already be given onthe testIf not given but SSAand SST is known:SSW = SST – SSAMSAMean SquareAmong/Betweend.f1=c–1??𝐴? −1MSWMean Square Withind.f2=n–c???? − ?MSTMean Square Totald.f=n–1????−1FstatRatio ofamongestimate of varianceand estimatewithinvariance𝑀?𝐴𝑀??n = number of values in groupc = number of groups
Find Fcrit using df1=n-1 and df2=n-c with F tableIf Fstat > Fcrit:Reject null hypothesisEx:SourceofvariationSSdfMSFP-ValueF-critBetweenGroups(A)210.27780.064139WithinGroups(W)148374.15Total (T)2113.833