Master Advanced Statistical Methods for Data Science Exam Success
School
Albertus Magnus College**We aren't endorsed by this school
Course
STAT 201
Subject
Statistics
Date
Dec 12, 2024
Pages
4
Uploaded by AdmiralFog15907
Exam Name: Advanced Statistical Methods in Data ScienceExam Time: 2 hours 30 minutesTotal Score: 100INSTRUCTIONS:1. This exam contains 25 questions. You have 2 hours and 30 minutes to complete the exam.2. Please answer all questions in the space provided.3. You may use a calculator, but no other electronic devices are allowed.4. Show your work for all calculation problems.5. Be sure to clearly label all graphs and charts.Question 1: (2 points) Multiple ChoiceWhich of the following is a continuous probability distribution?A. Normal distributionB. Binomial distributionC. Poisson distributionD. Chi-squared distributionQuestion 2: (2 points) Multiple ChoiceWhich of the following is a discrete probability distribution?A. Normal distributionB. Exponential distributionC. Uniform distributionD. Poisson distributionQuestion 3: (2 points) Multiple ChoiceWhat is the expected value of a binomial distribution with n = 10 and p = 0.5?A. 2.5B. 5C. 10D. 50Question 4: (2 points) Multiple ChoiceWhat is the variance of a Poisson distribution with parameter = 3?λA. 3B. 9C. 6D. 1.5Question 5: (2 points) Multiple ChoiceWhich of the following is a type of statistical inference?A. Hypothesis testing
B. Regression analysisC. Data visualizationD. Probability calculationQuestion 6: (2 points) Multiple ChoiceWhat is the purpose of a confidence interval?A. To estimate the population meanB. To test a hypothesis about a population parameterC. To estimate the population standard deviationD. To compare the means of two populationsQuestion 7: (3 points) Short AnswerExplain the concept of statistical significance and its importance in hypothesis testing.Question 8: (3 points) Short AnswerDescribe the difference between Type I and Type II errors in hypothesis testing.Question 9: (3 points) Short AnswerWhat is the purpose of the Analysis of Variance (ANOVA) and how is it related to regression analysis?Question 10: (3 points) Short AnswerExplain the concept of multicollinearity in regression analysis and its impact on the results.Question 11: (4 points) CalculationThe average height of adult males in a certain population is 175 cm, with a standard deviation of 5 cm. Assuming the heights are normally distributed, what is the probability that a randomly selected male from this population will be taller than 180 cm?Question 12: (4 points) CalculationA binomial experiment consists of 15 trials, and the probability of success on each trial is 0.2. What is the probability of obtaining exactly 3 successes in the experiment?Question 13: (4 points) CalculationThe average annual rainfall in a city is 80 cm, with a standard deviation of 10 cm. Assuming the rainfall amounts are normally distributed, what is the probability that the annual rainfall will be between 70 cm and 90 cm?Question 14: (4 points) CalculationA Poisson distribution has a mean of 5. What is the probability of observing 3 events in a given time period?Question 15: (4 points) Calculation
The heights of adult males in a certain population are normally distributed with a mean of 175 cm and a standard deviation of 5 cm. Calculate the z-score for a male with a height of 185 cm.Question 16: (5 points) GraphingConstruct a scatter plot of the following data points: (1, 2), (2, 3), (3, 4), (4, 5), (5, 6). Based on the scatter plot, determine if there is a linear relationship between the two variables.Question 17: (5 points) Short AnswerExplain the concept of a p-value and its role in hypothesis testing.Question 18: (5 points) Short AnswerDescribe the difference between a parametric and nonparametric statistical test.Question 19: (5 points) Short AnswerWhat is the purpose of a Bayesian network and how is it used in data analytics?Question 20: (5 points) Short AnswerExplain the concept of time series forecasting and its importance in data science.Question 21: (6 points) CalculationA multiple regression model is given by Y = 0 + 1X1 + 2X2 + , where Y is the dependent βββεvariable, X1 and X2 are independent variables, 0, 1, and 2 are the regression βββcoefficients, and is the error term. If 1 = 2, 2 = 3, and = 1, calculate the value of Y whenεββεX1 = 5 and X2 = 4.Question 22: (6 points) CalculationA chi-squared test is used to test the independence of two categorical variables. If the test statistic is 6.25 and the p-value is 0.04, at a significance level of 0.05, should the null hypothesis be rejected?Question 23: (6 points) CalculationThe correlation coefficient between two variables is 0.8. If the standard deviation of one variable is 3, what is the standard deviation of the other variable?Question 24: (7 points) Complex AnalysisA time series dataset consists of monthly sales data for a company over a 5-year period. Thedata shows a strong seasonal pattern, with higher sales in the summer months. Describe how you would analyze this dataset to identify and model the seasonal component.Question 25: (7 points) Complex AnalysisA Bayesian network is used to model the probability of a student passing an exam. The network includes three variables: studying hours, exam difficulty, and prior knowledge.
Describe how you would construct the Bayesian network and calculate the probability of a student passing the exam given the values of the three variables.End of Exam