Essays
Topics
Writing Tool
Machine Learning AI
ChatGPT
US History
Presidents of the United States
Joseph Robinette Biden
Donald Trump
Barack Obama
US States
States Ranked by Size & Population
States Ranked by Date
IPL
>
Statistics
>
msb14e_ppt_ch01 (1).pptx
Msb14epptch01 (1)
.pptx
School
University of Vermont
*
*We aren't endorsed by this school
Course
ANTH 253
Subject
Statistics
Date
Jan 14, 2025
Pages
76
Uploaded by HighnessPartridgeMaster1172
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 1
A LWAY S L E A R N I N G
Chapter 1
Statistics, Data,
and Statistical
Thinking
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 2
A LWAY S L E A R N I N G
Chapter 1 - Contents
1.
The Science of Statistics
2.
Types of Statistical Applications in Business
3.
Fundamental Elements of Statistics
4.
Types of Data
5.
Collecting Data: Sampling and Related Issues
6.
Business Analytics: Critical Thinking with
Statistics
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 3
A LWAY S L E A R N I N G
Where We’re Going
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 4
A LWAY S L E A R N I N G
1.1
The Science of Statistics
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 5
A LWAY S L E A R N I N G
What Is Statistics?
Statistics
is the art and science of learning
from data.
It involves:
•
Collecting, organizing summarizing,
analyzing, and interpreting information
which may be:
•
quantitative (numeric) or
•
descriptive (words, like eye color).
•
The objective is to answer a question of
interest which can be answered using data.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 6
A LWAY S L E A R N I N G
What business questions might
we wish to answer?
•
Which of several products is a consumer
more likely to purchase?
•
What is the estimated cost for replacement
of products under warranty?
•
How does the average lifetime for a new
composite material used in the manufacture
of an artificial hip compare with the material
currently in use?
•
What is the trend in hospital costs over the
next 6 years?
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 7
A LWAY S L E A R N I N G
1.2
Types of Statistical
Applications in Business
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 8
A LWAY S L E A R N I N G
Statistics: Two Key Processes
1)
Describe data – typically using graphs,
charts and summary metrics (like
averages/medians etc.)
2)
Draw conclusions (making estimates,
decisions, predictions, etc. about the
population of interest
(entire collection)
using a
sample
(a subset)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 9
A LWAY S L E A R N I N G
Statistical Methods
Statistical
Methods
Descriptive
Statistics
Inferential
Statistics
Inferentia
l
Statistics is not always needed
. If the data represents the
entire population of study, inferences are not needed.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 10
A LWAY S L E A R N I N G
Simple Example
•
Suppose you want to know the proportion of yellow M & M’s in the
bag of M & M’s you just bought.
What is a proportion by the
way??
•
You would simply count the yellow M & M’s and divide by the total.
No inference is needed.
•
When might an inference be needed with regard to yellow M & M’s?
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 11
A LWAY S L E A R N I N G
Descriptive Statistics
Descriptive statistics - T
o summarize the
information in a data set, and to present the
information in a convenient form.
•
Involves graphs and charts; computations of important metrics,
like average
.
•
Consider this
– Does your manager want to
see a 10,000 line report that captures each
individual part manufactured in the last 6
months and which country it was shipped to?
Wouldn’t a simple graph be much more useful?
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 12
A LWAY S L E A R N I N G
Inferential Statistics
Inferential Statistics
- U
tilizes
sample
data to
make:
•
Estimates about population
parameters
(Parameter
- A numeric value associated with the
population, like population average)
•
Decisions, Predictions, or other
generalizations about the population.
•
Ex: Whether two population proportions
are equal; The proportion of votes Trump
will receive vs. Harris. (Too close to call).
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 13
A LWAY S L E A R N I N G
Summary of what we have
discussed
•
The field of Statistics relies on data. Without
data we are not able to address the underlying
question of interest using statistical methods.
•
If the data represents the entire collection, (the
population)
then no inferences are needed. We
can simply summarize the data and report out.
•
If the data represents a sample (a subset) then
we will need inferential statistics to allow us to
make
educated guesses regarding the entire
population of interest.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 14
A LWAY S L E A R N I N G
1.3
Fundamental Elements
of Statistics
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 15
A LWAY S L E A R N I N G
Fundamental Elements
Experimental (or observational) unit
Object upon which we collect data (like a
person, or a company, or a country)
If we conduct an
experiment
we have
experimental units
. If we simply
observe
(which includes surveys), then we have
observational units.
•
Population
•
The entire set of units (subjects)
we are
interested in studying
•
* Populations are not necessarily people in
the world of Statistics
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 16
A LWAY S L E A R N I N G
Fundamental Elements
•
Variable -
•
A characteristic of the unit of study.
•
Values change across units. Ex: height, gender
•
Sample
•
Consists of units that are a subset of the
population units
•
Statistical Inference
•
To estimate or predict or generalize about a
characteristic of the population using information
from a sample
•
Estimating, predicting and generalizing carry
some amount of uncertainty
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 17
A LWAY S L E A R N I N G
Example- Using a Sample to draw an
inference for
average age
•
FOX News hypothesizes (educated guess) that the
average
age of FOX viewers is greater than 60
. To test
the hypothesis, a sample of 200 FOX viewers is used and
the age of each viewer is obtained.
a.
Describe the population.
b.
Describe the variable of interest.
c.
Describe the sample.
d.
Describe the inference
.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 18
A LWAY S L E A R N I N G
Example (cont)
Solution
a.
The population - All FOX viewers.
b.
The variable of interest - The age (in years) of
each viewer is the variable of interest.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 19
A LWAY S L E A R N I N G
Example (cont)
c.
The sample – The 200 FOX viewers selected for
the study.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 20
A LWAY S L E A R N I N G
Example (cont)
d.
Describe the inference.
The inference – To
generalize
the information
contained in the sample of 200 viewers to the
population of all FOX viewers.
Based on the average age of the 200 viewers the
researcher will infer whether it is likely the average
age of the population (all Fox viewers) exceeds 60
years.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 21
A LWAY S L E A R N I N G
Fundamental Elements
Measure of Reliability
•
Statement (usually qualified) about the
degree of uncertainty associated with a
statistical inference.
•
Polling results for elections are often
reported as Candidate X is projected to take
52% of the vote, with a margin of error of
.5%;
•
The margin or error captures the
uncertainty in the projected amount of 52%
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 22
A LWAY S L E A R N I N G
Four Elements of Descriptive
Statistics
1.
Identifying the population of interest and
whether a sample will be utilized
2.
Identifying one or more variables that are to be
investigated
3.
Tables, graphs, or numerical summary tools
4.
Identification of patterns in the data
5.
If a sample was used it is possible
and likely to continue with inferential
statistics
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 23
A LWAY S L E A R N I N G
Five Elements of Inferential
Statistics
1.
Identifying the population of interest
2.
Identifying one or more variables that are to be
investigated
3.
Identifying the sample
4.
Make an inference
about the population
based on information contained in the sample
5.
Establishing a measure of reliability for the
inference (we often use the margin of error)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 24
A LWAY S L E A R N I N G
Example
•
A fast-food restaurant has 6,289 outlets with drive-
throughs.
•
Problem statement
– Management wishes to
attract more customers to its drive-through
services, and is considering giving a 50% discount
to customers who wait more than a specified
number of minutes between the time they place the
order and the time they get it
. (I wonder if they did a
cost/benefit analysis on this before deciding to discount at 50%)
•
To help determine what the time limit should be,
they have
decided to estimate the average waiting
time at their drive-through window in Dallas, Texas.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 25
A LWAY S L E A R N I N G
Example
•
For 7 consecutive days, times are recorded using
digital clocks.
•
At the end of the 7-day period, 2,109 orders had
been timed.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 26
A LWAY S L E A R N I N G
Example (cont)
a.
Describe the process of interest at the Dallas restaurant.
b.
Describe the variable of interest.
c.
Describe the sample.
d.
Describe the inference of interest.
e.
Describe how the reliability of the inference could be
measured.
Solution
a.
The process of interest is the drive-through window at
the selected restaurant. It is a process because it
“produces,” or “generates,” meals over time—that is, it
services customers over time.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 27
A LWAY S L E A R N I N G
Example (cont)
b.
Describe the variable of interest.
The variable is customer wait time
c.
Describe the sample.
The sample consists of the 2,109 orders that were
processed through the drive-through during the 7-day
period at a particular location (Dallas, TX).
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 28
A LWAY S L E A R N I N G
Example (cont)
d.
Describe the inference of interest.
•
The company’s immediate interest is in learning
about the drive-through window in Dallas.
•
i.e. to estimate the average waiting time at the Dallas
facility using the sample average.
•
They may also use this to estimate the average wait
time at all their locations, although this would not be a
good statistical strategy. i.e. It would not be wise to
randomly select only one of over 6,000 locations to
make an inference about all 6,000 locations.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 29
A LWAY S L E A R N I N G
Example (cont)
e.
Describe how the reliability of the inference could be
measured.
At this point in the book we are not at a position to discuss
how reliability is actually computed. Suppose we found that
the average waiting time is 4.2 minutes, with a bound on
the error of estimation of 0.5 minutes.
We could then be reasonably certain that the true average
waiting time for the Dallas process is between 3.7 and 4.7
minutes. (Notice how we added and subtracted the error of
estimation (.5 minutes) from the average wait time of 4.2
minutes. )
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 30
A LWAY S L E A R N I N G
Summary of the Fundamental
Elements of Statistics
•
Samples and populations play key roles in Statistics. You will
work with either a sample or the entire population, when
conducting a statistical study.
•
Every statistical question focuses on some population.
•
The major areas of Statistics are:
•
Descriptive Statistics (Summarize, organize, graph)
•
Inferential Statistics (Samples do not provide the
complete picture. We must make inferences back to the
population when working with a sample. )
•
Variables are associated with a population or sample unit.
Every statistical study involves variable(s).
•
When making an inference, we include a level of reliability to
provide insight on how far off our estimate might be.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 31
A LWAY S L E A R N I N G
1.5
Types of Data
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 32
A LWAY S L E A R N I N G
Types of Data
Quantitative data:
•
T
rue numbers (not like social security numbers
or phone numbers but numbers that can be
mathematically manipulated).
•
Often measurements or counts.
Qualitative data
•
Descriptive data, like gender, hair color.
•
This data is
classified into one of a group of
categories, for instance defect type is a
possible category.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 33
A LWAY S L E A R N I N G
Types of Data-
all data is either quantitative or qualitative
Types of
Data
Quantitative
Data
Qualitative
Data
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 34
A LWAY S L E A R N I N G
Quantitative Data Examples
Measured on a numerical scale.
1.
Temperature
- The temperature at which each
piece of heat-resistant plastic begins to melt in a
sample of 20
2.
Unemployment rate
- The current
unemployment rate (measured as a percentage)
for your state of residence
3.
GMAT Scores
- The scores of a sample of 150
MBA applicants who took the GMAT exam
4.
Female employee count
- The number of female
executives in each of a sample of 75
manufacturing companies
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 35
A LWAY S L E A R N I N G
Qualitative Data
Classified into categories.
1.
Political Party
- The political party affiliation
(Democrat, Republican, or Independent) in a
sample of 50 CEOs
2.
Defective Status
- The defective status (defective
or not) of each of 100 Intel computer chips
3.
Car Size
- The size of a car (subcompact,
compact, midsize, or full-size) rented by each of a
sample of 30 business travelers
4.
A
taste tester’s ranking
-
(best, worst, etc.) of
four brands of barbecue sauce for a panel of 10
testers
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 36
A LWAY S L E A R N I N G
Qualitative Data
Is it possible for Car Size to become a numeric
variable?
Yes. If we think of it in terms of weight, or length, then
the variable becomes numeric.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 37
A LWAY S L E A R N I N G
Example
•
Manufacturing plants sometimes discharge
toxic-waste materials such as DDT into nearby
rivers and streams.
(My note – DDT was banned in the US in
1972 however it is a persistent environmental pollutant and may linger in
the environment).
•
These toxins can adversely affect plants and
animals living in that area.
•
A study of fish in the Tennessee River (in
Alabama) and its three tributary creeks: Flint
Creek, Limestone Creek, and Spring Creek was
conducted.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 38
A LWAY S L E A R N I N G
Example
•
A total of 144 fish were captured
•
What do you think is the research question?
•
Is the 144 fish a sample or a population?
•
What is the unit in this study?
The following variables were reported for each
fish : (continued on next slide)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 39
A LWAY S L E A R N I N G
Example (cont)
1.
River/creek where each fish was captured
2.
Species (channel catfish, largemouth bass, or
smallmouth buffalo fish)
3.
Length (centimeters)
4.
Weight (grams)
5.
DDT concentration (parts per million)
These data are saved in the
DDT
file. Classify
each of the five variables measured as quantitative
or qualitative.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 40
A LWAY S L E A R N I N G
Example (cont)
Solution
Quantitative variables –
Length
Weight
DDT concentration
Qualitative variables –
river/creek
species
Possible research question –
What is the
average concentration of DDT by species, in each
river, where the 144 fish represent a sample. Each
fish is a unit.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 41
A LWAY S L E A R N I N G
Summary of Data Types
Data comes in two flavors:
•
Quantitative
(numbers that can be
mathematically manipulated
•
Think about it - there is no meaningful
definition of an ‘average social security
number but your average grade in this
class does have meaning
•
Qualitative
– Consists of categories, like eye
color, stages of a disease, mood (happy/sad)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 42
A LWAY S L E A R N I N G
1.6
Collecting Data – Data
Collection strategies
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 43
A LWAY S L E A R N I N G
How do we obtain Data
1.
Obtain data from a
published source
2.
Obtain Data from running a
designed experiment
3.
Obtain Data using a survey
4.
Obtain Data from conducting an
observational study
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 44
A LWAY S L E A R N I N G
Obtaining Data
1. Published sources include:
Books, journals, newspapers, & Web sites
Some popular social survey web sites include
:
•
General Social Survey (GSS)
Contains a core set of demographic, behavioral, and attitudinal
questions, as well as topics of special interest. Many of the core
questions have remained unchanged since 1972, which allows for time-
trend studies and replication of earlier findings.
•
Data.gov
Provides public access to machine-readable datasets generated by the
Executive Branch of the Federal Government.
•
Harvard Dataverse
A searchable and downloadable repository for research data on many
subjects.
Pew Research Center
A popular platform for gathering data through polls and surveys, primarily
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 45
A LWAY S L E A R N I N G
Obtaining Data
•
Pew Research Center
A popular platform for gathering data through polls and surveys, primarily
focusing on politics, demographics, trends, and social issues.
•
International Social Survey Programme (ISSP)
A cross-national survey program that conducts annual surveys in a broad
group of countries, asking questions on a variety of topics.
•
National Center for Education Statistics (NCHS)
Contains much data on various health indicators at both national and
state levels, including public-use microdata from surveys such as the
National Health Interview
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 46
A LWAY S L E A R N I N G
Obtaining Data
2. Designed experiment
Researcher applies a ‘treatment’ to units in the
treatment group, and often uses a placebo with
the control group. Randomization ensures both
groups are well balanced across all other
variables
(What are they
talking about??)
3. Survey
A group of people are surveyed and their
responses are recorded
4. Observation study
Units are observed in natural setting and variables
of interest are recorded – For instance observing
second graders in their classroom
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 47
A LWAY S L E A R N I N G
Designed Experiment
In a
designed experiment
we typically have a
group of experimental units that are assigned the
treatment
and an untreated (or
control
) group.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 48
A LWAY S L E A R N I N G
Observational Study
An
observational study
is a data-collection
method where the experimental units sampled are
observed in their natural setting. No attempt is
made to control the characteristics of the
experimental units sampled. (Examples include
opinion polls
and
surveys
.)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 49
A LWAY S L E A R N I N G
Summary of Data Collection
Strategies
•
Using existing data sources
•
Conducting a designed experiment
•
Conducting an observational study
•
Utilizing a survey
•
Bottom Line – Any statistical study requires data,
how you obtain it depends on:
•
The research question and what data is currently available
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 50
A LWAY S L E A R N I N G
Types of Samples
Before discussing the types of Samples that are
possible, we should ask:
“WHY DO WE NEED TO SAMPLE?”
•
When it is not possible to work with the entire
population, often for one or more of the following
reasons:
•
Population is too large
•
The costs are prohibitive
•
Too time consuming
•
The entire population is not easily accessible
•
It would be too dangerous (consider a new drug or vaccine)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 51
A LWAY S L E A R N I N G
Types of Samples
A
representative sample
exhibits characteristics
typical of those possessed by the population of
interest. A
biased
sample DOES NOT
Our goal should always be to obtain a
representative sample
.
Consider a study of all COVID patients. You select a
sample with only adults whose BMI is greater than
30.
This is clearly a biased sample as there are many
individuals in the population with BMI’s less than 30.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 52
A LWAY S L E A R N I N G
Types of Random Samples –
Simple Random Sample
I.
Simple random sample
- A simple random
sample of
n
(where n represents some positive
integer value like 500) experimental units is a
sample selected from the population in such a way
that:
•
Every different sample of size
n is equally likely to
be selected.
•
Note - This is the theoretical definition of a simple
random sample. It is almost impossible to select
the sample in such a way to ensure this.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 53
A LWAY S L E A R N I N G
Random Number Generators can
be useful when selecting simple
random samples
•
Researchers often rely on
random
number generators
to generate the
numbers that will be used to select the
random sample.
•
Most software packages and
calculators have this feature
•
Notice this
assumes
you are able to
obtain a list of the population, that is
sequentially numbered.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 54
A LWAY S L E A R N I N G
Example
Problem Statement
– You wish to assess the
feasibility of building a new high school and wish to
gauge the opinions of people living close to the
proposed building site.
The neighborhood adjacent to the site has 711
homes. Use a random number generator to select
a simple random sample of 20 households from
the neighborhood to participate in the study
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 55
A LWAY S L E A R N I N G
Example (cont)
Solution
The population consists of the 711 households.
To obtain a simple random sample we can use Excel:
•
Assign a number from 1 to 711 to each of the
households in the population. These numbers are
entered into an Excel worksheet.
•
Next apply the random number generator of Excel or
XLSTAT (statistical software package for Excel),
requesting that 20 households be selected without
replacement (meaning you cannot be given the same
value twice). One possible set of random numbers
generated is 40, 63, 108, . . . , 636 and these are the
households to be included in your sample.
•
The Excel command would be:
•
=RANDARRAY(1,20,1,711,TRUE)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 56
A LWAY S L E A R N I N G
Importance of Selection
How well a sample is selected from a
population is of vital importance in
statistical inference
as the sample will be used to infer the
characteristics of the associated
population.
Suppose you selected the men’s
basketball team at NCSU as your sample
to make an inference of average male
undergraduate height.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 57
A LWAY S L E A R N I N G
Types of Random Samples –
Stratified Random Sample
2. Stratified random sampling
- u
sed when:
•
The experimental units can be separated
into groups that are thought to respond
differently to the research question
AND
•
It is important that their proportion in the
sample mirrors that in the population
Example–
Testing a COVID vaccine in
adults. Elderly often have a suppressed
immune response which may affect vaccine
efficacy. If 25% of the population is “elderly”,
we would like the sample to reflect that. Using
a simple random sample may not achieve
that target but stratified sampling will.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 58
A LWAY S L E A R N I N G
Types of Random Samples –
Stratified Random Sample contd
.
•
Key challenge
– Obtaining a reasonable
approximation regarding the proportion of
each group within the population of
interest.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 59
A LWAY S L E A R N I N G
Types of Random Samples –
Cluster sample
3. Cluster sampling
– Designed to save
time and money, but the results will not be
as accurate as a simple random sample or
a stratified sample. This methodology can
be used when:
•
The population can be grouped into
clusters, where each cluster resembles the
underlying population.
•
The researcher will randomly select one or
more clusters and and collect data from all
experimental units within each cluster
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 60
A LWAY S L E A R N I N G
Types of Random Samples –
Systematic Sample
4. Systematic sampling
-
systematically
selects every
k
th experimental unit,
typically from a list of all experimental
units.
•
This is often used when sampling from
an assembly line. (Clearly there is no
official list of all experimental units in
this scenario).
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 61
A LWAY S L E A R N I N G
Nonrandom Sample Errors
Selection bias
results when a subset of the
experimental units in the population is excluded
so
that these units have no chance of being selected
for the sample.
Nonresponse bias
results when the researchers
conducting a survey or study are unable to obtain
data on all experimental units selected for the
sample but continue with the study anyways.
Measurement error
refers to inaccuracies in the
values of the data recorded. In surveys, the error
may be due to ambiguous or leading questions and
the interviewer’s effect on the respondent.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 62
A LWAY S L E A R N I N G
Example
•
Mobile Marketer hopes to find out what is the most
popular device used by online shoppers.
•
They hire the mobile video ad network AdColony to
conduct a nationwide survey of 1,000 US online
shoppers.
•
The most popular device a smartphone, used by
56% of the online shoppers.
28% used a desktop
or laptop computer, and 16% used a tablet.
a.
Identify the data-collection method.
b.
Identify the target population.
c.
Are the sample data representative of the
population?
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 63
A LWAY S L E A R N I N G
Example (cont)
Solution
a.
Identify the data-collection method.
The data-collection method is a survey: 1,000 US
online shoppers participated in the study.
b.
Identify the target population.
Presumably, Mobile Marketer is interested in the
devices used by all US online shoppers.
Consequently, the target population is
all US
consumers who use the Internet for online
shopping.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 64
A LWAY S L E A R N I N G
Example (cont)
c.
Are the sample data representative of the population?
Because the 1,000 respondents clearly make up a subset of
the target population, they do form a sample, but is it
representative of the population?
It is not clear how the sample was obtained.
If the respondents were obtained using, say, random-digit
telephone dialing, then the sample is likely to be
representative because it is a random sample.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 65
A LWAY S L E A R N I N G
Example (cont)
However, if the questionnaire was made available to anyone
surfing the Internet, then the respondents are
self-selected,
also
known as a volunteer sample
.
Such a survey often suffers from
nonresponse bias
. i.e.
Those who chose not to respond
or who never saw the
questionnaire might have answered the questions differently,
leading to a lower (or higher) sample percentage.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 66
A LWAY S L E A R N I N G
Summary of Types of Random
Samples
•
Simple Random Sample – Facilitated through the use of a
random number generator; each possible sample has the
same chance of being selected
•
Cluster Sample – When the population can be broken
down into sub-groups (clusters) that are representative of
the population. The researcher will select one or more
clusters and collect data from each unit.
•
Stratified Sample – When the research question may be
sensitive to the response of different subgroups within the
population. Requires knowledge of
the proportion
associated with each sub-group.
•
Systematic sample – Often used in manufacturing where
we sample every kth unit from a list or assembly line
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 67
A LWAY S L E A R N I N G
Summary of NonRandom
sample errors
•
Selection Bias – Excluding some subset of the population
from the sampling process. This subset will then have no
representation in the sample.
•
Non Response Bias
– some of the units selected for the
sample are unwilling to provide responses.
The sample
becomes biased
•
Measurement Error – Inaccuracies in the values recorded
Could be related to how the question was worded, or a
simple transcription error, or a question the participant
does not want to answer truthfully. This will result in bias
in the sample. Very difficult to quantify in terms of its
magnitude
•
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 68
A LWAY S L E A R N I N G
1.7
Critical Thinking with Statistics
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 69
A LWAY S L E A R N I N G
Statistical Thinking
Business analytics
refers to methodologies (e.g.
statistical methods) that extract useful information
from data in order to make better business decisions.
Statistical thinking
involves applying rational
thought and the science of statistics to critically
assess data and inferences. Fundamental to the
thought process is that
variation
exists in
populations and process data.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 70
A LWAY S L E A R N I N G
Statistics in Business Analytics
A good analyst must be able to reformulate the business problem
into a statistical question
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 71
A LWAY S L E A R N I N G
Key Ideas
Types of Statistical Applications – Descriptive
and Inferential
Descriptive Statistics involves -
1.
Identify
population
and
sample
(collection of
experimental units
)
2.
Identify
variable(s)
3.
Collect
data
4.
Describe
data
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 72
A LWAY S L E A R N I N G
Key Ideas
Types of Statistical Applications
Inferential Statistics involves
1.
Identify
population
(i.e. describe the
collection of
all
experimental
units of
interest
) 2.
Identify
variable(s)
3.
Collect
sample
data (
subset
of
population)
4.
Inference
about population based on
sample
5.
Measure of reliability
for inference
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 73
A LWAY S L E A R N I N G
Key Ideas
Types of Data
1.
Quantitative
(numerical in nature)
2.
Qualitative
(categorical in nature)
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 74
A LWAY S L E A R N I N G
Key Ideas
Data-Collection Methods
1.
Observational
(e.g. survey)
2.
Published source
3.
Designed experiment
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 75
A LWAY S L E A R N I N G
Key Ideas
Types of Random Samples
1.
Simple Random Sample
2.
Stratified random sample
3.
Cluster sample
4.
Systematic sample
.
Copyright © 2022, 2018, and 2014
Pearson Education, Inc.
Slide - 76
A LWAY S L E A R N I N G
Key Ideas
Problems with Nonrandom Samples
1.
Selection bias –
If you deliberately
exclude a specific subset of the population
from the sampling process the sample is
no longer random.
2.
Nonresponse bias
– If the manner in
which the survey is made available can
result in certain sub-groups having reduced
participation the sample is no longer
random
3.
Measurement error –
Results from
multiple conditions, including
misleading/ambiguous questions,