Chapter 1

.pdf
School
Western University**We aren't endorsed by this school
Course
ECONOMIC 2152A
Subject
Statistics
Date
Dec 21, 2024
Pages
90
Uploaded by AdmiralDanger12790
Describing Data: The Role of Graphical StatisticsHossein GhaderiWestern UniversitySeptember 9, 2024
Background image
IntroductionGraphical statistics are essential tools in predicting orforecasting variables such as:Sales of a new productConstruction costsCustomer satisfaction levelsThe weatherElection resultsUniversity enrollment figuresGrade point averagesInterest rates and currency exchange ratesThese variables have significant effects on our daily lives.
Background image
The Need for Data InterpretationGovernments, businesses, and researchers spend billionscollecting data.Once data are collected, the real challenge begins:How do we interpret the data?What insights can be derived from the data to supportdecision-making?This is where graphical statistics play a key role.
Background image
Graphical Statistics in Decision MakingGraphical statistics help visualize large datasets in an easilyinterpretable form.They reveal patterns, trends, and outliers that may not beimmediately apparent in raw data.By visualizing data, decision-makers can:Identify trends and forecast future behaviorUnderstand variability and uncertaintyMake informed decisions based on insights
Background image
Understanding Data Through StatisticsIn our study of statistics, we learn many tools to help us:Process, summarize, analyze, and interpret dataMake better decisions in an uncertain environmentUnderstanding statistics allows us to make sense of all thedata.
Background image
Tables and Graphs in StatisticsTables and graphs introduced in this chapter help us:Gain a better understanding of dataProvide visual support for improved decision makingReports are enhanced by appropriate tables and graphs,including:Frequency distributions, bar charts, pie charts, Pareto diagramsLine charts, histograms, stem-and-leaf displays, ogivesVisualization of data is important for effective communication.
Background image
Statistical Thinking in Decision MakingDecisions are often made based on limited information.Examples include:Accountants selecting records for auditing purposesFinancial investors understanding market fluctuationsManagers using surveys to assess customer satisfactionMarketing executives gathering customer preferences ordemographicsEven without certainty, decisions need to be made, such asbalancing portfolios while future market movements remainunknown.
Background image
Statistical Thinking in PracticeIn each situation, the process includes:Defining the problemDetermining the data neededCollecting and summarizing the dataMaking inferences and decisions based on the dataStatistical thinking is essential from problem definition todecision making, potentially leading to:Reduced costsIncreased profitsImproved processesEnhanced customer satisfaction
Background image
Population and Sample in Market ResearchBefore bringing a new product to market, manufacturersassess likely demand by conducting market research surveys.They are interested in the entire population of potentialbuyers.However, analyzing large populations is often impractical dueto cost or time constraints.Instead, manufacturers collect data from a sample (a subsetof the population).A population is the complete set of all items of interest,denoted asN, which can be large or even infinite.A sample is an observed subset of the population, with samplesize denoted asn.
Background image
Examples of PopulationsExamples of populations include:All potential buyers of a new productAll stocks traded on the NYSE EuronextAll registered voters in a particular city or countryAll accounts receivable for a corporationOur goal is to make statements about the population basedon the sample.We need a representative sample, and randomness in sampleselection is crucial.
Background image
Random SamplingSimple random sampling selects a sample ofnobjects suchthat:Each member is chosen strictly by chance.The selection of one member does not influence the selectionof another.Every possible sample of sizenhas the same chance of beingselected.The term ”random sample” often refers to simple randomsampling.
Background image
Systematic SamplingIn systematic sampling, everyj-th item is selected from thepopulation.The ratiojis calculated asj=Nn, whereNis the populationsize andnis the desired sample size.Randomly select a number from 1 tojto get the first item,then select everyj-th item afterward.Example: IfN= 5000,n= 100, thenj= 50. If the firstrandomly selected number is 20, select 20, 70, 120, and so on.
Background image
Systematic Sampling ConsiderationsSystematic sampling assumes the population is in randomorder.If there is an unknown link between the ordering and thesubject of study, bias could be introduced.Systematic samples provide a good representation of thepopulation if there is no cyclical variation in the data.
Background image
Parameter and StatisticSuppose we want to know the average age of registered votersin the United States.The population is too large to analyze fully, so we might takea random sample, e.g., 500 voters, and calculate their averageage.The average based on the sample data is called a**statistic**.If we could calculate the average age of the entire population,that would be called a **parameter**.
Background image
Parameter and Statistic: DefinitionsA **parameter** is a numerical measure that describes aspecific characteristic of a population.A **statistic** is a numerical measure that describes aspecific characteristic of a sample.Throughout this course, we will study how to make decisionsabout a population parameter based on a sample statistic.We must accept an element of uncertainty since we do notknow the exact value of the parameter.
Background image
Sampling and Nonsampling Errors**Sampling error** occurs because the information is onlyavailable for a subset of the population (the sample).**Nonsampling errors** can occur even in a complete censusof the population.Examples of nonsampling errors include:Sampling from the wrong populationSurvey subjects giving inaccurate or dishonest answersNo response from survey subjects
Background image
Example of Nonsampling Error: Wrong PopulationSampledIn 1936, Literary Digest magazine predicted Alfred Landonwould win the U.S. presidential election.The prediction was wrong because their sample was takenfrom telephone directories, magazine subscription lists, andcar registrations.This sample underrepresented the poor, who were mostlyDemocrats and voted for Franklin Roosevelt.Conclusion: To make valid inferences, the sample mustrepresent the correct population.
Background image
Example of Nonsampling Error: Inaccurate ResponsesSurvey subjects may give inaccurate or dishonest answers.This could be due to:Poorly worded questions that are hard to understand or biasthe responseSensitive questions leading to dishonest answersFor example, a plant manager asking employees what theyhave stolen may not yield reliable answers.
Background image
Example of Nonsampling Error: NonresponseSurvey subjects may not respond at all or may omit answersto certain questions.This leads to:**Sampling error** due to a smaller sample size**Nonsampling error** if the respondents differ in importantways from the larger population.Nonresponse can induce bias if the respondents are notrepresentative of the population of interest.
Background image
Thinking Statistically: Problem DefinitionTo think statistically begins with problem definition:1.What information is required?2.What is the relevant population?3.How should sample members be selected?4.How should information be obtained from the samplemembers?Once we have sample information, we use it to make decisionsabout the population.Finally, we draw conclusions about the population based onthe sample.
Background image
Descriptive and Inferential Statistics**Descriptive statistics** focus on graphical and numericalmethods to summarize and process data.**Inferential statistics** use data to make predictions,forecasts, and estimates to support decision-making.Both are essential tools for analyzing and interpreting data.
Background image
Variable ClassificationA **variable** is a specific characteristic of an individual orobject, such as age or weight.Variables can be classified into two types:1.**Categorical variables**: Responses that fall into groups orcategories.2.**Numerical variables**: Measured quantities that can bediscrete or continuous.
Background image
Categorical VariablesCategorical variables produce responses that belong topredefined groups or categories.Examples include:Yes/No questions: ”Are you a business major?” or ”Do youown a car?”Gender, marital status, or the type of errors in health careclaims (procedural, diagnostic, etc.).Faculty evaluation responses ranging from ”strongly disagree”to ”strongly agree.”
Background image
Numerical VariablesNumerical variables can be either **discrete** or**continuous**:**Discrete numerical variables**: Often countable and finite.Examples: Number of students enrolled in a class, number ofuniversity credits, number of stocks in an investor’s portfolio.**Continuous numerical variables**: Can take any valuewithin a given range.Continuous variables are typically measured quantities, such asweight or height.
Background image
Continuous Numerical VariablesA **continuous numerical variable** may take on any valuewithin a given range of real numbers.It usually arises from a measurement process.Example: Height can be 72.1 inches or 71.8 inches dependingon the measurement accuracy.Other examples: Weight of a cereal box, time to run a race,distance between cities, temperature.Continuous variables are often truncated in daily conversationand treated like discrete variables.
Background image
Measurement LevelsData can be classified as either **qualitative** or**quantitative**.**Qualitative data** have no measurable meaning to thedifference in numbers.**Quantitative data** have a measurable meaning to thedifference in numbers.Examples:Qualitative: Football players’ jersey numbers (7 vs. 10) do notimply skill differences.Quantitative: Exam scores (90 vs. 45) have measurablemeaning in terms of performance.
Background image
Nominal and Ordinal Levels of Measurement**Nominal data**: Responses to categorical questions withno implied ranking.Examples:Gender (1 = Male, 2 = Female)Car ownership (1 = Yes, 2 = No)**Ordinal data**: Rank ordering of items, but with nomeasurable meaning to the difference between ranks.Examples:Product quality (1: poor, 2: average, 3: good)Satisfaction rating (1: very dissatisfied to 5: very satisfied)Consumer preference among soft drinks (1: most preferred, 2:second choice, 3: third choice)
Background image
Interval and Ratio Levels of Measurement**Interval data**: Provide rank and distance from anarbitrary zero.Example: Temperature measured in Celsius or Fahrenheit.The difference between 30°C and 10°C is 20°, but it isincorrect to say that 30°C is three times warmer than 10°C.**Ratio data**: Have a meaningful zero point and provideboth rank and meaningful differences.Example: Height or weight, where zero has a natural meaning(absence of the quantity).
Background image
Examples: Nominal and Ordinal Data**Nominal data**:Gender: Male/FemalePolitical affiliationCar ownership**Ordinal data**:Quality rating: (1: poor, 2: average, 3: good)Preference rankings: First, second, third choice.
Background image
Examples: Interval and Ratio Data**Interval data**:Temperature in Celsius/FahrenheitYears (e.g., 2023 vs. 1990)**Ratio data**:Height, weight, or distance between cities.Time to complete a task or run a race.
Background image
Ratio Data**Ratio data** indicate both rank and distance from anatural zero point.Ratios between two measures are meaningful.Example: A person weighing 200 pounds is twice as heavy assomeone weighing 100 pounds.Another example: A person aged 40 is twice the age ofsomeone who is 20 years old.Ratio data allows us to make meaningful comparisons andproportions.
Background image
Classifying Data by TypeAfter collecting data, responses are classified as:**Categorical** (e.g., gender, political affiliation)**Numerical** (e.g., age, income, weight)Data can also be classified by the measurement scale:Nominal, Ordinal, Interval, RatioAfter classification, we assign an arbitrary ID or code to eachresponse for easier data handling.
Background image
Graph Selection for Data TypesDifferent types of graphs are used based on the data type:**Categorical variables**: Bar charts, pie charts**Numerical variables**: Histograms, line graphs, scatter plotsChoosing the right graph helps visualize and interpret datamore effectively.
Background image
Handling Missing ValuesData files often contain **missing values**, especially insurvey responses.Respondents may skip sensitive questions about gender, age,or income.Missing values require special codes in the data entry processto ensure accuracy.Failure to properly handle missing values can lead toerroneous results in analysis.Different statistical software packages handle missing values invarious ways.
Background image
Example: Handling Missing ValuesSuppose a survey asks respondents about their income, butsome people choose not to answer.In the data file, these missing values should be coded (e.g.,-999 or NaN) to differentiate them from valid responses.Statistical software may exclude these missing values fromcalculations or treat them as zero if not handled correctly.Correct handling of missing values is crucial for accuratestatistical output.
Background image
Describing Categorical VariablesCategorical variables can be described using:**Frequency distribution tables****Bar charts**, **pie charts**, and **Pareto diagrams**These tools are commonly used by managers and marketingresearchers to describe data collected from surveys andquestionnaires.
Background image
Frequency DistributionA **frequency distribution** is a table used to organize data.The left column (classes or groups) includes all possibleresponses for a categorical variable.The right column lists the frequencies or number ofobservations for each class.A **relative frequency distribution** is obtained by dividingeach frequency by the total number of observations andmultiplying by 100
Background image
Tables and Charts for Categorical DataThe classes used for constructing frequency distribution tablesare the possible responses to a categorical variable.**Bar charts** and **pie charts** are commonly used torepresent categorical data.A **bar chart** uses the height of rectangles to representeach frequency, and the bars do not need to touch.
Background image
Example 1.1: Activity Level (Frequency Distribution andBar Chart)The U.S. Department of Agriculture (USDA) and NationalCenter for Health Statistics (NCHS) conducted surveys toassess the health and nutrition of the U.S. population.One variable in the **Healthy Eating Index (HEI-2005)**study is a participant’s activity level, coded as:1 = Sedentary2 = Active3 = Very activeWe set up a frequency distribution and bar chart of activitylevel for the HEI–2005 participants during their first interview.
Background image
Table 1.1: HEI–2005 Participants’ Activity Level: FirstInterviewParticipantsFrequencyPercentSedentary2,18348.9%Active75717.0%Very Active1,52034.1%Total4,460100.0%Table:HEI–2005 Participants’ Activity Level: First Interview
Background image
Figure 1.1: Bar Chart of Activity LevelSedentaryActiveVery Active05001,0001,5002,0002,1837571,520Activity LevelNumber of ParticipantsHEI–2005Participants’ Activity Level: First Interview (Bar Chart)
Background image
Cross TablesA **cross table** (or crosstab) lists the number ofobservations for every combination of values for twocategorical or ordinal variables.The combination of all possible intervals for the two variablesdefines the cells in a cross table.A cross table withrrows andccolumns is referred to as anr×ccross table.
Background image
Example 1.2: Cross Tables and Bar ChartsCross tables are useful for describing relationships betweencategorical or ordinal variables.Example: Comparing participants’ activity levels (sedentary,active, very active) with other categorical variables, such asage groups or educational levels.Component bar charts and cluster bar charts extend thesimple bar chart for multiple variables.
Background image
Example 1.2: Activity Level and Gender (Component andCluster Bar Charts)This example compares activity levels (sedentary, active, veryactive) with gender (male, female) using data from the**HEI–2005** study.We will use component (stacked) and cluster (side-by-side)bar charts to visualize the comparison.
Background image
Table 1.2: HEI–2005 Participants’ Activity Level by GenderActivity LevelMalesFemalesTotalSedentary9571,2262,183Active340417757Very Active8426781,520Total2,1392,3214,460Table:HEI–2005 Participants’ Activity Level (First Interview) by Gender
Background image
Figure 1.2: Component (Stacked) Bar ChartMaleFemale05001,0001,5002,0002,5008426783404179571,226GenderNumber of ParticipantsSedentaryActiveVery ActiveHEI–2005Participants’ Activity Level by Gender (Component Bar Chart)
Background image
Figure 1.3: Cluster (Side-by-Side) Bar ChartSedentaryActiveVery Active02004006008001,0001,2009573408421,226417678Number of ParticipantsMaleFemaleHEI–2005Participants’ Activity Level by Gender (Cluster Bar Chart)
Background image
ConclusionComponent (stacked) bar charts show the contribution ofeach activity level within a gender group.Cluster (side-by-side) bar charts compare male and femaleparticipants directly for each activity level.Both charts provide insights into the distribution of activitylevels by gender for HEI–2005 participants.
Background image
Pie Charts and Their Use**Pie charts** are useful for showing proportions and sharesof a whole.The circle represents the total, and the segments depict theshares of different categories.The area of each segment is proportional to the correspondingfrequency or share.
Background image
Example 1.3: Browser Wars - Market Share (Pie Charts)In February 2011, the browser market in Europe and NorthAmerica showed different preferences.We will visualize the market shares for both regions using piecharts.
Background image
Table 1.3: Market Shares of Browsers (February 2011)BrowserEuropean Market (%)North American MarketFirefox37.6926.24Internet Explorer36.5448.16Google Chrome16.0313.76Safari4.9010.58Opera4.260.58Others0.580.68Table:Market Shares of Browsers in Europe and North America(February 2011)
Background image
Figure 1.4: European Market Share (Pie Chart)37.69%36.54%16.03%4.90%4.26%0.58%FirefoxInternet ExplorerGoogle ChromeSafariOperaOthersBrowser Market Share in Europe (February 2011)
Background image
Figure 1.5: North American Market Share (Pie Chart)26.24%48.16%13.76%10.58%0.58%0.68%FirefoxInternet ExplorerGoogle ChromeSafariOperaOthersBrowser Market Share in North America (February 2011)
Background image
ConclusionThe European market was dominated by Firefox (37.69%) andInternet Explorer (36.54%) in February 2011.In North America, Internet Explorer held the largest sharewith 48.16%, while Firefox had 26.24%.Pie charts are effective tools for visualizing the distribution ofmarket shares among competing categories.
Background image
Pareto Diagrams and Their Use**Pareto diagrams** are bar charts used to emphasize themost frequent causes of defects.The diagram is used to separate the ”vital few” from the”trivial many.”The Italian economist **Vilfredo Pareto** observed that asmall number of factors are responsible for most of theproblems—commonly known as the **80–20 rule**.Bars are arranged from the most frequent cause to the leastfrequent cause from left to right.
Background image
Example 1.4: Health Care Claims Processing ErrorsA health insurance company set a goal to reduce errors by 50After auditing 1,000 claims, the team identified the mostfrequent errors in the claims processing system.The data are listed in Table 1.4, and the Pareto diagram helpsidentify the most significant factors contributing to theseerrors.
Background image
Table 1.4: Frequency of Claims Processing ErrorsError TypeFrequencyProcedural and Diagnostic Codes40Contractual Applications37Pricing Schedules17Provider Information9Provider Adjustments7Patient Information6Program and System Errors4Table:Health Care Claims Processing Errors
Background image
Figure 1.6: Pareto Diagram for Health Care ClaimsProcessing ErrorsProcedural CodesContractual AppsPricing SchedulesProvider InfoProvider AdjPatient InfoProgram Errors0102030404037179764Error TypeFrequency of Errors
Background image
Cumulative Percentages for Pareto Diagram0020406080100Error TypeCumulative PercentageCumulative Percentage of Errors
Background image
ConclusionThe Pareto diagram shows that **Procedural and DiagnosticCodes** and **Contractual Applications** account for over60% of the errors.By addressing these ”vital few” errors, the company canachieve significant reductions in total errors.This aligns with the **80–20 rule**, where most of theproblems come from a few key factors.
Background image
Introduction to Time-Series PlotsA **time-series plot** (also called a line chart) is used todisplay data points collected or measured at successive pointsin time.In a time series, the sequence of observations is important.Time-series plots are useful for identifying trends, cycles, orseasonal patterns in data.Examples of time-series data include GDP, currency exchangerates, stock prices, and corporate earnings.
Background image
Example 1.5: Gross Domestic Product (Time-Series Plot)The U.S. **Bureau of Economic Analysis (BEA)** providesannual GDP data from 1929 through 2009.We will plot the GDP data to identify long-term trends ineconomic growth.
Background image
Figure 1.7: Time-Series Plot of GDP (1929-2009)1,9291,9491,9691,9892,0000.40.81.21.4·104YearBillions of Real 2005 DollarsGross Domestic Product (1929–2009)Gross Domestic Product (GDP) from 1929–2009
Background image
Example 1.6: Currency Exchange Rates (Time-Series Plot)Exchange rates fluctuate over time and are important forinvestors, travelers, and businesses.We will plot the exchange rates between USD and EUR, andUSD and GBP, for the 6-month period from **August 22,2010** to **February 17, 2011**.
Background image
Figure 1.8: Time-Series Plot of USD to EUR (Aug 2010 -Feb 2011)AugOctDecJanFeb1.251.31.351.41.45Date (Aug 2010 - Feb 2011)Exchange Rate (USD to EUR)Currency Exchange Rates: USD to EUR
Background image
Figure 1.9: Time-Series Plot of USD to GBP (Aug 2010 -Feb 2011)AugOctDecJanFeb1.51.551.61.65Date (Aug 2010 - Feb 2011)Exchange Rate (USD to GBP)Currency Exchange Rates: USD to GBP
Background image
Conclusion**Time-series plots** allow us to observe trends over time fordifferent variables.The **GDP time-series plot** shows steady growth from1929 to 2009.The **currency exchange rate plots** reveal fluctuationsbetween USD and both EUR and GBP from August 2010 toFebruary 2011.Time-series analysis is essential for understanding trends,cycles, and seasonal patterns in economic data.
Background image
Introduction to Frequency DistributionsA **frequency distribution** is a table that summarizes databy listing the classes and the number of observations in eachclass.For numerical data, we use formulas to determine the numberof classes and the class width.A **cumulative frequency distribution** adds the frequenciesof all classes up to the current class.These tools help summarize data and improve communicationof results.
Background image
Table 1.6: Completion Times (Seconds)271236294252254263266222262278288262237247282224263267254271278263262288247252264263247225281279238252242248263255294268255272271291263242288252226263269227273281267263244249252256263252261245252294288245251269256264252232275284252263274252252256254269234285275263263246294252231265269235275288294263247252269261266269236276248299Table:Completion Times (in Seconds) for Employees
Background image
Calculating Class WidthThe number of classes,k, is chosen based on the size of thedataset. In this case, we have chosenk= 8 classes.The class widthwis determined using the formula:w=Largest ObservationSmallest ObservationNumber of ClassesFrom the data in **Table 1.6**, the largest observation is**299** and the smallest observation is **222**.Substituting into the formula:w=2992228= 9.625Since class width must be rounded upward, the class width is**10**.The first class interval is from **220 to less than 230**, andsubsequent classes are created by adding the class width.
Background image
Table 1.7: Frequency and Relative Frequency DistributionsCompletion Times (in Seconds)FrequencyPercent220 less than 23054.5%230 less than 24087.3%240 less than 2501311.8%250 less than 2602220.0%260 less than 2703229.1%270 less than 2801311.8%280 less than 290109.1%290 less than 30076.4%Table:Frequency and Relative Frequency Distributions for CompletionTimes
Background image
Table 1.8: Cumulative Frequency and Relative CumulativeFrequencyCompletion Times (in Seconds)Cumulative FrequencyCumulaLess than 2305Less than 24013Less than 25026Less than 26048Less than 27080Less than 28093Less than 290103Less than 3001101Table:Cumulative Frequency and Relative Cumulative Frequency forCompletion Times
Background image
Cumulative Frequency Distribution Plot22023024025026027028029030020406080100120Completion Times (in Seconds)Cumulative FrequencyCumulative Frequency Distribution for Completion TimesCumulative Frequency Distribution for Completion Times
Background image
ConclusionA frequency distribution summarizes numerical data bygrouping observations into classes.Cumulative frequency distributions help identify the totalnumber of observations below certain thresholds.In the example, 72.7% of the employees completed the taskwithin the goal of 270 seconds.These tools provide a clearer picture of the data, allowing thesupervisor to make informed decisions.
Background image
Figure 1.13: Histogram of Completion Times (ConnectedBars)22023024025026027028029005101520253035Completion Time (Seconds)Frequency
Background image
Shape of a DistributionThe shape of a distribution can be visually described using ahistogram.**Symmetry**: A distribution is said to be symmetric if theobservations are balanced or evenly distributed about thecenter.**Skewness**: A distribution is skewed if the observations arenot evenly distributed.A **skewed-right distribution** (positively skewed) has alonger tail on the right.A **skewed-left distribution** (negatively skewed) has alonger tail on the left.The following slides illustrate the three shapes:Symmetric distributionSkewed-right distributionSkewed-left distribution
Background image
Figure 1.15(a): Symmetric Distribution123456780510ValueFrequencySymmetric Distribution
Background image
Figure 1.15(b): Skewed-left Distribution123456780510ValueFrequencySkewed-left Distribution
Background image
Figure 1.15(c): Skewed-right Distribution12345670510ValueFrequencySkewed-right Distribution
Background image
Stem-and-Leaf DisplaysA **stem-and-leaf display** is an alternative to a histogram,grouping data based on their leading digits (**stems**) andarranging the final digits (**leaves**).Each stem represents a class of data values, while the leavesrepresent individual data points within each class.The leaves are displayed in ascending order after thecorresponding stem.Stem-and-leaf displays help reveal the internal structure of thedata, making it easy to identify patterns and outliers.
Background image
Example 1.11: Stem-and-Leaf DisplayCompletion Times (in Seconds)StemLeaves222234677824224556778925122222445566692612222233334667799992711123455568899281145588882914449Stem-and-Leaf Display of Completion Times
Background image
Data: Accounting Final Exam GradesA random sample of 10 final exam grades for an introductoryaccounting class is as follows:88,51,63,85,79,65,79,70,73,77We will use a stem-and-leaf display to describe thedistribution of these grades.
Background image
Example 1.11: Grades on an Accounting Final ExamStem-and-Leaf DisplayStemLeaves51635703799858Stem-and-Leaf Display of Accounting Final Grades
Background image
Scatter Plot: ExplanationA **scatter plot** is used to visualize the relationshipbetween two numerical variables.In business and economics, scatter plots are often used toinvestigate the relationship between variables such as:The effect of advertising on total profitsThe change in quantity sold as a result of a change in priceIn this example, we will analyze the relationship between**SAT Math scores** and **GPA** for 11 students.The **SAT Math score** is the independent variable (labeledasX) and the **GPA** is the dependent variable (labeled asY).
Background image
Figure 1.18: GPA vs. SAT Math Scores (Scatter Plot)400450500550600650700752.62.833.23.43.63.84SAT Math ScoreGPAGPA vs. SAT Math Scores
Background image
ConclusionCategorical variables can be effectively described usingfrequency distribution tables and bar charts.Cross tables provide insights into relationships between twocategorical variables and are often used with component orcluster bar charts.Proper representation of categorical data helps in makinginformed decisions based on survey and questionnaire results.
Background image
Case Study: Impact on Business DecisionsExample: A company launching a new productUse historical sales data and trends to:Forecast product demandIdentify the right pricing strategyGraphical statistics help maximize profits and minimize risks.
Background image
Types of Graphical StatisticsCommon graphical methods include:Histograms: Show distribution of dataBox plots: Highlight quartiles, median, and outliersScatter plots: Explore relationships between two variablesLine graphs: Track trends over timeBar charts: Compare categorical dataThese methods provide a range of perspectives to understanddata.
Background image
Case Study: Impact on Business DecisionsConsider a company launching a new product:Historical sales data and customer trends can be visualized toforecast demand.Graphical statistics can help identify the right pricing strategyby analyzing customer satisfaction and market behavior.In this way, graphical statistics provide critical support inmaximizing profits and minimizing risk.
Background image
ConclusionGraphical statistics are a powerful tool for interpreting largeamounts of data.They play a crucial role in decision-making processes acrossindustries.In an era of data-driven decision making, understanding andusing graphical statistics is essential.
Background image