Understanding Continuous Probability Distributions in IE
Dec 11, 2024
IE 322: Probabilistic Models in IE
Lecture 19-Chapter 6: Some Continuous Probability Distributions
Prakash Chakraborty
The Pennsylvania State University
Fall 2024
Keywords: Uniform, Normal, Normal curve, Lognormal, Chi-squared, Exponential, Gamma
6.1 Uniform distribution with parameters A, B
One of the simplest continuous distributions. This distribution is characterized by being 'flat,' and thus the density is uniform in a closed interval, say [A,B].
Figure 1: pdf of Uniform[1,3]
Definition
The density function of the continuous uniform random variable X on the interval [A,B] is
f(x;A,B) = 1/(B−A), A≤x≤B,
0, elsewhere.
Example
Suppose that a large conference room at a certain company can be reserved for no more than 4 hours. Both long and short conferences occur quite often. In fact, it can be assumed that the length X of a conference has a uniform distribution on the interval [0,4].
(a) What is the probability density function?
(b) What is the probability that any given conference lasts at least 3 hours?
6.1 Basic properties of Uniform distribution
Usually, a uniform distribution is used to represent a random variable that we know the lower (A) and upper bounds (B), but not much about which values are more likely.
Theorem
The mean and variance of the uniform distribution are
µ = (A+B)/2, and σ² = (B−A)²/12.
6.2 Normal distribution with parameters µ, σ
One of the most important probability distributions.
The graph of its pdf is called a normal curve.
The shape of a normal curve is determined by its mean µ and variance σ².
Definition
The density of the normal random variable X, with mean µ and variance σ², is
f(x;µ, σ) = 1/(√(2π)σ) e^(−(x−µ)²/(2σ²)), −∞<x<∞.
6.2 Normal distribution: properties
We denote that random variable X follows a normal distribution with mean µ and variance σ² by X∼N(µ, σ²).
The following are characteristics of a normal distribution.
1. The mode, the value of x where f(x) is at its maximum, occurs at x=µ.
2. The curve is symmetric around the mean, µ.
3. The normal curve approaches the horizontal axis as x approaches ∞ or −∞.
4. The total area under the curve is equal to 1.
5. [Important] If X is a normal random variable with mean µ and variance σ², aX+b is a normal random variable with mean aµ+b and variance a²σ².
6.2 Normal curves comparison
The value of µ decides where the center of the distribution is located → because a normal distribution is symmetric around the mean.
The value of σ decides how spread out the distribution is.
(a) Same σ, but µ₁≠µ₂
(b) Same µ, but σ₁< σ₂
(c) µ₁≠µ₂ and σ₁< σ₂
Figure 2: Normal Curves
6.3 Area under the normal curve
Suppose we would like to compute P(x₁<X<x₂) where X∼N(µ, σ²)
P(x₁<X<x₂) = P((x₁−µ)/σ < Z < (x₂−µ)/σ), where Z∼N(0,1).
(Table A.3 in the textbook contains P(Z<z) = F(z) is given for different values of z.)
Example
Given the standard normal table, find the area under the curve that lies
(a) to the right of z=1.84 and
(b) between z=−1.97 and z=0.86 and find k such that
(c) P(Z>k) = 0.3015 and
(d) P(k<Z<−0.18) = 0.4197.
Example
Given a random variable X having a normal distribution with µ=50 and σ=10, find the probability that X assumes a value between 45 and 62.
6.3 Using the normal curve in reverse
Sometimes we are given probability p and asked to find x that satisfies P(X<x) = p. In this case we can also use the table of standard normal distribution. Using the relationship between X and Z, if we find z such that p = P(Z<z) = P((X−µ)/σ < z) = P(X < σz+µ).
Therefore, x = σz+µ.
Example
Given a normal distribution with mean µ=40 and σ=6, find the value of x that has
(a) 40.9% of the area to the left and
(b) 14% of the area to the right
Using normal curve in reverse (2)
Sometimes, you may not be able to find p you are looking for in the standard normal CDF table. If that is the case, find p₁ and p₂ closest to p such that p₁<p<p₂ and the corresponding z₁ and z₂. Then, find the interpolated z value by
z = z₁×(p−p₁)/(p₂−p₁) + z₂×(p₂−p)/(p₂−p₁)
For instance, suppose you need to find z such that P(Z<z) = 0.51. Then, 0.5080<0.51<0.5120 and the corresponding Z's are 0.02 and 0.03. Therefore,
z = 0.02×(0.51−0.5080)/(0.5120−0.5080) + 0.03×(0.5120−0.51)/(0.5120−0.5080) = 0.025.
6.4 Application of the normal distribution
A typical iPhone battery lasts, on average 10 hours with a standard deviation of 2 hours. Assuming the battery life is normally distributed, find the probability that a given battery will last less than 7.5 hours.
6.4 Example
A company pays its employees an average wage of $15.90 an hour with a standard deviation of $1.5. If the wages are approximately normally distributed, find the minimum wage of the top 5% highest-paid employees.
6.5 Normal approximation to the Binomial
Earlier we discussed how a Poisson distribution can approximate a binomial distribution with parameter n and p. In fact, a binomial distribution can also be approximated by a normal distribution when n is large.
Theorem
If X is a binomial random variable with mean µ=np and variance σ²=np(1−p), then the distribution of (X−np)/√(np(1−p)) becomes closer to that of the standard normal random variable Z∼N(0,1), as n increases.
This approximation works well when n is large and p is not too close to 0 or 1 (works best when p=0.5)
When n is not too large, we perform continuity correction by adding 0.5 to x:
P(X≤x) ≈ P(Z ≤ (x+0.5−µ)/σ)
where X∼Binomial(n,p), and µ=np, σ=√(np(1−p)).
Example
The probability that a patient recovers from a rare blood disease is 0.4. If 100 people are known to have contracted this disease, what is the probability that fewer than 30 survive?
6.6 Exponential distribution with parameter β
Plays a key role in reliability engineering and designing service systems.
Definition
The continuous random variable X has an exponential distribution, with parameter β, if its density function is given by
f(x;β) = (1/β)e^(−x/β), x>0,
0, elsewhere,
where β >0.
Basic Properties:
6.6 Connection with Poisson
The exponential distribution is closely related to the Poisson distribution.
Recall that the Poisson distribution represents the number of events happening in the t time window given the average number of events in unit time is λ. The pdf is
p(x;λt) = e^(−λt)(λt)^x/x!.
The time between two events follow an exponential distribution with β=1/λ.
Example
Suppose that the time until the engine failure of a certain model of a car is represented by an exponential random variable T with mean time to failure β=5 years.
(a) What is the probability that an engine of the car model is functioning at the end of year 8?
(b) Suppose a company bought 5 of these models 8 years ago. What is the probability that at least 2 of them are functioning at the end of year 8?
6.6 The memoryless property
The exponential distribution has its distinct statistical property known as the memoryless property.
Suppose an electronic component where lifetime has an exponential distribution, the probability that the component lasts for t hours can be found by
P(X≥t) = ∫[t to ∞] λe^(−λx)dx = −e^(−λx)|[t to ∞] = e^(−λt).
Suppose that the component has been working for t₀ time period, then the probability of lasting an additional t hours can be computed as
P(X≥t+t₀|X≥t₀) = P(X≥t+t₀)/P(X≥t₀) = e^(−λ(t+t₀))/e^(−λt₀) = e^(−λt).
6.6 Gamma distribution with parameters α and β
The exponential distribution is a special case of a distribution known as the gamma distribution. The gamma distribution represents the time until α number of Poisson events happen.
Definition
The continuous random variable X has a gamma distribution, with parameters α and β, if its density function is given by
f(x;α, β) = (1/(β^α Γ(α)))x^(α−1)e^(−x/β), x>0,
0, elsewhere,
where α >0, β >0, and the gamma function Γ(α) is given by
Γ(α) = ∫[0 to ∞] x^(α−1)e^(−x)dx, for α >0.
6.6 Basic properties of gamma distribution
The gamma distribution can be defined for any α >0 (integer condition for α is not necessary).
Some properties of the gamma function:
1. Γ(α) = (α−1)Γ(α−1) for positive integer α.
2. Γ(α) = (α−1)! for a positive integer α.
3. Γ(1) = 1
4. Γ(1/2) = √π.
Basic properties of the gamma distribution:
Example
Suppose that the number of calls a call center receives is a Poisson process with an average 5 calls a minute. What is the probability that up to a minute will elapse by the time 2 calls have come in to the call center?
6.6 Gamma distribution: Memoryless?
The gamma distribution does not have the memoryless property. The only continuous distribution that has the memoryless property is the exponential distribution.
Although the gamma distribution is originated from the Poisson distribution, it has other wide applications in biomedical science and reliability engineering to model the survival time distribution.
Example
It is known, from the market data, that the length of time in months between customer complaints about a certain product is a gamma distribution with α=2 and β=4. Changes were made to tighten quality control requirements. Following these changes, 20 months passed before the first complaint. Does it appear as if the quality control tightening was effective?
6.9 Lognormal distribution with parameters µ, σ
The lognormal distribution has many applications in finance (e.g. return on financial investment).
Definition
The continuous random variable X has a lognormal distribution if the random variable Y= log(X) has a normal distribution with mean µ and standard deviation σ. The resulting density function of X is
f(x;µ, σ) = 1/(√(2π)σx) exp(−(log(x)−µ)²/(2σ²)), x≥0,
0, x<0.
Basic Properties:
Example
Concentrations of pollutants produced by chemical plants is under government regulations. Suppose the concentration of a pollutant in parts per million has a lognormal distribution with parameters µ=3.2 and σ=1. What is the probability that the concentration exceeds 8 parts per million?
6.7 Chi-squared distribution with parameter ν
The chi-square distribution is another special case of the gamma distribution by selecting α=ν/2 and β=2, where ν is a positive integer. Since β is fixed, the chi-squared distribution has only one parameter, ν, called the degrees of freedom.
Definition
The continuous random variable X has a chi-squared distribution, with ν degrees of freedom, if its density function is given by
f(x;ν) = (1/(2^(ν/2)Γ(ν/2)))x^(ν/2−1)e^(−x/2), x>0,
0, elsewhere,
where ν is a positive integer.
Basic Properties:
6.7 Relation to normal
The chi-squared distribution is closely related to the normal distribution. It is known that if X₁,X₂, . . . ,Xν have N(0,1) distribution, then
X₁²+X₂²+. . .+Xν² ∼ χ²(ν),
where χ²(ν) represents the chi-squared distribution with ν degrees of freedom. This is why ν is called the degrees of freedom parameter.