Probability Distribution

Understanding Probability Mass Function and Probability Density Function

function describing a discrete probability distribution by stating the probability of each value

Function describing a discrete probability distribution by stating the probability of each value.

Introduction

In the world of statistics, understanding how the values of a random variable are distributed is crucial. This is where the concepts of Probability Mass Function (PMF) and Probability Density Function (PDF) come into play. These two functions provide a mathematical description of the random variable's distribution.

Probability Mass Function (PMF)

The Probability Mass Function is a function that gives the probability that a discrete random variable is exactly equal to some value. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.

A PMF must satisfy two conditions:

  1. The probability for each value must be between 0 and 1, inclusive.
  2. The sum of all probabilities must equal 1.

For example, consider a fair six-sided die. The PMF of the outcome of a single roll is 1/6 for each of the faces (1, 2, 3, 4, 5, 6), and the sum of these probabilities is 1.

Probability Density Function (PDF)

The Probability Density Function is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0 (since there are an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

A PDF must satisfy two conditions:

  1. The probability for each value must be non-negative.
  2. The integral over the entire space must equal 1.

For example, the height of adult males in the U.S. is normally distributed with a mean of about 70 inches. The PDF at 70 inches shows the relative likelihood of a man being 70 inches tall.

Differences between PMF and PDF

While both PMF and PDF provide a description of the distribution of a random variable, they are used in different contexts. PMF is used for discrete random variables, for which the outcomes are countable. On the other hand, PDF is used for continuous random variables, which can take on an infinite number of outcomes.

In conclusion, understanding PMF and PDF is fundamental to understanding statistics and probability. These functions provide a way to describe the distribution of random variables, which is crucial in making predictions, inferences, and decisions based on data.