Mathematical Relation Between Mean, Median, and Mode
Mean, median, and mode are three key measures of central tendency in statistics that summarize a set of data by identifying the central point within that set. These measures are often used to understand the distribution and central value of a dataset.
Mean: The mean, often referred to as the average, is calculated by summing all the values in a dataset and then dividing by the number of values. Mathematically, the mean is given by:
Where:
= mean
= number of values in the dataset
= each individual value in the dataset
Median: The median is the middle value in a dataset when it is ordered in ascending or descending order. If the dataset has an odd number of observations, the median is the middle number. If the dataset has an even number of observations, the median is the average of the two middle numbers.
Mode: The mode is the value that appears most frequently in a dataset. A dataset may have one mode (unimodal), more than one mode (bimodal or multimodal), or no mode at all.
Relationship Between Mean, Median, and Mode
For a moderately skewed distribution, there is an empirical relationship between the mean, median, and mode, known as the empirical formula:
This relationship helps in understanding the skewness of the distribution:
- Symmetric Distribution: For a perfectly symmetric distribution, the mean, median, and mode are all equal.
- Positively Skewed Distribution: In a positively skewed (or right-skewed) distribution, the mean is typically greater than the median, and the median is greater than the mode.
- Negatively Skewed Distribution: In a negatively skewed (or left-skewed) distribution, the mean is typically less than the median, and the median is less than the mode.
Examples
Example 1: Symmetric Distribution
Consider the dataset: 2, 4, 6, 8, 10
Mean =
Median = 6 (middle value)
Mode = No mode (all values appear only once)
Here, the mean and median are equal, indicating a symmetric distribution.
Example 2: Positively Skewed Distribution
Consider the dataset: 1, 2, 2, 3, 9
Mean =
Median = 2 (middle value)
Mode = 2 (most frequent value)
Here, the mode < 3, the median < 3.4, and the mean = 3.4, indicating a positively skewed distribution. Using the empirical relationship: Mode ≈ 3 × Median - 2 × Mean = 3 × 2 - 2 × 3.4 = 6 - 6.8 = -0.8 (approximated to be close to the actual mode).
Example 3: Negatively Skewed Distribution
Consider the dataset: 3, 3, 4, 5, 6
Mean =
Median = 4 (middle value)
Mode = 3 (most frequent value)
Here, the mode < 4, the median < 4.2, and the mean = 4.2, indicating a negatively skewed distribution. Using the empirical relationship: Mode ≈ 3 × Median - 2 × Mean = 3 × 4 - 2 × 4.2 = 12 - 8.4 = 3.6 (approximated to be close to the actual mode).
This empirical relationship is useful for understanding the shape and skewness of a distribution, although it is most accurate for moderately skewed distributions.