Descriptive Statistics Calculator
Calculator Use
What are Descriptive Statistics?
Descriptive statistics summarize certain aspects of a data set or a population using numeric calculations. Examples of descriptive statistics include:
- mean, average
- midrange
- standard deviation
- quartiles
This calculator generates descriptive statistics for a data set. Enter data values separated by commas or spaces. You can also copy and paste data from spreadsheets or text documents. See allowable data formats in the table below.
Descriptive Statistics Formulas and Calculations
This calculator uses the formulas and methods below to find the statistical values listed.
Minimum
Ordering a data set x1 ≤ x2 ≤ x3 ≤ ... ≤ xn from lowest to highest value, the minimum is the smallest value x1.
\[ \text{Min} = x_1 = \text{min}(x_i)_{i=1}^{n} \]Maximum
Ordering a data set x1 ≤ x2 ≤ x3 ≤ ... ≤ xn from lowest to highest value, the maximum is the largest value xn.
\[ \text{Max} = x_n = \text{max}(x_i)_{i=1}^{n} \]Range
The range of a data set is the difference between the minimum and maximum.
\[ \text{Range} = x_n - x_1 \]Sum
The sum is the total of all data values x1 + x2 + x3 + ... + xn
\[ \text{Sum} = \sum_{i=1}^{n}x_i \]Size, Count
Size or count is the number of data points in a data set.
\[ \text{Size} = n = \text{count}(x_i)_{i=1}^{n} \]Mean
The mean of a data set is the sum of all of the data divided by the size. The mean is also known as the average.
For a Population
\[ \mu = \dfrac{\sum_{i=1}^{n}x_i}{n}\]For a Sample
\[ \overline{x} = \dfrac{\sum_{i=1}^{n}x_i}{n}\]Median
Ordering a data set x1 ≤ x2 ≤ x3 ≤ ... ≤ xn from lowest to highest value, the median is the numeric value separating the upper half of the ordered sample data from the lower half. If n is odd the median is the center value. If n is even the median is the average of the 2 center values.
If n is odd the median is the value at position p where
\[ p = \dfrac{n + 1}{2} \] \[ \widetilde{x} = x_p \]If n is even the median is the average of the values at positions p and p + 1 where
\[ p = \dfrac{n}{2} \] \[ \widetilde{x} = \dfrac{x_{p} + x_{p+1}}{2} \]Mode
The mode is the value or values that occur most frequently in the data set. A data set can have more than one mode, and it can also have no mode.
Standard Deviation
Standard deviation is a measure of dispersion of data values from the mean. The formula for standard deviation is the square root of the sum of squared differences from the mean divided by the size of the data set.
For a Population
\[ \sigma = \sqrt{\dfrac{\sum_{i=1}^{n}(x_i - \mu)^{2}}{n}} \]For a Sample
\[ s = \sqrt{\dfrac{\sum_{i=1}^{n}(x_i - \overline{x})^{2}}{n - 1}} \]Variance
Variance measures dispersion of data from the mean. The formula for variance is the sum of squared differences from the mean divided by the size of the data set.
For a Population
\[ \sigma^{2} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{2}}{n} \]For a Sample
\[ s^{2} = \dfrac{\sum_{i=1}^{n}(x_i - \overline{x})^{2}}{n - 1} \]Midrange
The midrange of a data set is the average of the minimum and maximum values.
\[ \text{MR} = \dfrac{x_{min} + x_{max}}{2} \]Quartiles
Quartiles separate a data set into four sections. The median is the second quartile Q2. It divides the ordered data set into higher and lower halves. The first quartile, Q1, is the median of the lower half not including Q2. The third quartile, Q3, is the median of the higher half not including Q2. This is one of several methods for calculating quartiles.[1]
Interquartile Range
The range from Q1 to Q3 is the interquartile range (IQR).
\[ IQR = Q_3 - Q_1 \]Outliers
Potential outliers are values that lie above the Upper Fence or below the Lower Fence of the sample set.
\[ \text{Upper Fence} = Q_3 + 1.5 \times IQR \] \[ \text{Lower Fence} = Q_1 - 1.5 \times IQR \]Sum of Squares
The sum of squares is the sum of the squared differences between data values and the mean.
For a Population
\[ SS = \sum_{i=1}^{n}(x_i - \mu)^{2} \]For a Sample
\[ SS = \sum_{i=1}^{n}(x_i - \overline{x})^{2} \]Mean Absolute Deviation
Mean absolute deviation[2] is the sum of the absolute value of the differences between data values and the mean, divided by the sample size.
For a Population
\[ MAD = \dfrac{\sum_{i=1}^{n}|x_i - \mu|}{n} \]For a Sample
\[ MAD = \dfrac{\sum_{i=1}^{n}|x_i - \overline{x}|}{n} \]Root Mean Square
The root mean square describes the magnitude of a set of numbers. The formula for root mean square is the square root of the sum of the squared data values divided by n.
\[ RMS = \sqrt{\dfrac{\sum_{i=1}^{n}x_i^{2}}{n}} \]Standard Error of the Mean
Standard error of the mean is calculated as the standard deviation divided by the square root of the count n.
For a Population
\[ {SE}_{\mu} = \dfrac{\sigma}{\sqrt{n}} \]For a Sample
\[ {SE}_{\overline{x}} = \dfrac{s}{\sqrt{n}} \]Skewness
Skewness[3] describes how far to the left or right a data set distribution is distorted from a symmetrical bell curve. A distribution with a long left tail is left-skewed, or negatively-skewed. A distribution with a long right tail is right-skewed, or positively-skewed.
For a Population
\[ \gamma_{1} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{3}}{n\sigma^{3}} \]For a Sample
\[ \gamma_{1} = \dfrac{n}{(n-1)(n-2)} \sum_{i=1}^{n} \left(\dfrac{x_i - \overline{x}}{s}\right)^{3} \]Kurtosis
Kurtosis[3] describes the extremeness of the tails of a population distribution and is an indicator of data outliers. High kurtosis means that a data set has tail data that is more extreme than a normal distribution. Low kurtosis means the tail data is less extreme than a normal distribution.
For a Population
\[ \beta_{2} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{4}}{n\sigma^{4}} \]For a Sample
\[ \beta_{2} = \dfrac{n(n+1)}{(n-1)(n-2)(n-3)} \sum_{i=1}^{n} \left(\dfrac{x_i - \overline{x}}{s}\right)^{4} \]Kurtosis Excess
Excess kurtosis describes the height of the tails of a distribution rather than the extremity of the length of the tails. Excess kurtosis means that the distribution has a high frequency of data outliers.
For a Population
\[ \alpha_{4} = \dfrac{\sum_{i=1}^{n}(x_i - \mu)^{4}}{n\sigma^{4}} - 3 \]For a Sample (This is just Kurtosis in MS Excel and Google Sheets)
\[ \alpha_{4} = \dfrac{n(n+1)}{(n-1)(n-2)(n-3)} \sum_{i=1}^{n} \left(\dfrac{x_i - \overline{x}}{s}\right)^{4} - \dfrac{3(n-1)^{2}}{(n-2)(n-3)} \]Coefficient of Variation
The coefficient of variation describes dispersion of data around the mean. It is the ratio of the standard deviation to the mean. The coefficient of variation is calculated as the standard deviation divided by the mean.
For a Population
\[ CV = \dfrac{\sigma}{\mu} \]For a Sample
\[ CV = \dfrac{s}{\overline{x}} \]Relative Standard Deviation
Relative standard deviation describes the variance of a subset of data from the mean. It is expressed as a percentage. Relative standard deviation is calculated as the standard deviation times 100 divided by the mean.
For a Population
\[ RSD = \left[ \dfrac{100 \times \sigma}{\mu} \right] \% \]For a Sample
\[ RSD = \left[ \dfrac{100 \times s}{\overline{x}} \right] \% \]Frequency
Frequency is the number of occurrences for each data value in the data set. Frequency is used to find the mode of a data set.
Unit
Options
54
65
47
59
40
53
54,
65,
47,
59,
40,
53,
or
42, 54, 65, 47, 59, 40, 53
65 47
59 40
53
or
42 54 65 47 59 40 53
54 65,,, 47,,59,
40 53
References
[1] Wikipedia contributors. "Quartile." Wikipedia, The Free Encyclopedia. Last visited 28 May, 2020.
[2] Weisstein, Eric W. "Mean Deviation." From MathWorld--A Wolfram Web Resource. Mean Deviation. Last visited 28 May, 2020.
[3] Information Technology Lab, National Institute of Standards and Technology. Section 1.3.5.11 Measures of Skewness and Kurtosis. From the Engineering Statistics Handbook. Last visited 28 May, 2020.