Do you like pears?
A farmer took a sample of 11 pears and measured their weights in order to compare this year's produce to last year's:
140, 153, 154, 155, 155, 157, 158, 158, 159, 160, 177
We can see that there are a couple of outliers: 140 and 177 are significantly different from most of the data.
What could be a useful measure to use in such a scenario?
We could use quartiles to see what the middle 50% of the data is between!
Let's recall that quartiles divide the data into quarters:
1st quartile (Q1) = 25th percentile, i.e. value that 25% (one quarter) of the data are smaller or equal to
2nd quartile (Q2) = 50th percentile, i.e. value that 50% (two quarters) of the data are smaller or equal to - so it is the middle value, i.e. the median
3rd quartile (Q3) = 75th percentile, i.e. value that 75% (three quarters) of the data are smaller or equal to
We can find all quartiles quite easily using the median.
Median is the middle value and we can find it either by:
1) Crossing one number from each side at a time (when the numbers are in an ascending order) until we reach the middle number:
140, 153, 154, 155, 155, 157, 158, 158, 159, 160, 177
or
2) Finding the position of the median by adding 1 to the total number of numbers and dividing by 2:
There are 11 numbers so the median is the (11 + 1) ÷ 2 = 6th number, i.e. 157!
The median divides the data into two halves:
140, 153, 154, 155, 155, 157, 158, 158, 159, 160, 177
Since the lower quartile (Q1) is one quarter of the way through the data, it is the median of the lower half:
140, 153, 154, 155, 155, 157, 158, 158, 159, 160, 177
And indeed, if we wanted to verify through the position, we can see Q1 would be the (11 + 1) ÷ 4 = 3rd number =, i.e. the 154 we found!
Similarly, the upper quartile is at the 3 x (11 + 1) ÷ 4 = 9th number (because it's three quarters of the way through the data), i.e. the median of the upper half:
140, 153, 154, 155, 155, 157, 158, 158, 159, 160, 177
So we can the middle 50% of the data lies between 154 (Q1) and 159 (Q3)!
We can use these to find the interquartile range (IQR), i.e. the distance over which the middle 50% of the data is spread out:
IQR = Q3 - Q1 = 159 - 154 = 5
So the middle 50% of the data is spread out over the distance of 5!
This is not an easy topic to get your head round, so don't worry if this all seems a bit daunting.
Let's work through some questions together and you can always look back at this introduction by clicking on the red help button that will appear on the screen as you start the questions.