Home > Statistics > Good teaching > Data reduction > Influence of outliers

Influence of outliers

The CensusAtSchool site has data from very many students.

Data include students' belly button height (measured from the ground to their belly button). A random sample generated from the data may include belly button heights of 2 cm. It is possible the students with a belly button height of 2 cm measured across the belly button and not from the floor to the belly button. These data values are erroneous measurements and are considered outliers. They should be removed from the data set before any calculations of the measures of centre are undertaken.

Other measures of belly button height that appear to be shorter or taller than expected cannot be discarded without considering if the individual measures are possible. They should not be removed without considering the context of the data.

Students should be given the opportunity to plot data sets with and without outliers to observe how they influence the mean in order to develop an understanding of how to make good decisions about the use of outliers.

Awareness of outliers

In considering measures of centre, particularly the mean, it is important to be aware of the influence of outliers.

Curriculum links

Year 8: Investigate the effect of individual data values, including outliers, on the mean and median