An outlier is an observation that appears to be significantly different from other values in the data set. This may be noticed in a table of values or in a particular graph type.
Outliers are removed if they are errors. Some causes of errors are inaccurate measurement, incorrect recording, or the subject not qualifying for the sample. Outliers can be defined according to a formula. Sometimes outliers identified in this way are not errors but legitimately part of the data set.
The decision of whether to keep or delete outliers depends on the question being asked about the data set. In the above plot, the outlier of 6500 was found to be a legitimate value (soy sauce). In cases such as this, the decision may be made that the value is so extreme as to be irrelevant for further comparison with the other products.