Originally Posted by mathman
Originally Posted by Youper
That is really part of the question. Which is more significant. I've been using median thinking it would better reflect what I want to know by giving less emphasis to a single bad shot.


Consider these numbers: 2,2,2,3,100,100,100

Three is the median, but does it "best represent" the "middle" of these numbers?


Mathman,

I'm no statistics expert, but I've seen enough data to be very suspicious of the data you listed. Looks like two different data sets somehow got combined, or there was some measurement error or some other problem. Bottom line is I wouldn't trust the data until I could figure out why 3 of the 7 data points were TWO orders of magnitude higher than the rest and the two "groups" of data within the data set were remarkably consistent in themselves but very different than those in the other "group." In fact, I would be wondering if the data represented a "zero" value at the low end (with some noise or some other phantom amount like nonlinearity at the bottom of a correlation curve showing instead of zero) and then values that were beyond the measurement capability of the system.

I know you were just throwing out some numbers for an example, but my point is blindly looking at a set of numbers is a trap that can lead to wrong conclusions. A lot more meaning can be derived from a data set when one knows something about the conditions under which the measurements were taken (e.g., gusty wind, a shot or two were pulled, etc.).

If I remember correctly, the mean and the median converge as the number of points in the data set increases. With a small number of data points, you have a strong possibility of running into "errors" with both the mean and median.