Originally Posted by brydan
Originally Posted by denton
The fundamental issue is that all measures of dispersion are differences between data points. Data points behave as we have come to expect. Differences do not..

Originally Posted by denton
Things are simpler if you simply worry about how far each shot is from the center of the group, but that is more complex than anyone is going to do in the field. For groups with 5 shots, group size has practically all the statistical strength of standard deviation.

After reading through this thread a few times I think I'm getting a better handle on this particular concept. So if I'm understanding this correctly (I suck at statistics so my terminology/understanding is probably off), we can't treat group size measurements as independent variables because of the underlying dispersion that those numbers are based on. Since they're not independent variables, the CLT doesn't apply. Therefore if we try to treat those group size measurements as independent variables, the results are unlikely/less likely to accurately model real world behavior. Am I on the right track with that?

On the second part, if instead of using group size measurements, we use the location of each individual shot, those are independent variables, therefore the CLT applies, and we can assume a normal distribution and analysis. Something like that? If so, would I be correct to assume that if I want to get a more accurate model of how my system actually performs within say 20 shots, it would be more accurate to use mean radius than shooting 4-5shot groups?

I appreciate everyone's input in this thread. It's interesting stuff to think about.

Seems like you're getting the hang of it.

Let me try rephrasing a bit for more clarity:

For interval or ratio data (stuff you can measure with a ruler, meter, etc.), we use the T Test to see if two groups of data really have different means, vs. the difference being easily explained by random variation. The T Test tests a difference. Because of the CLT, the T Test is robust to non-normality as long as you have decent sample sizes.

When you get to measures of dispersion, we use the F Test (or one of its cousins). The F Test tests the ratio of two variances (variance = standard deviation squared, a measure of dispersion). Because the CLT doesn't work here, the F Test is sensitive to non-normality.

Group size is a measure of dispersion. So, again, there is no CLT.

It's possible to simplify things by reducing the problem to one dimension. Think of the target in terms of r and theta, rather than x and y. We really don't care about theta most of the time. We just care about how far the bullet missed. So just do stats on r, and you can take a mean and a standard deviation. Now the stats are better behaved. You can simply say that 95% of shots will fall within plus and minus 2 standard deviations, and that works.

I don't think many folks will do standard deviations in the field. Something simpler is needed.

Group size, mean distance from center, and all the rest all contain the same information, wearing different shirts. There is no need for anything beyond group size and standard deviation. For 5 shots, group size is 90% as good as standard deviation.

So for ranges, you can pull out some exotic tools like ANOMR, or you can just punt and do the simulation. Then you can sort the resulting simulation numbers, and note the upper and lower 2.5% points, and you have your 95% Prediction Interval.

It's been fun.... not many are interested in this esoterica. Hope I have shed a little light on the subject.


Be not weary in well doing.