Dear Stevo..,
I am definitely not a statistician and you probably know most of the below already but here is one very simplified visualisation of yr problem. This is deduced (by me) from the ASTM manual on quality control of materials (ASTM E122 – 58). Similar maths will be found in all stats. books however ASTM present it in relatively ready-to-use form. This is no replacement for AS NUR’s proposal, just a quick way of getting an idea. I once used a similar approach for designing a sampling plan for container shipments of cartons (to estimate net weight) .
In the crudest way, can view situation as single sampling of a large lot of primary units. (correction for a small lot later)
Formula is n = (3v/e)exp2 (exp2 = squared)
n = necessary size of sample, ie number of sampled objects
v = coefficient of variation in per cent = (100).(sigma)/(Xbar) = the advance estimate of the coefficient of variation of the (desired characteristic of the) material, expressed in percent
sigma = the advance estimate of the standard deviation of the lot
e = 100E/(Xbar) = the allowable sampling error expressed as a per cent of Xbar
Xbar = the expected value of the characterisic being measured (ie the mean)
E = the maximum allowable difference between the estimate to be made and the result of testing (by same method) all the units in the universe
3 = a factor corresponding to a probability of about 3 parts in 1000 that the difference between the sample estimate and the result of testing (by the same method) all the units in the universe is greater than E. The choice of the factor 3 is recommended for general use.
Note – the advantage of using “v” is that it “standardises” the deviation with respect to the average quality characteristic
Note – don’t worry about strange terms like universe, I think this is for statistical thoroughness.
Example (my own) - you are seeking a maximum value of 3 for n
e is maybe +/-10 percent
plugging the two values in gives v = approx sq.root(33) = 5 - 6
I seem to remember reading somewhere that many typical powder mixing devices can achieve “v” numbers of around 3-6 percent with respect to composition unifomity but you probably have a better idea than me for that. Whether the (“true”) bacterial plate count has a similar “v” variation will of course depend on factors like materials all from the same source etc. Frankly hv no idea.
There are all sorts of approximations in the above but it gives you an idea.
One more comment – in the total absence of detailed statistical knowledge, can look at the historical smallest and largest values of the characteristic and estimate the sigma values from (largest – smallest) / 6 (assumes normal curve). The least optimistic book value (ie largest st.dev) is for a hypothesised rectangular distribution where formula is (largest – smallest) / 3.5.
[added - If you have a play with the data in earlier post, can see the result is "v" values in range 20-30 percent, the high value simply reflects their apparent variation. This may simply mean that the (
absolute) bacteriological quality substantially varies from lot to lot (eg genuinely different raw materials) or that the testing precision is low or other reasons. The distinction is the potential benefit of ANOVA but I'm sure will require replicate testing, etc to proceed along that route. Alternatively, can perhaps substantially increase the value of "e" in my above formula to acknowledge the acceptable tolerance required ]
For a small lot, the formula is (nL) = (N/(N+n)).n
nL = necessary sample size for finite lot
N = total lot size
n = value from first equation
(ie the necesssary sample size decreases if a significant part of the lot is used)
Rgds / Charles.C
Ps anyone with more knowledge on this topic is quite welcome to point out any errors in above. I'm sure there will be some