Precision, Accuracy, and Resolution

Dave Tutelman  --  December 23, 2007

Too many people use the terms "precision" and "accuracy" interchangeably. They shouldn't. Precision and accuracy are completely different concepts. Let's explore what they really mean, and how to tell the difference. While we're at it, we will also throw in "resolution", which is also too-often confused with precision.

This article first distinguishes between resolution and precision, then between precision and accuracy. In each case, we will start with an example chosen to make the point clear, then take one or two examples from clubmaking measurement to show why it's important to clubmakers.

Precision vs Resolution

First the definitions:
  • Resolution is the fineness to which an instrument can be read.
  • Precision is the fineness to which an instrument can be read repeatably and reliably.
There is a difference. Let's see it with an example.

Here are two stopwatches. One is analog and the other is digital. Both are manually actuated; this is an important point in the distinction.

First, let's look at the resolution of the two stopwatches:
  • The analog stopwatch has to be viewed on its dial. If you look closely, you can relate the big hand to the smallest tick-mark on the big dial. That tickmark is a tenth of a second. The best a good eye can do is resolve a reading to 1/10 second, which is therefore the resolution of the stopwatch.
  • The digital stopwatch has two digits beyond the seconds, so it subdivides time in hundredths of a second. Since it is easy to read to 1/100 of a second, that is its resolution.
So there is a substantial difference between the watches in resolution -- a power of ten, from 1/10 to 1/100 second.

What about the precision? Precision is reliable, repeatable measurement. The total measurement system includes the human that activates the watch in either case. And experiments have shown that a human takes about 1/10 of a second to react to a stimulus and turn it into a button press. So...
  • The analog stopwatch has a precision of about 1/10 second. Both the resolution and the stimulus-response time of the human are 1/10 second.
  • The digital stopwatch also has a precision of 1/10 of a second. This is a surprise! After all, the watch has a resolution of 1/100 second. But, because of the human reaction time, the hundredths digit is not reliable. If you measured precisely-known elapsed times with this arrangement, you would find the last digit's value to be almost random. There is a spread of about 1/10 of a second in the measured times due to the human factor. So it is repeatable to only 1/10 second.
Analog stopwatch
Digital stopwatch

This raises an important point. The advent of digital instrumentation gave rise to a mindset that equated resolution with precision. Digital readouts make it very easy to see what the resolution of an instrument was. Most people simply assume, "Hey, the guys who designed this made it to read to five digits, so it must be good to five digits!" Whatever "good" means. Resolution? Yes. Precision? Well, maybe.

How about a few real-life examples from clubmaker's instruments.

Digital Scales

In my article about testing digital scales, I warn about measuring the same weight twice in a row. That is because some manufacturers of digital scales realized that their precision was not up to their resolution. For instance, the resolution might be 1 gram; but they were not able, without significant added expense, to get the precision below 3 grams. Instead of cutting the resolution to 3 grams -- which would also be honest but expensive-- and potentially confusing -- they just left the resolution at one gram.

But they realized that customers might be annoyed by discovering this unfortunate fact of life. So they came up with a "cheater circuit" that recognizes if the load being weighed is within 3 grams (or maybe 5 grams, just to be safe) of the last thing weighed. If so, then just display the previous answer, instead of what you actually got this time. So, if you are trying to determine the repeatability of your scale, be sure to "cleanse the palate" with a completely different weight between weighings.

Frequency Meter

Typically, a clubmaker's frequency meter has a resolution of 1cpm. But John Kaufman has made a version of his very successful Club Scout frequency meter, with a resolution of 1/10 of a cpm (0.1cpm). It is reasonable to ask, what is its precision?

John has assured us that he does indeed get repeatable readings to 0.1cpm, and I believe him. It isn't hard to build electronics to do this. But I also believe him when he says that technique and setup must be watched when you're trying to attain this precision. Think about things like:
  • The stability of the clamp, and the bench to which it is mounted.
  • The repeatability of the clamp to the same pressure each time a shaft is clamped.
  • The repeatability of your technique for pulling and releasing the shaft.
  • The security with which the tip weight (or clubhead) is attached to the shaft.
Any of these could be perfectly good for a precision of 1cpm, but might introduce fluctuating readings with a resolution of 0.1cpm. In order to achieve the resolution that John has built into the electronics, your clamp, bench, and technique must be good for 0.1cpm repeatability.

Accuracy vs Precision

It's not hard to wrap your mind around repeatability, which is the difference between resolution and precision. The difference between precision and accuracy is correctness -- and that is sometimes a little harder to cope with. To make it easier, we'll use a very graphic example [1].

Low precision, low accuracyImagine you have a rifle with a telescopic sight. When you shoot with it, you get a pattern like the one at the left. Not very good.

So you decide there is something wrong with your telescopic sight. You get better optics -- sharper, and greater magnification. Does that solve the problem?
High precision, low accuracyNo it does not! You now have a much tighter distribution. But, on average, you're just as far from the bull's eye. The real problem was not that the scope did not show the target well enough; the scope was aligned wrong.

One way of expressing it is, "You have greatly improved the precision, but the accuracy did not get better." That is:
  • The repeatability from shot to shot (precision) is much better, but...
  • The "correctness" (accuracy) of the shots -- their distance from the bull's eye -- did not improve at all.
Low precision, high accuracyOK, so we can improve precision without improving accuracy. Does it work the other way, too? Can we improve accuracy without improving precision?

We can, as this picture shows. If, instead of working on the optics of the sighting scope, we had just aligned it properly, here's the pattern we would have gotten. No improvement in precision, but plenty of improved accuracy.
High precision, high accuracyFinally, just to complete the picture, here's high accuracy with high precision. This would result from working on both the alignment and the optics.

On to the promised examples from clubmaking instrumentation.

Digital Scales, but not just digital scales

Let's look at the sort of errors that affect the accuracy of instruments, as opposed to precision or resolution. In this section, our model instrument will be a digital scale, but we can apply the information to any instrument, even analog instruments.

Let's start by assuming that the resolution and precision of the instrument is easily good enough for the job you have. Your concern now is that these precise results reflect reality: that is, they are accurate.

Example: You have a digital scale that reads to 1 gram -- both resolution and precision. You can weigh a 100-gram standard weight and it reads the same value every time to within a gram. But that's just precision. If that consistent, precise value is that you read is 105 grams, then the accuracy is 5 grams, five times coarser than the precision.

So what would accuracy errors look like?

The first and perhaps most important error in any instrument is scaling error.

In the graph, a perfect instrument would be the heavy black line, a straight line at 45. That is, the measured value y -- the reading -- would be the same as the actual value x. For instance, if the actual clubhead being weighed is 198 grams, then the reading of the scale is also 198 grams.

The blue line shows what happens to the measurement if the digital scale has a scaling error. The reading is different from the actual value, by an amount proportional to the actual value. There are a few other ways to say this with precision:
  • The reading of the scale is related to reality by the equation y=Kx. For perfect accuracy, K=1.000. For any other value of K there is a scaling error.
  • The reading of the scale is off by a constant percentage. Not a constant amount -- we'll get to that later -- but a constant percentage or proportion. For instance, if K=1.02 instead of 1.000, then the reading is always 2% high. This gives a 2-gram error when weighing 100 grams, a 4-gram error when weighing 200 grams, etc.
How does this sort of error occur? It is usually a calibration error. This can affect either analog or digital scales. For instance:
  • Analog - Analog scales generally depend on a spring for measurement. The spring stretches or compresses in response to a force (a weight); the change in the length of the spring is measured and read as a force from markings on the scale. In order to do the conversion correctly, the scale manufacturer has to know the "spring constant" of the spring -- the ratio of force to elongation.
    It is well known that standard springs have a 10% tolerance on their spring constant. 10% is too big an error for any kind of scale, so the scale manufacturer will pay extra for a lower-tolerance spring. How much more money, for how much lower tolerance? That is what will determine the scaling error for the analog scale.
  • Digital - Many digital scales today have a "calibration" feature that works like this: You put a known accurate weight (specified and often supplied by the manufacturer) on the scale and hit a "calibration" button. It then determines the "K" of the scale, and adjusts the readout so the scale behaves as if K=1.000. A calibrated scale is only as accurate as weight you use to calibrate it.
    Horrible example: An acquaintance of mine saw that his scale required (according to the manufacturer's instructions) a 10-kilogram weight for calibration. He didn't have an exact 10-kilogram weight in the shop. But he knew that 10 kilograms is about 20 pounds, so he borrowed a [very accurate, he was assured] 20-pound weight from the fitness store next door, and used that to calibrate the scale. Well, 10 kilograms is actually 22.05 pounds. Yes, that's "about 20 pounds" -- with a 10% difference. So his "calibrated" (actually miscalibrated) scale now had a 10% error. This high-resolution, very precise digital scale had an accuracy problem of 10%. He weighed a 198g clubhead and got a reading of 218 grams -- and he wondered why.
 
The next kind of error we will look at is offset. This occurs when every reading is high (or low) by a constant amount, or constant offset.

Offset errors are very easy to eliminate in digital instruments (or electronic instruments in general). Most electronic instruments have some sort of "zero adjustment"; you provide the instrument with a zero input (e.g.- no weight on the scale) and tell it "this is zero". Examples:
  • Digital scales usually have a "Tare/Zero" button for exactly this purpose.
  • Many analog meters have a "zero adjust" knob or screw. With a zero input to the instrument, turn the knob so that the output is zero. Now the instrument has no offset error.

Zero adjustments assure that there is no output with zero input; that is, the instrument is perfectly accurate at zero. When this is adjusted properly, then any accuracy problems are something other than offset.

But offset errors can still creep in if we are not careful. In particular, it is sometimes hard to identify a zero (or any other arbitrary) "standard" to use as a zero adjust or tare. Case in point: A clubmaker added a Wixey to his loft/lie machine. (A Wixey is a digital angle gauge that can measure its own orientation to 0.1.) He concluded that he could now measure lie angle to 0.1. The problem was that, without the Wixey, there was no way to tell lie angle to better than a half degree with his L/L machine. So there was no way to position the Wixey on the L/L machine oriented within 0.1.

For instance, suppose we have a standard club that we know is 60, and use that to set the Wixey so it reads 60.0. That solves the problem, right? Well, maybe. Consider... how do we know that the standard club is 60? Because we measured it in another machine. OK then, how accurate was this other machine? Was it a full 0.1 accuracy machine? If not, our 60 standard club might actually be 60.3. If we use it to orient the Wixey on our machine, we have an instrument with an offset error of 0.3. It measures lie differences to 0.1 accuracy, but it will measure absolute lie with the same 0.3 error every time.
 
The final common accuracy error is linearity error. It is often the hardest to avoid building in real-world instruments.

In the graph, the red curve matches the black line for zero input; so this instrument has no offset error. It also matches the black line near the top-right of the graph. So this instrument does not have any percentage error at that point; we can't accuse it of scaling error.

If the actual response is perfectly accurate for at least two [widely-separated] input values, but inaccurate for other values, then the instrument's response curve cannot be a straight line. That's just geometry. The perfect response curve y=x is a straight line. Two points determine a straight line. So, if the actual response curve matched at two points and was a straight line, it would be the same straight line as the perfect response curve. Q.E.D.

If the response curve is not a straight line, then it is nonlinear as mathematicians and engineers would say. Inaccuracies of this type are referred to as nonlinearities.

We have already pointed out that analog scales frequently have scaling errors because of tolerances on the spring constant. They also often have nonlinear errors. To prove to yourself where these errors might come from, take a fairly flexible coil spring and start stretching it. For a while the length increases in proportion to the force you apply. You can see and feel that. But, at some point, the coils are less flat and more angled. The spring is straightening out. Now it takes a greater increase of force to get the same increase of length. Eventually the spring is mostly straight, and you can apply a lot more force with almost no additional increase in length.

This results in a nonlinear response curve like the one in the graph. As the spring stretches, its rate of length increase becomes less, and the curve gets "flatter" as shown.
 
A digital scale has a different kind of linearity problem. The problem stems from the necessity to convert a weight (an analog quantity) into a digital number. This conversion process, called "quantization", is a necessary function of most digital instruments.

Example: a 500-gram scale with a 0.1-gram resolution must quantize its input to one of 5000 values -- 0.0g through 499.9g.

The circuitry to do the quantization must be manufactured very accurately. This graph reveals a quantization circuit (D/A converter) with an inaccurate electronic component converting one of the bits. (It happens to be the second-most-significant bit for you binary number fans.) The error shown in the graph is unusually large, but smaller errors of this kind are not uncommon. That is why my article on digital scale testing stresses looking for nonlinearity errors.
 
Let me repeat that all these errors can apply to perfectly precise instruments. They are about accuracy problems, not precision problems.

Spine Finding

I have long objected to bearing-based spine finders because, unlike FLO-based systems, they find not the direction of the spine but an unpredictable mix of spine and residual bend. (These are often referred to as "feel finders".) More and more people are coming around to my point of view. But...

I recently corresponded with a clubmaker who said he agrees that FLO is the right way to find spine. Then he wrote,
"I have built a spine finder like JB's spine tool. But I have added an extra feature to it. The 3rd bearing, at the tip, is attached to a small scale (like a small fishing scale) which in turn is attached to a plate that is glued to my workbench. That way I can chart and mark every spine I find down to the gram. The tool works fine as a feelfinder but I believe the scale adds a little bit more measurability and science to it."

"Added measurability and science" implies an increase in accuracy. But is it? The accuracy problem with the feel-finder is the direction that it finds, which is only occasionally the true spine. His instrument will probably find the same direction every time with the same shaft -- making it precise. But, if that precise direction is not that of the spine or NBP, then the instrument has an accuracy problem.

Adding a scale does not improve the direction one iota. The instrument still has an error. It's the same error as before. But now we know some data about this wrong result to within a gram. Is someone prepared to argue that is actually helping?

I might add that there are a quite a few instruments around that advertise this same [erroneous] spine-finding feature. This includes the Auditor, the FlexMaster, and the NeuFinder[2]. In each case, they glorify a feel-finder with a meter which gives the impression of greater accuracy. In fact, it is attributing useless precision to a highly inaccurate measurement.

Real-world Example

Before closing, let me cite an example from the real world -- my bathroom -- that illustrates perfectly the difference between resolution, precision, and accuracy. I have a digital bathroom scale on which I weigh myself every morning. It reads to a half pound. Recently I saw the identical scale at a yard sale. The price was low enough that I bought it to compare the two scales, if for no other reason. There were identical calibration certificates on both machines; here is what they looked like:
*** CALIBRATION CERTIFICATE ***

Linearity: PASSED
Hysteresis: PASSED
Resolution: 0.5 lbs.
Quality Test: PASSED
Model: MS --- 7
Does this mean that the scale is accurate to 0.5 pounds? Let's look at this for the three measurement qualities:
  • Resolution: This says that the resolution is 0.5 pounds. My experience confirms this. Not only is the display capable of it, but I have seen readings only 0.5 pounds apart.
  • Precision: In my experience the readings are repeatable to the 0.5 pounds resolution. That either means the precision is 0.5 pounds or there is a "cheater circuit" that makes my test look good. I tested for a cheater circuit, and found that there probably is one. (Note: a cheater circuit works by imposing memory -- "hysteresis" -- on the readings. The fact that it passed the test might suggest that they test to see if the cheater circuit is working.)
  • Accuracy. The certificate says nothing about accuracy. (Well, it does include linearity, which by now we know is a component of accuracy.) But I put the scales side by side and weighed the same standard load -- me -- on both. The two scales differed by 2.5 pounds.
So:
The resolution is undoubtedly 0.5 pounds as advertised and tested.
The precision is hard to determine in the presence of the cheater circuit, but seems to be no worse than 1.0 pounds.
The accuracy is 2.5 pounds at best. (Another scale of the same model may be off by even more.)

Conclusion

As an instrumentation engineer going back many years, I am very aware of the distinction among resolution, precision, and accuracy. I react -- perhaps over-react[3] -- to statements that reflect disregard for this distinction. I have written this article so I have something to refer people to when I see such statements.


Notes:

  1. I Googled to see if my idea of accuracy vs precision is the general consensus. Not only was there general agreement on the concept, but 3 of the first 5 sites I checked used the same example I do to demonstrate it -- the scatter of shots at a target. At first I was taken aback; I thought and even hoped my analogy was original. But then, in the spirit of "great minds think alike", I realized I must have it right if everybody uses the same example.
  2. This applies to the NeuFinder 2, and even the NeuFinder 4 if you use it naively. The NeuFinder 4 also supports a differential deflection mode, which gives accurate spine finding at the expense of considerable extra work.
  3. OK, let me trot out an anecdote from my past -- mostly to demonstrate how long this issue has concerned me. When I took my PhD qualifying exam in 1969, the format was a written exam on a Monday, and later in the week an oral grilling from five professors -- who had seen what I wrote on Monday. Obviously, the first thing they were going to quiz me about was the areas where my written answers were shaky.
        One of the examining professors (I knew who my inquisitors would be well in advance) had a thing about the difference between precision and accuracy. I didn't quite agree with him on the distinction. I had sufficient confidence that I knew the difference -- I had been designing instruments for a decade at that point -- that I took a risk. When I saw that question on the written exam, I deliberately wrote an answer that challenged his point of view. I had therefore set the agenda for the half hour that he had to question me in the orals. He did indeed rise to the bait. We argued the point for fifteen minutes, half his allotted time, after which he agreed that I was probably right. He strongly recommended passing me on the test.


Last modified 10/3/2009