I recently taught a class on testing fundamentals, in it I made the comment that by my estimate there are over 8,000 lines of code in MATLAB dedicated to the simple (simple seeming) test
A == B
Why? What is testing for equality so hard? Let’s break it down.
- Data types and numerical precision: depending on the selected data type the resolution to determine “equal” may not be present. You can end up with false positives and false negatives
- Tolerances: You can take data type into account by adding tolerances into comparison.
- Absolute tolerance: abs(A-B) < tol
- Relative tolerance: abs(A-B) < per_tol * A
- But what about zero: Percentage tolerance is good, but what do you do when the value is zero?
- Relative tolerance (mean): abs(A-B) < per_tol * mean(A);
- Realitive tolerance (max): abs(A-B) < per_tol * max(A);
- Realitive tolerance (moving average) : abs(A-B) < per_tol * mean(A(i-N) : A(i+N))
- What about noise: for measured data how do you handle the “junk data”?
- What about missing data: much like junk data what do you do with missing data points?
- What about data shifts (temporal or other): it is fairly common for comparison operations to take place with “shifted” data. Where one signal is offset by some fixed amount in time.
- What about non-standard data formats: how do you handle the comparison of a structure of data? Do all elements in the structure have to match to “pass”? Do you apply the same standard of tolerances to all elements?
You can quickly see where my estimate of 8K lines of code come from. Why then do I mention this? Two reasons
- Start thinking about the complexity in “simple” tests
- Stop creating test operations when they already exist
This is written in the context of testing. Any sort of algorithmic or logical code will, of course, use comparison operations. For those cases keep 2 simple rules in mind
- Do not use floating point equivalence operations:
- Take into account the “else” conditions