If you are chopping up vegetables for a salad: carrots, peppers, onions… the order in which you chop and add these to the bowl does not matter. On the other hand if you are sauteing those same vegetables the order added to the pan matters a great deal.(1)
X = A + B
X = B + A
are the same;(2) however if those were lines of code a basic differencing tool would flag this as a change. The same problem holds true when differencing a model.
When creating models the transformations (throwing into a bowl or in a hot pan) applied to data (your vegetables) matters when determining if two things are equivalent.
Why automated differencing fails
Automated differencing detects structural changes; in some instances that is significant(3) but in others it is like our salad example, something trivial. Differencing tools lack the ability to determine context. So when and why should we perform differencing?
Differencing to debug
Debugging in 4 steps
- Detect an issue: Ideally the issue is caught through the use of regression testing.
- Determine if it is a bug: As you evolve your algorithm, outputs may change. Determine if the change represents a deviation from your requirements. (5)
- Determine the portion of the model responsible (tracking): In general this means tracking down where the variable in question is calculated and then backtracking to the change.
- Implement the patch: once you have determined the source of the problem, implement the solution. (6)
I use differencing as a tool when I am debugging a problem; this is the tracking stage of debugging. For complex models this is a time saver in determining what the possible changes are that created the bug.
While there are many types of filters, they all perform a seperation function; letting what you want through and catching the rest. When setting up your differencing tool you want to turn on the “ignore formatting” changes options.
For a text differencing tool, formatting changes would include things like spaces, tabs, and line breaks. For a graphical differencing tool this would include things like block positions or names, e.g. you are filtering out non-transformative changes.
Final thoughts: differencing for reviewing?
In general I do not find model differencing a useful tool during model reviews. It distracts the reviewers from understanding what changed functionally by having them focus on what changed structurally. Unless you are performing an architectural review I recommend reviewing functional simulation results versus the model diagram.(7)
- Carrots or onions first? It depends on what you want to achieve. If you are going for caramelized onions then those go in first, but if you want the carrots to caramelize then they need to go in first. Either way the peppers will go in near the end.
- This is the commutative property.
- In this example the second image has a flower in it. This is significant as it gives the bees a source of food. This allows them to produce honey which in turn leads to the bears bee-ing (4) happier (hence the two holding hands).
- Of course the bees may not be happier with the bears taking their honey.
- There are generally 2 types of tests: requirements based and baseline. In requirements based tests, a failure should always indicate a problem (unless you wrote the test with tolerances that were too tight). With regression testing, it is possible that a change in the output does not impact the requirements.
- Note: if the change was made by someone other than yourself (e.g. a developer of another module), consult with them to understand the reason for the change. It may be that the requirements have shifted.
- Google Image search returns some wonderful and sometimes odd results. In this case the search term was “structural versus functional changes.”