First a definition:
A software bug is an error, flaw, failure or fault
in a computer program or system that causes it to
produce an incorrect or unexpected result,
or to behave in unintended ways.
This is in contrast to incomplete development where the program is not yet performing the intended function.
There are three types of bugs:
- Changed induced: these are bugs that arise when part or all of the program is changed. These can often be resolved by doing comparisons against the earlier version of the program.
- Corner case bugs: these bugs are due to missed behavioral cases; for instance not accounting for overflow in a button counter.
- Incorrect library/function usage: these bugs arise from the use of library functions incorrectly; for instance passing a double to an integer function.
DIF: Detect, Isolate, Fix
In debugging the first step is to trace the issue to its root; in Simulink, this is normally a subsystem, in Stateflow a set of state transitions; in either case, the issue could be due to changes in parameterization so…
- Review/compare parameter data: inspect the parameter data that specifies the behavior of the system. Try reverting to earlier versions of the data.
- Introduce data logging: the simplest level of debugging is the introduction of intermediary data logging points. If this is a change induced bug this is often enough to determine the problem.
- Simulate by unit: where possible decompose the full model into components and simulate them in isolation to determine the subsystem behavior.
- Introduce breakpoints: both Simulink and Stateflow allow for the introduction of breakpoints. Conditional breakpoints, where the simulation halts for a given configuration, add additional debugging power.
- Use formal methods: use of formal method tools such as Simulink Design Verifier to detect dead logic and overflows/underflows can automatically determine the location of some bugs.
- Second eyes: Bring another person in to talk about your model, what you expect and what it is doing.
Common “bugs” and simple fixes
The following are common development bugs
- Miss aligned ports: verify that the inputs to a Model or Library correctly map with the calling subsystem. This issue arises when the referenced model/library is changed.
- Never reached: dead code due to logic that is never activated. This is found using the coverage report or through SLDV.
- NAN: nan, or not a number, occurs when you have a divide by zero operation. To detect this set the Simulink diagnostic to detect this condition.
- Interpolation data and tables: by default blocks in Simulink will interpolate outside of the specified range. This can cause problems if
- The data is of integer type and the result is a float
- The data is not valid outside of the specified range
- Saturation/limiters: frequently people put in limit blocks into systems during development. These blocks can “prevent” issues but also introduce errors. Inspect the data going into and out of limit blocks (and limits on integrators.)
- Synchronization: in some instances, the behavior depends on synchronized data; if the signals are out of alignment due to either introduction of unit delays or sample rate of the system. Look for cases where transitions are dependent on the status of two highly transient variables at the same time.
I would love to hear about your common bugs and debugging techniques.
I recently had a conversation with a client about how they instantiated constants for their model. Thier approach was to group common parameters together into a structure. What I pictured was something like this
In this instance, we have a small structure, one layer deep with 4 elements. This would be enough information to perform the calculations required to transform a throttle sensor voltage into a throttle position.
However, what they showed me was something quite different. In their instance, they had a hierarchical structure that was, in some places, 7 layers deep. The single structure contained not only all the parameters required for a single component but for multiple models.
This isn’t the first time I have seen a structure like this, in general, they grow organically. As multiple people work on a project they want an “easy” way to share data and, at first, when it is small, the method works well. However, as the structure grows in size several problems start to emerge.
- Where is my data: the first problem with large data structure is finding the data. Even the most logically organized structure
- Waste of space: deep structures inevitably end up with unused data.
- Repeated data: Going along with the “where is y data” is the “repeated data. People will add the same data to multiple locations.
- Customization: with a large data structure you have to configure the whole data structure as one.
The argument against flattening structures was “If we break them up we will have thousands of parameters”. While that was factually correct it missed the fact that they already had 1000’s of parameters, just in a deep structure. The advantage of the flat format are
- Ability to easy find parameters
- Ability to customize on a parameter by parameter basis
- Only used parameters are in the generated code
There are some disadvantages, related to how the parameters are stored in files; a single structure can be stored easily in a single structure. With multiple parameters, a storage format needs to be determined. Standard approaches include
- Use of MATLAB files
- Use of .mat files
- Use of Simulink Data Dictionary
- Use of an external data base
Any of these approaches can be used to organize the data.
For those of you reading in the distant future, e.g. more than a month from now, let me set the stage. It is July 2018 and (as an American) World Cup Soccer is in full swing. Now if I was anyplace else in the world it would just be “The World Cup”. However, with either name you know what I am talking about; this is what is called a “one-to-one” mapping.
In natural, e.g. spoken languages, these “one-to-one” mappings (or near mappings) are common and can often be understood through the context of the conversation. However, in software these mappings are problematic.
A rose by any other name… may still have thorns
Multiple names come into existence for multiple reasons.
- Multiple developers: This is the most common reason, multiple developers working in separate models/files. They each “need” to use a variable so they create a name for the variable.
- Units: For physical qualities, such as vehicle speed, it is common to see, VehSpeedMPH and VehSpeedKPH. While this may seem like a good idea, e.g. the units are known, this is perhaps the most problematic duplicate as you will often see the two instance out of sync with each other.
- Reusable functions: In this instance, the same function is used in multiple instances. In this instance, the key observation is to have meaningful generic names in the function.
- Common usage: For a subset of cases the reuse of names should be encouraged. This is the scoped data with common usage. For example “cnt” can be used for a counter in multiple models/functions.
The reconciliation project
First, determine if it is worth doing. While consistency is important, code that is fully developed and validated may not warrant the work required to update the code. Once the decision has been made to update the code/model base the following steps should be taken.
- Identification: find where the same data concept is used with different names.
- Select a common name: Use of a descriptive name is important in these cases to smooth the transition process.
- Check for “unit” issues: If the different names existed due to desired units, validate that the change to the common units is accounted for downstream.
- Update documentation/test cases: (You do have those right?). Documentation and test cases will often, reference the “old” names. Update the documentation to reflect the “new” name.
- Validate the units: After the updates have been performed the full regression test suite should be run to validate the behavior of the models.
As the past coordinator for the MAAB Style Guidelines, I have spent a fruitful number of hours thinking about guidelines for the Model-Based Design and Safety Critical environments. In a recent discussion, I was challenged by the question “Why have a guideline that you cannot check”?
Now any guideline can be validated through a manual review process. In this instance, the query was specifically asking about automatic validation. Further, they were working in an environment where the majority of users were new to the Model-Based Design environment. So here is my, evolved, answer.
Why can’t it be enforced?
Some guidelines cannot be enforced because they depend on human judgment, things like “meaningful names” or “readable diagrams” are, by their very nature subjective. (Though I have an idea for a neural network solution for the meaningful names issue). Since that can’t be enforced is it worth throwing out? Generally no; it becomes a “best practice” and perhaps a subset of the rule could be enforced (e.g. limit the number of blocks per level of the model, minimum name lengths…)
What do you do with the non-enforceable?
As a general best practice when guidelines are rolled out there should be an education seminar to explain the guidelines and their purpose. Special emphasis needs to be placed on those guidelines that cannot be automatically enforced. Explain the
- Rationale: why the guideline benefits the end user
- Effort: how hard will it be for the user to follow
- Benefit: what does the end user and the team get out of following the guideline
In the end, these guidelines should be thought of as a recommendation. Some, but not all will be caught during reviews with co-workers and by test engineers. That they will not always be followed should be expected but if you never provide the guidance they never can be followed. If you keep them to a minimum, say no more then 6 to 10, these guidelines that are highly impactful, well, eventually people will follow them without thinking.
The term “glue code” is a colloquial term for “a thin layer of software that connects software”. In general, it is used to connect to software components that were not designed to interact with each other. This is a common problem and, when sensibly done, is a fine approach. However, it is possible to develop “super glue” solutions which are, in the end, fragile and difficult to maintain.
The following are examples of standard “glue code” functions. Correctly implemented they are a thin layer between modules
- Data format translations: repackaging data between two different formats, such as an XML to CSV translation function.
- Communication port: adding a data transmission function between two pieces of code.
- Registry/read functions: these are functions that map data from one source (such as requirements) onto an object (such as a model)
- Error catching: these functions, generally, work in concert with other glue code ensuring that the data exchanged between the modules is correctly formatted.
How do I tell if I have crazy glue? There are 4 basic warning signs
- Use of multiple languages: connecting two software components written in different languages is a common task. However, if you find your glue code uses more than one language chances are you doing something to convoluted.
- Use of files for the interface: ideally glue code is written by leveraging existing APIs in the software components. If the function interface is through a file and a “poling function” the data exchange will be difficult to maintain.
- The growth of “special cases”: when the number of special cases the glue code has to handle gets above 10 to 15 chances are the data exchange format is not well defined.
- Size: there is no hard and fast rule for how large glue code can be, however at some point in time it stops being glue and becomes its own software component.
What to do with crazy glue?
In the ideal world, the glue code would be replaced with improved APIs in the source software components. Often this is not possible due to ownership issues. When this is the case basic software best practice come into play
- Break the glue into components
- Look for ways for one component to encapsulate the other
- Refactor the code to remove special cases
- Consider using different software components