Best practices for model cleanup

In this blog I have written a lot about “mushroom” and “spaghetti” code; today I’m going to write about the best practices for updating and cleaning up those models

Should I update?

Before you start you should ask yourself three questions

  1. Beyond cleanup are there additional modifications needed to the model? (No)
  2. Is the model, as written, performing it’s intended function? (Yes)
  3. Do I have tests cases that cover the full range of behavior of the model? (Yes)

If you answered as indicated (no,yes,yes) then stop. Spend time on another part of your code that does not meet those criteria(1). Otherwise lets start…

Baselining the model

The first step in cleaning up code is baselining the model. This activity consists of N steps

  1. Back up the model’s current state: Ideally this is already handled by your version control software but….
  2. Generate baseline test vectors: To the degree possible create baseline tests, these could be auto-generated.
  3. Generate baseline metrics: Generate the baseline metrics for the model, ram / rom usage, execution time, model coverage…
  4. Create the “Difference Harness”: The difference harness compares the original model to the update model by passing in the initial test vectors and comparing the outputs.

What is different about today?

The next question to ask in your refactoring is “do I re-factor or do I redo”? Depending on the state of the model there are times when simply re-doing the model from scratch is the better choice. This is often the case when the model was created before requirements existed and, as a result, does not meet them; that would make for a very short article though so let us assume that you are refactoring. First figure out what needs and what should change. To do that ask the following questions.

  • Review the requirements: what parts of the requirements are met, which are incorrect and which are missing?
    • Prioritize missing and incorrect requirements
  • Is it possible to decompose the model into sub-components: In most cases, the answer is no, or yes but it is tangled. It wouldn’t be mushroom code if you could.
    • Create partitioning to enable step-based modifications
  • Identify global data and complex routing: Minimization of global data should be an objective of update, complex routing is an indication that the model is not “conceptually” well decomposed
    • Move sections of the model to minimize signal routing and use of global data
  • Identify the “problem” portions of the model: Which sections of the model most frequently has bugs?
    • Squash them.

Once you have asked these questions you understand your priorities in updating the model

Begin modification

First understand the intent of the section of the model, either through inspection or through review of the requirements . Once you understand what the intention is you can start to simplify and clarify.

  • Simplifying logical statements / state charts
    • Run tool such as Simulink Design Verifier to check for dead branches, trim or fix
    • Look for redundant logical checks (multiple transitions all using the same “root” condition check)
    • Look for redundant states (multiple states exist all with the same entry and exit conditions)
  • Mathematical equations
    • Did they create blocks to replicate built in blocks? (Tables, sine, transfer functions)
      • Replace them with built-in blocks
    • Are complex equations being modeled as Simulink blocks?
      • Replace them with a MATLAB function
  • Size (to big or to small)
  • Partitioning rationale

Footnotes

  1. With mushroom code it is highly unlikely that you have tests cases that cover the full range of behavior of the model; model (or code) coverage should not be confused with full behavioral coverage since it is possible to auto-generate tests cases that would cover the full model without every understanding what that coverage means
  2. One advantage of having this blog for 3+ years is I can mine back article for information. Hopefully you will as well. What I mine is yours, nuggets of MBD wisdom.

Interface control documents and data dictionaries

Interface control documents (ICD) and data dictionaries are two parts of a mature MBD infrastructure. The question I often hear is “what is the boundary between the two artifacts”? First a high-level refresher:

  • The Data Dictionary: an artifact used to share a set of common data definitions external to the model and codebase.
    • Objective: provide common and consistent data definition between developers
  • The ICD: an artifact used to share interface information between components external to the model and codebase; often derived from or part of the requirements document set.
    • Objective: provide a common interface definition to simplify the integration of components when multiple people are working on a project.

An example of an ICD spec is

function namemyIncredibleFunction
function prototype(double mintGum, single *thought, something *else)
Call rateevent-driven
Multi-thread interruptibleyes
Function information
VariableTypeDimensionpassby
mintGumdouble1value
thoughtsingle4reference
somethingstructure10reference
function specification

And here is where the boundary question comes up. In specifying the data type and dimension in the ICD I am duplicating information that exists in the data dictionary; violating the single source of truth objective.

Duplication can be dangerous

So what is the flow of information here? I would suggest something like this…

  • The ICD document is created as part of the initial requirement specifications
  • The data interface request is used to inform the initial creation of data in the data dictionary
  • Once created the data is owned by the data dictionary

Infrastructure: making your artifacts work for you

Data dictionaries serve an obvious purpose, they are a repository for your data. On the other hand, interface control documents can seem like a burdensome overhead; which it will be without proper supporting infrastructure. If you remember the objective of the ICD, to simplify integration, then the need for tool support becomes obvious. When a developer checks in a new component it should be

  • Checked against its own ICD
  • Checked against the ICD for functions it calls and is called by
  • Its ICD should be checked against the data dictionary to validate the interface definition

With those three checks in place, early detection of invalid interfaces will be detected and integration issues can easily be avoided.

ICDs and the MATLAB / Simulink environment

Recently MathWorks released the System Composer tool. While I have not had a chance to try it out yet it offers some of the functionality desired above. I would be interested to learn of anyone’s experience with the tool

Demos: Are you making college chili?

When I was in college I would, after swim meets (H2Okies) make up a batch of chili. It had all the right ingredients (1) and got the job done(2) but it was tuned for a very narrow audience. It wasn’t until I started cooking for and with my wife Deborah that I really learned what it means to create a meal (3) for a wide group of people (4).

Michaels Chili Recipe (College and Adult version)

College (Demo)Adult (Production)
Ingredients
1-lb ground beef
32-ounces canned kidney beens
32-ounces canned tomatoes
1 onion
garlic powder
pepper
salt
red pepper (a lot of it)
cayenne pepper (a lot of it)(5)
Brown sugar (2 tbl)

Preparation:
Brown ground beef in pot,
“cook” diced onions in beef fat
Throw in the rest, cover with water.
Stir it every once in a while.
Add water to keep from scalding
Ingredients
1-lb ground beef 96% fat
1-lb dry kidney beans, sorted and soaked
16-ounces canned tomatoes
3 ~ 4 fresh tomatoes
4 ounces tomato paste
1 onion (red)
4 stalks celery
Spices: To taste and freshness
fresh garlic, pepper, salt, paprika, cumin

Preparation:
Saute onions, garlic and celery in olive oil
Brown beef with onions and garlic and celery
drain excess fat
On low heat add fresh diced tomatoes let sweat for 5 minutes
Add in spices
Add in kidney beans and canned tomatoes
Add water to cover beans
Simmer at low heat for 2 hours
An underspecified recipe

Demo versus production: What is the difference?

There are three things, first the dish is no longer dominated by single note, heat(6). Second, time, the college recipe was great for someone who needed something fast, e.g. throw it in and walk away; the adult version requires an investment for a greater return. Finally, reliability; hidden in the simple phrase “spice: to taste and freshness” is a decade of lessons learned.

Should you (I) make a demo? Or a prototype?

When these winter months roll around and the desire for a good hearty soup rolls around I can generalize my knowledge to a new soup. I don’t need to demo because I learned from past experiences. When I am creating a new software item I first look to see is there something I can reuse (or this), is it a known domain. If there is I don’t create a demo. If I need to learn something or I need to prove to a group that it can be done then I create a demo.

If the “demo” is something that I think I will be able use in the future then it becomes a “prototype”. If I am prototyping I put more time into the demos architecture, creation of test cases, creation of supporting infrastructure. It may not be the final product but it will be drawn upon.

The false lessons of demos

One last comment on demos; they can teach you false lessons. When you are doing things “fast and dirty” you have problems that you do not have when you follow a mature processes. When I was in college my chili was in constant state “it could burn”; it could burn because I was using a bad pot and a cheep stove with poor heat control; I havn’t burned a chili in 20+ years.

The same issues can happen with software development. When you are in rapid prototyping mode it is easy to have spaghetti code(7). This should be viewed as an failure of the development process, not of software in general.

Footnotes:

  1. It’s hard to mess up beans, but, in all honesty the ground beef could have been of higher quality
  2. The job, in this case being twofold; first feeding very hungry college age athletes and second burning the roof of your mouth out.
  3. In college the meal was chili and corn bread; byob. I know I made salads but I don’t think I ever served one.
  4. Of course when it is just Deborah and me then the meal is perfectly tuned to us; which is another sort of perfected meal.
  5. It is a good thing that I did not know about ghost peppers back then, I would have used them, I would have used way to many of them
  6. Heat in chili is, now, an after market feature. If you want it hot you can add many different condiments to add heat. I would recommend this compote.
  7. Mushroom code and spaghetti code are similar that they develop due to a lack of planning. Spaghetti code is characterized with convoluted calling structures; mushroom code is accumulation of code on top of code.