Ask the right question: Architectural size

Recently a client posed a question I have heard a number of times, “How many States can I have in my model before there are problems”?  On the surface, this seems like an O.K. question, however, when we dig in a little we see the inherent assumptions with the question.

Size matters, after a fashion

AsImage result for size matters a basic metric, the number of states in a model is meaningless, it is akin to the question “how many lines of code before there are problems”?  If someone said they had a program with one function and 100,000 lines of code you would assume that it was problematic in its complexity.  On the other hand, if they said the program had 100 functions you would think that the model was well architected.  Going to the other extream if the function had 1,000 functions you may think that they have created architectural problems of increased complexity.

No one builds a house with a Swiss army knife

Models are tools, Victorinox Swiss Champ XAVT in red transparent - 1.6795.XAVTthey perform functions in response to inputs.  It is possible to build a single function that performs a 1,000 different functions but that is rarely the correct way to go.

Rather each model should be viewed as a Image result for toolboxspecialized tool performs a function or set of related functions.  Again this relates to the “100 or 1,000” functions for a 100,000 lines of code.  I generally consider something a “related function” if

  • Uses the same inputs:  E.g. the function does not need to import additional data
  • Is used at the same time: E.g. the information is used in the same larger problem you are trying to solve.

For example, calculating the wheel speed and wheel torque in the same ABS braking function makes sense as they use the same input data (generally a PWM encoder) and are used at the same time (to determine the brake pulse width).  However, calculating mileage in that function, which can be derived from the wheel speed, does not make sense as it is not part of the same problem you are trying to solve.

Keeping it in memory…

In Image result for you can remember 7 peoplethis instance, I am talking about the developer’s memory. Above a given size and complexity it becomes difficult for a developer to remember all the parts of a function operate.

As a general rule of thumb, I try to stick to  a “depth of 3” limit.  No subsystems or nested states more than three levels deep.  If there is a need for greater depth I look to see if there is a way to decompose the model or chart into referenced models and charts.  One note, when measuring “depth” the count stops when a referenced model or chart is encountered as these are assumed to be atomic systems developed independently from the parent.

Benefits of decomposition

The Image result for decomposingfollowing benefits are realized through the decomposition of models

  1. Simplified testing: large functions have a large number of inputs, outputs, and possible responses.  Smaller models have reduced testing criteria.
  2. Simplified requirements linking: Generally, well decomposed aligns with the requirements by not clumping disparent functionality together.
  3. Improved reusability: Smaller functions are more likely to be generic or easily customizable.
  4. Improved readability: A smaller model can more quickly be reviewed and analyzed then a larger model.

What is the correct question?

There are two questions I would ask:

  1. How do I make the model functionally correct?
  2. How do I make the model readable?

For guidelines on that topic, you can read my Stateflow Best Practices document.

 

 

The Model-Based Design Workflow…

The following is an idealized Model-Based Design workflow, from initial requirements to product release.  The workflow assumes a multi-person team with resources to support multiple roles.

It all starts with requirements…

Ideally, the process starts with foundational tools and processes in place.  These consist of

  • Modeling guidelines:  Covers model architecture, data usage, and model clarity
  • Testing guidelines: How the system will be validated
  • Requirements tracking: A system for tracking the compliance and changes to requirements
  • Bug reporting: A system for tracking issues as they are discovered; this is tightly coupled to the requirements tracking.
  • Support infrastructure: The supporting infrastructure includes tools such as version control and CI systems.
  • Project timeline: The project timeline provides the objects for the completion of the project and the resource allocation (e.g people)

wf_p1

From the initial high-level requirements derived requirements and derived tests are defined.  These documents, along with modeling guidelines and testing guidelines are used to create the initial system and component level models and their associated test cases.  These models and tests are entered into the requirements tracking system and assigned to the appropriate teams.

Project development

During the development phase. both the models and tests points are elaborated by adding in details from the requirement and testing documents.  As these features are added to the system the models are validated against the test cases.

wf_p2

As issues are uncovered, either way in which the design is lacking or bugs are discovered information is added to the bug and requirements tracking systems.  This workflow goes on in parallel for each developer/model.

 

This development goes on in parallel for both the system under development and any supporting models such as plant and environmental models.

System integration and release

As the maturity of individual components matures the system level integration and validation rigor is increased.  At this stage of development testing of the derived objects, e.g. generated code, increases in frequency and rigor.

wf_p3

Wrap up

After the release of the project, the components from the development should be examined for reusability.  Beyond components processes and guidelines should be reviewed for updates to the best practices.

wf_p4

Statistical variations in plant models

Plant models, whether based on first principals physics or regression models based on real-world data are a cornerstone of the Model-Based Design controls development process.  During the initial development of the control algorithms a “static” physical model is sufficient; however when the development moves into the diagnostic and release phase physical models that demonstrate real-world variation are required.

Variations, not noise…

In Image result for variationa previous blog, I wrote about the importance of noise in testing.   Variations are different from noise in that they are a constant offset.  For instance, my height will always be 6’3″ while my wife Deborah’s will be 5’10”.  If we design an airbag system assuming everyone was 5’10” then there could be issues when the first 6’3″ person is in the car with an accident.

Working with variations

If we continue the “body variations” example and think of all the variables associated with the body, height, weight, leg length, arm length… we will observe two things

  1. There is a correlation between some variables:  In general leg length increases as height increases, as does weight.
  2. There are outliers:  While there are general correlations between properties there are still outliers which cannot be ignored.

So given these two considerations how do we proceed?

Data at the boundaries, data in the center

Test Image result for at the boundarydata should be defined that includes both data at the boundaries and in the ‘center’ of the test space.  Data at the boundaries exercises the edge cases while the data in the center is used to validate the mainline behavior.  When considering which boundary conditions to include consider the following issues.

  1. For discreet variations:  In instances where the variations are discreet, e.g. on/off, flow/no-flow all discreet instances should be included
  2. For continuous variations: In the example of height, values at the endpoints should be selected along with a set of points within the range.  (The total number should be a function of what a nominal unit is in the range.  For instance, if we took a height range from 4’10” to 6’6″ and assumed the nominal unit of 1″ then perhaps a spacing of 6″ would be reasonable)

Variations and variations… working with multiple variations

In any real-world system, there are multiple parameters that will vary.  Selecting which combination of variations (outliers and central points) needs to be determined in a rigorous fashion.  In an upcoming post, I will cover how six sigma style approaches can be used to determine which points should be selected.

 

ICDs and collaboratve verification

Interface Control Documents (ICD) are a method for specifying the functional interface of a component or system.  Used correctly they prevent integration errors and promote formal development practices.

What is in the document?

At a minimum, the ICD consists of the following information

  • Input specification
    • Name
    • Data type
    • Dimension
    • Reference/value
  • Output specification
    • Name
    • Data type
    • Dimension
    • Reference/value
  • Global data used
  • Calling method
    • Periodic or event-driven
    • Reusable / non-reusable

The I/O and global data are generally well understood.  Specification of the calling method is required to understand how time-dependent functions such as integrators or transfer functions will behave.

Additional data may include information such as signal range, update rate, units…  All of this information should be derived from the requirement specification.  (Note: the ICD is sometimes viewed as a derived requirement document)

How is the ICD used?

The ICD provides a baseline baseLinefor the interface to the software component.  Once the ICD has been defined the initial model can be created.  This is sometimes called a “shell model.”  The shell model has all of the inputs and outputs as defined by the ICD document.  The shell model can then be integrated into the system level model (or another integration model) for a system lockdown.  This integration model provides the first level of testing of the interface.  If the interface of the shell model changes the integration model will break.

ICD_WF

Test utility pattern: Simulink Test

As I have written about in previous posts I recommend the use of reusable test utilities.  When working in the text-based MATLAB environment how to create reusable utilities is easily understood; they are simply MATLAB functions.  However,  within the Simulink Test graphical environment, it may not be as clear.

Libraries and Functions

Fortunately, there is a solution; if there wasn’t there would be no post today.  Within the Simulink Test environment, calls can be made to functions.  The functions can be either return a value (or values) or directly set an assert or verify flag.

testCalls

The functions are imported from a Simulink Library and can be constructed from MATLAB or Simulink Function blocks.

testLibrarySetup

In the case of MATLAB functions that are placed in a Stateflow block with the functions export option selected.

exportFunction

So there you have it, a simple solution to reusable test utilities within the Simulink Test environment.

 

Model-Based Design: Return On Investment (ROI)

One of the rationales for adopting Model-Based Design is an expected Return On Investment (ROI).  This has three very natural questions

  1. What is the expected ROI?
  2. What is the timeframe for realizing the ROI?
  3. What is necessary to realize the ROI?

Unpacking the ROI questions

TheImage result for unpacking first thing to recognize is that the ROI will be dependent on the “level” of adoption of Model-Based Design.  The more processes of Model-Based Design that are used the greater the ROI, however, there is a corresponding delay in the realization of the ROI (see reference 1).

Further, the ROI is dependent on a having a defined implementation plan.  A full MBD process includes multiple tools and tasks, without a well-defined implementation plan the dependencies between these tasks will become muddled.

ROI Timeline

Assumingtimeline a well-defined implementation plan, Most companies will start to see a return on investment after 9 months to 1 year.  The majority of the ROI is generally realized after 3 years.

 

Hidden or “Negative” ROI

One aspect of Model-Based Design makes measuring ROI difficult, the fact that model-based approaches allow for the development of systems that are impossible (or at least extremely difficult) to develop using traditional approaches.  In these cases where MBD is used to create systems of high complexity, the measured ROI may be lower than actual ROI due to the inherent complexity of the system.

Expected ROI

Finally, what is the expected ROI?  From industry examples, ROI’s as high as 80% are known to be possible (see reference 2) with ROI’s of 30~40% are considered common.  Again, these results are dependent on having a good implementation plan.  Hopefully, this blog, or MathWorks, will help you develop that plan.

 

Image result for implementation

References:

  1. What is the benefit of a model-based design of embedded software systems in the car industry?  By Manfred Broy Technical University Munich, Germany
  2. Measuring Return on Investment of Model-Based Design By Joy Lin, MathWorks
  3. Model-Based Design in Practice: A Survey of Outcomes for Engineers and Business Leaders. By Dr. Jerry Krasner Chief Analyst at Embedded Market Forecasters

 

Why you need noise in your tests

Short answer: the real world is noisy.  If you write tests that assume clean input data you are not exercising the system in a real environment.  So let us talk about noise.

Types of noise and sources of noise

ForImage result for noisy sine wave this article, I will define noise as signal data entering the system from the outside of the system.  Sources of noise include

  1. Resolution limits: all measuring devices have a limit to their resolution.  A ruler with 1/8th-inch markings cannot accurately measure 1/16th-inch resolution.
  2. External interference: Frequently there are secondary effects that change the measurement.  For example, when measuring a voltage it is common to have noise in the signal from other wires running nearby.  (Which is why for some sensitive measurements shielded cables are used)
  3. Dynamic property: In some instances, the value of the property being measured is changing rapidly; any given measurement may be an outlier.
  4. Human error: For devices that human operators, well we make mistakes in how we enter information…

TypesImage result for types of noise of noise, generally, map on to the sources of noise.

  1. Quantization (resolution): Characterized by “jumps” in the value.  In dynamic systems must be tolerant of the jumps.  For static (e.g. post-run analysis) the jumps can be “smoothed” using functions.
  2. White (external): Characterized by random values around the “actual” signal.  Generally can be filtered using standard transfer functions.
  3. Outlier (dynamic): Characterized by occasional values outside the trending values.  If the “standard” range is known then these outlier values can be ignored.
  4. Systematic (human): Characterized by systems being executed in a non-standard order.  Systems need to be made recoverable from non-standard execution order.

 

Testing the wild-noise

TheImage result for where the wild things are basic strategy for testing with noise is to “inject noise” into the system under test.  How we inject can again be mapped back to our 4 types of noise

  1. Floor functions (quantization): Use a floor function to resolve signals to the nearest value of the inputs resolution.
  2. White noise generator (white): White noise generators are common functions.  One important note, if the same “seed” is used for the white noise for all runs then this test has an inherent flaw.
  3. White noise generator (outlier): There is a special case of the white noise generator where signals are more episodic and, generally, of a larger value.   In these cases, a statistical model of the outlier signals is helpful in creating this white noise generator.
  4. Decision tree analysis (human): Creating test cases for human error can be the most difficult.  For state logic, it is possible to analyze the system to determine all possible paths.

In the end, including noise in your tests will result in more robust systems.

Global signal data in Simulink Models

Unlike many, this post is Simulink centric and deals with the question of global signal data within Simulink models. So first what is “signal data?”  Broadly speaking within a Simulink model data elements are broken into parameters (fixed) and signals (things that change).  Signals are either calculated or come in from the root level.

signalsAndParameters

Within the model, the signal data is “scoped” to the line it is attached to, or in the case of a Stateflow chart or MATLAB function block, the scope of the chart/function.

The exception

Within Image result for simulink data storeSimulink, the exception to the rule is the Data Store.  With Data Store (read and write blocks) data can be shared in different parts of a model without the use of connecting signal lines.  Further, the data stores can be shared with Stateflow Charts and MATLAB functions.

In addition to acting as global data, Data Stores have the unique ability to be written to in multiple locations within a Simulink diagram.  Because of this ability, they must be fully defined with the data type, dimensions, and complexity when they are first created.

Global data bad……

GlobalImage result for discworld data is easy to work with, allows you to quickly share information between functions and to reduce interfaces.  At the same time, it makes debugging code more difficult (where was X set?) and reduces reusability of code by expanding the dependencies of a function.  But… there are times when global data is the correct solution.

When to use global data

So with these downsides when should global data be used? As a general rule of thumb, I advocate for 3 uses

  1. Error/Fault detection:  By their nature error flags can be set by multiple causes.  Because of this, the ability to write to an error flag in multiple locations is a valid rationale.  Additionally, since the error flags may be needed in multiple places in the model (more so than normal data) the ability to pass this without routing is important.
  2. Mode data: A system should respond to mode changes all within the same execution step.  Like error flags, Mode Data is shared across the full scope of a model.
  3. Reset flags: Reset flags are used to reset state behavior of integrators and transfer functions.

Image result for rule of thumb

Generated code

As a final note, the global property of data in Simulink models should not be confused with the scope of the data in the generated code.  The scope of the data in the generated code (for both parameters and signals) can either be determined automatically by Embedded Coder or controlled through Data Objects.  This will be covered in a future post.

Threading…

At some point in the software development cycle, the question of single or multi-threading environment will come up.  With multi-core processors more common now in embedded devices this a more frequent issue.  Let’s take a look at some of the trade off’s between single and multi-threaded environments.  For additional information, I recommend the following links

Single threaded

It just works, the program runs from start to finish in a set order and you know what happens relative to everything else.  However, it may be slower than it needs to be if some of the operations can take place in parallel.  If you do not have timing constraints this is a fine option to take.

Image result for one thread

Multi-threading

If single threading can be described as “just working” then multi-threading needs to be characterized in a different fashion.  We will start with some basic understanding of threads.  A thread is the smallest unit of execution that an OS can instantiate; they are either event-based or periodic (temporal).  Threaded operating systems can be either non-interpretable or interpretable.

Image result for timing diagram multi threading scheduler

Packaging your threads

Each thread should exhibit a high degree of independence from other threads; meaning the operations of “Thread A” should have a minimum dependence on the data from “Thread B.”  The key word here of course is “should.”  In the end the threads will need to exchange data and that is one of the complications of multi-threaded environments.

Data locking and synchronization

Image result for lock dataIn a multi-threaded environment, a lock (or mutex) is a method for ensuring that a memory resource is not in use by multiple threads at the same time.  E.g. if you have a shared memory space you do not want to threads writing to it at the same time (or one reading while the other is writing).

Locks provide a way of synchronizing data between threads, however, they slow down the process since the thread cannot continue until the data is unlocked.  In some instances, when the operation of one thread is dependent on the outputs from another, if the locking and data synchronization is not handled correctly a race condition can occur.

Debugging multithreaded environments

Bugs in multithreaded processors generally occur when the expected order of execution does not match the intended order of execution.  This can be either due to

  • A thread failing to start
  • A data synchronization failing
  • A thread taking longer than expected and preventing another thread from running

Image result for debugging multithreaded applications

Use of a debugger to “walk through” the code is often required to get to the root cause of the issue.  However, if the bug is due to an overrun issue then using the debugger may not catch the error because in the debugging mode you are not subject to the timing limitations. In this case, either a trace log or even an oscilloscope can be employed.

For more information on debugging multithreaded environments, I suggest these links