One of the key metrics for test suite development is code coverage: MCDC, range and address coverage. There are two questions that need to be asked when considering coverage. First, is it mathematically and logically possible for the full range to be reached? Second, with your test cases are you covering the full range?
Dead logic
“Dead logic” references paths within your model that can never be reached, either because the “if/else conditions” are always true or always false, or because a while or for loop construct does not terminate. When dead logic is detected this should be the first priority in examination as this may be due to a design error.
You can get there, but do you?
Assuming you can demonstrate that the code is fully reachable the next step is to look at your code coverage with your requirement-based tests. Your requirements should dictate what the algorithm does, therefore if you are validating all of your requirements you should see 85% or greater code coverage. Note that there is always some “helper” code which may not be exercised by the requirements-based tests, hence the 85% floor.
If your coverage metric is lower than 85% you need to determine if you are
Not fully testing the requirements: Add in additional test cases to cover the missing parts
Have introduced additional requirements into the functionality: Either repartition the code or add additional requirements and requirement tests to the module.
The process of validating your control algorithm requires the creation of virtual realities, a reality that must be good enough to convince your controller that it is operating in the real world. How do you know if your simulation is good enough to “fool” the controller?
How do you know if you are living in a simulation?
Designing the simulated environment for your control algorithm starts by defining the operating conditions in which the program will run. You need to include all of the operating conditions that the system may encounter while excluding the unreachable domains. For example, with a wind turbine you would want to model wind gusts up to 45 mph but exclude higher rates as the turbine would be shut down for safety reasons for speeds above that.
The next question is accuracy; at one level of change in the input values is there a perceptible change in the real world effects? In some cases a 1-degree change in temperature is trivial (e.g., a combustion engine) while in others it is significant (a home heating and cooling system). Select the resolution to match the requirements.
Next up is consistency, if we go back to the temperature resolution think about the other environmental variables associated with it, e.g. pressure: PV = NRT. If the resolution on the Pressure does not match the resolution of the Temperature then there can negative impacts on the control algorithm.
Finally, update rate: how often are the simulated environment variables updated? The data associated with the simulated variables should be updated with a rate that is consistent with the behavior in the real world.
The condition
The one conditional here; if a general use simulation is being created then the design needs to take into account the operating needs of all the target systems. For the ease of testing, the cross controller development should be limited to insure the highest quality simulation and testing environment.
Small, smart habits in modeling add up; while a local efficiency may not seem like much, when compounded over the full design, they will result in interestingly large improvements.
Data types, bit field, enums & #defines
Selecting a data type for your variable directly impacts the amount of memory required by your program. In general you should select the smallest sized variable that covers the range of the variable. But that is just the first step in how to save memory.
When you have multiple bits of On/Off state data, packaging the data into a single variable will save memory. For example, if you have 5 On/Off variables you could either use 5 Booleans (5 * 8 = 40 bytes) or a single unsigned 8 bit integer.
The final tip has to do with constant values. If the data needs to be tuneable then the data needs to be declared as a parameter. However, for non-tunable parameters, #defines or enumerated data types should be used; they provide the same “non-literal” implementation while being inlined in the generated code.
Exit early
When creating if-elseif-else logic, order the “if-elseif-else” in order of likelihood of occurrence (e.g., put the most likely first). This prevents the need to do unneeded comparisons in n% of the time.
Additionally, consider inlining calculations into the if/elseif logic if they are only used by a subset of the comparison.
resOf = highlyComplexMatheMaticalFunction if (lightIsOn) elseif (lightIsOff) elseif (resOf) end
The “highlyComplexMatheMaticalFunction” in this example is only used in one place so the requirement to calculate it for all paths is wasted. In contrast, if you have a complex calculation that is used by multiple branches you should consider pre-calculating the results.
The final suggestion here is to provide “early exits.” If you have if/then else logic or for/while loops consider having a “break” command in the loop / logic if the desired results are reached prior to the completion of the loop.
Resample your data
Real-world data is messy, often sampled at inconsistent intervals and both over and under-sampled at some domain points. For table data, resample your data into uniform intervals that cover your full range while maintaining the required accuracy from the data. If sections of the data are “flat” consider increasing the intervals in that section while decreasing the intervals on sections with rapid changes.
Error detection is not a halting problem,(1) which is to say that we can halt before crashing.(2) The objective of error detection is to determine operating regimes where
Continued operation presents
a risk to the operator
a risk to the those in the environment
a risk to the device
When those conditions are detected the device should provide feedback to the user and/or migrate into a different operating mode.
How do you know when faults are possible?
There are three primary ways in which errors can be predicted:
Known “dangerous” operating conditions: (fever condition) In some cases the fault condition is known; when your temperature goes above X you have a fever and should get treatment. In the same fashion for most devices, there are known conditions where continued operations are a risk.
Interpolated error: (crystal ball) In some cases, the current conditions with forward prediction can predict that the device will at some near point in time enter into a known dangerous condition. In this case a combination of how far off is the condition and how long have you been trending towards the condition should inform when the alert is raised.
Regression/ML/AI error: (history book) A statistical approach to predictive maintenance and error detection can be taken e.g., when conditions that historically have led to a fault are detected, then the alert can be raised. This differs from the crystal ball in that the root cause may not be understood.
I know but(3)
How to respond
How you respond should depend on the fault severity and the timeline of the fault. The higher the severity of the fault the higher the response; the closer the fault is to “activating,” the faster the response.
Low Alert
The lowest level of alert is the diagnostic message. For a low impact, delayed incident, this is acceptable. However, the end user may ignore this so…
Medium Alert
For a mid-level response, the device may reduce the performance. For example, adaptive cruise control could apply braking if the vehicle in front is too close.
High Alert
At the final end of things (high alert), the device should deploy / respond / act in such a way to protect the users and the people in the environment.
Better late than never; better early than late
As the section above shows, as the response increases the recovery is more significant. By the time you hit the high alert, repairing the device is a greater expense than a simple “check engine” light.”(4) Use the three prediction methods and you(5) can prevent faults.
Footnotes
I was surprised that there are no good engineering cartoons on the halting problem; perhaps they haven’t been finished yet.
For many engineers with error detection and fault handling, they treat the perfect as the enemy of the good. It is always better to error out gracefully than to continue operation at risk.
I know this is a fault screen for the small backhoe of a BobCat, but I like to imagine this is a nature documentary following around a wildcat (i.e., some of you may remember the early ’90’s commercials featuring a live bobcat).
The migration from Adaptive cruise control to air bags is clear; if you brake, you don’t crash.
In this case the “you” is you and your whole engineering team.
In the movie version of the book “Charlotte’s Web” there is a song “Zuckermans Famous Pig” which features the lyrics
Fine swine wish he was mine
What if he’s not so big
Seeing this cartoon while in graduate school(1) and while taking a numerical methods course led to the parody song
Fine Spline, coefficients are prime What if it grows too big?
Interpolation: When to use it
There are three general categories when interpolation is used
Sampled real-world data does not cover the full range (regression case) If the interpolation covers points inside of the data set this is generally a “safe” scenario e.g., the interpolated data will be smooth within the range. If the interpolation goes outside of the data set then the values predicted by the interpolation should be checked against real-world expectations. In general, 10~15% beyond the sampled range (for smooth data) is reasonable to interpolate.
To reduce calculation cost (speed or memory) Some calculations are memory or FLOPs intensive; interpolations (especially polynomial interpolation) can show a significant reduction in the total number of operations.(2)
Handle discontinuities (piecewise interpolation) For equations with discontinuities, an interpolation can be used to provide a non-infinite transition between the operating realms.
Interpolation: When not to use it
The “when not to use” is the mirror image of the “when to use.”
Mister toad’s wild ride:(3) (Sampled and Discontinuities): In some instances, the curvature of the equations is so severe that interpolations cannot accurately capture the data
Flip(4)side: The real thing is cheaper: Depending on the equation, and your target processor, the real calculation may be less intensive. In general when I hit a polynomial of order 6 or greater I start to question the value; (Taylor series after 3 terms).
Integer data: Gear 1.3 The class interpolation failure is when integer data is interpolated to floating-point values. My first exposure to this was when I interpolated a non-CVT vehicle into the 1.3rd gear.(5)
Follow these tips and you will know if you can “Pig out or Pig In” with your interpolation.
Footnotes
Back at this time, campus television had a limited number of channels. I would estimate that about 50% of my classmates, like me, had it on in the background the day before.
When performing polynomial interpolation save your powers, e.g. x2 = x*x; x3 = x2 * x; x4 = x2 * x2;
Continuing with the children’s story theme
I hope these puns don’t get you off on the wrong foot with my Flip-FLOPS.
Interestingly enough it was seeing that (a good decade before CVT’s were common) that I understood the impact that a CVT could have on fuel economy. If you are interested in fuel economy take a look at this series I’m writing.
Gravity: uphill or down, it has a way of changing your speed
Using these three forces acting on the vehicle, we can (when we add in losses) calculate the energy needed to get from point A to point B. (And if you are curious about how the data for A to B is collected check out this previous post)
Simple models
Our simple model uses the Lat / Lon / Elevation and Speed data we downloaded as part of the last blog post for our points A to B.
Counting your losses
In a frictionless, lossless world, my car with regenerative braking(1) could reach 100% efficiency. However, your car-not(2) able to do this in the real world. Our first pass of “driving the route” will make the following assumptions:
We hit every stoplight(3)
We drive at the speed limit(4)
There is no traffic(5)
Standard profile acceleration and deceleration between speed zones(6)
20% energy recapture on braking.
The first route: To the Tech Center!
My first working commute was from Farmington Hills Michigan to GM’s Warren(7) Tech Center. If we break it down by distance and stops we get the following table.(8)
Point
Distance (miles)
Speed MPH
0
.265
15
1
.530
35
2
.08
45
3
22.72
65
4
.5
45
5
1.5
45
Farmington Hills to Warren Tech Center
Between each point, there is a deceleration to stop(9) followed by an acceleration to the target speed. If we put this information into the Simulink model we get the following energy usage profile. There is an interesting modeling point between points 1, 2 and 3; it is a short stretch of the road section of road where the car does not have time to get up to speed before you have to slow down. I’ve included the Stateflow chart that I created to solve this look ahead in the footnotes.(10)
Full Trip
Regen Effects
Because this trip was mainly highway there was very little chance for regenerative breaking;(11) in contrast, my ADI (Applied Dynamics International) commute had many more start-stop moments with more regenerative braking events.
ADI Route
City Commute
Reviewing the data from these two routes reinforces some basic knowledge:
Total distance is only one factor in energy usage
Energy usage goes up with rate of travel
Start-stop events (with acceleration) have a large impact on energy use (e.g. steady speed is better)
Yellow is an odd default choice for plotting color.
In next week’s post we will add in a “human” driver model to improve the accel and decel behavior of these models.
Footnote
For a short introduction to the efficiency of regenerative braking I recommend this link. In short, there are two limitations to capturing energy from regenerative braking. First, a portion of the brake force is applied through conventional brake pads. Second, the torque/speed of the wheels at braking cannot be tuned for optimal energy capture. As a first pass approximation, we will assume that 20% of the energy is re-captured during brake implementation.
To instructors of Thermodynamics courses, please feel free to use this joke under a GPL Open Source License.
It only ever seems this way when you are driving.
Generally this is true with the exception of 25 MPH zones which always seem way slower than anyone drives.
Ok, so that never happens, but one could dream.
The first pass approximation of this is 10 mph/sec on acceleration, and 20 mph/sec on deceleration.
GM Technical Center is a large complex with tunnels connecting all the buildings; I often thought that I was working in a “rabbit’s warren.”
For the first pass, we are assuming a “flat” drive. In southeast Michigan, this is generally true.
The stops require a “look ahead” model, e.g., we have to know when to start stopping.
I implemented this as a Stateflow chart with the intention that additional logic will be added to account for the driver behavior model in subsequent updates. For now, it is a simple accel / deccel / hold calculation.
And because this was in 1995 there was zero chance for regenerative braking. GM had the EV1 then but I did not own one.
Getting from point A to point B sounds simple enough. Pull out of your driveway, turn left, turn right, turn left…(1) Yet as many a traveling salesperson has taught us, finding the best route is not a simple task. When you add in multiple competing factors, evaluating the cost of the route and determining the “best” path is not an easy task. Today we are going to take the first step in getting those routes by extracting map data.
Map data: What do we care about?
For any given route(2) what data do we care about and how do we characterize it? The primary categories are physical data (distance, elevation) and road characteristics (speed limits, stop signs, turns). Additional data such as traffic patterns will be modeled in a future human factors post.
Characteristic
Measurement
Notes
Distance
Linear distance of each segment
Does not include lane changes or turning radius
Elevation
Running elevation at each point in the route
Can also be expressed as the slope at each point
Speed limit
Posted speed limit
The actual average speed will be determined in later stages
Traffic lights
What is the periodicity of the lights and the red/green ratio
Timing of the lights can have a dramatic effect on MPG
Turns
Right and left turns
Turning slows down the vehicle and impacts MPG.
Example routes
To start this off I use my first professional commute from Farmington Hills, MI to Warren, MI at the General Motors Technical Center.(3) This is primarily a highway drive and because of that, easiest to get data on. So how are we going to get this data?
The early days: GM Tech Center
MAP APIs
There are multiple options for downloading map data. Depending on your objective the source could be(4)
USGS: A United States Government service that contains a wealth of information
Roads: Used to get both the speed limits and the signage
From tables to algorithms
For my starting point I used a static query, e.g. I downloaded the full route in one batch. The result is a 10,000+ entry point table set at 2-meter resolution.(6)
LAT
Long
delta X (meters)
elev (meters)
speed (mps)
Ahead
42.4758
-83.4136
0
262.128
6.7056
0
42.4759
-83.4127
2
262.128
6.7056
0
42.4760
-83.4121
2.1
262.130
6.7056
0
What is “Ahead”?
For my eventual model I added in a category of “ahead.” Ahead is calculated based on the position of a stop sign or turn, the posted speed limit, and a human factor of “driver aggression.” For safe drivers the “Ahead” will toggle early, leading to a reasonable deceleration.
More routes different data
Several years later, my wife and I moved to Ann Arbor, MI, a small college town (home of the University of Michigan). During our Ann Arbor days, I had 2 commutes; first, an “in-town” route to ADI (Applied Dynamics International), the other a mix of highway and surface streets to Ford Research. The in-town transit will be of interest for the high number of “starts and stops” while the highway commute to Ford Research, where I worked on full vehicle simulations to predict fuel economy, will act as a validation point of the earlier GM commute.
ADI
Ford Research Labs
Next post
In the next post I will build up the physics model and then run it in a traffic free version of these maps.
Footnotes
When my wife and I lived in the Boston area we joked that directions between any two points could be given just by referencing Dunkin Donuts. E.g. turn right at the first DD, turn left at the 3rd DD on your left; this was not far from the truth.
We will cover multiple routes in a future post.
I worked on my first full vehicle simulation H.I.L. system for GM on a project known as SimuCar. This is also the time period when I met my wife, Deborah, a great start to a wonderful life.
This post is centered around the US; there are other download sites for the rest of the world.
I selected Google Maps due to familiarity with the API; the others listed could be used with equal success. For other examples, these demos will give you a good start.
I selected 2 meters resolution for processing speed, based on a 1/2 car length estimate. In later posts I will examine how the map resolution effects the MPG estimates.
Machine Learning (ML), Artificial Intelligence (AI), and Deep Learning (DL) have moved from the research end of embedded systems development into real-world applications. These methodologies are in use to support the solution to problems that traditional control algorithms either cannot solve or solve at a high computational cost. Adopting a new technology means understanding the advantages and disadvantages of the technology. Today I want to talk about the novel input problem and how it can result in critical system faults.
Name that tune
ML/AI/DL systems have made huge strides in their ability to classify inputs into reasoned guesses. Recently Google announced a “name that tune” feature that allows users to provide a snippet and google will try to name that tune. In my 5 test cases it got 4 out of 5 right. The one it got wrong was John Cage’s 4’33”. It got it wrong because it is a song with no sound, something that falls outside all of the classification categories.
Conceptually understanding a black box
ML/AI/DL systems are notoriously black box systems, e.g. what goes on inside is not known or in a functional sense, knowable. But, there is a way of thinking of them that can help with the design of the complete controls system. ML/AI/DL systems can be thought of as “self taught” expert systems.
They “ask” themselves a set of questions, and for each question they guide down to a set of probabilities to the final “answer.”
The ideal system works like this:
I know what this is: It is a hot dog!
I know the type of thing it is: It is a type of sausage
I know the general class of things: it is food
I don’t know what it is: but I think it is safe
I don’t know what it is: it may not be safe
Depending on the type of system you are designing you may need to reject the results for anything that comes back 3 or lower in the list.
I don’t know: it may not be safe
Crying wolf, that is to say alerting when there is not a good reason to alert, is the curse of all ML/AI/DL systems. At the same time, as ML/AI/DL systems are often used in safety critical systems they need to have a better safe then sorry approach to input response. The first answer? A graduated response.
In most cases the warning can be responded to in stages. First “slow down” the rate of the event happening; use that time to re-evaluate the alert. If the conditions continue to merit, then “stop.”
The second approach? Stop. If you have a safety critical system, “fail safe” should always be prioritized over continued operation. Ideally the data gathered can be used as part of a DevOps workflow so that in the future the “correct” approach can be followed.
I learned to use a slide rule in 11th grade AP physics. To be clear, at the time there was no need to learn how to use a slide rule as digital calculators were common enough and not too expensive. So why did I learn how to use one?
Clarity of ideas
Having learned how to do multiplication and division with a slide rule I never forgot the fundamental properties of logarithms. The tool encoded an idea. This is often how we incorporate fundamental concepts, through using them. This is the argument for working with “primitive” tools; they can ground us in topics so we can use them going forward.
You are grounded!
But there is a “tipping point.” Somewhere along the line, you will hit a point where further use of the “basic tools” has a diminishing return on investment. I never learned how to solve roots with a slide rule and I don’t think I would have benefited from it. How do you determine when you have “tipped”?
How to tip
Unlike a restaurant, there is no set “tip” (15% for basic service in the US). Moreover you can easily miss it if you are not going into the exercise with the correct attitude. To identify the tipping point you need to:
Actively think about “what am I learning (Y) as I do X?”
Think about how what you learn could be applied to other situations.
Recognize when you are are just “getting better at X and not learning about Y.”
Follow those guidelines and you will slide into new areas with ease; after you slide for a while you will get the feel for when you are ready to dive in.