Getting from point A to point B sounds simple enough. Pull out of your driveway, turn left, turn right, turn left…(1) Yet as many a traveling salesperson has taught us, finding the best route is not a simple task. When you add in multiple competing factors, evaluating the cost of the route and determining the “best” path is not an easy task. Today we are going to take the first step in getting those routes by extracting map data.
Map data: What do we care about?
For any given route(2) what data do we care about and how do we characterize it? The primary categories are physical data (distance, elevation) and road characteristics (speed limits, stop signs, turns). Additional data such as traffic patterns will be modeled in a future human factors post.
Linear distance of each segment
Does not include lane changes or turning radius
Running elevation at each point in the route
Can also be expressed as the slope at each point
Posted speed limit
The actual average speed will be determined in later stages
What is the periodicity of the lights and the red/green ratio
Timing of the lights can have a dramatic effect on MPG
Right and left turns
Turning slows down the vehicle and impacts MPG.
To start this off I use my first professional commute from Farmington Hills, MI to Warren, MI at the General Motors Technical Center.(3) This is primarily a highway drive and because of that, easiest to get data on. So how are we going to get this data?
There are multiple options for downloading map data. Depending on your objective the source could be(4)
USGS: A United States Government service that contains a wealth of information
Roads: Used to get both the speed limits and the signage
From tables to algorithms
For my starting point I used a static query, e.g. I downloaded the full route in one batch. The result is a 10,000+ entry point table set at 2-meter resolution.(6)
delta X (meters)
What is “Ahead”?
For my eventual model I added in a category of “ahead.” Ahead is calculated based on the position of a stop sign or turn, the posted speed limit, and a human factor of “driver aggression.” For safe drivers the “Ahead” will toggle early, leading to a reasonable deceleration.
More routes different data
Several years later, my wife and I moved to Ann Arbor, MI, a small college town (home of the University of Michigan). During our Ann Arbor days, I had 2 commutes; first, an “in-town” route to ADI (Applied Dynamics International), the other a mix of highway and surface streets to Ford Research. The in-town transit will be of interest for the high number of “starts and stops” while the highway commute to Ford Research, where I worked on full vehicle simulations to predict fuel economy, will act as a validation point of the earlier GM commute.
In the next post I will build up the physics model and then run it in a traffic free version of these maps.
When my wife and I lived in the Boston area we joked that directions between any two points could be given just by referencing Dunkin Donuts. E.g. turn right at the first DD, turn left at the 3rd DD on your left; this was not far from the truth.
We will cover multiple routes in a future post.
I worked on my first full vehicle simulation H.I.L. system for GM on a project known as SimuCar. This is also the time period when I met my wife, Deborah, a great start to a wonderful life.
This post is centered around the US; there are other download sites for the rest of the world.
I selected Google Maps due to familiarity with the API; the others listed could be used with equal success. For other examples, these demos will give you a good start.
I selected 2 meters resolution for processing speed, based on a 1/2 car length estimate. In later posts I will examine how the map resolution effects the MPG estimates.
Machine Learning (ML), Artificial Intelligence (AI), and Deep Learning (DL) have moved from the research end of embedded systems development into real-world applications. These methodologies are in use to support the solution to problems that traditional control algorithms either cannot solve or solve at a high computational cost. Adopting a new technology means understanding the advantages and disadvantages of the technology. Today I want to talk about the novel input problem and how it can result in critical system faults.
Name that tune
ML/AI/DL systems have made huge strides in their ability to classify inputs into reasoned guesses. Recently Google announced a “name that tune” feature that allows users to provide a snippet and google will try to name that tune. In my 5 test cases it got 4 out of 5 right. The one it got wrong was John Cage’s 4’33”. It got it wrong because it is a song with no sound, something that falls outside all of the classification categories.
Conceptually understanding a black box
ML/AI/DL systems are notoriously black box systems, e.g. what goes on inside is not known or in a functional sense, knowable. But, there is a way of thinking of them that can help with the design of the complete controls system. ML/AI/DL systems can be thought of as “self taught” expert systems.
They “ask” themselves a set of questions, and for each question they guide down to a set of probabilities to the final “answer.”
The ideal system works like this:
I know what this is: It is a hot dog!
I know the type of thing it is: It is a type of sausage
I know the general class of things: it is food
I don’t know what it is: but I think it is safe
I don’t know what it is: it may not be safe
Depending on the type of system you are designing you may need to reject the results for anything that comes back 3 or lower in the list.
I don’t know: it may not be safe
Crying wolf, that is to say alerting when there is not a good reason to alert, is the curse of all ML/AI/DL systems. At the same time, as ML/AI/DL systems are often used in safety critical systems they need to have a better safe then sorry approach to input response. The first answer? A graduated response.
In most cases the warning can be responded to in stages. First “slow down” the rate of the event happening; use that time to re-evaluate the alert. If the conditions continue to merit, then “stop.”
The second approach? Stop. If you have a safety critical system, “fail safe” should always be prioritized over continued operation. Ideally the data gathered can be used as part of a DevOps workflow so that in the future the “correct” approach can be followed.
I learned to use a slide rule in 11th grade AP physics. To be clear, at the time there was no need to learn how to use a slide rule as digital calculators were common enough and not too expensive. So why did I learn how to use one?
Clarity of ideas
Having learned how to do multiplication and division with a slide rule I never forgot the fundamental properties of logarithms. The tool encoded an idea. This is often how we incorporate fundamental concepts, through using them. This is the argument for working with “primitive” tools; they can ground us in topics so we can use them going forward.
You are grounded!
But there is a “tipping point.” Somewhere along the line, you will hit a point where further use of the “basic tools” has a diminishing return on investment. I never learned how to solve roots with a slide rule and I don’t think I would have benefited from it. How do you determine when you have “tipped”?
How to tip
Unlike a restaurant, there is no set “tip” (15% for basic service in the US). Moreover you can easily miss it if you are not going into the exercise with the correct attitude. To identify the tipping point you need to:
Actively think about “what am I learning (Y) as I do X?”
Think about how what you learn could be applied to other situations.
Recognize when you are are just “getting better at X and not learning about Y.”
Follow those guidelines and you will slide into new areas with ease; after you slide for a while you will get the feel for when you are ready to dive in.
Every engineering project has a starting point where you map out what you want to realize and how you will enact your vision. For complicated projects there are iterations at each stage as the initial design process is honed in on and leads toward the final design. With this series of blog posts I am going to walk you through my design process for optimizing a set of my “historical daily commutes”(1) for fuel-efficiency.
The cost function
Optimization requires a function (or set of functions) to optimize against. Our first task then is to outline what will go into our “cost function.”(2)
Vehicle efficiency: engine & driveline
Vehicle aerodynamics: the effect of speed on MPG
Route topology: the ups and downs that make your MPG go up or down
Weather: The rain, the sleet,and the snow that changes traction and heating you know…
Driver behavior: lead-footed devil or soft-coasting angel?
Traffic conditions: what are the other drivers like?
Vehicle physics: This can be done with basic “table lookup” models. Past experience shows that there are dimension returns on higher fidelity models/
Environmental factors: The environment directly impacts the vehicle physics; modeling this can be done using real-world data (maps) and historical data (weather). These models will give us the chance to explore data-based modeling.
Human factors: I will be drawing on network theory models for traffic flow and human/traffic interactions.
Optimizing the cost function
This is where things become interesting; this cost function is highly nonlinear. How we navigate (optimize) the cost function is an open question. I will try and consider which of the following is the best approach:
Segmentation: decompose the optimization into sub-optimization problems and run integration analysis?
Neural network: global optimization of the system?
Model linearization: create a linearized version of the models enabling linear optimizations?
I will be selecting commuting routes from my early years of work when I was part of the auto industry. Travel to the GM Tech Center in Warren, Ford’s Scientific Research Labs, and the Milford Proving grounds. Each route will be analyzed individually and then the goodness-of-fit of the algorithm will be compared between the results.
The objective with a cost function is to minimize the cost of the tasks. I’ve also seen this formulated as a “satisfaction quotient” where the objective is to maximize satisfaction. While I like the latter concept more, the minimization algorithms are simpler to implement so we will be using those.
When driving in rush hour it can seem like you are the only human on the road until someone is nice and lets you merge over to that lane you’ve needed to for the past 10 minutes.
Over the next few months, I will be running a series of blog posts that I am calling “Every Day Model-Based Design.” In them, I will be creating physical and controls models to understand and explore everyday events. The first will be…
Driving style MPG Optimization: Taking a look at my daily commutes throughout my career to determine an optimal driving style. Featuring:
Big data analysis (e.g. traffic analysis)
Physical modeling (e.g. road and vehicle behavior)
Human modeling (e.g. how do I drive)
TBD: What the world shows me…
I plan on having fun with these topics, going wide and deep as I explore how mathematics and systems engineering can describe the world around us.
Part of my objective in writing this blog is to clarify issues that are seen as complex to render them to the “actual complexity of the system”; simplify without sacrificing accuracy. Recently when looking up information on transmissions I came across this video from Chevy from 1936, explaining how a transmission works. While modern transmissions have additional components not covered in this video (such as hydraulic clutches) the fundamentals are there.
With that in mind, today I want to take this opportunity to highlight another excellent work that explains a fundamental engineering concept.
Selecting an ECU (electronic control unit) for your project is an investment in time and financial resources. Once the selection has been made for a given product line, that ECU will be used on average for 5 years. This means that even if the ECU meets all of your needs now it may not in 3 years if you don’t plan ahead.
Types of growth
There are three types of growth that need to be accounted for:
Increases in functions: as new features are added new functions are added. These functions will take up additional processing time.
Increases in memory: Hand-in-hand with the new functions’ time needs are memory needs.
Increases in I/O: The trickiest of the lot. Sometimes it is just additional channels of existing I/O types but in some cases it is a need for new I/O types.
As a general rule of thumb, 80% to 85% memory process utilization at initial release provides a safe margin. For hardware, two spare channels of each type is generally safe. In the case where new I/O types may be required there are two options. The first is to select a hardware device that has product family members with additional I/O types. The second is a selection of a board that supports external I/O expansion slots.
Growth in the times of DevOps
Traditionally the updates to ECU software only happened when a new product was released and it happened “in the factor.” With the growth in “over air” updates (one of the driving features of DevOps) the starting metrics need to change. The rule of thumb will need to take into account the anticipated features to be released and determine which of those will be pushed for update. The type of features to be pushed will be heavily dependent on the product type with some products receiving very few updates (e.g., medical devices with high integrity workflows) while others such as consumer devices may receive frequent updates
Simulink is a graphical control design environment. However, it has the ability to include MATLAB, textual-based, algorithms as part of the model. For some design problems, this is the best approach. In today’s post, I’m going to review the three primary methods for including MATLAB code in your Simulink Model.
Function / Class / System Object
There are three methods for including MATLAB code into Simulink Models; a function, MATLAB classes, or MATLAB Simulink System Objects.
Generates C code
Supports calls to external files
Object Oriented Code
Built-in I/O validation routines
Built-in state save / restore methods
The Class and System Objects provide functionality that is not built into the Function method. However, both require additional knowledge of how to program. A MATLAB function can be written as a simple set of equations while the Object Oriented methods require some base level programing knowlege.
And state data: memory
For models that are targeting Code Generation there is an additional consideration, memory usage. State data, or dWork data in the generated code is used for any variable that is required to be held in memory between time steps. With a MATLAB function the user can explicitly define the variables that are State Data by using the “persistent” keyword.
With the MATLAB class or MATLAB System Object, any data that is defined as a property will be stored as a dWork variable regardless of the need for a state variable. The number of state variables can be decreased in the Class implementation; however in doing so, much of the benefit of the class based approach is lost.
Co-Simulation is when two or more modeling tools run concurrently, exchanging data between the tools. Co-Simulation is desirable when a single tool cannot achieve either the fidelity or execution speed required to model a given element.
Types of co-simulation
There are two primary types of co-simulation: imported and networked.
Imported: The “primary” tool incorporates the “secondary” models into its’ framework. The primary tool is responsible for the execution and timing of the secondary tool.
Networked: In the networks, case the tools execute independently with a secondary program providing the data exchange layer between the tools. The data exchange layer is responsible for matching the time stamps for the tools.
If your primary tool has a method for importing third party executables, like S-Functions in Simulink, then this is generally the easiest method for performing co-simulation. Once the functional interface is defined, the simulation engine of the primary tool provides the full execution context.
The downside of this approach is that it’s generally less accurate than the networked option. This is most acute when either of the tools requires a variable step solution as the incorporated tools are most often run at a fixed step or a variable step set by the primary tool.
In contrast, the networked approach allows you to run each tool with the optimal step size for the model; this results in high accuracy but is more often than not, much slower. The second issue with this approach is, how do you synchronize the tools?
In general a 2nd or 3rd order spline should be used to match the data points between the tools for the different time steps. This means that the integration tool may need to store large amounts of data and perform significant calculations at each data exchange.
There are some texts that serve as a foundation stone for a field or technology; Kernighan & Ritchie’s “The C programing language” for C, Smith, Prabhu and Friedman’s “Establishing A Model-Based Design Culture” for MBD and for UML it is Martin Fowler’s “UML Distilled.” While the fields have moved beyond these three texts they all act as the common starting point of the discussion. With that in mind I want to talk about why reading UML Distilled will provide a significant boost to your system level modeling abilities.(1)
The third, and latest addition of UML Distilled was written in 2003. The usage of UML and the associated tool chains have evolved since then but the core principals of the book hold up.
The difficulty in talking about UML is that it is an open standard; as a result, different tools and different groups have variants on the implementation of the language. That is why this book is so valuable, it lays out the core nature of the major types of UML diagrams.
Perhaps most importantly the book lays out the cases of when to use each type of UML diagram, e.g. “Class Diagrams: When to use” & “Sequence Diagrams: When to use” and… While in my view the book recommends the usage of some diagrams when I do not think the are appropriate (specifically some of the recommendations for State Machine Diagrams and Communication Diagrams) it is admirable that he provides the trade-offs between different types of UML diagrams.
Object Oriented Classes and UML
One of the virtues of UML is the ability to graphically design Object Oriented models using Class Diagrams. Combined with Sequence Diagrams, the basics of a systems level modeling tool can be defined. However, what is often missed is that the Class and Sequence diagrams need to be combined with Package and Deployment diagrams to fully implement a system level model. Often the first two are used in system design and a less efficient implementation is created.(2)
The primary issue surrounding UML is “when to stop.” UML diagrams are not intended for the final design of software but it is tempting to keep putting more information into the UML diagram. Fowler lays out a good case for how far to go.
For smaller systems, Class and Sequence diagrams are sufficient; however, for larger system-of-systems the Package and Deployment diagrams are needed.