System engineering for dummies

I’ve been getting a lot of suggestions about the brand new UPSide project recently. One of them nudged me into bringing a piece of implicit knowledge to the surface of my mind. Having made it conscious, I can now share it.


I’ve said before that, on the unusual occasions I get to do it, I greatly enjoy whole-systems engineering – problems where hardware and software design inform each other and the whole is situated in an economic and human-factors context that really matters.


I don’t kid myself that I’m among the best at this, not in the way that I know I’m (say) an A-list systems programmer or exceptionally good at a couple other specific things like DSLs. But one of the advantages of having been around the track a lot of times is that you see a lot of failures, and a lot of successes, and after a while your brain starts to extract patterns. You begin to know, without actually knowing that you know until a challenge elicits that knowledge.


Here is a thing I know: A lot of whole-systems design has a serious drunk-under-the-streetlamp problem in its cost and complexity estimations. Smart system engineers counter-bias against this, and I’m going to tell you at least one important way to do that.



You know the joke. A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost them here, and the drunk replies, no, and that he lost them near his car three blocks away. The policeman asks why he is searching here, and the drunk replies, “This is where the light is”


When we’re trying to estimate costs and time-to-completion for a whole system, we have a tendency to over-focus on the costs we can easily list and pin down, as opposed to the ones that are more difficult to estimate.


In general, hardware costs – the BOM (Bill of Materials) – are easy to estimate. If your estimate is off, time is on your side; buying parts will generally be cheaper six months from now than it is now. Other things that are much harder to estimate are software development costs and the time value of rapid completion. Delay is not your ally there; software development does not necessarily get cheaper inside your planning horizon, and being later to complete is a bad thing.


The streetlight effect means, therefore, that when doing cost and development-complexity analysis for a whole system, and trying to optimize out costs, we’re going to have a strong tendency to chisel away at BOM while neglecting attempts to lower the software-dev costs. We’re likely to end up writing a procurement strategy that trades small gains in the former for large losses in the latter, simply because we’re not allocating our attention as we should.


What makes this worse is that the zero-sum conflict is not just for the the attention of the planner’s brain. The easy- and hard-to-estimate costs can affect each other. Going cheap on the hardware often increases the software-development friction and lengthens the product timeline.


In the specific case that nudged me into consciousness, when I had to choose a main controller for the UPside I reached for an SBC running Unix on board rather than an Arduino-class microcontroller that requires custom firmware all the way down. Various EE types complained that my choice is overkill. I knew they were wrong in my gut, but I had to think about it to realize why.


The smart whole-systems engineer counter-biases against the streetlight effect. One of the ways is to plan on the assumption that software development costs you have no clear idea how to estimate are likely to blow up on you horribly, and that if there are hedges you can buy against that by taking a hit somewhere else (in the BOM, or even your raw revenues) it’s probably smart to go for at least some of them.


Twenty years ago I was the first to observe that making the software of your whole system open-source is an effective way to spread your development costs and mitigate the effects of your own experts moving on to other things. That’s one kind of hedge against large risks that are difficult to estimate – you’re trading away the expected benefits of collecting rent on the software’s secrecy, but (with rare exceptions) these were doubtful to begin with. Way easy to overestimate.


Using a Unix engine instead of a no-OS microcontroller or PIC in your embedded deployment is another long bet. You’ll pay for it where you can see, but the benefits in reduced costs and risks are future indefinite.


A smart systems engineer knows that he should counterbias against the streetlight effect by making some of those long bets anyway. Sometimes this will succeed, sometimes it will fail. The only thing you know for sure is that the “safe” strategy of never long-betting at all is suboptimal, exactly because the streetlight effect messes with your judgment.

 •  0 comments  •  flag
Share on Twitter
Published on February 17, 2018 17:26
No comments have been added yet.


Eric S. Raymond's Blog

Eric S. Raymond
Eric S. Raymond isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Eric S. Raymond's blog with rss.