If you’ve talked to me at certain times, you may have had me talk your ear off about some transportation accident, usually that’s had loss of life. I’ve probably told you about news I’ve read, maybe ad nauseum. If it was an older one, I may have sent you a link to an NTSB report. And no, it’s not really for this reason.
One factor in my obsession is certainly the loss of life. These kind of engineering, system-engineering, management, human factors engineering failures cause the most pain to the people closest to those who died. As a secondary effect, consider the engineers and technicians that did work, or who’s failure to do work, that may have been a contributing factor; those people may carry guilt about an accident for the rest of their lives.
Bridge Fusion Systems has done work that supports rail transportation. Some of the work we’ve done, like the RTP-110 switch controller for the Toronto Transportation Commission, gets very close to being safety critical.
Fortunately, none of the work we’ve done has ever been a factor in any transportation accident. But, every time there is one of these accidents I am compelled to understand the failures in the process that led to it so that we can learn from it.
We tell debugging stories. Some people might find them boring. The point of their telling is to pass along important information about how defects manifested themselves. Sometimes, the point is how our own blindness to what the system was trying to tell us made it harder to fix the problem. In all cases, the meaning of this story is to prevent others from repeating our same mistakes, to gain experience second-hand.
Following the reporting or NTSB report from an accident is like hearing a very large, usually sad debugging story. The stories sometimes cover more than just technical details: Management practices, behavior and communication among technicians and other human factors such as distraction are also included.
From these, I want to have clear ideas in my head about how things go wrong: technical, leadership, focus, distraction and confusion. Like the debugging stories, I want to have ideas in my head, in the heads of the people who work for me, of how what we do might lead to failures that could be serious. I want to be able to recognize these paths to failure at the circuit, software, system and leadership level.
Thinking this way has affected the way we build our products and recommend products be built for our customers, even when they’re not safety critical. For instance, data logging within the product, data logging of everything that’s practical has helped us find subtle bugs before they turned into customer and end user complaints that would have been hard to track down.
If you want to read more about some of the incidents that have affected our thinking, you could start here:
WMATA Collision, 2009
Amtrak Train 188, Philadelphia, 2015
Amtrak Train 501, DuPont, WA (Preliminary), 2017
EDIT: Apr 16, 2019: I’ve fixed a badly phrased sentence above and added below some events that I’ve heard have influenced people I know and respect.