Looking through some of my favorite articles of all time, I came across this jewel from 2005 - Wired News’s 10 Worst Bugs in History. I remember at the time I felt like their list was incomplete, and it has always bugged me a little bit (yes, pun intended). So I decided to do something about it. After taking a look at 20 or so of the worst software failures in history, I have compiled my own, updated, top ten list…
# 10 - Mars Climate Orbiter Crashes (1998)
A sub contractor who designed the navigation system on the orbiter used imperial units of measurement instead of the metric system that was specified by NASA.
Result - The 125 million dollar space craft attempted to stabilize its orbit too low in the Martian atmosphere, and crashed into the red planet.
# 9 - Mariner I space probe (1962)
While transcribing a handwritten formula into navigation computer code, a programmer missing a single superscript bar. This single omission caused the navigation computer to treat normal variations as serious errors, causing it to wildly overreact with corrections during launch. To be fair to the programmer, the original formula was written in pencil on a single piece notebook paper - not exactly the best system for transcribing mission critical information. Then again, this was 1962…
Result - 237 seconds into the mission that was supposed to sent Mariner I to Venus, the space craft was so far off course that Mission Control had to destroy it over the Atlantic. The cost of the spacecraft was 18.2 million in 1962.
# 8 - Ariane 5 Flight 501 (1996)
NASA certainly isn’t alone in its spacecraft destroying software bugs though. In 1996, Europe’s newest unmanned satellite-launching rocket, the Ariane 5, reused working software from its predecessor, the Ariane 4. Unfortunately, the Ariane 5’s faster engines exploited a bug that was not realized in previous models. In essence, the software tried to cram a 64-bit number into a 16-bit space. The resulting overflow conditions crashed both the primary and backup computers (which were both running the exact same software).
Result - 36.7 seconds into its maiden launch, the self destruct safety mechanism was activated due to the computer failures, and the spacecraft disintegrated in a spectacular fireball. The Ariane 5 had cost nearly 8 billion to develop, and was carrying a 500 million satellite payload when it exploded.
# 7 - EDS Fails Child Support (2004)
In 2004, EDS software giant introduced a large, complex IT system to the U.K.’s Child Support Agency (CSA). At the exact same time, the Department for Work and Pensions (DWP) decided to restructure the entire agency. The restructure and the new software were completely incompatible, and irreversible errors were introduced as a result. With over
still reported as open in the new system, the clash of the two events has crippled the CSA’s network.
Result - The system somehow managed to overpay 1.9 million people, underpay another 700,000, had 7 billion in uncollected child support payments, a backlog of 239,000 cases, 36,000 new cases “stuck” in the system, and has cost the UK taxpayers over 1 billion to date.
# 6 - Soviet Gas Pipeline Explosion (1982)
When the CIA (allegedly) discovered that the Soviet Union was (allegedly) trying to steal sensitive U.S. technology for its operation of their trans-Siberian pipeline, CIA operatives (allegedly) introduced a bug into the Canadian built system that would pass Soviet inspection but fail when in operation.
Result - The largest non-nuclear explosion in the planet’s history. And a new-found respect (fear?) of the CIA.
# 5 - Black Monday (1987)
On October 19, 1987, a long running bull market was halted by a rash of SEC investigations of insider trading. At the time, computer trading models were (and still are) common in the trading market, and most had triggers in place to sell stocks if their value dropped to a certain point. As investors began to dump stocks affected by the investigations, their stocks dropped, causing the computer triggers to kick in. The flood of computer issued stock executions, coupled with investor liquidation, overwhelmed the market and caused multiple systems to crash. This in turn triggered even more automated sell executions, and panic quickly set in. Investors were selling blind world wide, stocks were virtually liquidated, and market values plummeted.
Result - Technically beginning in Hong Kong (where markets opened first), the crash had world wide implications. The impact in the US was devastating. the Dow Jones Industrial Average plummeted 508 points, losing 22.6% of its total value. The S&P 500 dropped 20.4%. This was the greatest loss Wall Street ever suffered in a single day.
# 4 - Therac-25 Medical Accelerator (1985)
The Therac-25 was a radiation therapy device built by Atomic Energy of Canada Limited (AECL) and CGR of France. It could deliver two different kinds of radiation therapy: either a low-power electron beam (beta particles) or X-rays. Unfortunately, the operating system used by the Therac-25 was designed and built by a programmer who had no formal training. The OS contained a subtle race condition, and because of it a technician could accidentally configure the Therac-25 so the electron beam would fire in high-power mode without the proper patient shielding.
Result - In at least 6 incidents (with more suspected), patients were accidentally administered lethal or near lethal doses of radiation - approximately 100 times the intended dose. At least five deaths are directly attributed to it, with others seriously injured.
#3 - Multidata Systems (2000)
Another medical system makes the list, this time at the National Cancer Institute in Panama City. This one is a combination of software bug as well as user error. A U.S. firm, Multidata Systems International, created therapy planning software that was designed to calculate the proper dosage of radiation for patients undergoing radiation therapy. The software allows a radiation specialist to draw on their screen where they would be placing metal shields (called “blocks”) on the patient during treatment. These blocks protect healthy tissue from the radiation. The software itself only allows the placement of 4 blocks, but the Panamanian doctors normally used five. To get past the limitation in the software, the doctors decided to trick the software by drawing all five blocks as a single block with a hole in the middle. Unfortunately, a bug in the Multidata software caused it to give different results depending on how the hole was drawn. Draw it one way and the dosage was correct. Draw it in the other direction and the software recommended twice the correct dosage.
Result - At least eight patients die, while another 20 receive overdoses likely to cause significant health problems. The physicians, who were legally required to double-check the computer’s calculations by hand, are indicted for murder.
#2 - Patriot Missile Bug (1991)
During the first Gulf War, an American Patriot Missile system was deployed to protect US Troops, allies, and Saudi and Israeli civilians from Iraqi SCUD missile attacks. A software rounding error in the one of the early versions of the system incorrectly calculated the time, causing it to ignore some of the incoming targets.
Result - A Patriot Missile Battery in Saudi Arabia fails to intercept an incoming Iraqi SCUD. The missile destroyed an American Army barracks, killing 28 soldiers and injuring around 100 other people.
#1 World War III… Almost (1983)
Have you ever seen the movie War Game? Nobody knew at the time how very close this movie mimicked a real life near-disaster in the same year. In 1983, Soviet early warning satellites picked up sunlight reflections off cloud-tops and mistakenly interpreted them as missile launches in the United States. Software was in place to filter out false missile detections of this very nature, but a bug in the software let the alerts through anyway. The Russian system instantly sent priority messages up saying that the United States had launched five ballistic missiles. Protocol in such an event was to respond decisively, launching the entire soviet nuclear arsenal before any US missile detonations could disable their response capability. The duty officer for the system, one Lt Col Stanislav Petrov, intercepted the messages and flagged them as faulty, stopping the near-apocalypse. He claimed that he had a “funny feeling in my gut” about the attack, and reasoned if the U.S. was really attacking they would launch more than five missiles.
Result - Thankfully nothing. However, the world was literally minutes away from “Global Thermal Nuclear War”. Any retaliatory missile launched by the Soviets would have triggered a like response from the U.S., eventually leading to a total launch of all systems from both sides. (Like W.O.P.R., I would have much preferred a nice game of chess…)
Honorable Mention #1 - LA Airport Flights Grounded (2007)
A single faulty piece of embedded software, on a network card, sends out faulty data on the United States Customs and Border Protection network, bringing the entire system to a halt. Nobody is able to leave or enter the U.S. from the LA Airport for over eight hours.
Result - Over 17,000 planes grounded for the duration of the outage
Honorable Mention #2 - The Ping of Death (1995)
A lack of error handling in the IP fragmentation reassembly code makes it possible to crash many Windows, Macintosh, and Unix operating systems by sending a malformed “ping” packet from anywhere on the Internet.
Result - The blue screen of death and giggling teenage hackers all over the nation.