bitHound Blog

The Bug Blog: Calculation and Codes Crashes from the 90’s

If you look up the definition of a software bug, you will find that it has many names from: errors, mistake, flaws, failures, anomaly, exception, crash, bug, defect, incident and side effect. Let’s take a look at a few of the crashes and mistakes that made headlines in the 90's.

1. 1990 - The AT&T Network Crash

On January 15th, 1990, 60,000 AT&T customers lost telephone services completely. During the nine long hours of frantic effort that it took to restore service, 75 million telephone calls went uncompleted, resulting in over $60,000 lost of revenue from calls. This does not take into account the loss of revenue for all airline reservations, car rentals, hotels, theaters, restaurants and other businesses that relied on AT&T’s network for business. In the end, it turned out to be a single line of code, written in C that caused the crash. The error in the code featured a break statement (line 10), that was located within an if clause, that was nested within a switch clause.

In pseudocode, the program reads as follows:

1  while (ring receive buffer not empty 
          and side buffer not empty) DO
2    Initialize pointer to first message in side buffer
      or ring receive buffer
3    get copy of buffer
4    switch (message)
5       case (incoming_message):
6             if (sending switch is out of service) DO
7                 if (ring write buffer is empty) DO
8                     send "in service" to status map
9                 else
10                    break // caused the error
                  END IF
11           process incoming message, set up pointers to
             optional parameters
12           break
        END SWITCH
13   do optional parameter work

Source: NYtimes

2. 1994 - The Intel Pentium FDIV bug

Because of a software bug, the processor in Intel’s Pentium returned incorrect decimal results, a problem for the precise calculations needed in fields like math and science. When Professor Thomas Nicely, a math professor discovered the bug, he made it aware to the the public that Intel knew about the bug and would replace the problem chips only upon request to users who could prove they were affected. This made widespread news and upset a number of users, who demanded a refund or replacement. It is said that the incident cost Intel over $400 million.

The presence of the bug can be checked manually by performing the following calculation in nearly any application, including my Android phone’s calculator:

4195835.0/3145727.0 = 1.333 820 449 136 241 000  (Correct value)

4195835.0/3145727.0 = 1.333 739 068 902 037 589  (Flawed Pentium)

4195835.0/3145727.0 = 1.333 820 45 (Android phone)

Source: Wikipedia

3. 1996 - The Ariane 5 Explosion

On June 4th 1996, the Ariane-5 rocket was launched into space and exploded 39 seconds after liftoff. The cost of this explosion resulted in a loss of over 10 years of research by the space agency, over 7 billion dollars in development and half a billion dollars of cargo. The explosion was caused by a computer error. The rocket's computer was converting a 64-bit floating point number into a 16-bit signed integer number. Once the rocket hit approximately 36.7 seconds into the flight, a number was encountered that was larger than 32768, which is the largest possible 16-bit signed integer. This caused the conversation to fail, leading to the explosion a few seconds later.

Source: Wikipedia

bitHound identifies risks and priorities in your Node.js projects.