Thursday, February 19, 2004

Feb 19 2004

Okay, everyone, gather around! It’s story time.

The simulator where I work has to recognize NavAids, which (from what I understand) are basically transmitters tucked into the ground at various points around the globe. When flying, the plane monitors the distance to each of these NavAids, and because it knows the actual location of those NavAids, it can triangulate with them to find out where the plane actually is.

Except that sometimes the NavAids don’t show up when the simulator is running. Thus, our intrepid programmer Chris was brought in.

Chris found that our simulator has a list of NavAids. When the simulator starts up, it reads the NavAids list from a file into memory, then as it flies, it uses that list to find the NavAids around it. Or, it’s supposed to; clearly, that wasn’t happening quite right.

Chris looked at the file which contained the NavAids, and discovered that they were not sorted. Or, at least, it seemed that way. After a bit of digging, he found out that the file’s structure had never been documented, so some poor soul had had to figure out how to generate those files essentially blind. And that soul had never figured out what order the NavAids were supposed to be in. Dead end.

So Chris looked at the code which searches the NavAids list for a NavAid. To find the NavAid, it takes the NavAid’s frequency (f) and the number of NavAids in the list (n) and performs the following calculation to get the NavAid’s number:

(f — n * (f/n) ) * 3

Don’t run screaming into the night; I don’t like algebra either, but this is very straightforward.

Okay. Let’s look at the middle of this calculation, which is the part calculated first:

n * (f/n)

Any number is the same as that number divided by one, so the above equation is the same as…

(n/1) * (f/n)

…which is the same as…

(n * f) / n

n divided by n is 1, so we can safely eliminate it, to get…

f

Now, let’s plug that back into the full formula:

(f — f) * 3

Any number minus itself equals? Zero! 5–5=0, 27–27=0, etc. So, this complicated formula always evaluates to 0. Well, 0 times 3. Which is always zero.

But wait. It gets better.

Chris figured that maybe this was all a mistake in coding the formula. This C code was originally written in FORTRAN, so he looked at the original FORTRAN code and found the exact same formula. Weeeeird.

After banging his head against this problem for awhile, Chris started talking about it to some co-workers, one of whom, John, had been working with FORTRAN for years. John looked at the code and almost immediately said, “I know why it’s doing that.”

It turns out that FORTRAN’s implementation of division has a unique property: when performed in a particular way, you get the remainder of the division, not the quotient. This formula was set up in that particular way.

Suddenly, everything came together in Chris’ mind. The original FORTRAN programmers had stored the NavAids in the list using the remainder of the division of the NavAid’s frequency by the number of items in the list. Then they’d used this hack to calculate that number and retrieve the appropriate NavAid from the list. Without explaining this anywhere.

When the code was migrated from FORTRAN to C, the formula was copied over exactly. And it compiled and worked perfectly. But the C compiler happily calculated the actual quotient, not the remainder. The hack no longer worked, so the formula always equalled zero.

Meanwhile, because the file was now being generated by a program that didn’t know how to order the NavAids, they weren’t being put into the list in the order that the formula expected them to be.

So finally Chris understood the problem, and went back to fix the code.

The moral of this story? Ask for help. Your co-workers can often save you days’ or weeks’ worth of pain and suffering.

And never use hacks.

No responses yet

Leave a Reply

I work for Amazon. The content on this site is my own and doesn’t necessarily represent Amazon’s position.