We're all missing the point on computer security

Computer security is a weird thing. Most people worry about it, at least a little. A lot of companies make whole heaping piles of money peddling software and hardware that purports to finally make all of your data completely and impregnably secure. The government is busy producing new laws that provide a legal framework for the guarantee of security in a networked environment.

But I can't help but feel that somehow everybody is missing the point. Consumers insist on encryption for the actual transfer of a credit card number but turn around and hand their password (with the associated access to the credit card data for the account) to the first person who instant messages them claiming to be an AOL employee who needs to verify their password. Businesses spend a lot of money on expensive firewalls and sophisticated monitoring software, but then forget to change default passwords on third-party software that they install. Software vendors provide products that are capable of assisting in securing a network, but then make the default configuration as open to penetration as any 60's free-love flower child. The left hand hasn't the vaguest notion of what the left wrist is doing, much less the right hand.

It doesn't matter though, because all of them are worrying about entirely the wrong things. I don't care if a box gets rooted and my company's webpage screams Fuck Authority for a few hours. We have a solid enough backup system that fixing that would require nothing more than scrubbing the compromised disks and then running dd to copy a known good image. It doesn't matter if you open an email with a nasty virus that blows away all of your software. Yeah, it's a pain to have to re-install all of your applications, but you won't even remember it in a year. Who gives a shit if some high-profile company gets hit with a denial of service attack, even if they're down for a whole day. Let's be honest, these Internet behemoths have a lot to do to prove that they have viable businesses and, in the end, no single 24 hour period is going to make the decision.

No, what scares me is that the only applications where any thought is being given to security belong to the class of user-visible applications. But most of the applications that need to be secured belong to the class of decision-support applications. These are the invisible applications that today's corporations, and tomorrow's individuals, use to provide them with the fundamental information upon which they make decisions about how to interact with the world.

These applications are tremendously complex, and the vulnerabilities in them are often exceedingly subtle, but they represent a far greater threat than any vulnerability seen yet in any email program. Perhaps an example is in order.

I worked for an Internet company that, internally, took this whole information revolution thing very seriously. We had a much higher ratio of certifiable net.fanatics to normal people than any place I've ever seen in my entire life.¹ Needless to say, we had a lot of software that had been developed internally that we used to make all sorts of important business decisions--the kind with consequences measured in the millions of dollars--every day.

We trusted these systems. We wrote them ourselves, and all of them were actually extensively reviewed by many engineers at every phase of the design and implementation. We were careful, we knew that if these systems went haywire it could cause real problems. If the wrong set of systems decided to go bad, it could literally kill the company.

At least one of the systems went haywire anyway. In spite of the very best efforts of dozens of seriously hard-working programmers, bugs crept in. They always do, it's inevitable in any real world environment. Databases drift--columns are added; tables are added; as years pass different applications begin to think that one field means different things. Source skews--work-arounds are made for system specific deficiencies, but aren't documented as such and are thus preserved through ports to new systems; implicit assumptions creep in; bizarre dependencies begin to appear.

In our case, it was a system that helped control the amount of certain products that were ordered from our suppliers based on the way that different customers were using rebates. We called the system Fred (it was an inside joke among some of the engineers). We wanted Fred to make sure that we were adequately stocked to cover whatever response we might see to our various marketing initiatives. Of course, it fed that data into several other analytic systems that were used to drive all sorts of marketing activities and purchasing decisions.

Fred worked fine at first, and actually helped make a significant difference in the company's financial results the first quarter it was available. After that, it performed above expectations for a long time before finally settling down to perform at or just below the rest of our systems--or at least it did according to our metric software, also developed internally.

In reality, Fred worked great for about 18 months. Then it broke when another application stopped writing the particular piece of data it was dependent upon to the database. Apparently the fact that that column had been written had just been a side effect of the way that the other application handled a particular part of each customer-record update. When that application was updated, Fred could no longer find the piece of data it needed.

But, Fred did the Right Thing. The system promptly reported via email to several people that something was wrong and what it was. The next day the engineer who now owned that part of Fred (Fred having grown large enough that it took several engineers to care for him) promptly investigated, quickly ascertained the cause of the problem, and immediately fixed Fred to simply look the data up elsewhere. Because he actually cut several dozen lines of code that had done the previous look-up (there was a more recent database API available that offered easier access) he felt good that he had reduced the overall complexity of the system by just a little bit. And hey, incremental gain is better than no gain at all.

Fred now appeared to work again. A few months passed, the programmer who had made the fix decided to retire on his stock-option millions, a new programmer took over that part of Fred, and life went on. But Fred was now very quietly doing something very horrible.

A particular counter was being used by both Fred and another customer analysis program. One of them pushed the counter up while the other pushed the counter down. Unfortunately, when Fred had been updated one of the things that had changed was the exact timing of the counter increment.

Another part of Fred assumed that the counter had already been updated and scribbled all over several internal data structures, preparing for the next update. But under some inputs this wasn't the case. Under some inputs the counter was never updated.

The result wasn't massive breakage--no, everything looked like it worked just fine. But every time Fred ran, some essentially random segment of his data store was corrupted. Not in a major way, but in a way that made a lot of the statistics and reports that Fred produced a little wrong.

These statistics and reports, of course, were then used as input to other decision control systems. The reports produced by these systems were, in turn, fed into still more systems.

We finally discovered the problem when some of our business people came to the IT people to say that the computer's purchase recommendations weren't making any sense. Apparently they hadn't been making sense for a very long time, but the mid-level marketing people who were using Fred trusted his data more than they trusted what they read elsewhere in the media, more than what focus groups said, even more than what they themselves thought was going to happen in our industry (more than once I heard someone say "I thought for sure that Fred would say to spend more marketing in FOO than in BAR, but then he came back and said 'Nope, BAR is better.'.", it should have tipped me off, but it didn't--I trusted Fred, too).

We didn't believe there was a problem, of course, but we were more than willing to take a look. Over the next day or two we hacked up some more metrics that fetched their data in a completely different way and performed their calculations completely differently. That's the only way to debug software like this, you have to compare it to something that is known to be good. We didn't have anything comparable, so we rewrote the reporting (but not the analysis) portions of Fred.

It didn't take long for it to become crystal clear that something was very wrong. We suspended usage of Fred and of every system that depended upon Fred. This was tremendously painful, people had to be laid off because we didn't have any software for them to use. We set about trying to count just how much this had cost us.

In the end it wasn't pretty. We had directly lost nearly a half-billion dollars as inventory that we'd never be able to sell had to be written off. We couldn't afford to lose a half-billion dollars, we didn't have it.

And that was only the direct cost, we never bothered to count how much we incurred in indirect costs from the systems using Fred as input data. We were obviously out of business by then.

The real bitch of it, though, is that I think we may have been attacked. I think that our company--1,000 people paying their bills, taking care of their kids, and just trying to live their lives--could have been murdered by our biggest competitor.

One of the first things we discovered when we began investigating was that the bug was dependent upon data supplied at our website by our users. In other words, it was certifiable, grade-A untrustworthy data. And we did all of the right things with it; we were careful not to overflow our buffers; we didn't execute anything; we just loaded it into a database.

But if someone had a copy of our source code they'd have been able to exploit the bug. It would have been trivial to put together a little bot using perl to make an arbitrary number of requests per day into our systems with bad data.

We had gotten used to our competitors looking at our site continuously, day in and day out. We even thought it was a good idea, and we looked at them, too. About a year ago, our biggest competitor had vastly stepped up the average number of requests per day into our system. This corresponded to a period during which they rolled out several dozen features that we had that they didn't yet. We just assumed that the increased requests were the result of them reverse-engineering our features. We played with them for a while, blocking their requests based on first domain, then ip-block, but we weren't serious about it and they kept looking at us.

Now I wonder, though. The increase in activity corresponds to when Fred first started to go out of control. I don't think it would have been too hard to come up with a copy of our source code--like most rapidly growing companies we had a tremendous in and out flow of contractor programmers and dba's. If our competitor had gotten a copy of our source they could have masked the attack in the greater volume of requests that they were making for the reverse engineering.

Anyway, that doesn't matter now. What does matter is that our security concerns are completely out-of-whack. We spend too much time worrying about superficial system security and interface security and not enough time worrying about the security that matters. And in the near future we'll begin adopting personal decision support devices.

I doubt we'll ever learn.

1- This is called corporate culture and is actually very fascinating, but not in an article about computer security.

endless thanks to sirnonya for proofreading and editing help.

The most elusive type of problem in a C program	corporate culture	Why security experts avoid implementing	The Eightfold Model of Human Consciousness
Buying an electric guitar	public key cryptography	Programming issues	missing the point
American Culture	Code Red Worm	Twin Peaks	firewall
Poco tries to make a delivery	Morris Worm	RC4	AOL
Code Red II	purchasing power	Blowfish	computer security
everybody	Right Thing	Out of Control