29 August 2007
The Electronic Frontier Foundation has obtained documents on the FBI's DCS-3000 system. They're nicely summarized in a Wired story. In addition, Matt Blaze has written about technical weaknesses in the wiretap technology used. I won't repeat what they've done so well.
I'm concerned about a longer-term issue: I don't think the FBI really understands computer security. More precisely, while parts of the organization seem to, the overall design of the DCS-3000 system shows that when it comes to building and operating secure systems, they just don't get it.
The most obvious example is the account management scheme described in the DCS-3000 documents: there are no unprivileged userids. In fact, there are no individual userids; rather, there are two privileged accounts. Each has diferent powers; however, as the documents themselves note, each can change the other's permissions to restore the missing abilities. Where is the per-user accountability? Why should ordinary users run in privileged mode at all? The answers are simple and dismaying.
Instead of personal userids, the FBI relies on log sheets. This may provide sufficient accountability if everyone follows the rules. It provides no protection against rule-breakers. It is worth noting that Robert Hanssen obtained much of the information he sold to the Soviets by exploiting weak permission mechanisms in the FBI's Automated Case System. The DCS-3000 system doesn't have proper password security mechanisms, either, which brings up another point: why does a high-security system use passwords at all? We've know for years how weak they are. Why not use smart cards for authentication?
We can't even rely on just the log sheets: the systems support remote access, via unencrypted telnet.
Any security specialist will tell you that this design is a recipe for disaster. Indeed, the FBI's own security audit, as documented in the released documents, makes some of these very points. The problem is that the system was misdesigned in the first place.
There's another side to the problem, though: worries about threats that aren't particularly serious. The CI-100 component — a so-called "data diode" for moving data between different classification levels — is built from two Windows machines that are required to have anti-virus software. Why? These machines are forwarding data at the packet level. They are not receiving email, browsing the web, serving users, etc. Where will virus infections come from? It's not that it hurts to have anti-virus software, but requiring it makes me wonder how good the threat analysis is. And if viruses are a threat, why are Windows boxes used? It's not that other systems are necessarily more secure (though I can make a good case for that); however, viruses simply aren't a real-world threat. Furthermore, generic Windows machines are notoriously hard to lock down. Yes, there's some benefit to using a familiar platform, but this is a very specialized need.
My biggest concern, though, lies in the words of one of the FBI's own security evaluations: the biggest threat is from insders. The network is properly encrypted for protection against outside attackers. The defenses against insiders — yes, rogue FBI agents or employees — are far too weak.
To sum up: we have a system that accesses very sensitive data, with few technical protections against inside attacks, and generic defenses that don't seem to fit the threat model.
Update: In a new report, the Department of Justice's Office of the Inspector General has released a new report on the FBI's vulnerability to espionage from within. The report points out continuing serious problems with the Bureau's Automated Case Support (ACS) system, and calls for (among other things) "a third-party audit program to detect and give notice of unauthorized access to sensitive cases on a real-time basis". You can't do that with manual log sheets.
28 August 2007
It seems that the press reports I mentioned about the Amtrak outage were not correct. A later report asserts that the problem was with a circuit breaker panel, and that the delay in restoring service was because of how long it took to get the new part.
The new article confirms, though, that Amtrak did have plans for passenger service during such an outage. That's excellent, even if some station agents were unaware of the backup procedure.
26 August 2007
While I was getting ready to leave on a trip this morning, my wife heard a brief radio report: Amtrak was having problems with its ticketing system; as a result, there were long lines at train stations. We were unable to get any more information, so we left the house 30 minutes early to get to the station.
As it turned out, by that point the ticketing system had been down for almost 24 hours. From what the ticket clerk told me, it had failed early Saturday morning, came back a few hours later, then failed hard around 1:30 PM EDT. He had no information on what the problem was or when the system would be back.
The lines came about because at many stations, agents were hand-writing tickets. I was spared that: I was told to board the train without a ticket, and simply give the conductor my reservation number. Passengers who didn't know theirs were told to call up to get it; given how crowded the phone lines were, it's not clear to me that that would have worked very well. In any event, I had my mine. (Amtrak reservation numbers are six hexadecimal digits…) The conductor went through the train asking people in my situation to write down their name and reservaton number; presumably, they'll follow up with me somehow.
The lack of communication by Amtrak was quite frustrating. There were no notices on amtrak.com; all you knew was that some functions weren't working that well. The same was true of the automated phone system. There was virtually no coverage by the mainstream media, even though Amtrak ridership is up significantly. Rumors spread. One fellow rider told me she heard the problem was caused by a lightning strike.
Ultimately, it was determined to be a software issue: a system upgrade didn't work properly. Apparently, diagnosing the problem took close to 12 hours; repairing it — that is, deleting the "upgrade" and reinstalling the old software — took another 12-15 hours.
It's tempting to blame Amtrak for the entire fiasco. Certainly, they should have communicated better with their passengers. I think their failures in that are inexcusable. But it's a fact of life that software upgrades often break things. Perhaps Amtrak didn't test the new code adequately; it will take a detailed investigation to find out. That said, even the best testing is often not good enough. More worrisome is how long it took to revert to the old system. That may have been a case of poor planning by Amtrak; however, in practice it turns out to be a surprisingly difficult thing to do on complex systems. Just as they're not the only ones to have been victimized by bad upgrades, they're not the only ones who had trouble backing them out.
There are, then, three lessons.
- Communicate with your customers
- Test new systems
- Plan and prepare for failure
I look forward to seeing the investigation report on this incident.
24 August 2007
Traditionally, computer security has been about defending your computer against outsiders. Sometimes, though, the owner is seen as the threat, either by the manufacturer or by application or content providers. They have a much harder job.
The case that's in the news right now is the apparent unlocking of the Apple iPhone by a New Jersey teenager. His attack involves a temporary hardware mod to the phone to confuse the boot ROM. According to his blog entry, the boot ROM checks certain memory locations. If they're all 0xFFFFFFFF, the code assumes that new memory has been installed, and skips its usual checks. Otherwise, it insists that the code be digitally signed. That should be adequate protection, because those memory locations can't be overwritten. The hardware mod, though, changes a bit on the address bus, making the check look at locations that can be changed.
There are claims of a software-only attack, and a forthcoming commercial unlock service. No details have been released, but the claims are not implausible, especially if they've found another root exploit.
There's another recent case that has drawn much less attention but makes the same point, albeit more subtly. In this case, someone has been sued for making it easy for people to circumvent protection on downloadable coupons. Software that consumers can download from coupons.com lets them print their own coupons; however, this software limits how many coupons any one user can print. It does this by assigning "each user's computer a unique identifier, which the company uses to track and control the consumer's coupon-printing practices, usually limiting each user to two coupons per product. Each printed coupon has its own unique serial code."
The offending behavior, then, consisted of deleting files or registry keys. In other words, coupons.com is claiming that if done for improper purposes, users are not allowed to modify a disk drive on a computer that they own. Who owns the machine? (I note that the license agreement does not seem to prohibit circumventing their protections.)
Clearly, this protection scheme is rather easy to bypass, the company's claims notwithstanding:
When consumers first print a coupon from our systems, the Coupon Printer is installed on their computer. It is an industry-standard browser plug-in that enables the security features required to print real coupons. Unlike cookie-based controls, removing and reinstalling the Coupon Printer does not affect its security settings. A coupon never appears on the consumer's screen but prints directly to the printer.(We'll ignore for now the security implications of teaching consumers to install software offered by random web sites…) But how strong is this protection? Not very.
The obvious thing to do is to remove the offending files or registry keys. Doing that requires knowing which those files or keys are, which most consumers won't know. On the other hand, ordinary roll-back software — common on many PCs — will do the job quite nicely. But virtualization makes it easier still.
If you run virtual machines, it's really easy to discard changes. The virtual machine's "disks" are typically ordinary files on the host computer. Boot a VM, print the coupons you want, exit, and restore the disk from the copy of it you make beforehand. It's even easier with, say, VMware, which features "undoable disks": when you shut down the VM, all changes you've made are discarded.
What is the lesson here? Leaving out the legal aspects — for once, I won't dissect a statute, though I find the notion of copyrighting a coupon serial number to be dubious in the extreme — it's really hard to defend against someone with unlimited access to the machine. Coupons.com did it rather poorly, but they were constrained to work within the constraints of existing, commercial operating systems. Apple tried a lot harder, but it seems that even they failed.
Succeeding requires tamper-proof hardware. But no security professional will speak of tamper-proof devices, as opposed to tamper-resistant ones. Security is a matter of economics, and not just technology. How much will your attacker spend to defeat your security? Are you protecting something valuable enough that your enemy will resort to the three B's: burglary, bribery or blackmail? Protecting against determined adversaries is very hard; it's rarely wise to bet your business on it.
20 August 2007
Skype has finally released some details on its massive network outage. From what they've said, it appears to have been a self-propagating restart failure. We've seen these before.
The first part of the trap is a massive number of near-simultaneous client restarts. This is relatively easy to design for if you plan for it; I've been in more than one meeting where someone has something like "what happens if we power-cycle Chicago?" Not having sufficiently capacity to handle very rare events isn't necessarily wrong. The events are very rare; in many circumstances, it's perfectly acceptable to shed load by denying service to some clients during the recovery phase.
What appears to have happened here, as best I can tell from the Skype statement, is more subtle. Suppose that the excess load causes a server to crash. All of the clients who were using that server will notice the problem and attempt to reconnect to a different server. That puts more load on it, causing it to crash.
As I've noted, this sort of thing has happened before. Perhaps the best-known incident was the Martin Luther King Day meltdown of the AT&T long distance network. In that case, the problem was that if a phone switch crashed and restarted, the recovery message could crash its neighbor. That one would restart, generating messages that crashed its neighbors, including of course the one that crashed it.
The hardest problem, though, is that it's so difficult to test a load-sensitive failure. How many client machines do you have in your test lab? Do you really know what resource your servers are going to run out of first, especially if there's non-linear behavior?
14 August 2007
According to a news story, a 10-year-old boy locked himself into a gun safe at a store. He was released via an override code. The story, though, raises a number of interesting questions.
The first question is how the safe was opened in the first place. This is a gun safe, a device intended children away from guns. But the article notes:
"My brother saw a safe and opened it somehow. He just pressed numbers," Daniel said.As it turns out, there was a simple-to-guess default combination. The safe is also advertised as burglar-resistant; I'll return to that point below.
The next question is why there should be an override code. It was useful this time, of course, but what happened here is hardly a common occurrence. More likely, it's to prevent loss of access to the contents when the owner forgets the combination.
What appears to be the manufacturer's instruction page gives some hints. (Note: the incident happened at a Sam's Club. Its web site shows just one safe that appears to fit the description in the article; a simple search query found the safe's web page. The specifications on that page are identical to those given in the article. That said, I'm not 100% certain I found the right page, though I do think it likely.) The purpose is indeed error recovery:
Q. Who do I contact if I lose my combination?
A. In the event that you lose your combination, you will need to write a letter stating that you are the owner of the safe, including your serial number. The letter will have to be notarized and faxed to us at 817-xxx-yyyy. Please include in the letter how you would like your combination to be released and someone will contact you.
This system seems dubious, but for its intended purposes may provide adequate security. It's good enough against a burglar, who probably would not interrupt a break-in to get a letter notarized and faxed, let alone to wait for a response. It will likely deter most children, who may run into skeptical adults if they tried to get such a letter notarized. An adult confederate? Sure, that's possible, but the adult confederate could just as easily buy guns for the children. The biggest risk may be a clever teenager who could fake the notary seal; it's hard to tell that a faxed page has an embossed stamp.
From a security perspective, though, there may have been a failure. The article says that "Sams Club employees were able to obtain an override code from the manufacturer." How was the call to the manufacturer authenticated? Did someone call the phone number listed on the web page, claim to be the store manager, and explain the emergency? Did they put on someone who claimed to be the local fire chief? How would you authenticate such a call if you received it? (I won't even bother discussing email security….)
Perhaps the call went via the Sam's Club internal chain of command first, since this particular gun safe appears to be custom-made for Sam's Club. That only postpones the issue, even assuming that an internal call can be authenticated. A good answer would rely on a call from someone who's personally known to the recipient. "Hi, Pat. This is Chris at Sam's Club in Worcester; we have an emergency." If Chris knows Pat and recognizes Pat's voice, it's probably secure. It's even better of Chris looks up Pat's number in a personal directory and returns the call. Relying on things like CallerID would be dangerous, as would assuming that any caller from Sam's Club is legitimate. (Remember that many teenagers work as store clerks, and they're part of the threat model.) Also note that unless the safe manufacturer rep has access to the override codes, there's an internal authentication chain, too.
The best answer, of course, is if there was some pre-arranged emergency authentication code. Did they have that much foresight? It's possible but I tend to doubt it.
We don't know the details about how the authentication took place. I suspect that if I asked, I'd be told that for security reasons, they can't reveal that information. What is clear, though, is that most of the likely scenarios involve people who are properly trained in security, and who will do the right thing and stick to the procedures even in an emergency. In other words, people are the weak link. In this case, there was a happy outcome. However, failure to protect the combination could easily result in a tragic result. What is the proper balance?
Update: Matt Blaze, an expert on safes, notes that safes of this type with electronic locks rarely have override codes. (The factory-set combinations to mechanically-locked safes are typically recorded, however.) He suggests that perhaps the manufacturer simply supplied the default 1-2-3-4-5-6 combination, plus information on the 5-minute lock-out that occurs after several failed entry attempts. The article mentions troubles with it. There are clearly no security implications to supplying public data. However, the web page clearly states that some form of combination recovery is possible (perhaps only for mechanical locks); everything I wrote above would apply to that data.
10 August 2007
The Minnesota Supreme Court ordered the release of source code to an alcohol breath tester. The decision is heartening, but may not set a broad precedent.
The actual opinion makes it clear that the ruling is based on the facts of this specific case. In particular, the state's RFP for the devices required that Minnesota own the copyright to the source code:
All right, title, and interest in all copyrightable material which Contractor shall conceive or originate, either individually or jointly with others, and which arises out of the performance of this Contract, will be the property of the State and are by this Contract assigned to the State along with ownership of any and all copyrights in the copyrightable material[.] Contractor also agrees, upon the request of the State to execute all papers and perform all other acts necessary to assist the State to obtain and register copyrights on such materials. Where applicable, works of authorship created by Contractor for the State in performance of the Contract shall be considered "works for hire" as defined in the U.S. Copyright Act.Other states' contracts may not have a similar provision.
What's necessary is recognition of a fundamental right to such access. Minnesota's RFP did recognize that, and required that the contractor provide
information * * * including statement of all non-disclosure/non-reproduction agreements required to obtain information, fees and deposits required, to be used by attorneys representing individuals charged with crimes in which a test with the proposed instrument is part of the evidence. This part of the contract to be activated with an order from the court with jurisdiction of the case and include a reduced fee schedule for defendants found by the court to be entitled to a publicly funded defense.That was likely an administrative inclusion, rather than a legislative one or a broad constitutional holding; still, it's a good start.
I congratulate the Minnesota officials who added those two provisions, especially the one requiring access for defense attorneys. I hope that all other jurisdictions will follow suit, and for all similar devices.
Update: Thinking further, it would seem that the Minnesota Attorney-General could get an injunction compelling the vendor to release the code. This is not a matter of balancing two rights — the defendant's right to information necessary for the criminal case versus the company's right to protect its trade secrets — it's a question of enforcing a contract the company agreed to.
6 August 2007
Matt Blaze explains how companies should handle security problems.
3 August 2007
We often see news stories about one security flaw or another in some important package or system. Such stories are always depressing, but this week was worse than most for government systems.
The most publicized story, of course, was the yet more reports of flaws in electronic voting machines. I and others have reported on that story. What's so sad is that it isn't a new problem, even conceptually. People have been warning about it for almost 20 years. Peter Neumann's 1993 paper set forth security criteria for computerized voting systems; also see the bibliography. Of particular interest is Ronnie Dugger's 1998 article in the New Yorker. Rebecca Mercuri and David Dill are two other, early voices warning of the problem. But nothing much seems to have happened.
Weaknesses existed in all control areas and computing device types reviewed.Predictably, DHS officials downplayed the risk: "the report raised many hypothetical problems and overstated others, because few outsiders can gain access to the system's computers." That completely ignores insider attacks, of course, but that isn't the whole problem; the investigators found that the system had inadequate physical controls and poor separation from other networks. On top of that, the cryptography was laughably poor: a single key was used for all traffic, certficates were not properly distributed, the same certificate was used for clients and servers, etc.
These weaknesses collectively increase the risk that unauthorized individuals could read, copy, delete, add, and modify sensitive information.
Other countries probably have similar problems. A German security researcher demonstrated that he could crash passport readers. New passports have pictures and fingerprints stored as JPEG files readable via an RFID chip; Lukas Grunwald cloned a legitimate chip but inserted a corrupted JPEG file. There is quite likely a penetration vulnerability, too; as he notes, "if you're able to crash something you are most likely able to exploit it."
Want some more? The US Internal Revenue Service, it turns out, is vulnerable to social engineering attacks. In a recent official audit, 60% of employees tested changed their passwords as instructed by a caller who claimed to be from the help desk.
There is no one solution to the problems described above. Some solutions are obvious: get rid of C to deal with the (probable) buffer overflow problem in the passport reader, improve employee training about passwords (better yet, switch to two-factor authentication), etc. It's also clear that we need more research on the subject; I'll blog about that in the near future. But one point is worth stressing now: given the difficulty of writing correct, secure software, it can't be done cheaply. Low-bid systems will never be secure. We do know something about building reliable software: think how rarely we see critical failures in phone switches, avionics, etc. Note well: I am not saying that any of these systems are perfect. They're not, and the failures have been copiously reported in RISKS Digest. But they are a lot better than most of what we use. Similarly, voting machines, DHS border control systems, passport readers, etc., can be a lot better than they are today (though for voting machines it's unclear if they can be enough better than simpler, semi-manual alternatives). But we, as a society, have to want this badly enough to pay for it. Do we? Should we? Contemplating the consequences of any of these systems being compromised, I think the answer is obvious.
1 August 2007
Lots of other people have already commented on the California voting machine evaluation. See, for example, blog posts by Avi Rubin, Bruce Schneier, and Ed Felten (links to their blogs at the upper left of this page). I won't bother adding my two cents.
The responses to the evaluation by the vendors have been predictable. The study was unrealistic, it ignores process, an enemy wouldn't have full access to the source code, etc. But these responses ignore the first two questions that any security professional asks when doing an evaluation: what are you protecting, and against whom?
In the US, most elections are run by political appointees. Over the years, a variety of procedures have been adopted to try to prevent fraud; most of them involve some form of multi-party access. For example, ballot counting is overseen by representatives of all parties. That said, there is often a lot of opportunity for mischief by the party in control in some jurisdiction; if nothing else, they control what machines are purchased. This issue is finally drawing some much-need scrutiny.
The security question, then, is this: are today's processes, designed for older generations of voting technology, sufficient to protect electronic voting machines? Put more bluntly, given the ease of replacing code, opening locks, and bypassing seals — as described quite vividly in every independent study I've seen — are electronic voting machines and the associated processes secure against attacks by insiders? Remember that these insiders have a lot of money and skill, and demonstrably have the motive. Do they have the means? The conclusions of the reports are quite damning with respect to the ease of certain attacks; the real question is whether or not would-be insider attackers have sufficient access. The attacks are fast and easy; I strongly suspect that they're quite practical, given just a bit of luck.
I should note that I do agree with the vendors and election officials that process is important. I told Avi Rubin that before he released his famous paper on Diebold voting machines; he recounts that story in his book Brave New Ballot. Do we have the right processes today? Look at these pictures, from the ITU and the BBC about the start of an election: the ballot boxes are shown to be empty before the voting starts. What are the high-assurance electronic equivalents? Remember that processes can be attacked with technology; see the "Stuffer's ballot box" for an old example. And remember that I said "high assurance"; what really happens when the buttons are pressed to show that a voting machine has been cleared?
Ironically, for all that I'm a security expert, my real concern with electronic voting machines is ordinary bugs in the code. These have demonstrably happened. One of the simplest cases to understand is the counter overflow problem: the voting machine used too small a field for the number of votes cast. The machine used binary arithmetic (virtually all modern computers do), so the critical number was 32,767 votes; the analogy is trying to count 10,000 votes if your counter only has 4 decimal digits. In that vein, the interesting election story from 2000 wasn't Florida, it was Bernalillo County, New Mexico; you can see a copy of the Wall Street Journal story about the problem here.
Our voting machines are badly broken. Fixing them means accepting the technological limitations and designing a system around them, not asserting that they do not exist. It also means fixing what technical problems we can.
Update: Matt Blaze's blog now discusses the review, too. (Matt was one of the reviewers and couldn't speak publicly until his report was released.)