Useful Links

Recent Posts

Archive

Heartbleed: Don't Panic

11 April 2014

There's been a lot of ink and pixels spilled of late over the Heartbleed bug. Yes, it's serious. Yes, it potentially affects almost everyone. Yes, there are some precautions you should take. But there's good news, too: for many people, it's a non-event.

Heartbleed allows an attacker to recover a random memory area from a web or email server running certain versions of OpenSSL. The question is what's in that memory. It may be nothing, or it may contain user passwords (this has reportedly been seen on Yahoo's mail service), cryptographic keys, etc. From a theoretical perspective, this latter is the most serious; an attacker can impersonate the site, read old traffic that's been recorded, etc. (Beside, cryptographers take key leakage very personally; that keys won't leak is one of our core assumptions.) Is this a real risk, though? For many people, the answer is no.

In order to impersonate a site, an attacker has to redirect traffic you're sending towards that site. If you only use the Internet via well-controlled networks, you're probably safe. Yes, it's possible to redirect traffic on the Internet backbone, but it's rare and difficult. If a major intelligence agency is after you or that site, you're at risk; most of us aren't in that category. Cellular data networks are also in that category: it can be done, but it's hard.

For most people, the weak link is their access network: their home, their workplace, the public or semi-public networks they use. It's much easier to redirect traffic on a WiFi network or an Ethernet, and well within the capabilities of ordinary cybercriminals. If untrusted individuals or hacked machines use the same networks as you do, you're at much more risk. Your residence is probably safe if there are no hacked machines on it and if you observe good security precautions on your WiFi network (WPA2 and a strong password). A small office might be safe; a large one is rather more dangerous. All public hotspots are quite exposed.

The other risk of Heartbleed is someone decrypting old traffic. That sounds serious, though again it's hard to capture traffic if you're not law enforcement or an intelligence agency. On exposed nets, hackers can certainly do it, but they're not likely to record traffic they'll never be able to decrypt. Law enforcement might do that, if they thought they could get assistance from the local spooks to break the crypto. They could also redirect traffic, with cooperation from the ISP. The question, though, is whether or not they would; most police forces don't have that kind of technical expertise.

It's important to realize that exposure isn't all or nothing. If you regularly use a public hotspot to visit a social networking site but only do your banking at home, your banking password is probably safe. That's also why your home network gear is probably safe: you don't access it over the Internet. (One caveat there: you should configure it so that you can't access it remotely, only from your home. Too much gear is shipped with that set incorrectly. If you have a router, make sure remote access to it is turned off.)

One more threat is worth mentioning: client software, such as browsers and mail programs, use SSL; some of these use OpenSSL and hence are vulnerable if you use them to connect to a hacked site. Fortunately, most major browsers and mailers are not affected, but to be safe, make sure you've installed all patches.

There's one password you should change nevertheless: your email password. It's generally used to reset all of your other accounts. "Probably safe" is not the same as "definitely". Accordingly, as soon as you know that your mail provider has patched its system (Google and Yahoo have, and Microsoft was never vulnerable), change it—and change it to something strong and use a password manager to save you from having to use the same new password everywhere.

Oh yes—if Martian Intelligence is after you (you know who you are), indeed you should be worried.

Open Source Quality Challenge Redux

9 April 2014

I don't have time to do a long blog post on Heartbleed, the new flaw in OpenSSL, but there's one notion going around that needs to be squashed. Specifically, some people are claiming that open source software is inherently more secure:

Because so many people are working on the software, that makes it so it's less susceptible to problems. For security it's more important in many ways, because often security is really hard to implement correctly. By having an open source movement around cryptography and SSL, people were able to ensure a lot of basic errors wouldn't creep into the products.
Not so. What matters is that people really look, and not just with their eyes, but with a variety of automated static and dynamic analysis tools.

Secure systems require more than that, though. They require a careful design process, careful coding, and careful review and testing. All of these need to be done by people who know how to build secure systems, and not just write code. Secure programming is different and harder; most programmers, however brilliant they may be, have never been taught security. And again, it's not just programming and it's not just debugging; design—humble design—matters a great deal.

I wrote about this problem in the open source community five years ago. I haven't seen nearly enough change. We need formal, structured processes, starting at the very beginning, before we'll see dramatic improvement.

Speculation About Goto Fail

24 February 2014

Following the logic in my previous post, I don't think that Apple's goto fail was a deliberate attack. Suppose it was, though. What can we learn about the attacker?

The first point is that it very clearly was not the NSA or other high-end intelligence agency. As I noted, this is too visible and too clumsy. While they may not object to that, covertly tinkering with Apple source code is a difficult and risky operation. If an investigation by Apple shows that this was an attack, they'll move heaven and earth to close the hole, whether it was technical, personnel, or procedural. In other words, using some covert access channel to install a back door effectively "spends" it; you may not get to reuse the channel. In that case, they definitely would not use this one-shot on something that is so easily spotted. (What could they have done? There are lots of random numbers lying around in cryptographic protocols; leak the session key—more likely, a part of it—in one of those numbers. Use convoluted code to generate this "random" number, complete with misleading comments.)

The next question is how this capability can be used. The vulnerability requires a so-called "man-in-the-middle" (MitM) attack, where an attacker has to receive all messages in each direction, decrypt them, reencrypt them, and forward them to the proper destination. If you're intercepting a lot of traffic, that's a lot of work. There are a number of ways to do MitM attacks; for technical reasons, it's a lot easier on the client or the server's LAN, or with the cooperation of either's ISP. There are certainly other ways to do it, such as DNS spoofing or routing attacks; that's why we need DNSSEC and BGPSEC. But ARP-spoofing from on-LAN is very, very easy.

We can narrow it down still further. If the main thing of interest is email and in particular email passwords, the odds are that the victim will be using one of the big "cloud" providers: Google, Yahoo, or Microsoft. You don't want to tap those nets near the servers; apart from the technical difficulty (they're good at running their machines securely, and they don't invite random strangers onto their nets), and apart from the fact that you'd be trying to take a sip from a firehose, it's hard to figure out where to put the tap. Consider gmail. I checked its IP address from my office, my house, and from a server I sometimes use in Seattle. The IP addresses resolved to data centers near DC, New York City, and Seattle, respectively. Which one should you use if you wanted my traffic? Note in particular that though my office is in New York City and my house is not, I got a New York server from when trying from home, but not when trying from my office.

(There's another possible attacker vector: software updates. Apple's updates are digitally signed, so they can't be tampered with by an attacker; however, that isn't true for all third-party packages. Tampering with an update is a way to get malicious code installed on someone's machine.)

The conclusion is simple: go after the client LAN. If it's a LAN the attacker and the target both have access to—say, an Internet cafe or the like—the problem is very simple. In fact, if you're on-LAN with the target, you can home in on the target's MAC address (which doesn't go off-LAN), making your life very simple. Alternatively, hack into the wireless router or seek cooperation from the hotspot owner or the ISP. (Home routers are easily hacked, too, of course. If the attacker goes that route, there's no need for MAC-spoofing.)

We can also speculate that the victim is using an iOS device rather than a Mac. If nothing else, Apple sells far more iPhones and iPads than it does Macs. There's another reason to think that, though: Macs are expensive, and Internet cafes are much more popular in poorer countries where fewer people have broadband access at home. Of course, if they're using an iPhone in cellular mode, there's no LAN to camp on, but governments have little trouble gaining access to the networks run by their own telephone companies. Either way, it sounds very much like a targeted attack, aimed at a very few individuals.

My reasoning is, of course, highly speculative. If it was an attack, though, I conclude (based on this tenuous set of deductions) that it was a moderately capable government going after a small set of victims, either on a public net or via the local mobile phone carrier. Most of us had nothing to worry about—until, of course, the patch came out. And why a patch for just iOS, with MacOS waiting until later? Apart from the system test difficulties I mentioned last time, might it be because Apple was warned—by someone!—that this hole was being exploited in the wild? Again, I'm speculating...


Update: Nicholas Weaver notes that putting the exploit station on the victim's LAN or router solves another problem: identifying which machines may be vulnerable to this exploit. It can look at other traffic from that MAC address to decide if it's an Apple product, what browser it's using, etc., and not tip off the victim by pulling this stunt against Firefox or Internet Explorer users.

In principle, this can be done from off-LAN, but it's harder, especially if the hosts use strong sequence numbers and randomize the IPid field. (Some open source operating systems do that. I haven't checked on Macs or Windows boxes in a long time.)

Goto Fail

23 February 2014

As you've probably heard by now, there's a serious bug in the TLS implementations in iOS (the iPhone and iPad operating system) and MacOS. I'll skip the details (but see Adam Langley's excellent blog post if you're interested); the effect is that under certain conditions, an attacker can sit in the middle of an encrypted connection and read all of the traffic.

Here's the code in question:

	if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
		goto fail;
	if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
		goto fail;
		goto fail;
	if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
		goto fail;
	...
Note the doubled
		goto fail;
		goto fail;
That's incorrect and is at the root of the problem.

The mystery is how that extra line was inserted. Was it malicious or was it an accident? If the latter, how could it possibly have happened? Is there any conceivable way this could have happened? Well, yes.

Here's a scan from a 1971 paperback printing of Roger Zelazny's Lord of Light.

Note the duplicated paragraph... Today, something like that could easily happen via a cut-and-paste error. (The error in that book was not authorial emphasis; I once checked a hardcover copy that did not have the duplication.)

There's another reason to think it was an accident: it's not very subtle. That sequence would stick out like a sore thumb to any programmer who looked at it; there are no situations where two goto statements in a row make any sense whatsoever. In fact, it's a bit of a surprise that the compiler didn't flag it as an error; the ordinary process of optimization should have noticed that the all of the lines after second goto fail could never be reached. The attempted sabotage of the Linux kernel in 2003 was much harder to spot; you'd need to notice the difference between = and == in a context where any programmer "just knows" it should be ==. However...

There are still a few suspicious items. Most notably, this bug is triggered if and only if something called "perfect forward secrecy" (PFS) has been offered as an encryption option for the session. Omitting the technical details of how it works, PFS is a problem for attackers—including government attackers—who have stolen or cryptanalyzed a site's private key. Without PFS, knowing the private key allows encrypted traffic to be read; with PFS, you can't do that. This flaw would allow an attacker who was capable of carrying out a "man-in-the-middle" (MitM) attack to read even PFS-protected traffic. While MitM attacks have been described in the academic literature at least as long ago as 1995, they're a lot less common than ordinary hacks and are more detectable. In other words, if this was a deliberate back door, it was probably done by a high-end attacker, someone who wants to carry out MitM attacks and has the ability to sabotage Apple's code base.

On the gripping hand, the error is noticeable by anyone poking at the file, and it's one of the pieces of source code that Apple publishes, which means it's not a great choice for covert action by the NSA or Unit 61398. With luck, Apple will investigate this and announce what they've found.

There are two other interesting questions here: why this wasn't caught during testing, and why the iOS fix was released before the MacOS fix.

The second one is easy: changes to MacOS require different "system tests". That is, testing just that one function's behavior in isolation is straightforward: set up a test case and see what happens. It's not enough just to test the failure case, of course; you also have to test all of the correct cases you can manage. Still, that's simple enough. The problem is "system test": making sure that all of the rest of the system behaves properly with the new code in. MacOS has very different applications than iOS does (to name just one, the mailer is very, very different); testing on one says little about what will happen on the other. For that matter, they have to create new tests on both platforms that will detect this case, and make sure that old code bases don't break on the new test case. (Software developers use something called "regression testing" to make sure that new changes don't break old code.) I'm not particularly surprised by the delay, though I suspect that Apple was caught by surprise by how rapidly the fix was reverse-engineered.

The real question, though, is why they didn't catch the bug in the first place. It's a truism in the business that testing can only show the presence of bugs, not the absence. No matter how much you test, you can't possibly test for all possible combinations of inputs that can result to try to find a failure; it's combinatorially impossible. I'm sure that Apple tested for this class of failure—talking to the wrong host—but they couldn't and didn't test for this in every possible variant of how the encryption takes place. The TLS protocol is exceedingly complex, with many different possibilities for how the encryption is set up; as noted, this particular flaw is only triggered if PFS is in use. There are many other possible paths through the code. Should this particular one have been tested more thoroughly? I suspect so, because it's a different way that a connection failure could occur; not having been on the inside of Apple's test team, though, I can't say for sure how they decided.

There's a more troubling part of the analysis, though. One very basic item for testers is something called a "code coverage" tool: it shows what parts of the system have or have not been executed during tests. While coverage alone doesn't suffice, it is a necessary condition; code that hasn't been executed at all during tests has never been tested. In this case, the code after the second goto was never executed. That could and should have been spotted. That is wasn't does not speak well of Apple's test teams. (Note to outraged testers: yes, I'm oversimplifying. There are things that are very hard to do in system test, like handling some hardware error conditions.) Of course, if you want to put on an extra-strength tinfoil hat, perhaps the same attackers who inserted that line of code deleted the test for the condition, but there is zero evidence for that.

So what's the bottom line? There is a serious bug of unknown etiology in iOS and MacOS. As of this writing, Apple has fixed one but not the other. It may have been an accident; if it was enemy action, it was fairly clumsy. We can hope that Apple will announce the results of its investigation and review its test procedures.

Why the US Doesn't have Chip-and-PIN Credit Cards Yet

5 February 2014

In the wake of the Target security breach, there's been a fair amount of hand-wringing about why the US has lagged most of the rest of the world in deploying EMV (Europay, MasterCard and Visa)—chips—in credit cards. While I certainly think that American banks and card issuers should have moved sooner, they had their reasons for their decision. Arguably, they were even correct.

To understand the actual logic, it is necessary to remember three things:

In other words, they did a calculation, concluded that EMV did not make financial sense, and stuck with mag stripes.

Security mechanisms are not selected randomly. Rather, they're deployed to counter specific threats. If you don't see a threat that would be countered by EMV, there's no point to using it. One major source of loss—fraud on card applications—is not addressed at all by EMV. Forged cards have long been an issue, but on a small scale; this could be dealt with by things like holograms and quick revocation. Yes, datatbases of credit card numbers existed, but for the most part these weren't at risk; there was little, if any, network connectivity, and the criminal hacker community had developed the tools to get at these databases.

(Those databases of card numbers turned out to be very important. Most people use a very few credit cards, often just one; that means that your credit card number is effectively your customer ID number. You behavior can be (and is) tracked this way, especially if you buy both online and in a physical store.)

Quick revocation, implemented when merchants started deploying terminals a bit over 30 years ago, was very important. Before that, stores relied on books listing canceled card numbers. These books were issued at most weekly, and were cumbersome to use; as a result, they were generally consulted only for large transactions. (Exercise for the reader: at the conclusion of this blog post, explain why this behavior was quite rational.)

In other words, by around 1995, life was pretty good for American credit card accepters. There was a relatively cheap technology (mag stripes plus online verification), good databases for tracking, and decent law enforcement.

Life was different in Europe. Countries are much smaller, of course, which means that there's more cross-border travel; this in turn hinders law enforcement for cross-border crime. (Not very many Americans travel abroad, so there's not nearly as much of a cross-border issue affecting American banks.) For whatever reason, online verification terminals were not deployed as widely, but crime was increasing. (I've heard that different costs for telecommunications service and equipment played a big role, but I haven't verified that.) There had a second mover advantage: they hadn't invested as much in mag stripe technology, and in the meantime smart cards—chips in credit cards—had become feasible and affordable, which was not true circa 1980.

One element of the cost, then, is the infrastructure: the myriad terminals that merchants own, and the server complexes that accept and verify those transactions. None of that would work with EMV cards. Other costs, though, are more subtle. In one ironic example, Target itself tried deploying EMV ten years ago: Target was both an issuer and accepter of credit cards. It turned out, though, that processing a transaction with an EMV card is slower, which meant long lines at cash registers— lines that their competitors didn't have, because almost no one else in the US was using EMV.

This, then, was the problem: high conversion costs, high operational costs, disadvantages for early adopters, and little consumer demand for the chips—American consumers aren't responsible for fraudulent use, and as noted few Americans travel abroad where they might need the chips. Combine this with the lack of a significant threat model, and the decision seemed obvious: the financial calculations indicated that it wasn't a profitable move. Yes, there would be some loss due to preventable fraud, but the cost of that prevention would be greater than the likely losses. As noted above, fraud prevention is strictly a financial decision.

What happened, of course, was that the threat changed dramatically. Hackers did learn to penetrate store server complexes and card processors. The conversion is now an urgent matter, but it will still be years before most transactions in the US will involve chip-enabled cards and terminals.

Alternate Universes: Academic Publishing in Computer Science vs. Law

6 December 2013

I (and my co-authors) have recently had two papers accepted to law reviews, "When enough is enough: Location tracking, mosaic theory, and machine learning", and "Lawful hacking: Using existing vulnerabilities for wiretapping on the Internet". In the process, I learned something about the world of legal academic publishing. It's about as different from my more familiar world of computer science publishing as can be. With the thought of amusing everyone, both on the legal and CS sides, I thought I'd make a list of differences...

Computer Science Law Reviews
In CS (and I believe in all other STEM fields), multiple submission is strictly forbidden. That is, you cannot simultaneously submit substantially the same paper to more than one venue. If the program chairs discover that you have done so, it is grounds for immediate summary rejection of the paper from all such venues, with no appeals considered. I've seen it happen—and yes, chairs do check for this. In legal academe, not only is multiple submission accepted, it is the normal, right, and proper way to do things. You're supposed to submit to many—dozens, perhaps—law reviews simultaneously. If your paper is accepted by one, you withdraw it from any lower-ranked journals and ask the editors of higher-ranked journals to expedite their reviews. ("Expedited" can mean "within a week", perhaps less, because that's how long you may have to accept the earlier offer.) If you receive another acceptance from a publication that is higher-ranked still, you repeat the process until you're either satisfied or you hit the deadline for accepting an offer to publish.
Every submission is independent. If a paper is rejected from one venue, you look for the next, and use its submission system. Submissions, though, are free. There are two centralized sites for law reviews. On these sites, you check the publications you're interested in; they do the rest. Partly to pay for this service, and partly to impose a limit on how many sites you submit to, there is a fee per publication. It's not a large fee, just a few dollars, but if you submit to 50 venues, it adds up. (Many law schools pay for blanket submission licenses. That's also foreign to CS; were there such charges, each prof would pay out of his or her grants.)
CS and all other sciences worship peer review. That is, submissions are judged by professionals in the field: professors, high-end practitioners, etc. Being on a conference program committee is an honor, albeit a very time-consuming one. Some committes will have a student member or two; generally, this is reserved for very senior graduate students who have a good publication track record of their own. The rationale is simple: it takes a good researcher to know what is really new and interesting; students are presumed to have too little experience (and often too little historical context). Most law reviews are edited by law students. Much of law school is about teaching critical thinking (a philosphy I'd like to see more of in CS...); good law students are presumed to be able to recognize sound arguments and good scholarship.
CS conferences are scattered throughout the year; you can submit to whichver is convenient and appropriate for your topic. The major security conferences even arrange their notification and submission deadlines so that you (just barely) have time to revise a rejected paper in time to submit to the next conference. Because law reviews are edited by students, there are two publication windows, August and February. Virtually all law reviews follow that calendar. Furthermore, there's no hard submission deadline; rather, journals accept papers unti the issue is full.
For most CS conferences (and certainly all of the major security conferences), submissions are anonymous. This is done to avoid bias, be it based on personal feelings, the reputation of the authors, or (and I've heard this stated explicitly by program chairs) the authors' gender. Perhaps because law students are inexperienced, submissions are not only not anonymous, they're accompanied by biographies and even full CVs.
Reviewers of CS papers explain their decisions, pro or con, and provide detailed feedback. You can't just reject a paper you don't like; rather, you have to explain exactly what's wrong. While some reviewers don't do a good job (see this set of instructions on how to do better), authors frequently receive very useful feedback. (Admittedly, sometimes the feedback is implicit: reviewer #2 either didn't read or didn't understand the paper....)

Law review rejections are terse: "thank you very much, but we're not interested". Amusingly enough, that means that CS rejections are more like legal opinions: reviewers have to justify their decisions. I suppose the law review model is more like the Supreme Court denying certiorari.
Editing of CS papers is generally cursory. There may be a few comments about getting the paper reviewed by a native (English) speaker, or some serious problems with content that must be changed to the satisfaction of a "shepherd". The level of editing and attention to detail by law review editors is nothing short of amazing. One of my papers literally had 4,917 changes made by the first-round editors: 1,943 insertions, 1,774 deletions, 2 moves, and 1,088 formatting changes. In addition, there were 110 comments.
From the time of acceptance to the time that cameara-ready copy is due, you have a long time to make changes: 8 weeks for last year's Usenix Security conference. Those 4,900 changes and110 comments? We had to turn them around in 8 days.
Most computer scientists write their papers in LaTeX. Submissions are always in PDF. Law reviews generally want Microsoft Word documents or perhaps something trivially convertible to Word, such as RTF. (Yes, you're welcome to use something like LibreOffice if you wish.) Some (though by no means all) will apparently take PDF. To me, this was one of the more painful parts of the process...
One of the nice things about LaTeX is bibtex, a bibliographic citation formatter. If you have a good bibligraphic database (my personal one has over 1,000 entries), you just write
\cite{cheswick.bellovin.ea:firewalls}
and the software does the rest. (If you've ever looked at my papers web page, you'll see that each entry has a bib link; clicking on it gives a bibtex database entry for that publication.) Journals generally have a preferred format, but this is a minor matter since bibtex handles that for you.
In legal writing, citations must be written in Blue Book format. This mirrors court practice; many courts require it in legal filings. However, not only is this format required, the specification is copyrighted, the copyright is enforced, and software developers are denied permission to write code to automate the task. (Very many of the ~5K edits were to fix our citation formatting.)
While academic convention does require proper citations to the academic literature, most computer scientists are fairly restrained about this. Actual footnotes are comparatively rare; citations are inline. Lawyers love footnotes. Apart from citations to other articles, laws, court rulings, etc. (and these are full of strange italicized words like supra and infra), footnotes are used for digressions, longer explanations of issues, etc. Every factual statement must have a footnote giving a source for it. I'd write here that a typical law review article is 25% footnotes by weight (contents may have settled during shipping), but that clause would require two footnotes, one for the assertion and one for the joke. By the way, make sure you give correct page numbers for the specific facts you're relying on; the editors will check them.
In CS, most important papers are published in conference proceedings, not journals. (Other sciences differ.) These submissions have a length limit, typically about 15 pages or so. Law reviews sometimes talk about limits, but these are lightly enforced, and the typical minimum size is far longer than CS's maximum. Let's put it like this: the shorter of my two accepted papers is about 50 pages; the other is considerably longer. Lawyers and law professors say they want papers to be shorter, but that doesn't seem to be happening in practice.
Everyone wants their work to have impact. In CS, the h-index is commonly used; without going into details, it measures how often a paper is cited. Lawyers also care about citations, but even a single citation can make a paper's (and hence its author's) reputation—if that citation is in an important court opinion.

Neither publishing model is perfect. Many computer scientists think that the stress on conferences is harmful; I've heard law professors debate the wisdom of student editors. Some things, like single versus multiple submissions, are matters of custom and taste. (If you're a lawyer, note that this difference is the one that most perplexes my colleagues. In fact, I'm not sure I believe it even now.)

Still, it's good to visit other planets occasionally, even if it's just academically.