Useful Links

Recent Posts

Archive

Doing Crypto

22 April 2014

The recent discovery of the goto fail and heartbleed bugs has prompted some public discussion on a very important topic: what advice should cryptologists give to implementors who need to use crypto? What should they do? There are three parts to the answer: don't invent things; use ordinary care; and take special care around crypto code.

Don't invent things
This oldest piece of advice on the subject is still sound; everyone who teaches or writes on the subject will repeat it. Never invent your own primitives or protocols. Cryptographic protocols are fiendishly difficult to get right; even pros often get them wrong. Encryption algorithms are even harder to design. It's certainly true that there have been very few known attacks on bad crypto by hackers not working for a major government. But "few" is not the same as "none"—think of WEP—and many commercial sites have been targeted by governments. Besides, many crypto attacks are silent; the victims may never know what happened.

Custom crypto: just say no.

Use ordinary care
Crypto code is code, and is therefore susceptible to all the natural shocks that code is heir to. All the usual advice—watch out for buffer overflows, avoid integer overflows, and so on—applies here; crypto code is almost by definition security-sensitive. There's no shortage of advice and tools; most of this is even correct.

Special care
Crypto code, though, is special; there are precautions that need to be taken that are irrelevant anywhere else. Consider things like timing attacks: if you're using RSA but haven't implemented it with all due paranoia, an attacker can recover your private key just by seeing how long it takes you to respond to certain messages. There are cache timing attacks: if the attacker can run programs on the same computer as your crypto code (and this isn't a preposterous notion in a world of cloud computing), it's possible to figure out an AES key by watching what cache lines are busy during an encryption or decryption operation. Alternatively, consider how hard it is to implement obvious advice like zeroing out keys after they're used: if you write code to assign zeros to a variable you'll never use again, a modern compiler can optimize that code out of existence. The problems are subtle and there aren't widely-known sources of advice.

Nine years ago, Dan Boneh commented to me that "It's amazing how tricky it is to implement Crypto correctly these days. I constantly tell my undergrad students that they should not try to implement existing systems themselves." It's even more true today.

Some of these issues can be dealt with by sticking with a well-implemented, standards-adhering crypto library. Yes, there have been problems of late with several different such libraries—but most programmers will make at least as many mistakes on their own. Are we putting all of our eggs in one basket and exacerbating the monoculture problem? In this case, I prefer to fall back on Pudd'nhead Wilson's advice here:

Behold, the fool saith, "Put not all thine eggs in the one basket"—which is but a manner of saying, "Scatter your money and your attention"; but the wise man saith, "Put all your eggs in the one basket and—WATCH THAT BASKET!"
Crypto is hard.

Heartbleed: Don't Panic

11 April 2014

There's been a lot of ink and pixels spilled of late over the Heartbleed bug. Yes, it's serious. Yes, it potentially affects almost everyone. Yes, there are some precautions you should take. But there's good news, too: for many people, it's a non-event.

Heartbleed allows an attacker to recover a random memory area from a web or email server running certain versions of OpenSSL. The question is what's in that memory. It may be nothing, or it may contain user passwords (this has reportedly been seen on Yahoo's mail service), cryptographic keys, etc. From a theoretical perspective, this latter is the most serious; an attacker can impersonate the site, read old traffic that's been recorded, etc. (Beside, cryptographers take key leakage very personally; that keys won't leak is one of our core assumptions.) Is this a real risk, though? For many people, the answer is no.

In order to impersonate a site, an attacker has to redirect traffic you're sending towards that site. If you only use the Internet via well-controlled networks, you're probably safe. Yes, it's possible to redirect traffic on the Internet backbone, but it's rare and difficult. If a major intelligence agency is after you or that site, you're at risk; most of us aren't in that category. Cellular data networks are also in that category: it can be done, but it's hard.

For most people, the weak link is their access network: their home, their workplace, the public or semi-public networks they use. It's much easier to redirect traffic on a WiFi network or an Ethernet, and well within the capabilities of ordinary cybercriminals. If untrusted individuals or hacked machines use the same networks as you do, you're at much more risk. Your residence is probably safe if there are no hacked machines on it and if you observe good security precautions on your WiFi network (WPA2 and a strong password). A small office might be safe; a large one is rather more dangerous. All public hotspots are quite exposed.

The other risk of Heartbleed is someone decrypting old traffic. That sounds serious, though again it's hard to capture traffic if you're not law enforcement or an intelligence agency. On exposed nets, hackers can certainly do it, but they're not likely to record traffic they'll never be able to decrypt. Law enforcement might do that, if they thought they could get assistance from the local spooks to break the crypto. They could also redirect traffic, with cooperation from the ISP. The question, though, is whether or not they would; most police forces don't have that kind of technical expertise.

It's important to realize that exposure isn't all or nothing. If you regularly use a public hotspot to visit a social networking site but only do your banking at home, your banking password is probably safe. That's also why your home network gear is probably safe: you don't access it over the Internet. (One caveat there: you should configure it so that you can't access it remotely, only from your home. Too much gear is shipped with that set incorrectly. If you have a router, make sure remote access to it is turned off.)

One more threat is worth mentioning: client software, such as browsers and mail programs, use SSL; some of these use OpenSSL and hence are vulnerable if you use them to connect to a hacked site. Fortunately, most major browsers and mailers are not affected, but to be safe, make sure you've installed all patches.

There's one password you should change nevertheless: your email password. It's generally used to reset all of your other accounts. "Probably safe" is not the same as "definitely". Accordingly, as soon as you know that your mail provider has patched its system (Google and Yahoo have, and Microsoft was never vulnerable), change it—and change it to something strong and use a password manager to save you from having to use the same new password everywhere.

Oh yes—if Martian Intelligence is after you (you know who you are), indeed you should be worried.

Open Source Quality Challenge Redux

9 April 2014

I don't have time to do a long blog post on Heartbleed, the new flaw in OpenSSL, but there's one notion going around that needs to be squashed. Specifically, some people are claiming that open source software is inherently more secure:

Because so many people are working on the software, that makes it so it's less susceptible to problems. For security it's more important in many ways, because often security is really hard to implement correctly. By having an open source movement around cryptography and SSL, people were able to ensure a lot of basic errors wouldn't creep into the products.
Not so. What matters is that people really look, and not just with their eyes, but with a variety of automated static and dynamic analysis tools.

Secure systems require more than that, though. They require a careful design process, careful coding, and careful review and testing. All of these need to be done by people who know how to build secure systems, and not just write code. Secure programming is different and harder; most programmers, however brilliant they may be, have never been taught security. And again, it's not just programming and it's not just debugging; design—humble design—matters a great deal.

I wrote about this problem in the open source community five years ago. I haven't seen nearly enough change. We need formal, structured processes, starting at the very beginning, before we'll see dramatic improvement.

Speculation About Goto Fail

24 February 2014

Following the logic in my previous post, I don't think that Apple's goto fail was a deliberate attack. Suppose it was, though. What can we learn about the attacker?

The first point is that it very clearly was not the NSA or other high-end intelligence agency. As I noted, this is too visible and too clumsy. While they may not object to that, covertly tinkering with Apple source code is a difficult and risky operation. If an investigation by Apple shows that this was an attack, they'll move heaven and earth to close the hole, whether it was technical, personnel, or procedural. In other words, using some covert access channel to install a back door effectively "spends" it; you may not get to reuse the channel. In that case, they definitely would not use this one-shot on something that is so easily spotted. (What could they have done? There are lots of random numbers lying around in cryptographic protocols; leak the session key—more likely, a part of it—in one of those numbers. Use convoluted code to generate this "random" number, complete with misleading comments.)

The next question is how this capability can be used. The vulnerability requires a so-called "man-in-the-middle" (MitM) attack, where an attacker has to receive all messages in each direction, decrypt them, reencrypt them, and forward them to the proper destination. If you're intercepting a lot of traffic, that's a lot of work. There are a number of ways to do MitM attacks; for technical reasons, it's a lot easier on the client or the server's LAN, or with the cooperation of either's ISP. There are certainly other ways to do it, such as DNS spoofing or routing attacks; that's why we need DNSSEC and BGPSEC. But ARP-spoofing from on-LAN is very, very easy.

We can narrow it down still further. If the main thing of interest is email and in particular email passwords, the odds are that the victim will be using one of the big "cloud" providers: Google, Yahoo, or Microsoft. You don't want to tap those nets near the servers; apart from the technical difficulty (they're good at running their machines securely, and they don't invite random strangers onto their nets), and apart from the fact that you'd be trying to take a sip from a firehose, it's hard to figure out where to put the tap. Consider gmail. I checked its IP address from my office, my house, and from a server I sometimes use in Seattle. The IP addresses resolved to data centers near DC, New York City, and Seattle, respectively. Which one should you use if you wanted my traffic? Note in particular that though my office is in New York City and my house is not, I got a New York server from when trying from home, but not when trying from my office.

(There's another possible attacker vector: software updates. Apple's updates are digitally signed, so they can't be tampered with by an attacker; however, that isn't true for all third-party packages. Tampering with an update is a way to get malicious code installed on someone's machine.)

The conclusion is simple: go after the client LAN. If it's a LAN the attacker and the target both have access to—say, an Internet cafe or the like—the problem is very simple. In fact, if you're on-LAN with the target, you can home in on the target's MAC address (which doesn't go off-LAN), making your life very simple. Alternatively, hack into the wireless router or seek cooperation from the hotspot owner or the ISP. (Home routers are easily hacked, too, of course. If the attacker goes that route, there's no need for MAC-spoofing.)

We can also speculate that the victim is using an iOS device rather than a Mac. If nothing else, Apple sells far more iPhones and iPads than it does Macs. There's another reason to think that, though: Macs are expensive, and Internet cafes are much more popular in poorer countries where fewer people have broadband access at home. Of course, if they're using an iPhone in cellular mode, there's no LAN to camp on, but governments have little trouble gaining access to the networks run by their own telephone companies. Either way, it sounds very much like a targeted attack, aimed at a very few individuals.

My reasoning is, of course, highly speculative. If it was an attack, though, I conclude (based on this tenuous set of deductions) that it was a moderately capable government going after a small set of victims, either on a public net or via the local mobile phone carrier. Most of us had nothing to worry about—until, of course, the patch came out. And why a patch for just iOS, with MacOS waiting until later? Apart from the system test difficulties I mentioned last time, might it be because Apple was warned—by someone!—that this hole was being exploited in the wild? Again, I'm speculating...


Update: Nicholas Weaver notes that putting the exploit station on the victim's LAN or router solves another problem: identifying which machines may be vulnerable to this exploit. It can look at other traffic from that MAC address to decide if it's an Apple product, what browser it's using, etc., and not tip off the victim by pulling this stunt against Firefox or Internet Explorer users.

In principle, this can be done from off-LAN, but it's harder, especially if the hosts use strong sequence numbers and randomize the IPid field. (Some open source operating systems do that. I haven't checked on Macs or Windows boxes in a long time.)

Goto Fail

23 February 2014

As you've probably heard by now, there's a serious bug in the TLS implementations in iOS (the iPhone and iPad operating system) and MacOS. I'll skip the details (but see Adam Langley's excellent blog post if you're interested); the effect is that under certain conditions, an attacker can sit in the middle of an encrypted connection and read all of the traffic.

Here's the code in question:

	if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
		goto fail;
	if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
		goto fail;
		goto fail;
	if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
		goto fail;
	...
Note the doubled
		goto fail;
		goto fail;
That's incorrect and is at the root of the problem.

The mystery is how that extra line was inserted. Was it malicious or was it an accident? If the latter, how could it possibly have happened? Is there any conceivable way this could have happened? Well, yes.

Here's a scan from a 1971 paperback printing of Roger Zelazny's Lord of Light.

Note the duplicated paragraph... Today, something like that could easily happen via a cut-and-paste error. (The error in that book was not authorial emphasis; I once checked a hardcover copy that did not have the duplication.)

There's another reason to think it was an accident: it's not very subtle. That sequence would stick out like a sore thumb to any programmer who looked at it; there are no situations where two goto statements in a row make any sense whatsoever. In fact, it's a bit of a surprise that the compiler didn't flag it as an error; the ordinary process of optimization should have noticed that the all of the lines after second goto fail could never be reached. The attempted sabotage of the Linux kernel in 2003 was much harder to spot; you'd need to notice the difference between = and == in a context where any programmer "just knows" it should be ==. However...

There are still a few suspicious items. Most notably, this bug is triggered if and only if something called "perfect forward secrecy" (PFS) has been offered as an encryption option for the session. Omitting the technical details of how it works, PFS is a problem for attackers—including government attackers—who have stolen or cryptanalyzed a site's private key. Without PFS, knowing the private key allows encrypted traffic to be read; with PFS, you can't do that. This flaw would allow an attacker who was capable of carrying out a "man-in-the-middle" (MitM) attack to read even PFS-protected traffic. While MitM attacks have been described in the academic literature at least as long ago as 1995, they're a lot less common than ordinary hacks and are more detectable. In other words, if this was a deliberate back door, it was probably done by a high-end attacker, someone who wants to carry out MitM attacks and has the ability to sabotage Apple's code base.

On the gripping hand, the error is noticeable by anyone poking at the file, and it's one of the pieces of source code that Apple publishes, which means it's not a great choice for covert action by the NSA or Unit 61398. With luck, Apple will investigate this and announce what they've found.

There are two other interesting questions here: why this wasn't caught during testing, and why the iOS fix was released before the MacOS fix.

The second one is easy: changes to MacOS require different "system tests". That is, testing just that one function's behavior in isolation is straightforward: set up a test case and see what happens. It's not enough just to test the failure case, of course; you also have to test all of the correct cases you can manage. Still, that's simple enough. The problem is "system test": making sure that all of the rest of the system behaves properly with the new code in. MacOS has very different applications than iOS does (to name just one, the mailer is very, very different); testing on one says little about what will happen on the other. For that matter, they have to create new tests on both platforms that will detect this case, and make sure that old code bases don't break on the new test case. (Software developers use something called "regression testing" to make sure that new changes don't break old code.) I'm not particularly surprised by the delay, though I suspect that Apple was caught by surprise by how rapidly the fix was reverse-engineered.

The real question, though, is why they didn't catch the bug in the first place. It's a truism in the business that testing can only show the presence of bugs, not the absence. No matter how much you test, you can't possibly test for all possible combinations of inputs that can result to try to find a failure; it's combinatorially impossible. I'm sure that Apple tested for this class of failure—talking to the wrong host—but they couldn't and didn't test for this in every possible variant of how the encryption takes place. The TLS protocol is exceedingly complex, with many different possibilities for how the encryption is set up; as noted, this particular flaw is only triggered if PFS is in use. There are many other possible paths through the code. Should this particular one have been tested more thoroughly? I suspect so, because it's a different way that a connection failure could occur; not having been on the inside of Apple's test team, though, I can't say for sure how they decided.

There's a more troubling part of the analysis, though. One very basic item for testers is something called a "code coverage" tool: it shows what parts of the system have or have not been executed during tests. While coverage alone doesn't suffice, it is a necessary condition; code that hasn't been executed at all during tests has never been tested. In this case, the code after the second goto was never executed. That could and should have been spotted. That is wasn't does not speak well of Apple's test teams. (Note to outraged testers: yes, I'm oversimplifying. There are things that are very hard to do in system test, like handling some hardware error conditions.) Of course, if you want to put on an extra-strength tinfoil hat, perhaps the same attackers who inserted that line of code deleted the test for the condition, but there is zero evidence for that.

So what's the bottom line? There is a serious bug of unknown etiology in iOS and MacOS. As of this writing, Apple has fixed one but not the other. It may have been an accident; if it was enemy action, it was fairly clumsy. We can hope that Apple will announce the results of its investigation and review its test procedures.

Why the US Doesn't have Chip-and-PIN Credit Cards Yet

5 February 2014

In the wake of the Target security breach, there's been a fair amount of hand-wringing about why the US has lagged most of the rest of the world in deploying EMV (Europay, MasterCard and Visa)—chips—in credit cards. While I certainly think that American banks and card issuers should have moved sooner, they had their reasons for their decision. Arguably, they were even correct.

To understand the actual logic, it is necessary to remember three things:

In other words, they did a calculation, concluded that EMV did not make financial sense, and stuck with mag stripes.

Security mechanisms are not selected randomly. Rather, they're deployed to counter specific threats. If you don't see a threat that would be countered by EMV, there's no point to using it. One major source of loss—fraud on card applications—is not addressed at all by EMV. Forged cards have long been an issue, but on a small scale; this could be dealt with by things like holograms and quick revocation. Yes, datatbases of credit card numbers existed, but for the most part these weren't at risk; there was little, if any, network connectivity, and the criminal hacker community had developed the tools to get at these databases.

(Those databases of card numbers turned out to be very important. Most people use a very few credit cards, often just one; that means that your credit card number is effectively your customer ID number. You behavior can be (and is) tracked this way, especially if you buy both online and in a physical store.)

Quick revocation, implemented when merchants started deploying terminals a bit over 30 years ago, was very important. Before that, stores relied on books listing canceled card numbers. These books were issued at most weekly, and were cumbersome to use; as a result, they were generally consulted only for large transactions. (Exercise for the reader: at the conclusion of this blog post, explain why this behavior was quite rational.)

In other words, by around 1995, life was pretty good for American credit card accepters. There was a relatively cheap technology (mag stripes plus online verification), good databases for tracking, and decent law enforcement.

Life was different in Europe. Countries are much smaller, of course, which means that there's more cross-border travel; this in turn hinders law enforcement for cross-border crime. (Not very many Americans travel abroad, so there's not nearly as much of a cross-border issue affecting American banks.) For whatever reason, online verification terminals were not deployed as widely, but crime was increasing. (I've heard that different costs for telecommunications service and equipment played a big role, but I haven't verified that.) There had a second mover advantage: they hadn't invested as much in mag stripe technology, and in the meantime smart cards—chips in credit cards—had become feasible and affordable, which was not true circa 1980.

One element of the cost, then, is the infrastructure: the myriad terminals that merchants own, and the server complexes that accept and verify those transactions. None of that would work with EMV cards. Other costs, though, are more subtle. In one ironic example, Target itself tried deploying EMV ten years ago: Target was both an issuer and accepter of credit cards. It turned out, though, that processing a transaction with an EMV card is slower, which meant long lines at cash registers— lines that their competitors didn't have, because almost no one else in the US was using EMV.

This, then, was the problem: high conversion costs, high operational costs, disadvantages for early adopters, and little consumer demand for the chips—American consumers aren't responsible for fraudulent use, and as noted few Americans travel abroad where they might need the chips. Combine this with the lack of a significant threat model, and the decision seemed obvious: the financial calculations indicated that it wasn't a profitable move. Yes, there would be some loss due to preventable fraud, but the cost of that prevention would be greater than the likely losses. As noted above, fraud prevention is strictly a financial decision.

What happened, of course, was that the threat changed dramatically. Hackers did learn to penetrate store server complexes and card processors. The conversion is now an urgent matter, but it will still be years before most transactions in the US will involve chip-enabled cards and terminals.