Side-channel attacks in web browsers: practical, low-cost, and highly scalable
In a paper presented this week at the ACM Conference on Computer and Communications Security, four computer scientists from Columbia University—Yossi Oren, Vasileios P. Kemerlis, Simha Sethumadhavan, and Angelos D. Keromytis—demonstrate that it's possible to spy on activities of a computer user from a web browser, even in some cases determining what website(s) a user is visiting. This type of attack, dubbed spy-in-the-sandbox, works by observing activity in the CPU cache on Intel microprocessors. It affects close to 80% of PCs, and it represents an escalation and scaling up of what's possible with side-channel attacks, requiring no special software or close proximity to the victim. Fortunately the fix is easy and Web browser vendors, alerted to the problem, are updating their code bases to prevent such attacks. One other upside: the spy-in-the-sandbox attack may serve as a primitive for secure communications.
In a side-channel attack, an attacker is able to glean crucial information by analyzing physical emissions (power, radiation, heat, vibrations) produced during an otherwise secure computation. Side-channel attacks are not new; Cold-War examples abound, from aiming a laser beam at a window to pick up vibrations from conversations inside, or installing microphones in typewriters to identify letters being typed. On computers, side-channel-attacks often work by inferring information from how much time or battery power is required to process an input or execute an operation. Given precise side-channel measurements, an attacker can work backward to reconstruct the input.
Side channel attacks can be particularly insidious because they circumvent security mechanisms. Traditionally they are directed against targeted individuals and assume proximity and special software installed on the victim's computer. However, those assumptions may have to be rethought after four computer scientists from Columbia University (Yossef Oren, Vasileios P. Kemerlis, Simha Sethumadhavan, and Angelos D. Keromytis) demonstrated for the first time that it is possible to launch a side channel attack from within a web browser. The method is detailed in their paper The Spy in the Sandbox — Practical Cache Attacks in JavaScript and Their Implications, which was presented this week (October 12) at the ACM Conference on Computer and Communications Security.
The attack, dubbed spy-in-the-sandbox by the researchers, does not steal passwords or extract encryption keys. Instead, it shows that the privacy of computer users can be compromised from code running inside the highly restricted (sandboxed) environment of a web browser. The researchers were able to tell for instance whether a user was sitting at the computer and hitting keys or moving the mouse; more worrisome from a privacy perspective, the researchers could determine with 80% accuracy whether the victim was visiting certain websites.
More may be possible. As Yossef Oren, a postdoctoral researcher who worked on the project (now an Assistant Professor at the Department of Information Systems Engineering in Ben-Gurion University) puts it, "Attacks always become worse."
In one sense at least, spy-in-the-sandbox attacks are more dangerous than other side-channel attacks because they can scale up to attack 1,000, 10,000, or even a million users at once. Nor are only a few users vulnerable; the attack works against users running an HTML5-capable browser on a PC with an Intel CPU based on the Sandy Bridge, Ivy Bridge, Haswell, or Broadwell micro-architectures, which account for approximately 80% of PCs sold after 2011.
How it was done
Neither proximity or special software is required; the one assumption is that the victim can be lured to a website controlled by the attacker and leaves open the browser window.
What's running in that open browser window is JavaScript code capable of viewing and recording the flow of data in and out of the computer's cache, specifically the L3, or last-level, cache. (A cache is extra-fast memory close to the CPU to hold data currently in use; caching data saves the time it would take to fetch data from regular memory.)
That an attacker can launch a side-channel attack from a web browser is somewhat surprising. Websites running on a computer operate within a tightly contained environment (the sandbox) that restricts what the website's JavaScript can do.
However, the sandbox does not prevent JavaScript running in an open browser window from observing activity in the L3 cache, where websites interact with other processes running on the computer, even those processes protected by higher-level security mechanisms like virtual memory, privilege rings, hypervisors, and sandboxing.
The attack is possible because memory location information leaks out by timing cache events. If a needed element is not in the cache (a cache miss event), for instance, it takes longer to retrieve the data element. This allows the researchers to know what data is currently being used by the computer. To add a new data element to the cache, the CPU will need to evict data elements to make room. The data element is evicted not only from the L3 cache but from lower-level caches as well. To check whether data residing at a certain physical address are present in the L3 cache as well, the CPU calculates which part of the cache (cache set) is responsible for the address, then only checks the certain lines within the cache that correspond to this set, allowing the researchers to associate cache lines with physical memory.
In timing events, researchers were able to infer which instruction sets are active and which are not, and what areas in memory are active when data is being fetched. "It's remarkable that such a wealth of information about the system is available to an unprivileged webpage," says Oren.
"While previous studies have been able to see some of the same behavior, they relied on specially written software that had to be installed on the victim's machine. What's remarkable here is that we see some of the same information using only a browser," says Vasileios Kemerlis, a PhD student who worked on the project (now an Assistant Professor in the Computer Science Department at Brown University).
By selecting a group of cache sets and repeatedly measuring their access latencies over time, the researchers were able to construct a very detailed picture, or memorygram, of the real-time activity of the cache.
A memorygram of L3 cache activity: Vertical line segments indicate multiple adjacent cache sets are active during the same time period. Since consecutive cache sets (within the same page frame) correspond to consecutive addresses in physical memory, it may indicate the execution of a function call spanning more than 64 bytes of assembler instructions. The white horizontal line indicates a variable constantly being accessed during measurements, and probably belongs to the measurement code or to the underlying JavaScript runtime.

Such a detailed picture is possible only because many web browsers recently upgraded the precision of their timers, making it possible to time events with microsecond precision. If memorygrams were fuzzier and less detailed, it would not be possible to capture such small events as a cache miss. (Different browsers implement this new feature with different precisions.) High-resolution timers have recently been added to browsers as a way to give developers, especially game developers, sufficient fine-grained detail to know what processes might be slowing performance. Of course, the more information developers have, the more information an attacker can access also.
Different processes have different memorygrams, and the same is true for different websites; their memorygrams will differ depending on the data the site is using, how the site is structured, how many images it contains and the size of those images. These various parts of the website end up in different locations in memory, and need to be called and cached, giving each website its own distinctive signature.
The researchers visited 10 sites and recorded multiple memorygrams in each case to build a classifier that could, with 80% accuracy, determine if a website open on a victim's machine matched one of the 10 pre-selected sites. (The same website viewed on different browsers will exhibit slight differences; it's this noise that prevents 100% accuracy when matching memorygrams.)
Future work
As pernicious as is the side-channel attack, especially considering how practical, scalable, and low-cost it is, avoiding it is surprisingly easy: run only a single web browser window at a time. An across-the-board fix to prevent the attacks is easy also; have browsers return to using less precise timers (or alert users of the high-precision timers that there exist possible security vulnerabilities).
And in this story of data privacy at least there is a happy ending. In March 2015, the researchers shared their research with all major browser vendors; by September 2015, Apple, Google, and Mozilla had released updated versions of their browsers to close the identified security hole.
The researchers are not yet done examining the potential of web-based side-channel attacks. They will continue looking at the problem (on old versions of browsers) to test the attack at larger scale. They are also considering a more interesting question; can memorygrams be used for good purposes?
A pre-set memorygrams might be placed in memory to be viewed by a trusted party to convey information. One memorygram might represent a 1 bit, another a 0 bit. The process of communicating in this fashion would be slow, but it would be extremely difficult for an attacker to even figure out that explicit communication is occurring between two parties. Memorygrams might thus serve as a primitive in securely conveying information, and what was once a threat to security may serve to enhance it.
-Linda Crane
Posted 10/15/2015