June 2021
Where Did "Data Shadow" Come From? (26 June 2021)

Where Did "Data Shadow" Come From?

26 June 2021

Anyone who works in privacy is familiar with the term "data shadow": the digital record created by our transactions, our travels, our online activities. But where did the phase come from? Who used it first?

A number of authors have attributed it to Alan Westin, whose seminal book Privacy and Freedom (largely a report on the work of the Committee on Science and Law of the Association of the Bar of the City of New York) set the stage for most modern discussions of privacy. I followed the lead of some other authors, e.g., Kevin Keenan’s 2005 book Invasion of Privacy: A Reference Handbook in saying that the phrase appeared in Privacy and Freedom. Simson Garfinkel’s 2000 book Database Nation is more circumspect, attributing it to Westin in the 1960s, but not citing a particular work. However, repeated searches of Westin’s works do not turn up that phrase.

Other authors point to different origins. Harvey Choldin says that the term is used by German sociologists, and cites an interview. Can we push it back further?

Google Scholar turned up a 1983 reference in Politics on a Microchip by Edwin Black. A Google Ngram search by Garfinkel shows nothing before 1976. But a more complete Google search turned up a 1974 usage by Kirstein Aner [sic] in New Scientist. They quote her as saying

Every single person will be closely followed all his life by his own data shadow where everything he has ever done, learnt, bought, achieved, or failed to achieve is expressed in binary numbers for eternity. His shadow will often be taken for himself and we will have to pay for all the inaccuracies as if they were his own; he will never know who is looking at is shadow or why.
Kerstin Anér was a member of the Swedish legislature.

At that point, I asked a Swedish friend for help. He found some useful references to "dataskugga". Page 231 of Kretslopp Av Data says, via Google Translate:

By the term "data shadow" Kerstin Anér meant "the two-dimensional image of the individual that is evoked through one or more data registers" or "the series of crosses in different boxes that a living person is transformed into in the data registers and which tends by those in power to be treated as more real and more interesting than the living man himself.
The footnote for the quote points to Anér’s 1975 book Datamakt (Computer Power) and says (again via Google Translate):
Anér referred to in the article [unsigned], “Data law must protect privacy. But we are still sold as issues ”, Dagens Nyheter, 13 July 1973; Ten years later, the term was still in use: Calle Hård, "The story of the data shadow 600412– 5890", Aftonbladet, 11 March 1982.

There was one more tantalizing reference. Google Books turned up a reference to "data shadow" in OECD Informatics with a possible date of 1971. When I finally obtained access to Volume 12, I found that it was from 1978—in an article by Anér.

That’s as far back as I’ve been able to trace it: Kerstin Anér, probably in 1972 or 1973. Does anyone have any earlier uses, in any language?

Acknowledgments: Patrik Fältström did the Swedish research. Tarah Wheeler and Matthew Nuding tracked down the OECD volume. Simson Garfinkel also did assorted web searches.

https://www.cs.columbia.edu/~smb/blog/2021-06/2021-06-26.html