2 July 2011
There's been a fair amount of controversy of late about
increase the number of top-level domains. With a bit of effort,
though — and with little disruption to the
infrastructure — we could abolish the issue
entirely. Any string whatsoever could be used, and
it would all Just Work. That is, it would Just Work
in a narrow technical sense; it would hurt innovation
and it would likely have serious economic failure modes.
The technical limitation is that every end point would have to
be upgraded to do the hashing. Yes, that's a problem, but we've
been through it before; supporting internationalized domain names
required the same thing. And it works:
But — how do endpoints know to do the hashing in this scheme?
Something in a the URL bar of a web browser? There are lots of
things on the net that aren't web browsers; how will they
know what to do? You can't necessarily tell from a string if it
should be used literally or via this hashing scheme; "Amazon.com"
appears to be the legal name of the corporation.
There's another problem: canonicalization. Similar strings will
produce very different hash values. Here's an example:
the result is a46af6931d9dace2200617548fab3274549e308f.
Add a dot after every pair of hex digits, tacking on a suffix like
.arb (for "arbitrary", since .hash might be seen
as having other connotations), and you get
which looks like a domain name, albeit a weird one. It not only looks
like one, it is; that string could be added to the DNS today
with no changes in code. We could even distribute the servers;
at every level, there are 256 easy-to-split subtrees. So what's wrong?
The technical limitation is that every end point would have to be upgraded to do the hashing. Yes, that's a problem, but we've been through it before; supporting internationalized domain names required the same thing. And it works:
But — how do endpoints know to do the hashing in this scheme? Something in a the URL bar of a web browser? There are lots of things on the net that aren't web browsers; how will they know what to do? You can't necessarily tell from a string if it should be used literally or via this hashing scheme; "Amazon.com" appears to be the legal name of the corporation.
There's another problem: canonicalization. Similar strings will produce very different hash values. Here's an example:
|New York Times||7e145e463809ea5e7c28f2ddf103499f942c9ea3|
|The New York Times||1950c50c10f288dd6e9190361c968e1b8c4a3775|
We could no doubt define some set of rules that would handle many common cases. Equally certain, we'd miss many more. Companies could think of their own rules, but if they missed some we'd be back to cybersquatting and typosquatting. This would be worse, though, because the names are so spread out.
The real issue, though, is economic: who would run the different pieces of .arb? There are currently about 100M names in .com. Let's allow for growth and assume 1,000,000,000 names. To handle canonicalization, assume another factor of 10, for about 10B names. Does that work? To a first approximation, sure; we can delegate at each period in the name, and there are 256 values at each level. That means that going down just two levels, we could have 65,536 different registries, each handling about 150K names. That's easy to do, but a given registry could handle more than one zone. Let's assume that 1.5M names is a good size (which is somewhat challenging, though it's clearly possible since it works today). That means we'd need about 6,600 registries. But they have no way to do marketing; there's no way to target any particular business segment, since names are mapped to more or less random parts of the name tree. If a registry failed, an unpredictable portion of the net would suddenly be unreachable.
Most of us never see registries; when we want to create a new domain, we do business with a registrar. But every registrar would need to do business with every registry! The number of relationships would get ungainly, and again, there's no way to do targeted marketing. The registrars for, say, .museum can target museums, while ignoring, say, banks. With this scheme, everyone is doing business with everyone. It's great to have a global market; it's also very expensive.
18 July 2011
Circles are Google's answer to Facebook's friends, but they can do more. (The choice of word has also given rise to an endless debate: what is the verb form equivalent to "to friend"? To circle? To encircle? To circumscribe? (We won't go into the question of whether or not one should "befriend" people on Facebook instead of friending them…)) In their simplest form, circles serve two purposes: access control (who can see your posts?) and following à la Twitter: whose posts do you see by default? This is the first danger: the concept is overloaded. Just because I want to hear what someone else says doesn't mean that I want them to hear what I say. The problem can be avoided by proper assignment of people to different circles, but I'm very skeptical that people will get that right; they don't on Facebook.
The problem is worse, though: circles can be used for many more things. There are already lists of creative ways to use them, but such circles are also both access control and following lists. Google+ is still a very geeky place, and was geekier still early on, but I saw a lot of confusion from people I know to be ubergeeks. Once you get used to circles, they're great, but of course the current population is asking for more power still, such as Venn diagram operations on circles. Wonderful — until you get something wrong.
There are many good things here. I especially like that you're asked, explicitly, with whom any new post should be shared. On the other hand, you get no such choice if you post a comments to someone else's thread; indeed, you can't even tell with whom the original poster decided that it should be shared. But I fear that the overloading will lead to very big trouble.