How the Wisdom of the Crowds Manifests Itself in Language and on the World-Wide Web

I had an “AHA!” insight this morning and feel like running it past you here. :)

Let me start off by backtracking to the print era, in order to bring forth an example that will provide a basis for how this works online (on the world-wide web). Here’s an excerpt from the entry for “Language” in “Philosophisches Wörterbuch, Band 2″ (Leipzig: VEB Verlag Enzyklopädie Leipzig, 1964, 1974, 12. Auflage, page 1161 ff.) — the first sentence reads that language is:

Sprache — aus den Bedürfnissen des gesellschaftlichen Lebens, insbesondere der Produktionstätigkeit, hervorgegangenes und sich ständig entwickelndes System verbaler Zeichen, das der Formulierung der Gedanken ( :arrow: Denken) im Prozess der :arrow: Erkenntnis der objektiven Realität durch die Menschen dient und den Austausch ihrer Gedanken und emotionalen Erlebnisse sowie Fixierung und Aufbewahrung des erworbenen Wissens ermöglicht.

I don’t care to agree or disagree with this definition, but I am quite sure that if you grew up and or have lived in “The West” in the past century, you will probably not have read such a definition of language before. The fact that this reference work has probably vanished from most libraries the world over (you need to understand that I actually collect such idiosyncratic reference works ;) ) means that very few people have access to this information resource. This is how the Wisdom of the Crowds applies to language: The more widely subscribed to a dictionary is (e.g. Oxford English Dictionary aka OED or Webster’s Dictionary) the more valid the definitions in those dictionaries are interpreted to be.

Note how this applies to the validity of the dictionary, and that the validity of the definition is actually derived from the scope of the dictionary. Whereas people in the United States are more likely to seek out Webster’s Dictionary, people in other English-speaking countries would probably be more likely to turn to the Oxford English Dictionary.

Now let’s turn to the world-wide web. In the early days of the web, people used to think that a website like was true/valid because the site had many users. But maybe that was a mistake. In my opinion, that would be the equivalent of many people using a particular word, rather than using a particular dictionary (on the Internet, these are referred to as “top-level domains” [TLDs]). The American equivalent to Webster’s Dictionary was (first and foremost) .com (“dot com”). For years and years I would hear people say “if you want to register a dot com” and then I would just roll my eyes. The fact that many millions of people have registered dot com domains names means that the stakeholders supporting this registry are widely distributed across the world.

What seems odd about this is that — unlike the entries in Webster’s dictionary or the OED — the registrant of example-1.tld-1 seems to have little or no influence over the content of example-2.tld-1. Also, there is very good reason to believe that registrants of example-1.tld-1, example-1.tld-2 and example-1.tld-3 have more in common than the registrants of example-1.tld-1, example-2.tld-1 and example-3.tld-1 (see also “Wisdom of the Language“). There are some isolated cases where the top-level domains have been organized by organizations which control the content on those domains (for example: .MUSEUM, .TRAVEL, .JOBS, etc.), but most (if not all) of these have been miserable failures (not sure about .GOV and/or .MIL). If ICANN’s plans come to fruition, then the number of these special / controlled top-level domains will increase from maybe about a dozen or so to several thousands.

In my opinion, simple mathematics should suffice to figure out that none of these will be “mainstream” the way .COM, .NET, .ORG, etc. are. If I meet someone on the street and they tell me about, or I will expect the corresponding website to more “accepted” in much the same way the definition of “free” in the traditional dictionaries are accepted when those dictionaries are on the bookshelves of many or even most homes. The same second-level domain (“free”) under such top-level domains as .JOBS, .TRAVEL or maybe someday .GOOG or .MSFT will be more like the definition of “language” I cited above from the philosophical dictionary published in the German Democratic Republic: quirky!

