First of all I know Clay Shirky, and he’s a good fellow. But he’s simply wrong about his claim that "tagging" (of the flavor that is appearing on del.icio.us — what I call "social tagging") is inherently better than the use of formal ontologies. Clay favors the tagging approach because it is bottom-up and emergent in nature, and he argues against ontologies because pre-specification cannot anticipate the future. But this is a simplistic view of both approaches. One could just as easily argue against tagging systems because they don’t anticipate the future — they are shortsighted, now-oriented systems that fail to capture the "big picture" or to optimally organize resources for the long-term. Their saving grace is that over time they do (hopefully) self-organize and prune out the chaff, but that depends both on the level of participation and the quality of that participation.
Tagging is certainly useful — and indeed collaborative authoring,
editing and filtering are powerful paradigms — but folksonomies (at
least present day ones) suffer from having too little formal structure
— tagging systems easily result in "metadata soup." Ontologies are on
the other end of the spectrum — they are particulary useful for
accurately modeling the actual structure of the world, or of conceptual
domains — but admittedly in some cases their formal structure can be
overly rigid and specific. The benefit of tagging is primarily the
adaptive nature of the resulting taxonomies. The benefit of ontologies
is the rich, and unambiguous, semantics they define. Tagging systems
are useful when all that is needed is the ability to link items to
topics; ontologies are useful when what is needed is to rigorously
define or understand what is meant, or not meant, by particular
classes, fields and relationships — something that is essential for
good machine-processing of data.
One point that Clay makes, which I think is very interesting, is his
view that perhaps the world is moving from a graph-theory information
model (ontologies) to a set-theory model (folksonomies) — but in
fact, under the surface this argument falls apart. OWL is nothing other
than a language for enabling extremely sophisticated set-theoretic
operations on information. In fact, if you actually look at the OWL
language itself, it is primarily comprised of set-theoretic statements.
I don’t really view graph-theory and set-theory as mutually exclusive
— in fact, they are highly connected, if not equivalent at a deep
level. But expressing information in graph form or set form does have
different benefits for certain types of information processing. In
particular, graphs can be beneficial when associative reasoning is
important — for example, when traversing links or networks between
nodes is key. Sets on the other hand are useful when relevance or
mutual membership are most important.
Clay discounts ontologies for many reasons. He has many arguments,
most of which have some merit, but fall short of convincing me (or
anyone in the field of knowledge representation). Indeed, tagging
systems are just special, highly simplistic cases of ontologies —
namely, they are ontologies with extremely basic semantics and almost
no constraints — they are even lower on the spectrum than taxonomies.
In fact, we could graph the spectrum of knowledge management as follows:
<—————————————————————>
Tags Folders Taxonomies Databases Ontologies
One of Clay’s early arguments against ontologies was that they are
merely systems for syllogistic logic — but in fact, that is simply not
the case. While the formal semantics of OWL doe support logical
inferencing and reasoning, that is not the only value of ontologies. In
fact, I think a much more important benefit of ontologies is simply
that they make the semantics of data structures explicit — which makes
it much easier to both process information, and integrate information
across different applications and representations. Ontologies are, in
my opinion, simply the next evolution of database schemas. Surely, Clay
would not argue that database schemas have no place in the world!
Another way of looking at ontologies and the semantic web is that
they do for the meaning of data what other markup languages have done
for the layout and structure of data. HTML provided a way to markup the
formatting of content. XML provided a way to markup the structure of
content. RDF and OWL provide a way to markup the meaning of
information. This is a logical progression, and it is something that
will really make the Web, desktop and enterprise easier to cope with.
Ontologies are not panaceas — but they are incredibly powerful when
used appropriately. And that is the operative word — they are not for
everything. Indeed, in cases where social tagging is sufficient,
ontologies may simply be overkill. But there are many, many cases where
social tagging simply does not, and cannot, have the semantic rigour
that is needed.
So what’s next? I think that ultimately we will see a synthesis of
these two approaches emerge. Imagine a folksonomy combined with an
ontology — a "folktology." In a folktology, users could
instantly propose or modify ontological classes and properties in the
same manner that they do with tags in tagging systems. The most popular
ontological constructs (the most-instantiated classes, or slots on
classes, for example) would "rise to the top" and self-amplify, while
the less-instantiated ones would "fall to the bottom" over time. In
this way an emergent, self-organizing, and self-pruning ontology could
emerge within a community. Such a system would have the ease and
adaptability of a folksonomy plus the semantic richness and formal
structure of an ontology. I think ultimately a
<i>folktology</i> approach will be better than either
folksonomies or ontolgoies on their own.