From the Metaweb to the Semantic Web: A Roadmap
December 10th, 2003In previous articles on this Weblog I have suggested that we name the new evolution of the Web that is emerging from the confluence of Weblogging and RSS, “The Metaweb.” The Metaweb is a meta-data driven Web of microcontent. We can see it emerging and chart its growth by looking at technorati and daypop for example. The Metaweb is happening today, it is real. You are browsing it now by reading this page.
I believe that the Metaweb is the first step in the evolution of the coming Semantic Web. The Semantic Web is a Web of ontologically-defined information. Ontologies are formal systems of concepts that can be used to rigorously define what things mean and how they relate. So for example, an ontology about cameras would define basic concepts about cameras such as “lens,” “viewfinder,” “film,” “tripod,” “zoom lens,” “shutter speed,” etc.
By linking content about cameras to the appropriate definitions in the camera ontology, it then becomes possible for software to do a better job of understanding what the content means. That’s the first goal of the Semantic Web — simply adding more semantics to information so that it can be understood better by machines (and people). This can be done today.
The second goal of the Semantic Web is to enable software to think more intelligently about information by providing a formal means to express and derive abstract logical relationships, inferences and proofs, and arbitrary formal statements about information. This can be done today too, but to do it well requires artificial intelligence. The first goal of the Semantic Web — semantic metadata — is near-term, the second goal — intelligent information processing — is long-term. The point of this article is that the Metaweb is the first step towards achieving both these goals.
It all starts with RSS, in my opinion.
RSS is a metadata format for publishing and subscribing to metacontent objects, the units of the Metaweb. RSS (in various flavors and soon in Atom, a new open standard based on RSS) is already in wide use on Weblogs and content syndication sites today. Numerous large and small organizations and content providers publish and subscribe to RSS as a means to exchange and track ideas.
The first step in the process of evolving the Semantic Web is to bring about widespread adoption of the Metaweb — of weblogs, RSS, Atom and other emerging microcontent media. As microcontent begins to play an increasingly important role on the Web, and in our personal and work lives, it will set the stage for the gradual introduction of ever-richer microcontant formats and protocols, eventually leading to full Semantic Web microcontent.
Existing microcontent standards such as RSS are extremely barebones, and the emerging Atom spec looks to be no less lightweight so far. There are many problems with RSS and Atom — chief among them in my opinion is that while they are extensible there is really no easy way to make use of extensions, and secondly, they do not provide semantically defined metatags. Anyone is free to extend such formats with whatever custom metatags they want to put in, but currently there is no way to instantly make those metatags useful in applications that were not written specifically to recognize them, nor is there a way to semantically define the meaning of those tags so that software can understand how to interpret them without human intervention.
So the next step after the widespread adoption of metacontent standards like RSS and Atom is to add support for pervasively and ubiquitously extending the formats and also for putting more semantics into microcontent. We are working on these problems at Radar Networks.
In order to add rich semantics to microcontent (or any content for that matter), there needs to be a formally defined semantics in the first place. This brings us to the subject of ontologies. Ontologies, as I have explained earlier, are formal conceptual models. They define systems of concepts.
There are a number of ontologies in existence today, however for the most part they are either too high-level and abstract or too vertical and specific to be of much use to the average Web surfer. For example, the SUMO ontology provides a good Upper Ontology that defines abstract concepts such as various units of measurement, various types of common entities, and common relationships among them. OpenCYC is another ontology that focuses mainly on “common sense knowledge” — such as concepts related to shopping or social relationships, places, etc. Other ontologies are more vertical — for instance DARPA has funded the development of a number of ontologies that provide knowledge related to warfare. The NIH has funded work on ontologies related to medicine. But there is no ontology that provides good semantic definitions of the kinds of things that ordinary consumers and knowledge workers deal with.
At Radar Networks we have been working to define this ontology — which we call “The Infoworker Ontology” — with a goal of evententually contributing it to a standards body in the future. The Infoworker Ontology is a mid-level horizontal ontology that defines the semantics of common entities and relationships in the domain of knowledge work — things like documents, events, projects, tasks, people, groups, etc. The development and adoption of an open, extensible, and widely-used Infoworker ontology is a necessary step towards making the Semantic Web useful to ordinary mortals (as opposed to academic researchers).
By connecting microcontent objects to the Infoworker Ontology a new generation of semantic-microcontent (what we call “metacontent”) is enabled. With the right tools even non-technical consumers will be able to author and use metacontent.
It is at this point that the Metaweb begins to evolve into the Semantic Web: The moment when someone adds semantics to microcontent in a manner that everyone can use. This is what we have done at Radar Networks. But to do it right is non-trivial: a number of incredibly complex and subtle issues must be solved.
After 3 years of working on this problem we are confident that we have the right approach. In future months I will begin to describe our approach on this Weblog. Stay tuned!
Related posts:
- The Metaweb: Beyond Weblogs The Metaweb is not just the set of all Weblog...
- The Birth of "The Metaweb" — The Next Big Thing — What We are All Really Building Originally developed at Netscape, a new technology called RSS has...
- From Application-Centric to Data-Centric Computing: The Metaweb One of the big changes that will be enabled by...
- "Memes" are the units of the Metaweb: Microcontent by Another Name At Radar Networks we refer to pieces of microcontent as...
- Web 3.0 Roundup: Radar Networks, Powerset, Metaweb and Others… It’s been a while since I posted about what my...