What's After the Real Time Web?

In typical Web-industry style we’re all focused minutely on the leading trend-of-the-year, the real-time Web. But in this obsession we have become a bit myopic. The real-time Web, or what some of us call “The Stream,” is not an end in itself, it’s a means to an end. So what will it enable, where is it headed, and what’s it going to look like when we look back at this trend in 10 or 20 years?

In the next 10 years, The Stream is going to go through two big phases, focused on two problems, as it evolves:

  1. Web Attention Deficit Disorder. The first problem with the real-time Web that is becoming increasingly evident is that it has a bad case of ADD. There is so much information streaming in from so many places at once that it’s simply impossible to focus on anything for very long, and a lot of important things are missed in the chaos. The first generation of tools for the Stream are going to need to address this problem.
  2. Web Intention Deficit Disorder. The second problem with the real-time Web will emerge after we have made some real headway in solving Web attention deficit disorder. This second problem is about how to get large numbers of people to focus their intention not just their attention. It’s not just difficult to get people to notice something, it’s even more difficult to get them to do something. Attending to something is simply noticing it. Intending to do something is actually taking action, expending some energy or effort to do something. Intending is a lot more expensive, cognitively speaking, than merely attending. The power of collective intention is literally what changes the world, but we don’t have the tools to direct it yet.

The Stream is not the only big trend taking place right now. In fact, it’s just a strand that is being braided together with several other trends, as part of a larger pattern. Here are some of the other strands I’m tracking:

  • Messaging. The real-time Web aka The Stream is really about messaging in essence. It’s a subset of the global trend towards building a better messaging layer for the Web. Multiple forms of messaging are emerging, from the publish-and-subscribe nature of Twitter and RSS, to things like Google Wave, Pubsubhubub, and broadcast style messaging or multicasting via screencast, conferencing and media streaming and events in virtual worlds. The effect of these tools is that the speed and interactivity of the Web are increasing — the Web is getting faster. Information spreads more virally, more rapidly — in other words, “memes” (which we can think of as collective thoughts) are getting more sophisticated and gaining more mobility.
  • Semantics. The Web becomes more like a database. The resolution of search, ad targeting, and publishing increases. In other words, it’s a higher-resolution Web. Search will be able to target not just keywords but specific meaning. For example, you will be able to search precisely for products or content that meet certain constraints. Multiple approaches from natural language search to the metadata of the Semantic Web will contribute to increased semantic understanding and representation of the Web.
  • Attenuation. As information moves faster, and our networks get broader, information overload gets worse in multiple dimensions. This creates a need for tools to help people filter the firehose. Filtering in its essence is a process of attenuation — a way to focus attention more efficiently on signal versus noise. Broadly speaking there are many forms of filtering from automated filtering, to social filtering, to personalization, but they all come down to helping someone focus their finite attention more efficiently on the things they care about most.
  • The WebOS.  As cloud computing resources, mashups, open linked data, and open API’s proliferate, a new level of aggregator is emerging. These aggregators may focus on one of these areas or may cut across them. Ultimately they are the beginning of true cross-service WebOS’s. I predict this is going to be a big trend in the future — for example instead of writing Web apps directly to various data and API’s in dozens of places, just write to a single WebOS aggregator that acts as middleware between your app and all these choices. It’s much less complicated for developers. The winning WebOS is probably not going to come from Google, Microsoft or Amazon — rather it will probably come from someone neutral, with the best interests of developers as the primary goal.
  • Decentralization. As the semantics of the Web get richer, and the WebOS really emerges it will finally be possible for applications to leverage federated, Web-scale computing. This is when intelligent agents will actually emerge and be practical. By this time the Web will be far too vast and complex and rapidly changing for any centralized system to index and search it. Only massively federated swarms of intelligent agents, or extremely dynamic distributed computing tools, that can spread around the Web as they work, will be able to keep up with the Web.
  • Socialization. Our interactions and activities on the Web are increasingly socially networked, whether individual, group or involving large networks or crowds. Content is both shared and discovered socially through our circles of friends and contacts. In addition, new technologies like Google Social Search enable search results to be filtered by social distance or social relevancy. In other words, things that people you follow like get higher visibility in your search results. Socialization is a trend towards making previously non-social activities more social, and towards making already-social activities more efficient and broader. Ultimately this process leads to wider collaboration and higher levels of collective intelligence.
  • Augmentation. Increasingly we will see a trend towards augmenting things with other things. For example, augmenting a Web page or data set with links or notes from another Web page or data set. Or augmenting reality by superimposing video and data onto a live video image on a mobile phone. Or augmenting our bodies with direct connections to computers and the Web.

If these are all strands in a larger pattern, then what is the megatrend they are all contributing to? I think ultimately it’s collective intelligence — not just of humans, but also our computing systems, working in concert.

Collective Intelligence

I think that these trends are all combining, and going real-time. Effectively what we’re seeing is the evolution of a global collective mind, a theme I keep coming back to again and again. This collective mind is not just comprised of humans, but also of software and computers and information, all interlinked into one unimaginably complex system: A system that senses the universe and itself, that thinks, feels, and does things, on a planetary scale. And as humanity spreads out around the solar system and eventually the galaxy, this system will spread as well, and at times splinter and reproduce.

But that’s in the very distant future still. In the nearer term — the next 100 years or so — we’re going to go through some enormous changes. As the world becomes increasingly networked and social the way collective thinking and decision making take place is going to be radically restructured.

Social Evolution

Existing and established social, political and economic structures are going to either evolve or be overturned and replaced. Everything from the way news and entertainment are created and consumed, to how companies, cities and governments are managed will change radically. Top-down beaurocratic control systems are simply not going to be able to keep up or function effectively in this new world of distributed, omnidirectional collective intelligence.

Physical Evolution

As humanity and our Web of information and computatoins begins to function as a single organism, we will evolve literally, into a new species: Whatever is after the homo sapien. The environment we will live in will be a constantly changing sea of collective thought in which nothing and nobody will be isolated. We will be more interdependent than ever before. Interdependence leads to symbiosis, and eventually to the loss of generality and increasing specialization. As each of us is able to draw on the collective mind, the global brain, there may be less pressure on us to do things on our own that used to be solitary. What changes to our bodies, minds and organizations may result from these selective evolutionary pressures? I think we’ll see several, over multi-thousand year timescales, or perhaps faster if we start to genetically engineer ourselves:

  • Individual brains will get less good at things like memorization and recall, calculation, reasoning, and long-term planning and action.
  • Individual brains will get better at multi-tasking, information filtering, trend detection, and social communication. The parts of the nervous system involved in processing live information will increase disproportionately to other parts.
  • Our bodies may actually improve in certain areas. We will become more, not less, mobile, as computation and the Web become increasingly embedded into our surroundings, and into augmented views of our environments. This may cause our bodies to get into better health and shape since we will be less sedentary, less at our desks, less in front of TV’s. We’ll be moving around in the world, connected to everything and everyone no matter where we are. Physical strength will probably decrease overall as we will need to do less manual labor of any kind.

These are just some of the changes that are likely to occur as a result of the things we’re working on today. The Web and the emerging Real-Time Web are just a prelude of things to come.

Video: My Talk on the Evolution of the Global Brain at the Singularity Summit

If you are interested in collective intelligence, consciousness, the global brain and the evolution of artificial intelligence and superhuman intelligence, you may want to see my talk at the 2008 Singularity Summit. The videos from the Summit have just come online.

(Many thanks to Hrafn Thorisson who worked with me as my research assistant for this talk).

Twine's Explosive Growth

Twine has been growing at 50% per month since launch in October. We've been keeping that quiet while we wait to see if it holds. VentureBeat just noticed and did an article about it. It turns out our January numbers are higher than Compete.com estimates and February is looking strong too. We have a slew of cool viral features coming out in the next few months too as we start to integrate with other social networks. Should be an interesting season.

A Few Predictions for the Near Future

This is a five minute video in which I was asked to make some predictions for the next decade about the Semantic Web, search and artificial intelligence. It was done at the NextWeb conference and was a fun interview.


Learning from the Future with Nova Spivack from Maarten on Vimeo.

My Visit to DERI — World's Premier Semantic Web Research Institute

Earlier this month I had the opportunity to visit, and speak at, the Digital Enterprise Research Institute (DERI), located in Galway, Ireland. My hosts were Stefan Decker, the director of the lab, and John Breslin who is heading the SIOC project.

DERI has become the world’s premier research institute for the Semantic Web. Everyone working in the field should know about them, and if you can, you should visit the lab to see what’s happening there.

Part of the National University of Ireland, Galway. With over 100 researchers focused solely on the Semantic Web, and very significant financial backing, DERI has, to my knowledge, the highest concentration of Semantic Web expertise on the planet today. Needless to say, I was very impressed with what I saw there. Here is a brief synopsis of some of the projects that I was introduced to:

  • Semantic Web Search Engine (SWSE) and YARS, a massively scalable triplestore.  These projects are concerned with crawling and indexing the information on the Semantic Web so that end-users can find it. They have done good work on consolidating data and also on building a highly scalable triplestore architecture.
  • Sindice — An API and search infrastructure for the Semantic Web. This project is focused on providing a rapid indexing API that apps can use to get their semantic content indexed, and that can also be used by apps to do semantic searches and retrieve semantic content from the rest of the Semantic Web. Sindice provides Web-scale semantic search capabilities to any semantic application or service.
  • SIOC — Semantically Interlinked Online Communities. This is an ontology for linking and sharing data across online communities in an open manner, that is getting a lot of traction. SIOC is on its way to becoming a standard and may play a big role in enabling portability and interoperability of social Web data.
  • JeromeDL is developing technology for semantically enabled digital libraries. I was impressed with the powerful faceted navigation and search capabilities they demonstrated.
  • notitio.us. is a project for personal knowledge management of bookmarks and unstructured data.
  • SCOT, OpenTagging and Int.ere.st.  These projects are focused on making tags more interoperable, and for generating social networks and communities from tags. They provide a richer tag ontology and framework for representing, connecting and sharing tags across applications.
  • Semantic Web Services.  One of the big opportunities for the Semantic Web that is often overlooked by the media is Web services. Semantics can be used to describe Web services so they can find one another and connect, and even to compose and orchestrate transactions and other solutions across networks of Web services, using rules and reasoning capabilities. Think of this as dynamic semantic middleware, with reasoning built-in.
  • eLite. I was introduced to the eLite project, a large e-learning initiative that is applying the Semantic Web.
  • Nepomuk.  Nepomuk is a large effort supported by many big industry players. They are making a social semantic desktop and a set of developer tools and libraries for semantic applications that are being shipped in the Linux KDE distribution. This is a big step for the Semantic Web!
  • Semantic Reality. Last but not least, and perhaps one of the most eye-opening demos I saw at DERI, is the Semantic Reality project. They are using semantics to integrate sensors with the real world. They are creating an infrastructure that can scale to handle trillions of sensors eventually. Among other things I saw, you can ask things like "where are my keys?" and the system will search a network of sensors and show you a live image of your keys on the desk where you left them, and even give you a map showing the exact location. The service can also email you or phone you when things happen in the real world that you care about — for example, if someone opens the door to your office, or a file cabinet, or your car, etc. Very groundbreaking research that could seed an entire new industry.

In summary, my visit to DERI was really eye-opening and impressive. I recommend that major organizations that want to really see the potential of the Semantic Web, and get involved on a research and development level, should consider a relationship with DERI — they are clearly the leader in the space.

Insightful Article About Twine

Carla Thompson, an analyst for Guidewire Group, has written what I think is a very insightful article about her experience participating in the early-access wave of the Twine beta.

We are now starting to let the press in and next week we will begin to let waves of people in from our over 30,000 user wait list. We will be letting people into the beta in waves every week going forward.

As Carla notes, Twine is a work in progress and we are mainly focused on learning from our users now. We have lots more to do, but we’re very excited about the direction Twine is headed in, and it’s really great to see Twine getting so much active use.

Continue reading

How about Web 3G?

I’m here at the BlogTalk conference in Cork, Ireland with a range of bloggers and technologists discussing the emerging social Web. Including myself, Ian Davis and Paul Miller from Talis, there are also a bunch of other Semantic Web folks including Dan Brickley, and a group from DERI Galway.

Over dinner a few of us were discussing the terms “Semantic Web” versus “Web 3.0” and we all felt a better term was needed. After some thinking, Ian Davis suggested “Web 3G.” I like this term better than Web 3.0 because it loses the “version number” aspect that so many objected to. It has a familiar ring to it as well, reminding me of the 3G wireless phone initiative. It also suggests Tim Berners-Lee’s “Giant Global Graph” or GGG — a synonym for the Semantic Web. Ian stayed up late and put together a nice blog post about the term, echoing many of my own sentiments about how this term should apply to a decade (the third decade of the Web), rather than to a particular technology.

Powerpoint Deck: Making Sense of the Semantic Web, and Twine

Now that I have been asked by several dozen people for the slides from my talk on "Making Sense of the Semantic Web," I guess it’s time to put them online. So here they are, under the Creative Commons Attribution License (you can share it with attribution this site).

You can download the Powerpoint file at the link below:

Download nova_spivack_semantic_web_talk.ppt


Or you can view it right here:

Enjoy! And I look forward to your thoughts and comments.

Quick Video Preview of Twine

The New Scientist just posted a quick video preview of Twine to YouTube. It only shows a tiny bit of the functionality, but it’s a sneak peak.

We’ve been letting early beta testers into Twine and we’re learning a lot from all the great feedback, and also starting to see some cool new uses of Twine. There are around 20,000 people on the wait-list already, and more joining every day. We’re letting testers in slowly, focusing mainly on people who can really help us beta test the software at this early stage, as we go through iterations on the app. We’re getting some very helpful user feedback to make Twine better before we open it up the world.

For now, here’s a quick video preview:

True Knowledge is Cool

The most interesting and exciting new app I’ve seen this month (other than Twine of course!) is a new semantic search engine called True Knowledge. Go to their site and watch their screencast to see what the next generation of search is really going to look like.

True Knowledge is doing something very different from Twine — whereas Twine is about helping individuals, groups and teams manage their private and shared knowledge, True Knowledge is about making a better public knowledgebase on the Web — in a sense they are a better search engine combined with a better Wikipedia. They seem to overlap more with what is being done by natural language search companies like Powerset and companies working on public databases, such as Metaweb and Wikia.

I don’t yet know whether True Knowledge is supporting W3C open-standards for the Semantic Web, but if they do, they will be well-positioned to become a very central service in the next phase of the Web. If they don’t they will just be yet another silo of data — but a very useful one at least. I personally hope they provide SPARQL API access at the very least. Congratulations to the team at True Knowledge! This is a very impressive piece of work.

The Next Big Thing: User-Contributed Metadata

Dan Farber has an interesting piece today about how user-contributed metadata will revolutionize online advertising. He mentions Facebook, Metaweb and Twine as examples. I agree, of course, with Dan’s thoughts on this, since these are some of the underlying motivations of Twine. The rich user-generated metadata in Twine is not just about users however, it’s about everything — products, companies, events, places, web pages, etc. The "semantic graph" we are building is far richer than a graph that is just about people. I’ll be blogging more about this in the future.

A Video and an Audio Cast About Twine

Last night I saw that the video of my presentation of Twine at the Web 2.0 Summit is online. My session, "The Semantic Edge," featured Danny Hillis of Metaweb demoing Freebase, Barney Pell demoing Powerset, and myself Demoing Twine, followed by a brief panel discussion with Tim O’Reilly (in that order). It’s a good panel and I recommend the video, however, the folks at Web 2.0 only filmed the presenters; they didn’t capture what we were showing on our screens, so you have to use your imagination as we describe our demos.

An audio cast of one of my presentations about Twine to a reporter was also put online recently, for a more in-depth description.

Radar Networks Announces Twine.com

My company, Radar Networks, has just come out of stealth. We’ve announced what we’ve been working on all these years: It’s called Twine.com. We’re going to be showing Twine publicly for the first time at the Web 2.0 Summit tomorrow. There’s lot’s of press coming out where you can read about what we’re doing in more detail. The team is extremely psyched and we’re all working really hard right now so I’ll be brief for now. I’ll write a lot more about this later.

Continue reading

Radar Networks Coming Out of Stealth – Friday, October 19

News Flash!

My company, Radar Networks, is coming out of stealth this Friday, October 19, 2007 at the Web 2.0 Summit, in San Francisco. I’ll be speaking on "The Semantic Edge Panel" at 4:10 PM, and publicly showing our Semantic Web online service for the first time. If you are planning to come to Web 2.0, I hope to see you at my panel.

Here’s the official Media Alert below:

               

(PRWEB)
October 15, 2007 — At the Web2.0 Summit on October 19th, Radar
Networks will announce a revolutionary new service that uses the power
of the emerging Semantic Web to enable a smarter way of sharing,
organizing and finding information. Founder and CEO Nova Spivack will
also give the first public preview of Radar’s application, which is one
of the first examples of “Web 3.0” – the next-generation of the Web, in
which the Web begins to function more like a database, and software
grows more intelligent and helpful.

Join Nova as he participates in “The Semantic Edge” panel discussion
with esteemed colleagues including Powerset’s Barney Pell and Metaweb’s
Daniel Hillis, moderated by Tim O’Reilly.

Who:   
Radar Networks Founder and CEO Nova Spivack

When:   
Friday, October 19, 2007
4:10 – 4:55 p.m.
   
Where: 
Web2.0 Summit
Palace Hotel
Grand Ballroom
2 New Montgomery Street
San Francisco,  California  94105
   

The Semantic Web, Collective Intelligence and Hyperdata

I’m posting this in response to a recent post by Tim O’Reilly which focused on disambiguating what the Semantic Web is and is not, as well as the subject of Collective Intelligence. I generally agree with Tim’s post, but I do have some points I would add by way of clarification. In particular, in my opinion,  the Semantic Web is all about collective intelligence, on several levels. I would also suggest that the term "hyperdata" is a possibly useful way to express what the Semantic Web is really all about.

What Makes Something a Semantic Web Application?

I agree with Tim that the term "Semantic Web" refers to the use of a particular set of emerging W3C open standards. These standards include RDF, OWL, SPARQL, and GRDDL. A key requirement for an application to have "Semantic Web inside" so to speak, is that it makes use of or is compatible with, at the very least, basic RDF. Another alternative definition is that for an application to be "Semantic Web" it must make at least some use of an ontology, using a W3C standard for doing so.

Semantic Versus Semantic Web

Many applications and services claim to be "semantic" in one manner or another, but that does not mean they are "Semantic Web." Semantic applications include any applications that can make sense of meaning, particularly in language such as unstructured text, or structured data in some cases. By this definition, all search engines today are somewhat "semantic" but few would qualify as "Semantic Web" apps.

The Difference Between "Data On the Web" and a "Web of Data"

The Semantic Web is principally about working with  data in a new and hopefully better way, and making that data available on the Web if desired in an open fashion such that other applications can understand and reuse it more easily. We call this idea "The Data Web" — the notion is that we are transforming the Web from a distributed file server into something that is more like a distributed database.

Instead of the basic objects being web pages, they are actually pieces of data (triples) and records formed from them (sets, trees, graphs or objects comprised of triples). There can be any number of triples within a Web page, and there can also be triples on the Web that do not exist within Web pages at all — they can come directly from databases for example.

One might respond to this by noting that there is already a lot of data on the Web, in XML and other formats — how is the Semantic Web different from that? What is the difference between "Data on the Web" and the idea of "The Data Web?"

The best answer to this question that I have heard was something that Dean Allemang said at a recent Semantic Web SIG in Palo Alto. Dean said, "Sure there is data on the Web, but it’s not actually a web of data."  The difference is that in the Semantic Web paradigm, the data can be linked to other data in other places, it’s a web of data, not just data on the Web.

I call this concept of interconnected data, "Hyperdata." It does for data what hypertext did for text. I’m probably not the originator of this term, but I think it is a very useful term and analogy for explaining the value of the Semantic Web.

Another way to think of it is that the current Web is a big graph of interconnected nodes, where the nodes are usually HTML documents, but in the Semantic Web we are talking about a graph of interconnected data statements that can be as general or specific as you want. A data record is a set of data statements about the same subject, and they don’t have to live in one place on the network — they could be spread over many locations around the Web.

A statement to the effect of "Sue lives in Palo Alto" could exist on site A, refer to a URI for a statement defining Sue on site B, a URI for a statement that defines "lives in" on site C, and a URI for a statement defining "Palo Alto" on site D. That’s a web of data. What’s cool is that anyone can potentially add statements to this web of data, it can be completely emergent.

The Semantic Web is Built by and for Collective Intelligence

This is where I think Tim and others who think about the Semantic Web may be missing an essential point. The Semantic Web is in fact highly conducive to "collective intelligence." It doesn’t require that machines add all the statements using fancy AI. In fact, in a next-generation folksonomy, when tags are created by human users, manually, they can easily be encoded as RDF statements. And by doing this you get lots of new capabilities, like being able to link tags to concepts that define their meaning, and to other related tags.

Humans can add tags that become semantic web content. They can do this manually or software can help them. Humans can also fill out forms that generate RDF behind the scenes, just as filling out a blog posting form generates HTML, XML, ATOM etc. Humans don’t actually write all that code, software does it for them, yet blogging and wikis for example are considered to be collective intelligence tools.

So the concept of folksonomy and tagging is truly orthogonal to the Semantic Web. They are not mutually exclusive at all. In fact the Semantic Web — or at least "Semantic Web Lite" (RDF + only basic use of OWL + basic SPARQL) is capable of modelling and publishing any data in the world in a more open way.

Any application that uses data could do everything it does using these technologies. Every single form of social, user-generated content and community could, and probably will, be implemented using RDF in one manner or another within the next decade or so. And in particular, RDF and OWL + SPARQL are ideal for social networking services — the data model is a much better match for the structure of the data and the network of users and the kinds of queries that need to be done.

Folktologies

This notion that somehow the Semantic Web is not about folksonomy needs to be corrected. For example, take Metaweb’s Freebase. Freebase is what I call a "folktology" — it’s an emergent, community generated ontology. Users collaborate to add to the ontology and the knowledge base that is populated within it. That’s a wonderful example of collective intelligence, user generated content, and semantics (although technically to my knowledge they are not using RDF for this, their data model is from what I can see functionally equivalent and I would expect at least a SPARQL interface from them eventually).

But that’s not all — check out TagCommons and this Tag Ontology discussion, and also the SKOS ontology — all of which are working on semantic ways of characterizing simple tags in order to enrich folksonomies and enable better collective intelligence.

There are at least two other places where the Semantic Web naturally leverages and supports collective intelligence. The first is the fact that people and software can generate triples (people could do it by hand, but generally they will do it by filling out Web forms or answering questions or dialog boxes etc.) and these triples can live all over the Web, yet interconnect or intersect (when they are about the same subjects or objects).

I can create data about a piece of data you created, for example to state that I agree with it, or that I know something else about it. You can create data about my data. Thus a data-set can be generated in a distributed way — it’s not unlike a wiki for example. It doesn’t have to work this way, but at least it can if people do this.

The second point is that OWL, the ontology language, is designed to support an infinite number of ontologies — there doesn’t have to be just one big ontology to "rule them all." Anyone can make a simple or complex ontology and start to then make data statements that refer to it. Ontologies can link to or include other ontologies, or pieces of them, to create bigger distributed ontologies that cover more things.

This is kind of like not only mashing up the data, but also mashing up the schemas too. Both of these are examples of collective intelligence. In the case of ontologies, this is already happening, for example many ontologies already make use of other ontologies like the Dublin Core and Foaf.

The point here is that there is in fact a natural and very beneficial fit between the technologies of the Semantic Web and what Tim O’Reilly defines Web 2.0 to be about (essentially collective intelligence). In fact the designers of the underlying standards of the Semantic Web specifically had "collective intelligence" in mind when they came up with these ideas. They were specifically trying to rectify several problems in the closed, data-silo world of old fashioned databases. The big motivation was to make data more integrated, to enable applications to share data more easily, and to be able to build data with other data, and to build schemas with other schemas. It’s all about enabling connections and network effects.

Now, whether people end up using these technologies to do interesting things that enable human-level collective intelligence (as opposed to just software level collective intelligence) is an open question. At least some companies such as my own Radar Networks and Metaweb, and Talis (thanks, Danny), are directly focused on this, and I think it is safe to say this will be a big emerging trend. RDF is a great fit for social and folksonomy-based applications.

Web 3.0 and the concept of "Hyperdata"

Where Tim defines Web 2.0 as being about collective intelligence generally, I would define Web 3.0 as being about "connective intelligence." It’s about connecting data, concepts, applications and ultimately people. The real essence of what makes the Web great is that it enables a global hypertext medium in which collective intelligence can emerge. In the case of Web 3.0, which begins with the Data Web and will evolve into the full-blown Semantic Web over a decade or more, the key is that it enables a global hyperdata medium (not just hypertext).

As I mentioned above, hyperdata is to data what hypertext is to text. Hyperdata is a great word — it is so simple and yet makes a big point. It’s about data that links to other data. It does for data what hypertext does for text. That’s what RDF and the Semantic Web are really all about. Reasoning is NOT the main point (but is a nice future side-effect…). The main point is about growing a web of data.

Just as the Web enabled a huge outpouring of collective intelligence via an open global hypertext medium, the Semantic Web is going to enable a similarly huge outpouring of collective knowledge and cognition via a global hyperdata medium. It’s the Web, only better.

Radar Networks Progress Update

I’m sitting the Dynasty Lounge in Taipei, enroute to Singpore where I will be addressing ministers of the government there on the potential of the Semantic Web. Singapore is a very forward-looking country and they have some very exciting new initiatives in the works there. After that I hope to have a little time for a vacation and then I’m heading back to San Francisco, returning on August 1.

I should have email for all or most of the time here, so that is the best way to reach me directly. And of course you can comment on this blog too.

As for the company — lots of good news here at Radar Networks.

First of all the team has gotten the next version of our alpha up (our hosted Web service for the Semantic Web) and it’s getting awesome! We’re on track for a invite only launch in the fall timeframe as planned.

We also chose a brand for our product, with help from the mad geniuses at Igor International. The new brand is secret until launch but we love it. We’ll be announcing the brand close to launch.

If you want to be invited to our launch and be one of the first to see how useful the Semantic Web really can be — sign up for our mailing list at http://www.radarnetworks.com/ — and feel free to invite your friends to sign up too. Only people who sign up will get on our waiting list. We already have around 2000 bloggers and other influencers pre-registered, and more are coming every day, so don’t wait — it will be on a first-come, first-serve basis. We’ll be letting people into the service in waves.

Another exciting development: Several of the world’s big media empires have started approaching me to see how they can get involved in the network we are building here at Radar Networks. They are interested in the potential of the Semantic Web for adding new capabilities to their content and new services for their audiences. That’s an exciting direction to explore for us. If you have large collections of interesting, useful, content of value to particular audiences, or if you have large audiences that need a better way to do stuff on the Web, feel free to drop me a line and we can discuss how you might be able to get involved with the Semantic Web in partnership with us.

In other news, I am still inundated with hundreds of emails from interesting people who read the articles about us in this month’s Business 2.0 and BusinessWeek. It’s been very interesting to connect with so many other thinkers and businesses. Forgive me in advance if takes me a while to write back — I promise I will.

I can’t wait to come back to San Francisco and start playing with our alpha — it’s really getting there. All the credit should go to our awesome development team. They’ve been writing tons of code and it’s starting to really pay off.

Enriching the Connections of the Web — Making the Web Smarter

Web 3.0 — aka The Semantic Web — is about enriching the connections of the Web. By enriching the connections within the Web, the entire Web may become smarter.

I  believe that collective intelligence primarily comes from connections — this is certainly the case in the brain where the number of connections between neurons far outnumbers the number of neurons; certainly there is more "intelligence" encoded in the brain’s connections than in the neurons alone. There are several kinds of connections on the Web:

  1. Connections between information (such as links)
  2. Connections between people (such as opt-in social relationships, buddy lists, etc.)
  3. Connections between applications (web services, mashups, client server sessions, etc.)
  4. Connections between information and people (personal data collections, blogs, social bookmarking, search results, etc.)
  5. Connections between information and applications (databases and data sets stored or accessible by particular apps)
  6. Connections between people and applications (user accounts, preferences, cookies, etc.)

Are there other kinds of connections that I haven’t listed — please let me know!

I believe that the Semantic Web can actually enrich all of these types of connections, adding more semantics not only to the things being connected (such as representations of information or people or apps) but also to the connections themselves.

In the Semantic Web approach, connections are represented with statements of the form (subject, predicate, object) where the elements have URIs that connect them to various ontologies where their precise intended meaning can be defined. These simple statements are sometimes called "triples" because they have three elements. In fact, many of us are working with statements that have more than three elements ("tuples"), so that we can represent not only subject, predicate, object of statements, but also things like provenance (where did the data for the statement come from?), timestamp (when was the statement made), and other attributes. There really is no limit to what kind of metadata can be stored in these statements. It’s a very simple, yet very flexible and extensible data model that can represent any kind of data structure.

The important point for this article however is that in this data model rather than there being just a single type of connection (as is the case on the present Web which basically just provides the HREF hotlink, which simply means "A and B are linked" and may carry minimal metadata in some cases), the Semantic Web enables an infinite range of arbitrarily defined connections to be used.  The meaning of these connections can be very specific or very general.

For example one might define a type of connection called "friend of" or a type of connection called "employee of" — these have very different meanings (different semantics) which can be made explicit and also machine-readable using OWL. By linking a page about a person with the "employee of" link to another page about a different person, we can express that one of them employs the other. That is a statement that any application which can read OWL is able to see and correctly interpret, by referencing the underlying definition of "employee of" which is defined in some ontology and might for example specify that an "employee of" relation connects a person to a person or organization who is their employer. In other words, rather than just linking things with the generic "hotlink" we are all used to, they can now be linked with specific kinds of links that have very particular and unambiguous meaning and logical implications.

This has the potential at least to dramatically enrich the information-carrying capacity of connections (links) on the Web. It means that connections can carry more meaning, on their own. It’s a new place to put meaning in fact — you can put meaning between things to express their relationships. And since connections (links) far outnumber objects (information, people or applications) on the Web, this means we can radically improve the semantics of the structure of the Web as a whole — the Web can become more meaningful, literally. This makes a difference, even if all we do is just enrich connections between gross-level objects (in other words, connections between Web pages or data records, as opposed to connections between concepts expressed within them, such as for example, people and companies mentioned within a single document).

Even if the granularity of this improvement in connection technology is relatively gross level it could still be a major improvement to the Web. The long-term implications of this have hardly been imagined let alone understood — it is analogous to upgrading the dendrites in the human brain; it could be a catalyst for new levels of computation and intelligence to emerge.

It is important to note that, as illustrated above, there are many types of connections that involve people. In other words the Semantic Web, and Web 3.0, are just as much about people as they are about other things. Rather than excluding people, they actually enrich their relationships to other things. The Semantic Web, should, among other things, enable dramatically better social networking and collaboration to take place on the Web. It is not only about enriching content.

Now where will all these rich semantic connections come from? That’s the billion dollar question. Personally I think they will come from many places: from end-users as they find things, author content, bookmark content, share content and comment on content (just as hotlinks come from people today), as well as from applications which mine the Web and automatically create them. Note that even when Mining the Web a lot of the data actually still comes from people — for example, mining the Wikipedia, or a social network yields lots of great data that was ultimately extracted from user-contributions. So mining and artificial intelligence does not always imply "replacing people" — far from it! In fact, mining is often best applied as a means to effectively leverage the collective intelligence of millions of people.

These are subtle points that are very hard for non-specialists to see — without actually working with the underlying technologies such as RDF and OWL they are basically impossible to see right now. But soon there will be a range of Semantically-powered end-user-facing apps that will demonstrate this quite obviously. Stay tuned!

Of course these are just my opinions from years of hands-on experience with this stuff, but you are free to disagree or add to what I’m saying. I think there is something big happening though. Upgrading the connections of the Web is bound to have a significant effect on how the Web functions. It may take a while for all this to unfold however. I think we need to think in decades about big changes of this nature.

Web 3.0 — Next-Step for Web?

The Business 2.0 Article on Radar Networks and the Semantic Web just came online. It’s a huge article. In many ways it’s one of the best popular articles written about the Semantic Web in the mainstream press. It also goes into a lot of detail about what Radar Networks is working on.

One point of clarification, just in case anyone is wondering…

Web 3.0 is not just about machines — it’s actually all about humans — it leverages social networks, folksonomies, communities and social filtering AS WELL AS the Semantic Web, data mining, and artificial intelligence. The combination of the two is more powerful than either one on it’s own. Web 3.0 is Web 2.0 + 1. It’s NOT Web 2.0 – people. The "+ 1" is the
addition of software and metadata that help people and other
applications organize and make better sense of the Web. That new layer
of semantics — often called "The Semantic Web" — will add to and
build on the existing value provided by social networks, folksonomies,
and collaborative filtering that are already on the Web.

So at least here at Radar Networks, we are focusing much of our effort on facilitating people to help them help themselves, and to help each other, make sense of the Web. We leverage the amazing intelligence of the human brain, and we augment that using the Semantic Web, data mining, and artificial intelligence. We really believe that the next generation of collective intelligence is about creating systems of experts not expert systems.

Business 2.0 and BusinessWeek Articles About Radar Networks

It’s been an interesting month for news about Radar Networks. Two significant articles came out recently:

Business 2.0 Magazine published a feature article about Radar Networks in their July 2007 issue. This article is perhaps the most comprehensive article to-date about what we are working on at Radar Networks, it’s also one of the better articulations of the value proposition of the Semantic Web in general. It’s a fun read, with gorgeous illustrations, and I highly recommend reading it.

BusinessWeek  posted an article about Radar Networks on the Web. The article covers some of the background that led to my interests in collective intelligence and the creation of the company. It’s a good article and covers some of the bigger issues related to the Semantic Web as a paradigm shift. I would add one or two points of clarification in addition to what was stated in the article: Radar Networks is not relying solely on software to organize the Internet — in fact, the service we will be launching combines human intelligence and machine intelligence to start making sense of information, and helping people search and collaborate around interests more productively. One other minor point related to the article — it mentions the story of EarthWeb, the Internet company that I co-founded in the early 1990’s: EarthWeb’s content business actually was sold after the bubble burst, and the remaining lines of business were taken private under the name Dice.com. Dice is the leading job board for techies and was one of our properties. Dice has been highly profitable all along and recently filed for a $100M IPO.

A Bunch of New Press About Radar Networks

We had a bunch of press hits today for my startup, Radar
Networks

PC World  Article on  Web 3.0 and Radar Networks

Entrepreneur Magazine interview

We’re also proud to announce that Jim
Hendler
, one of the founding gurus of the Semantic Web, has joined our technical advisory board.

Metaweb and Radar Networks

This is just a brief post because I am actually slammed with VC meetings right now. But I wanted to congratulate our friends at Metaweb for their pre-launch announcement. My company, Radar Networks, is the only other major venture-funded play working on the Semantic Web for consumers so we are thrilled to see more action in this sector.

Metaweb and Radar Networks are working on two very different applications (fortunately!). Metaweb is essentially making the Wikipedia of the Semantic Web. Here at Radar Networks we are making something else — but equally big — and in a different category. Just as Metaweb is making a semantic analogue to something that exists and is big, so are we: but we’re more focused on the social web — we’re building something that everyone will use. But we are still in stealth so that’s all I can say for now.

This is now an exciting two-horse space. We look forward to others joining the excitement too. Web 3.0 is really taking off this year.

An interesting side note: Danny Hillis (founder of Metaweb), myself (founder of Radar Networks) and Lew Tucker (CTO of Radar Networks) all worked together at Thinking Machines (an early AI massively parallel computer company). It’s fascinating that we’ve all somehow come to think that the only practical way to move machine intelligence forward is by having us humans and applications start to employ real semantics in what we record in the digital world.

Web 3.0 Roundup: Radar Networks, Powerset, Metaweb and Others…

It’s been a while since I posted about what my stealth venture, Radar Networks, is working on. Lately I’ve been seeing growing buzz in the industry around the "semantics" meme — for example at the recent DEMO conference, several companies used the word "semantics" in their pitches. And of course there have been some fundings in this area in the last year, including Radar Networks and other companies.

Clearly the "semantic" sector is starting to heat up. As a result, I’ve been getting a lot of questions from reporters and VC’s about how what we are doing compares to other companies such as for example, Powerset, Textdigger, and Metaweb. There was even a rumor that we had already closed our series B round! (That rumor is not true; in fact the round hasn’t started yet, although I am getting very strong VC interest and we will start the round pretty soon).

In light of all this I thought it might be helpful to clarify what we are doing, how we understand what other leading players in this space are doing, and how we look at this sector.

Indexing the Decades of the Web

First of all, before we get started, there is one thing to clear up. The Semantic Web is part of what is being called "Web 3.0" by some, but it is in my opinion really just one of several converging technologies and trends that will define this coming era of the Web. I’ve written here about a proposed definition of Web 3.0, in more detail.

For those of you who don’t like terms like Web 2.0, and Web 3.0, I also want to mention that  I agree — we all want to avoid a rapid series of such labels or an arms-race of companies claiming to be > x.0. So I have a practical proposal: Let’s use these terms to index decades since the Web began. This is objective — we can all agree on when decades begin and end, and if we look at history each decade is characterized by various trends. 

I think this is reasonable proposal and actually useful (and also avoids endless new x.0’s being announced every year). Web 1.0 was therefore the first decade of the Web: 1990 – 2000. Web 2.0 is the second decade, 2000 – 2010. Web 3.0 is the coming third decade, 2010 – 2020 and so on. Each of these decades is (or will be) characterized by particular technology movements, themes and trends, and these indices, 1.0, 2.0, etc. are just a convenient way of referencing them. This is a useful way to discuss history, and it’s not without precedent. For example, various dynasties and historical periods are also given names and this provides shorthand way of referring to those periods and their unique flavors. To see my timeline of these decades, click here.

So with that said, what is Radar Networks actually working on? First of all, Radar Networks is still in stealth, although we are
planning to go beta in 2007. Until we get closer to launch what I can
say without an NDA is still limited. But at least I can give some
helpful hints for those who are interested. This article provides some hints, as well as what I hope is a helpful tutorial about natural language search and the Semantic Web, and how they differ. I’ll also discuss how Radar Networks compares some of the key startup ventures working with semantics in various ways today (there are many other companies in this sector — if you know of any interesting ones, please let me know in the comments; I’m starting to compile a list).

 

(click the link below to keep reading the rest of this article…)

Continue reading

How the WebOS Evolves?

Here is my timeline of the past, present and future of the Web. Feel free to put this meme on your own site, but please link back to the master image at this site (the URL that the thumbnail below points to) because I’ll be updating the image from time to time.

This slide illustrates my current thinking here at Radar Networks about where the Web (and we) are heading. It shows a timeline of technology leading from the prehistoric desktop era to the possible future of the WebOS…

Note that as well as mapping a possible future of the Web, here I am also proposing that the Web x.0 terminology be used to index the decades of the Web since 1990. Thus we are now in the tail end of Web 2.0 and are starting to lay the groundwork for Web 3.0, which fully arrives in 2010.

This makes sense to me. Web 2.0 was really about upgrading the “front-end” and user-experience of the Web. Much of the innovation taking place today is about starting to upgrade the “backend” of the Web and I think that will be the focus of Web 3.0 (the front-end will probably not be that different from Web 2.0, but the underlying technologies will advance significantly enabling new capabilities and features).

See also: This article I wrote redefining what the term “Web 3.0” means.

See also: A Visual Graph of the Future of Productivity

Please note: This is a work in progress and is not perfect yet. I’ve been tweaking thepositions to get the technologies and dates right. Part of thechallenge is fitting the text into the available spaces. If anyone outthere has suggestions regarding where I’ve placed things on thetimeline, or if I’ve left anything out that should be there, please letme know in the comments on this post and I’ll try to read just and update the image from time to time. If you would like to produce abetter version of this image, please do so and send it to me forinclusion here, with the same Creative Commons license, ideally.

What is the Semantic Web, Actually?

I’ve read several blog posts reacting to John Markoff’s article today. There seem to be some misconceptions in those posts about what the Semantic Web is and is not. Here I will try to  succinctly correct a few of the larger misconceptions I’ve run into:

  • The Semantic Web is not just a single Web. There won’t be one Semantic Web, there will be thousands or even millions of them, each in their own area. They will all be part of one Semantic Web in that they will use the same open-standard languages and their data will be universally accessible, but they won’t all be run by any single company. They will connect together over time, forming a tapestry. But nobody will own this or run this as a single service. It will be just as decentralized as the Web already is.
  • The Semantic Web is not separate from the existing Web. The Semantic Web won’t be a new Web apart from the Web we already have. It simply adds new metadata and data to the existing Web. It merges right into the existing HTML Web just like XML does, except this new metadata is in RDF (since RDF can in fact be expressed in XML).
  • The Semantic Web is not just about unstructured data. In fact, the Semantic Web is really about structured data: it provides a means (RDF) to turn any content or data into structured data that other software can make use of. This is really what RDF enables.
  • The Semantic Web does not require complex ontologies. Even without making use of OWL and more sophisticated ontologies, powerful data-sharing and data-integration can be enabled on the existing Web using even just RDF alone.
  • The Semantic Web does not only exist on Web pages. RDF works inside of applications and databases, not just on Web pages. Calling it a "Web" is a misnomer of sorts — it’s not just about the Web, it’s about all information, data and applications.
  • The Semantic Web is not only about AI, and doesn’t require it. There are huge benefits from the Semantic Web without ever using a single line of artificial intelligence code. While the next-generation of AI will certainly be enabled by richer semantics, AI is not the only benefit of RDF. Making data available in RDF makes it more accessible, integratable, and reusable — regardless of any AI. The long-term future of the Semantic Web is AI for sure — but to get immediate benefits from RDF no AI is necessary.
  • The Semantic Web is not only about mining, search engines and spidering. Application developers and content providers, and end-users, can benefit from using the Semantic Web (RDF) within their own services, regardless of whether they expose that RDF metadata to outside parties. RDF is useful without doing any data-mining — it can be baked right into content within authoring tools and created transparently when information is published. RDF makes content more manageable and frees developers and content providers from having to look at relational data models. It also gives end-users better ways to collect and manage content they find.
  • The Semantic Web is not just research. It’s already in use and starting to reach the market. The government uses it of course. But also so do companies like Adobe, and more recently Yahoo (Yahoo Food has started to use some Semantic Web technologies now). And one flavor of RSS is defined with RDF. Oracle has released native RDF support in their products. The list goes on…

Learning more:

New York Times Article About the Emerging Semantic Web

A New York Times article came out today about the Semantic Web — in which I was quoted, speaking about my company Radar Networks. Here’s an excerpt:

Referred to as Web 3.0, the effort is in its infancy, and the very
idea has given rise to skeptics who have called it an unobtainable
vision. But the underlying technologies are rapidly gaining adherents,
at big companies like I.B.M. and  Google
as well as small ones. Their projects often center on simple, practical
uses, from producing vacation recommendations to predicting the next
hit song.

But in the future, more powerful systems could act as
personal advisers in areas as diverse as financial planning, with an
intelligent system mapping out a retirement plan for a couple, for
instance, or educational consulting, with the Web helping a high school
student identify the right college.

The projects aimed at
creating Web 3.0 all take advantage of increasingly powerful computers
that can quickly and completely scour the Web.

“I call it the
World Wide Database,” said Nova Spivack, the founder of a start-up firm
whose technology detects relationships between nuggets of information
mining the World Wide Web. “We are going from a Web of connected
documents to a Web of connected data.”

Web 2.0, which describes
the ability to seamlessly connect applications (like geographical
mapping) and services (like photo-sharing) over the Internet, has in
recent months become the focus of dot-com-style hype in Silicon Valley.
But commercial interest in Web 3.0 — or the “semantic Web,” for the
idea of adding meaning — is only now emerging.