What's After the Real Time Web?

In typical Web-industry style we’re all focused minutely on the leading trend-of-the-year, the real-time Web. But in this obsession we have become a bit myopic. The real-time Web, or what some of us call “The Stream,” is not an end in itself, it’s a means to an end. So what will it enable, where is it headed, and what’s it going to look like when we look back at this trend in 10 or 20 years?

In the next 10 years, The Stream is going to go through two big phases, focused on two problems, as it evolves:

  1. Web Attention Deficit Disorder. The first problem with the real-time Web that is becoming increasingly evident is that it has a bad case of ADD. There is so much information streaming in from so many places at once that it’s simply impossible to focus on anything for very long, and a lot of important things are missed in the chaos. The first generation of tools for the Stream are going to need to address this problem.
  2. Web Intention Deficit Disorder. The second problem with the real-time Web will emerge after we have made some real headway in solving Web attention deficit disorder. This second problem is about how to get large numbers of people to focus their intention not just their attention. It’s not just difficult to get people to notice something, it’s even more difficult to get them to do something. Attending to something is simply noticing it. Intending to do something is actually taking action, expending some energy or effort to do something. Intending is a lot more expensive, cognitively speaking, than merely attending. The power of collective intention is literally what changes the world, but we don’t have the tools to direct it yet.

The Stream is not the only big trend taking place right now. In fact, it’s just a strand that is being braided together with several other trends, as part of a larger pattern. Here are some of the other strands I’m tracking:

  • Messaging. The real-time Web aka The Stream is really about messaging in essence. It’s a subset of the global trend towards building a better messaging layer for the Web. Multiple forms of messaging are emerging, from the publish-and-subscribe nature of Twitter and RSS, to things like Google Wave, Pubsubhubub, and broadcast style messaging or multicasting via screencast, conferencing and media streaming and events in virtual worlds. The effect of these tools is that the speed and interactivity of the Web are increasing — the Web is getting faster. Information spreads more virally, more rapidly — in other words, “memes” (which we can think of as collective thoughts) are getting more sophisticated and gaining more mobility.
  • Semantics. The Web becomes more like a database. The resolution of search, ad targeting, and publishing increases. In other words, it’s a higher-resolution Web. Search will be able to target not just keywords but specific meaning. For example, you will be able to search precisely for products or content that meet certain constraints. Multiple approaches from natural language search to the metadata of the Semantic Web will contribute to increased semantic understanding and representation of the Web.
  • Attenuation. As information moves faster, and our networks get broader, information overload gets worse in multiple dimensions. This creates a need for tools to help people filter the firehose. Filtering in its essence is a process of attenuation — a way to focus attention more efficiently on signal versus noise. Broadly speaking there are many forms of filtering from automated filtering, to social filtering, to personalization, but they all come down to helping someone focus their finite attention more efficiently on the things they care about most.
  • The WebOS.  As cloud computing resources, mashups, open linked data, and open API’s proliferate, a new level of aggregator is emerging. These aggregators may focus on one of these areas or may cut across them. Ultimately they are the beginning of true cross-service WebOS’s. I predict this is going to be a big trend in the future — for example instead of writing Web apps directly to various data and API’s in dozens of places, just write to a single WebOS aggregator that acts as middleware between your app and all these choices. It’s much less complicated for developers. The winning WebOS is probably not going to come from Google, Microsoft or Amazon — rather it will probably come from someone neutral, with the best interests of developers as the primary goal.
  • Decentralization. As the semantics of the Web get richer, and the WebOS really emerges it will finally be possible for applications to leverage federated, Web-scale computing. This is when intelligent agents will actually emerge and be practical. By this time the Web will be far too vast and complex and rapidly changing for any centralized system to index and search it. Only massively federated swarms of intelligent agents, or extremely dynamic distributed computing tools, that can spread around the Web as they work, will be able to keep up with the Web.
  • Socialization. Our interactions and activities on the Web are increasingly socially networked, whether individual, group or involving large networks or crowds. Content is both shared and discovered socially through our circles of friends and contacts. In addition, new technologies like Google Social Search enable search results to be filtered by social distance or social relevancy. In other words, things that people you follow like get higher visibility in your search results. Socialization is a trend towards making previously non-social activities more social, and towards making already-social activities more efficient and broader. Ultimately this process leads to wider collaboration and higher levels of collective intelligence.
  • Augmentation. Increasingly we will see a trend towards augmenting things with other things. For example, augmenting a Web page or data set with links or notes from another Web page or data set. Or augmenting reality by superimposing video and data onto a live video image on a mobile phone. Or augmenting our bodies with direct connections to computers and the Web.

If these are all strands in a larger pattern, then what is the megatrend they are all contributing to? I think ultimately it’s collective intelligence — not just of humans, but also our computing systems, working in concert.

Collective Intelligence

I think that these trends are all combining, and going real-time. Effectively what we’re seeing is the evolution of a global collective mind, a theme I keep coming back to again and again. This collective mind is not just comprised of humans, but also of software and computers and information, all interlinked into one unimaginably complex system: A system that senses the universe and itself, that thinks, feels, and does things, on a planetary scale. And as humanity spreads out around the solar system and eventually the galaxy, this system will spread as well, and at times splinter and reproduce.

But that’s in the very distant future still. In the nearer term — the next 100 years or so — we’re going to go through some enormous changes. As the world becomes increasingly networked and social the way collective thinking and decision making take place is going to be radically restructured.

Social Evolution

Existing and established social, political and economic structures are going to either evolve or be overturned and replaced. Everything from the way news and entertainment are created and consumed, to how companies, cities and governments are managed will change radically. Top-down beaurocratic control systems are simply not going to be able to keep up or function effectively in this new world of distributed, omnidirectional collective intelligence.

Physical Evolution

As humanity and our Web of information and computatoins begins to function as a single organism, we will evolve literally, into a new species: Whatever is after the homo sapien. The environment we will live in will be a constantly changing sea of collective thought in which nothing and nobody will be isolated. We will be more interdependent than ever before. Interdependence leads to symbiosis, and eventually to the loss of generality and increasing specialization. As each of us is able to draw on the collective mind, the global brain, there may be less pressure on us to do things on our own that used to be solitary. What changes to our bodies, minds and organizations may result from these selective evolutionary pressures? I think we’ll see several, over multi-thousand year timescales, or perhaps faster if we start to genetically engineer ourselves:

  • Individual brains will get less good at things like memorization and recall, calculation, reasoning, and long-term planning and action.
  • Individual brains will get better at multi-tasking, information filtering, trend detection, and social communication. The parts of the nervous system involved in processing live information will increase disproportionately to other parts.
  • Our bodies may actually improve in certain areas. We will become more, not less, mobile, as computation and the Web become increasingly embedded into our surroundings, and into augmented views of our environments. This may cause our bodies to get into better health and shape since we will be less sedentary, less at our desks, less in front of TV’s. We’ll be moving around in the world, connected to everything and everyone no matter where we are. Physical strength will probably decrease overall as we will need to do less manual labor of any kind.

These are just some of the changes that are likely to occur as a result of the things we’re working on today. The Web and the emerging Real-Time Web are just a prelude of things to come.

The Future of the Web: BBC Interview

The BBC World Service’s Business Daily show interviewed the CTO of Xerox and me, about the future of the Web, printing, newspapers, search, personalization, the real-time Web. Listen to the audio stream here. I hear this will only be online at this location for 6 more days. If anyone finds it again after that let me know and I’ll update the link here.

The Next Generation of Web Search — Search 3.0

The next generation of Web search is coming sooner than expected. And with it we will see several shifts in the way people search, and the way major search engines provide search functionality to consumers.

Web 1.0, the first decade of the Web (1989 – 1999), was characterized by a distinctly desktop-like search paradigm. The overriding idea was that the Web is a collection of documents, not unlike the folder tree on the desktop, that must be searched and ranked hierarchically. Relevancy was considered to be how closely a document matched a given query string.

Web 2.0, the second decade of the Web (1999 – 2009), ushered in the beginnings of a shift towards social search. In particular blogging tools, social bookmarking tools, social networks, social media sites, and microblogging services began to organize the Web around people and their relationships. This added the beginnings of a primitive “web of trust” to the search repertoire, enabling search engines to begin to take the social value of content (as evidences by discussions, ratings, sharing, linking, referrals, etc.) as an additional measurment in the relevancy equation. Those items which were both most relevant on a keyword level, and most relevant in the social graph (closer and/or more popular in the graph), were considered to be more relevant. Thus results could be ranked according to their social value — how many people in the community liked them and current activity level — as
well as by semantic relevancy measures.

In the coming third decade of the Web, Web 3.0 (2009 – 2019), there will be another shift in the search paradigm. This is a shift to from the past to the present, and from the social to the personal.

Established search engines like Google rank results primarily by keyword (semantic) relevancy. Social search engines rank results primarily by activity and social value (Digg, Twine 1.0, etc.). But the new search engines of the Web 3.0 era will also take into account two additional factors when determining relevancy: timeliness, and personalization.

Google returns the same results for everyone. But why should that be the case? In fact, when two different people search for the same information, they may want to get very different kinds of results. Someone who is a novice in a field may want beginner-level information to rank higher in the results than someone who is an expert. There may be a desire to emphasize things that are novel over things that have been seen before, or that have happened in the past — the more timely something is the more relevant it may be as well.

These two themes — present and personal — will define the next great search experience.

To accomplish this, we need to make progress on a number of fronts.

First of all, search engines need better ways to understand what content is, without having to do extensive computation. The best solution for this is to utilize metadata and the methods of the emerging semantic web.

Metadata reduces the need for computation in order to determine what content is about — it makes that explicit and machine-understandable. To the extent that machine-understandable metadata is added or generated for the Web, it will become more precisely searchable and productive for searchers.

This applies especially to the area of the real-time Web, where for example short “tweets” of content contain very little context to support good natural-language processing. There a little metadata can go a long way. In addition, of course metadata makes a dramatic difference in search of the larger non-real-time Web as well.

In addition to metadata, search engines need to modify their algorithms to be more personalized. Instead of a “one-size fits all” ranking for each query, the ranking may differ for different people depending on their varying interests and search histories.

Finally, to provide better search of the present, search has to become more realtime. To this end, rankings need to be developed that surface not only what just happened now, but what happened recently and is also trending upwards and/or of note. Realtime search has to be more than merely listing search results chronologically. There must be effective ways to filter the noise and surface what’s most important effectively. Social graph analysis is a key tool for doing this, but in
addition, powerful statistical analysis and new visualizations may also be required to make a compelling experience.

Video: My Talk on The Future of Libraries — "Library 3.0"

If you are interested in semantics, taxonomies, education, information overload and how libraries are evolving, you may enjoy this video of my talk on the Semantic Web and the Future of Libraries at the OCLC Symposium at the American Library Association Midwinter 2009 Conference. This event focused around a dialogue between David Weinberger and myself, moderated by Roy Tennant. We were forutnate to have an audience of about 500 very vocal library directors in the audience and it was an intensive day of thinking together. Thanks to the folks at OCLC for a terrific and really engaging event!

Video: My Talk on the Evolution of the Global Brain at the Singularity Summit

If you are interested in collective intelligence, consciousness, the global brain and the evolution of artificial intelligence and superhuman intelligence, you may want to see my talk at the 2008 Singularity Summit. The videos from the Summit have just come online.

(Many thanks to Hrafn Thorisson who worked with me as my research assistant for this talk).

Interest Networks are at a Tipping Point

UPDATE: There’s already a lot of good discussion going on around this post in my public twine.

I’ve been writing about a new trend that I call “interest networking” for a while now. But I wanted to take the opportunity before the public launch of Twine on Tuesday (tomorrow) to reflect on the state of this new category of applications, which I think is quickly reaching its tipping point. The concept is starting to catch on as people reach for more depth around their online interactions.

In fact – that’s the ultimate value proposition of interest networks – they move us beyond the super poke and towards something more meaningful. In the long-term view, interest networks are about building a global knowledge commons. But in the short term, the difference between social networks and interest networks is a lot like the difference between fast food and a home-cooked meal – interest networks are all about substance.

At a time when social media fatigue is setting in, the news cycle is growing shorter and shorter, and the world is delivered to us in soundbytes and catchphrases, we crave substance. We go to great lengths in pursuit of substance. Interest networks solve this problem – they deliver substance.t

So, what is an interest network?

In short, if a social network is about who you are interested in, an interest network is about what you are interested in. It’s the logical next step.

Twine for example, is an interest network that helps you share information with friends, family, colleagues and groups, based on mutual interests. Individual “twines” are created for content around specific subjects. This content might include bookmarks, videos, photos, articles, e-mails, notes or even documents. Twines may be public or private and can serve individuals, small groups or even very large groups of members.

I have also written quite a bit about the Semantic Web and the Semantic Graph, and Tim Berners-Lee has recently started talking about what he calls the GGG (Giant Global Graph). Tim and I are in agreement that social networks merely articulate the relationships between people. Social networks do not surface the equally, if not more important, relationships between people and places, places and organizations, places and other places, organization and other organizations, organization and events, documents and documents, and so on.

This is where interest networks come in. It’s still early days to be clear, but interest networks are operating on the premise of tapping into a multi–dimensional graph that manifests the complexity and substance of our world, and delivers the best of that world to you, every day.

We’re seeing more and more companies think about how to capitalize on this trend. There are suddenly (it seems, but this category has been building for many months) lots of different services that can be viewed as interest networks in one way or another, and here are some examples:

What all of these interest networks have in common is some sort of a bottom-up, user-driven crawl of the Web, which is the way that I’ve described Twine when we get the question about how we propose to index the entire Web (the answer: we don’t.

We let our users tell us what they’re most interested in, and we follow their lead).

Most interest networks exhibit the following characteristics as well:

  • They have some sort of bookmarking/submission/markup function to store and map data (often using existing metaphors, even if what’s under the hood is new)
  • They also have some sort of social sharing function to provide the network benefit (this isn’t exclusive to interest networks, obviously, but it is characteristic)
  • And in most cases, interest networks look to add some sort of “smarts” or “recommendations” capability to the mix (that is, you get more out than you put in)

This last bullet point is where I see next-generation interest networks really providing the most benefit over social bookmarking tools, wikis, collaboration suites and pure social networks of one kind or another.

To that end, we think that Twine is the first of a new breed of intelligent applications that really get to know you better and better over time – and that the more you use Twine, the more useful it will become. Adding your content to Twine is an investment in the future of your data, and in the future of your interests.

At first Twine begins to enrich your data with semantic tags and links to related content via our recommendations engine that learns over time. Twine also crawls any links it sees in your content and gathers related content for you automatically – adding it to your personal or group search engine for you, and further fleshing out the semantic graph of your interests which in turn results in even more relevant recommendations.

The point here is that adding content to Twine, or other next-generation interest networks, should result in increasing returns. That’s a key characteristic, in fact, of the interest networks of the future – the idea that the ratio of work (input) to utility (output) has no established ceiling.

Another key characteristic of interest networks may be in how they monetize. Instead of being advertising-driven, I think they will focus more on a marketing paradigm. They will be to marketing what search engines were to advertising. For example, Twine will be monetizing our rich model of individual and group interests, using our recommendation engine. When we roll this capability out in 2009, we will deliver extremely relevant, useful content, products and offers directly to users who have demonstrated they are really interested in such information, according to their established and ongoing preferences.

6 months ago, you could not really prove that “interest networking” was a trend, and certainly it wasn’t a clearly defined space. It was just an idea, and a goal. But like I said, I think that we’re at a tipping point, where the technology is getting to a point at which we can deliver greater substance to the user, and where the culture is starting to crave exactly this kind of service as a way of making the Web meaningful again.

I think that interest networks are a huge market opportunity for many startups thinking about what the future of the Web will be like, and I think that we’ll start to see the term used more and more widely. We may even start to see some attention from analysts — Carla, Jeremiah, and others, are you listening?

Now, I obviously think that Twine is THE interest network of choice. After all we helped to define the category, and we’re using the Semantic Web to do it. There’s a lot of potential in our engine and our application, and the growing community of passionate users we’ve attracted.

Our 1.0 release really focuses on UE/usability, which was a huge goal for us based on user feedback from our private beta, which began in March of this year. I’ll do another post soon talking about what’s new in Twine. But our TOS (time on site) at 6 minutes/user (all time) and 12 minutes/user (over the last month) is something that the team here is most proud of – it tells us that Twine is sticky, and that “the dogs are eating the dog food.”

Now that anyone can join, it will be fun and gratifying to watch Twine grow.

Still, there is a lot more to come, and in 2009 our focus is going to shift back to extending our Semantic Web platform and turning on more of the next-generation intelligence that we’ve been building along the way. We’re going to take interest networking to a whole new level.

Stay tuned!

Watch My best Talk: The Global Brain is Coming

I’ve posted a link to a video of my best talk — given at the GRID ’08 Conference in Stockholm this summer. It’s about the growth of collective intelligence and the Semantic Web, and the future and role the media. Read more and get the video here. Enjoy!

New Video: Leading Minds from Google, Yahoo, and Microsoft talk about their Visions for Future of The Web

Video from my panel at DEMO Fall ’08 on the Future of the Web is now available.

I moderated the panel, and our panelists were:

Howard Bloom, Author, The Evolution of Mass Mind from the Big Bang to the 21st Century

Peter Norvig, Director of Research, Google Inc.

Jon Udell, Evangelist, Microsoft Corporation

Prabhakar Raghavan, PhD, Head of Research and Search Strategy, Yahoo! Inc.

The panel was excellent, with many DEMO attendees saying it was the best panel they had ever seen at DEMO.

Many new and revealing insights were provided by our excellent panelists. I was particularly interested in the different ways that Google and Yahoo describe what they are working on. They covered lots of new and interesting information about their thinking. Howard Bloom added fascinating comments about the big picture and John Udell helped to speak about Microsoft’s longer-term views as well.

Enjoy!!!

The Future of the Desktop

This is an older version of this article. The most recent version is located here:

http://www.readwriteweb.com/archives/future_of_the_desktop.php

—————

I have spent the last year really thinking about the future of the Web. But lately I have been thinking more about the future of the desktop. In particular, here are some questions I am thinking about and some answers I’ve come up so far.

(Author’s Note: This is a raw, first-draft of what I think it will be like. Please forgive any typos — I am still working on this and editing it…)

What Will Happen to the Desktop?

As we enter the third decade of the Web we are seeing an increasing shift from local desktop applications towards Web-hosted software-as-a-service (SaaS). The full range of standard desktop office tools (word processors, spreadsheets, presentation tools, databases, project management, drawing tools, and more) can now be accessed as Web-hosted apps within the browser. The same is true for an increasing range of enterprise applications. This process seems to be accelerating.

As more kinds of applications become available in Web-based form, the Web browser is becoming the primary framework in which end-users work and interact. But what will happen to the desktop? Will it too eventually become a Web-hosted application? Will the Web browser swallow up the desktop? Where is the desktop headed?

Is the desktop of the future going to just be a web-hosted version of the same old-fashioned desktop metaphors we have today?

No. There have already been several attempts at doing this — and they never catch on. People don’t want to manage all their information on the Web in the same interface they use to manage data and apps on their local PC.

Partly this is due to the difference in user experience between using files and folders on a local machine and doing that in “simulated” fashion via some Flash-based or HTML-based imitation of a desktop. Imitations desktops to-date have simply been clunky and slow imitations of the real-thing at best. Others have been overly slick. But one thing they all have in common: None of them have nailed it. The desktop of the future – what some have called “the Webtop” – still has yet to be invented.

It’s going to be a hosted web service

Is the desktop even going to exist anymore as the Web becomes increasingly important? Yes, there will have to be some kind of interface that we consider to be our personal “home” and “workspace” — but ultimately it will have to be a unified space that all our devices connect to and share. This requires that it be a hosted online service.

Currently we have different information spaces on different devices (laptop, mobile device, PC). These will merge. Native local clients could be created for various devices, but ultimately the simplest and therefore most likely choice is to just use the browser as the client. This coming “Webtop” will provide an interface to your local devices, applications and information, as well as to your online life and information.

Today we think of our Web browser running inside our desktop as an applicaiton. But actually it will be the other way around in the future: Our desktop will run inside our browser as an application.

Instead of the browser running inside, or being launched from, some kind of next-generation desktop web interface technology, it’s will be the other way around: The browser will be the shell and the desktop application will run within it either as a browser add-in, or as a web-based application.

The Web 3.0 desktop is going to be completely merged with the Web — it is going to be part of the Web. In fact there may eventually be no distinction between the desktop and the Web anymore.

The focus shifts from information to attention

As our digital lives shift from being focused on the old fashioned desktop to the Web environment we will see a shift from organizing information spatially (directories, folders, desktops, etc.) to organizing information temporally (feeds, lifestreams, microblogs, timelines, etc.).

Instead of being just a directory, the desktop of the future is going to be more like a feed reader or social news site. The focus will be on keeping up with all the stuff flowing in and out of the user’s environment. The interface will be tuned to help the user understand what the trends are, rather than just on how things are organized.

The focus will be on helping the user to manage their attention rather than just their information. This is a leap to the meta-level: A second-order desktop. Instead of just being about the information (the first-order), it is going to be about what is happening with the information (the second-order).

Users are going to shift from acting as librarians to acting as daytraders.

Our digital roles are already shifting from acting as librarians to becoming more like daytraders. In the PC era we were all focused on trying to manage the stuff on our computers — in other words, we were acting as librarians. But this is going to shift. Librarians organize stuff, but daytraders are focused on discovering and keeping track of trends. It’s a very different focus and activity, and it’s what we are all moving towards.

We are already spending more of our time keeping up with change and detecting trends, than on organizing information. In the coming decade the shelf-life of information is going to become vanishingly short and the focus will shift from storage and recall to real-time filtering, trend detection and prediction.

The Webtop will be more social and will leverage and integrate collective intelligence

The Webtop is going to be more socially oriented than desktops of today — it will have built-in messaging and social networking, as well as social-media sharing, collaborative filtering, discussions, and other community features.

The social dimension of our lives is becoming perhaps our most important source of information. We get information via email from friends, family and colleagues. We get information via social networks and social media sharing services. We co-create information with others in communities.

The social dimension is also starting to play a more important role in our information management and discovery activities. Instead of those activities remaining as solitary, they are becoming more communal. For example many social bookmarking and social news sites use community sentiment and collaborative filtering to help to highlight what is most interesting, useful or important.

It’s going to have powerful semantic search and social search capabilities built-in

The Webtop is going to have more powerful search built-in. This search will combine both social and semantic search features. Users will be able to search their information and rank it by social sentiment (for example, “find documents about x and rank them by how many of my friends liked them.”)

Semantic search will enable highly granular search and navigation of information along a potentially open-ended range of properties and relationships.

For example you will be able to search in a highly structured way — for example, search for products you once bookmarked that have a price of $10.95 and are on-sale this week. Or search for documents you read which were authored by Sue and related to project X, in the last month.

The semantics of the future desktop will be open-ended. That is to say that users as well as other application and information providers will be able to extend it with custom schemas, new data types, and custom fields to any piece of information.

Interactive shared spaces instead of folders

Forget about shared folders — that is an outmoded paradigm. Instead, the  new metaphor will be interactive shared spaces.

The need for shared community space is currently being provided for online by forums, blogs, social network profile pages, wikis, and new community sites. But as we move into Web 3.0 these will be replaced by something that combines their best features into one. These next-generation shared spaces will be like blogs, wikis, communities, social networks, databases, workspaces and search engines in one.

Any group of two or more individuals will be able to participate in a shared space that connects their desktops for a particular purpose. These new shared spaces will not only provide richer semantics in the underlying data, social network, and search, but they will also enable groups to seamlessly and collectively add, organize, track, manage, discuss, distribute, and search for information of mutual interest.

The personal cloud

The future desktop will function like a “personal cloud” for users. It will connect all their identities, data, relationships, services and activities in one virtual integrated space. All incoming and outgoing activity will flow through this space. All applications and services that a user makes use of will connect to it.

The personal cloud may not have a center, but rather may be comprised of many separate sub-spaces, federated around the Web and hosted by different service-providers. Yet from an end-user perspective it will function as a seamlessly integrated service. Users will be able to see and navigate all their information and applications, as if they were in one connected space, regardless of where they are actually hosted. Users will be able to search their personal cloud from any point within it.

Open data, linked data and open-standards based semantics

The underlying data in the future desktop, and in all associated services it connects, will be represented using open-standard data formats. Not only will the data be open, but the semantics of the data – the schema – will also be defined in an open way. The emerigng Semantic Web provides a good infrastructure for enabling this to happen.

The value of open linked-data and open semantics is that data will not be held prisoner anywhere and can easily be integrated with other data.

Users will be able to seamlessly move and integrate their data, or parts of their data, in different services. This means that your Webtop might even be portable to a different competing Webtop provider someday. If and when that becomes possible, how will Webtop providers compete to add value?

It’s going to be smart

One of the most important aspects of the coming desktop is that it’s going to be smart. It’s going to learn and help users to be more productive. Artificial intelligence is one of the key ways that competing Webtop providers will differentiate their offerings.

As you use it, it’s going to learn about your interests, relationships, current activities, information and preferences. It will adaptively self-organize to help you focus your attention on what is most important to whatever context you are in.

When reading something while you are taking a trip to Milan it may organize itself to be more contextually relevant to that time, place and context. When you later return home to San Francisco it will automatically adapt and shift to your home context. When you do a lot of searches about a certain product it will realize your context and intent has to do with that product and will adapt to help you with that activity for a while, until your behavior changes.

Your desktop will actually be a semantic knowledge base on the back-end. It will encode a rich semantic graph of your information, relationships, interests, behavior and preferences. You will be able to permit other applications to access part or all of your graph to datamine it and provide you with value-added views and even automated intelligent assistance.

For example, you might allow an agent that cross-links things to see all your data: it would go and add cross links to relevant things onto all the things you have created or collected. Another agent that makes personalized buying recommendations might only get to see your shopping history across all shopping sites you use.

Your desktop may also function as a simple personal assistant at times. You will be able to converse with your desktop eventually — through a conversational agent interface. While on the road you will be able to email or SMS in questions to it and get back immediate intelligent answers. You will even be able to do this via a voice interface.

For example, you might ask, “where is my next meeting?” or “what Japanese restaurants do I like in LA?” or “What is Sue’s Smith’s phone number?” and you would get back answers. You could also command it to do things for you — like reminding you to do something, or helping you keep track of an interest, or monitoring for something and alerting you when it happens.

Because your future desktop will connect all the relationships in your digital life — relationships connecting people, information, behavior, prefences and applications — it will be the ultimate place to learn about your interests and preferences.

Federated, open policies and permissions

This rich graph of meta-data that comprises your future desktop will enable the next-generation of smart services to learn about you and help you in an incredibly personalized manner. It will also of course be rife with potential for abuse and privacy will be a major function and concern.

One of the biggest enabling technologies that will be necessary is a federated model for sharing meta-data about policies and permissions on data. Information that is considered to be personal and private in Web site X should be recognized and treated as such by other applications and websites you choose to share that information with. This will require a way for sharing meta-data about your policies and permissions between different accounts and applicaitons you use.

The semantic web provides a good infrastructure for building and deploying a decentralized framework for policy and privacy integration, but it has yet to be developed, let alone adopted. For the full vision of the future desktop to emerge a universally accepted standard for exchanging policy and permission data will be a necessary enabling technology.

Who is most likely to own the future desktop?

When I think about what the future desktop is going to look like it seems to be a convergence of several different kinds of services that we currently view as separate.

It will be hosted on the cloud and accessible across all devices. It will place more emphasis on social interaction, social filtering, and collective intelligence. It will provide a very powerful and extensible data model with support for both unstructured and arbitrarily structured information. It will enable almost peer-to-peer like search federation, yet still have a unified home page and user-experience. It will be smart and personalized. It will be highly decentralized yet will manage identity, policies and permissions in an integrated cohesive and transparent manner across services.

By cobbling together a number of different services that exist today you could build something like this in a decentralized fashion. Is that how the desktop of the future will come about? Or will it be a new application provided by one player with a lot of centralized market power? Or could an upstart suddently emerge with the key enabling technologies to make this possible? It’s hard to predict, but one thing is certain: It will be an interesting process to watch.

If Social Networks Were Like Cars…

I have been thinking a lot about social networks lately, and why there are so many of them, and what will happen in that space.

Today I had what I think is a "big realization" about this.

Everyone, including myself, seems to think that there is only room for one big social network, and it looks like Facebook is winning that race. But what if that assumption is simply wrong from the start?

What if social networks are more like automobile brands? In other words, there can, will and should be many competing brands in the space?

Social networks no longer compete on terms of who has what members. All my friends are in pretty much every major social network.

I also don’t need more than one social network, for the same reason — my friends are all in all of them. How many different ways do I need to reach the same set of people? I only need one.

But the Big Realization is that no social network satisfies all types of users. Some people are more at home in a place like LinkedIn than they are in Facebook, for example. Others prefer MySpace.  There are always going to be different social networks catering to the common types of people (different age groups, different personalities, different industries, different lifestyles, etc.).

The Big Realization implies that all the social networks are going to be able to interoperate eventually, just like almost all email clients and servers do today. Email didn’t begin this way. There were different networks, different servers and different clients, and they didn’t all speak to each other. To communicate with certain people you had to use a certain email network, and/or a certain email program. Today almost all email systems interoperate directly or at least indirectly. The same thing is going to happen in the social networking space.

Today we see the first signs of this interoperability emerging as social networks open their APIs and enable increasing integration. Currently there is a competition going on to see which "open" social network can get the most people and sites to use it. But this is an illusion. It doesn’t matter who is dominant, there are always going to be alternative social networks, and the pressure to interoperate will grow until it happens. It is only a matter of time before they connect together.

I think this should be the greatest fear at companies like Facebook. For when it inevitably happens they will be on a level playing field competing for members with a lot of other companies large and small. Today Facebook and Google’s scale are advantages, but in a world of interoperability they may actually be disadvantages — they cannot adapt, change or innovate as fast as smaller, nimbler startups.

Thinking of social networks as if they were automotive brands also reveals interesting business opportunities. There are still several unowned opportunities in the space.

Myspace is like the car you have in high school. Probably not very expensive, probably used, probably a bit clunky. It’s fine if you are a kid driving around your hometown.

Facebook is more like the car you have in college. It has a lot of your junk in it, it is probably still not cutting edge, but its cooler and more powerful.

LinkedIn kind of feels like a commuter car to me. It’s just for business, not for pleasure or entertainment.

So who owns the "adult luxury sedan" category? Which one is the BMW of social networks?

Who owns the sportscar category? Which one is the Ferrari of social networks?

Who owns the entry-level commuter car category?

Who owns equivalent of the "family stationwagon or minivan" category?

Who owns the SUV and offroad category?

You see my point. There are a number of big segments that are not owned yet, and it is really unlikely that any one company can win them all.

If all social networks are converging on the same set of features, then eventually they will be close to equal in function. The only way to differentiate them will be in terms of the brands they build and the audience segments they focus on. These in turn will cause them to emphasize certain features more than others.

In the future the question for consumers will be "Which social network is most like me? Which social network is the place for me to base my online presence?"

Sue may connect to Bob who is in a different social network — his account is hosted in a different social network. Sue will not be a member of Bob’s service, and Bob will not be a member of Sue’s, yet they will be able to form a social relationship and communication channel. This is like email. I may use Outlook and you may use Gmail, but we can still send messages to each other.

Although all social networks will interoperate eventually, depending on each person’s unique identity they may choose to be based in — to live and surf in — a particular social network that expresses their identity, and caters to it. For example, I would probably want to be surfing in the luxury SUV of social networks at this point in my life, not in the luxury sedan, not the racecar, not in the family car, not the dune-buggy. Someone else might much prefer an open source, home-built social network account running on a server they host. It shouldn’t matter — we should still be able to connect, share stuff, get notified of each other’s posts, etc. It should feel like we are in a unified social networking fabric, even though our accounts live in different services with different brands, different interfaces, and different features.

I think this is where social networks are heading. If it’s true then there are still many big business opportunities in this space.

A Few Predictions for the Near Future

This is a five minute video in which I was asked to make some predictions for the next decade about the Semantic Web, search and artificial intelligence. It was done at the NextWeb conference and was a fun interview.


Learning from the Future with Nova Spivack from Maarten on Vimeo.

My Visit to DERI — World's Premier Semantic Web Research Institute

Earlier this month I had the opportunity to visit, and speak at, the Digital Enterprise Research Institute (DERI), located in Galway, Ireland. My hosts were Stefan Decker, the director of the lab, and John Breslin who is heading the SIOC project.

DERI has become the world’s premier research institute for the Semantic Web. Everyone working in the field should know about them, and if you can, you should visit the lab to see what’s happening there.

Part of the National University of Ireland, Galway. With over 100 researchers focused solely on the Semantic Web, and very significant financial backing, DERI has, to my knowledge, the highest concentration of Semantic Web expertise on the planet today. Needless to say, I was very impressed with what I saw there. Here is a brief synopsis of some of the projects that I was introduced to:

  • Semantic Web Search Engine (SWSE) and YARS, a massively scalable triplestore.  These projects are concerned with crawling and indexing the information on the Semantic Web so that end-users can find it. They have done good work on consolidating data and also on building a highly scalable triplestore architecture.
  • Sindice — An API and search infrastructure for the Semantic Web. This project is focused on providing a rapid indexing API that apps can use to get their semantic content indexed, and that can also be used by apps to do semantic searches and retrieve semantic content from the rest of the Semantic Web. Sindice provides Web-scale semantic search capabilities to any semantic application or service.
  • SIOC — Semantically Interlinked Online Communities. This is an ontology for linking and sharing data across online communities in an open manner, that is getting a lot of traction. SIOC is on its way to becoming a standard and may play a big role in enabling portability and interoperability of social Web data.
  • JeromeDL is developing technology for semantically enabled digital libraries. I was impressed with the powerful faceted navigation and search capabilities they demonstrated.
  • notitio.us. is a project for personal knowledge management of bookmarks and unstructured data.
  • SCOT, OpenTagging and Int.ere.st.  These projects are focused on making tags more interoperable, and for generating social networks and communities from tags. They provide a richer tag ontology and framework for representing, connecting and sharing tags across applications.
  • Semantic Web Services.  One of the big opportunities for the Semantic Web that is often overlooked by the media is Web services. Semantics can be used to describe Web services so they can find one another and connect, and even to compose and orchestrate transactions and other solutions across networks of Web services, using rules and reasoning capabilities. Think of this as dynamic semantic middleware, with reasoning built-in.
  • eLite. I was introduced to the eLite project, a large e-learning initiative that is applying the Semantic Web.
  • Nepomuk.  Nepomuk is a large effort supported by many big industry players. They are making a social semantic desktop and a set of developer tools and libraries for semantic applications that are being shipped in the Linux KDE distribution. This is a big step for the Semantic Web!
  • Semantic Reality. Last but not least, and perhaps one of the most eye-opening demos I saw at DERI, is the Semantic Reality project. They are using semantics to integrate sensors with the real world. They are creating an infrastructure that can scale to handle trillions of sensors eventually. Among other things I saw, you can ask things like "where are my keys?" and the system will search a network of sensors and show you a live image of your keys on the desk where you left them, and even give you a map showing the exact location. The service can also email you or phone you when things happen in the real world that you care about — for example, if someone opens the door to your office, or a file cabinet, or your car, etc. Very groundbreaking research that could seed an entire new industry.

In summary, my visit to DERI was really eye-opening and impressive. I recommend that major organizations that want to really see the potential of the Semantic Web, and get involved on a research and development level, should consider a relationship with DERI — they are clearly the leader in the space.

How about Web 3G?

I’m here at the BlogTalk conference in Cork, Ireland with a range of bloggers and technologists discussing the emerging social Web. Including myself, Ian Davis and Paul Miller from Talis, there are also a bunch of other Semantic Web folks including Dan Brickley, and a group from DERI Galway.

Over dinner a few of us were discussing the terms “Semantic Web” versus “Web 3.0” and we all felt a better term was needed. After some thinking, Ian Davis suggested “Web 3G.” I like this term better than Web 3.0 because it loses the “version number” aspect that so many objected to. It has a familiar ring to it as well, reminding me of the 3G wireless phone initiative. It also suggests Tim Berners-Lee’s “Giant Global Graph” or GGG — a synonym for the Semantic Web. Ian stayed up late and put together a nice blog post about the term, echoing many of my own sentiments about how this term should apply to a decade (the third decade of the Web), rather than to a particular technology.

Help us Win! Twine is a Finalist in the Crunchies!

My company’s product, Twine.com, has made it to the finalist round in the Crunchies, a new annual tech industry awards competition, under the Best Technical Achievement category. Please help us win by casting your vote for Twine here. Thanks!

UPDATE: It turns out, that for some odd reason the Crunchies allows each voter to vote once per day per category — in other words, you can vote multiple times in the same category — one vote per user per day — so please vote for Twine again if you can.

Powerpoint Deck: Making Sense of the Semantic Web, and Twine

Now that I have been asked by several dozen people for the slides from my talk on "Making Sense of the Semantic Web," I guess it’s time to put them online. So here they are, under the Creative Commons Attribution License (you can share it with attribution this site).

You can download the Powerpoint file at the link below:

Download nova_spivack_semantic_web_talk.ppt


Or you can view it right here:

Enjoy! And I look forward to your thoughts and comments.

True Knowledge is Cool

The most interesting and exciting new app I’ve seen this month (other than Twine of course!) is a new semantic search engine called True Knowledge. Go to their site and watch their screencast to see what the next generation of search is really going to look like.

True Knowledge is doing something very different from Twine — whereas Twine is about helping individuals, groups and teams manage their private and shared knowledge, True Knowledge is about making a better public knowledgebase on the Web — in a sense they are a better search engine combined with a better Wikipedia. They seem to overlap more with what is being done by natural language search companies like Powerset and companies working on public databases, such as Metaweb and Wikia.

I don’t yet know whether True Knowledge is supporting W3C open-standards for the Semantic Web, but if they do, they will be well-positioned to become a very central service in the next phase of the Web. If they don’t they will just be yet another silo of data — but a very useful one at least. I personally hope they provide SPARQL API access at the very least. Congratulations to the team at True Knowledge! This is a very impressive piece of work.

Radar Networks Announces Twine.com

My company, Radar Networks, has just come out of stealth. We’ve announced what we’ve been working on all these years: It’s called Twine.com. We’re going to be showing Twine publicly for the first time at the Web 2.0 Summit tomorrow. There’s lot’s of press coming out where you can read about what we’re doing in more detail. The team is extremely psyched and we’re all working really hard right now so I’ll be brief for now. I’ll write a lot more about this later.

Continue reading

Web 3.0 — The Best Official Definition Imaginable

Jason just blogged his take on an official definition of "Web 3.0" — in his case he defines it as better content, built using Web 2.0 technologies. There have been numerous responses already, but since I am one of the primary co-authors of the Wikipedia page on the term Web 3.0, I thought I should throw my hat in the ring here.

Web 3.0, in my opinion is best defined as the third-decade of the Web (2009 – 2019), during which time several key technologies will become widely used. Chief among them will be RDF and the technologies of the emerging Semantic Web. While Web 3.0 is not synonymous with the Semantic Web (there will be several other important technology shifts in that period), it will be largely characterized by semantics in general.

Web 3.0 is an era in which we will upgrade the back-end of the Web,
after a decade of focus on the front-end (Web 2.0 has mainly been about
AJAX, tagging, and other front-end user-experience innovations.) Web
3.0 is already starting to emerge in startups such as my own Radar Networks (our product is Twine) but will really become mainstream around 2009.

Why is defining Web 3.0 as a decade of time better than just about any other possible definition of the term? Well for one thing, it's a definition that can't easily be co-opted by any company or individual around some technology or product. It's also a completely unambiguous definition — it refers to a particular time period and everything that happens in Web technology and business during that period. This would end the debate about what the term means and move it to something more useful to discuss: What technologies and trends will actually become important in the coming decade of the Web?

It's time to once again pull out my well-known graph of Web 3.0 to illustrate what I mean…

Radarnetworkstowardsawebos

(Click the thumbnail for a larger, reusable version)

I've written fairly extensively on the subjects of defining Web 3.0 and the Semantic Web. Here are some links to get you started if you want to dig deeper:

The Semantic Web: From Hypertext to Hyperdata
The Meaning and Future of the Semantic Web
How the WebOS Evolves
Web 3.0 Roundup
Gartner is Wrong About Web 3.0
Beyond Keyword (And Natural Language) Search
Enriching the Connections of the Web: Making the Web Smarter
Next Step for the Web
Doing for Data What HTML Did for Documents

The Semantic Web, Collective Intelligence and Hyperdata

I’m posting this in response to a recent post by Tim O’Reilly which focused on disambiguating what the Semantic Web is and is not, as well as the subject of Collective Intelligence. I generally agree with Tim’s post, but I do have some points I would add by way of clarification. In particular, in my opinion,  the Semantic Web is all about collective intelligence, on several levels. I would also suggest that the term "hyperdata" is a possibly useful way to express what the Semantic Web is really all about.

What Makes Something a Semantic Web Application?

I agree with Tim that the term "Semantic Web" refers to the use of a particular set of emerging W3C open standards. These standards include RDF, OWL, SPARQL, and GRDDL. A key requirement for an application to have "Semantic Web inside" so to speak, is that it makes use of or is compatible with, at the very least, basic RDF. Another alternative definition is that for an application to be "Semantic Web" it must make at least some use of an ontology, using a W3C standard for doing so.

Semantic Versus Semantic Web

Many applications and services claim to be "semantic" in one manner or another, but that does not mean they are "Semantic Web." Semantic applications include any applications that can make sense of meaning, particularly in language such as unstructured text, or structured data in some cases. By this definition, all search engines today are somewhat "semantic" but few would qualify as "Semantic Web" apps.

The Difference Between "Data On the Web" and a "Web of Data"

The Semantic Web is principally about working with  data in a new and hopefully better way, and making that data available on the Web if desired in an open fashion such that other applications can understand and reuse it more easily. We call this idea "The Data Web" — the notion is that we are transforming the Web from a distributed file server into something that is more like a distributed database.

Instead of the basic objects being web pages, they are actually pieces of data (triples) and records formed from them (sets, trees, graphs or objects comprised of triples). There can be any number of triples within a Web page, and there can also be triples on the Web that do not exist within Web pages at all — they can come directly from databases for example.

One might respond to this by noting that there is already a lot of data on the Web, in XML and other formats — how is the Semantic Web different from that? What is the difference between "Data on the Web" and the idea of "The Data Web?"

The best answer to this question that I have heard was something that Dean Allemang said at a recent Semantic Web SIG in Palo Alto. Dean said, "Sure there is data on the Web, but it’s not actually a web of data."  The difference is that in the Semantic Web paradigm, the data can be linked to other data in other places, it’s a web of data, not just data on the Web.

I call this concept of interconnected data, "Hyperdata." It does for data what hypertext did for text. I’m probably not the originator of this term, but I think it is a very useful term and analogy for explaining the value of the Semantic Web.

Another way to think of it is that the current Web is a big graph of interconnected nodes, where the nodes are usually HTML documents, but in the Semantic Web we are talking about a graph of interconnected data statements that can be as general or specific as you want. A data record is a set of data statements about the same subject, and they don’t have to live in one place on the network — they could be spread over many locations around the Web.

A statement to the effect of "Sue lives in Palo Alto" could exist on site A, refer to a URI for a statement defining Sue on site B, a URI for a statement that defines "lives in" on site C, and a URI for a statement defining "Palo Alto" on site D. That’s a web of data. What’s cool is that anyone can potentially add statements to this web of data, it can be completely emergent.

The Semantic Web is Built by and for Collective Intelligence

This is where I think Tim and others who think about the Semantic Web may be missing an essential point. The Semantic Web is in fact highly conducive to "collective intelligence." It doesn’t require that machines add all the statements using fancy AI. In fact, in a next-generation folksonomy, when tags are created by human users, manually, they can easily be encoded as RDF statements. And by doing this you get lots of new capabilities, like being able to link tags to concepts that define their meaning, and to other related tags.

Humans can add tags that become semantic web content. They can do this manually or software can help them. Humans can also fill out forms that generate RDF behind the scenes, just as filling out a blog posting form generates HTML, XML, ATOM etc. Humans don’t actually write all that code, software does it for them, yet blogging and wikis for example are considered to be collective intelligence tools.

So the concept of folksonomy and tagging is truly orthogonal to the Semantic Web. They are not mutually exclusive at all. In fact the Semantic Web — or at least "Semantic Web Lite" (RDF + only basic use of OWL + basic SPARQL) is capable of modelling and publishing any data in the world in a more open way.

Any application that uses data could do everything it does using these technologies. Every single form of social, user-generated content and community could, and probably will, be implemented using RDF in one manner or another within the next decade or so. And in particular, RDF and OWL + SPARQL are ideal for social networking services — the data model is a much better match for the structure of the data and the network of users and the kinds of queries that need to be done.

Folktologies

This notion that somehow the Semantic Web is not about folksonomy needs to be corrected. For example, take Metaweb’s Freebase. Freebase is what I call a "folktology" — it’s an emergent, community generated ontology. Users collaborate to add to the ontology and the knowledge base that is populated within it. That’s a wonderful example of collective intelligence, user generated content, and semantics (although technically to my knowledge they are not using RDF for this, their data model is from what I can see functionally equivalent and I would expect at least a SPARQL interface from them eventually).

But that’s not all — check out TagCommons and this Tag Ontology discussion, and also the SKOS ontology — all of which are working on semantic ways of characterizing simple tags in order to enrich folksonomies and enable better collective intelligence.

There are at least two other places where the Semantic Web naturally leverages and supports collective intelligence. The first is the fact that people and software can generate triples (people could do it by hand, but generally they will do it by filling out Web forms or answering questions or dialog boxes etc.) and these triples can live all over the Web, yet interconnect or intersect (when they are about the same subjects or objects).

I can create data about a piece of data you created, for example to state that I agree with it, or that I know something else about it. You can create data about my data. Thus a data-set can be generated in a distributed way — it’s not unlike a wiki for example. It doesn’t have to work this way, but at least it can if people do this.

The second point is that OWL, the ontology language, is designed to support an infinite number of ontologies — there doesn’t have to be just one big ontology to "rule them all." Anyone can make a simple or complex ontology and start to then make data statements that refer to it. Ontologies can link to or include other ontologies, or pieces of them, to create bigger distributed ontologies that cover more things.

This is kind of like not only mashing up the data, but also mashing up the schemas too. Both of these are examples of collective intelligence. In the case of ontologies, this is already happening, for example many ontologies already make use of other ontologies like the Dublin Core and Foaf.

The point here is that there is in fact a natural and very beneficial fit between the technologies of the Semantic Web and what Tim O’Reilly defines Web 2.0 to be about (essentially collective intelligence). In fact the designers of the underlying standards of the Semantic Web specifically had "collective intelligence" in mind when they came up with these ideas. They were specifically trying to rectify several problems in the closed, data-silo world of old fashioned databases. The big motivation was to make data more integrated, to enable applications to share data more easily, and to be able to build data with other data, and to build schemas with other schemas. It’s all about enabling connections and network effects.

Now, whether people end up using these technologies to do interesting things that enable human-level collective intelligence (as opposed to just software level collective intelligence) is an open question. At least some companies such as my own Radar Networks and Metaweb, and Talis (thanks, Danny), are directly focused on this, and I think it is safe to say this will be a big emerging trend. RDF is a great fit for social and folksonomy-based applications.

Web 3.0 and the concept of "Hyperdata"

Where Tim defines Web 2.0 as being about collective intelligence generally, I would define Web 3.0 as being about "connective intelligence." It’s about connecting data, concepts, applications and ultimately people. The real essence of what makes the Web great is that it enables a global hypertext medium in which collective intelligence can emerge. In the case of Web 3.0, which begins with the Data Web and will evolve into the full-blown Semantic Web over a decade or more, the key is that it enables a global hyperdata medium (not just hypertext).

As I mentioned above, hyperdata is to data what hypertext is to text. Hyperdata is a great word — it is so simple and yet makes a big point. It’s about data that links to other data. It does for data what hypertext does for text. That’s what RDF and the Semantic Web are really all about. Reasoning is NOT the main point (but is a nice future side-effect…). The main point is about growing a web of data.

Just as the Web enabled a huge outpouring of collective intelligence via an open global hypertext medium, the Semantic Web is going to enable a similarly huge outpouring of collective knowledge and cognition via a global hyperdata medium. It’s the Web, only better.

Open Source Projects for Extracting Data and Metadata from Files & the Web

I’ve been looking around for open-source libraries (preferably in Java, but not required) for extracting data and metadata from common file formats and Web formats. One project that looks very promising is Aperture. Do you know of any others that are ready or almost ready for prime-time use? Please let me know in the comments! Thanks.

Microsoft Astoria Not Supporting RDF

Microsoft’s Astoria project has decided to make RDF a lower priority and is not supporting it for now. So much for Microsoft participating in the Semantic Web.

The Astoria project page describes the project thusly:


The goal of Microsoft Codename Astoria is to enable applications to
expose data as a data service that can be consumed by web clients
within a corporate network and across the internet. The data service is
reachable over HTTP, and URIs are used to identify the various pieces
of information available through the service. Interactions with the
data service happens in terms of HTTP verbs such as GET, POST, PUT and
DELETE, and the data exchanged in those interactions is represented in
simple formats such as XML and JSON.

That sounds like a perfect description of a Web 3.0 service– and a perfect use-case for RDF and SPARQL!  But the team at Astoria doesn’t seem to feel anyone is actually using RDF enough to warrant supporting that open standard. Well, perhaps among Microsoft’s customers that is true enough. But on the other hand, this is a missed opportunity to exercise forward-thinking leadership that will pay dividends to Microsoft in the future when millions of developers and applications will be using RDF as their common language. When is that future? Perhaps as soon as 2 – 3 years. Certainly not more than 5 years.

Welcome to the age of openness: The more open your platform and data, the more developers will want to use it. RDF is one of the ways to make data more open, and supporting it in a data-web platform makes good sense. But it’s hard to teach a an old god new tricks!

Virtual Out of Body Experiences

A very cool experiment in virtual reality has shown it is possible to trick the mind into identifying with a virtual body:

Through these goggles, the volunteers could see a camera
view of their own back – a three-dimensional "virtual own body" that
appeared to be standing in front of them.

When the researchers stroked the back of the volunteer
with a pen, the volunteer could see their virtual back being stroked
either simultaneously or with a time lag.

The volunteers reported that the sensation seemed to be
caused by the pen on their virtual back, rather than their real back,
making them feel as if the virtual body was their own rather than a
hologram.

Volunteers

Even when the camera was switched to film the back of a
mannequin being stroked rather than their own back, the volunteers
still reported feeling as if the virtual mannequin body was their own.

And when the researchers switched off the goggles,
guided the volunteers back a few paces, and then asked them to walk
back to where they had been standing, the volunteers overshot the
target, returning nearer to the position of their "virtual self".

This has implications for next-generation video games and virtual reality. It also has interesting implications for consciousness studies in general.

Continue reading

Knowledge Networking

I’ve been thinking for several years about Knowledge Networking. It’s not a term I invented, it’s been floating around as a meme for at least a decade or two. But recently it has started to resurface in my own work.

So what is a knowledge network? I define a knowledge network as a form of collective intelligence in which a network of people (two or more people connected by social-communication relationships) creates, organizes, and uses a collective body of knowledge. The key here is that a knowledge network is not merely a site where a group of people work on a body of information together (such as the wikipedia), it’s also a social network — there is an explicit representation of a social relationship within it. So it’s more like a social network than for example a discussion forum or a wiki.

I would go so far as to say that knowledge networks are the third-generation of social software. (Note this is based in-part on ideas that emerged in conversations I have had with Peter Rip, so this also his idea):

  • First-generation social apps were about communication (eg.
    messaging such as Email, discussion boards, chat rooms, and IM)
  • Second-generation social apps were about people and content (eg. Social networks, social media sharing, user-generated content)
  • Third-generation social apps are about relationships and knowledge  (eg. Wikis, referral networks, question and answer systems, social recommendation systems, vertical knowledge and expertise portals, social mashup apps, and coming soon, what we’re building at Radar Networks)

Just some thoughts on a Saturday morning…

The Rise of the Social Operating System

In recent months we have witnessed a number of social networking sites begin to open up their platforms to outside developers. While this trend has been exhibited most prominently by Facebook, it is being embraced by all the leading social networking services, such as Plaxo, LinkedIn, Myspace and others. Along separate dimensions we also see a similar trend towards "platformization" in IM platforms such as Skype as well as B2B tools such as Salesforce.com.

If we zoom out and look at all this activity from a distance it appears that there is a race taking place to become "the social operating" system of the Web. A social operating system might be defined as a system that provides for systematic management and facilitation of human social relationships and interactions.

We might list some of the key capabilities of an ideal "social operating system" as:

  • Identity management
    • Open portable identity
    • Personal profiles ("personas")
    • Privacy control
  • Relationship management
    • Directory and lookup services (location of people to communicate with)
    • Social networking (opt-in relationship formation, indirect social connectivity via social networks)
    • Spam control
  • Communication
    • Person to person communication
      • Synchronous (IM, VOIP)
      • Asynchronous (email, SMS)
    • Group communication
      • Synchronous (conferencing)
      • Asynchronous (group discussions)
  • Social Content distribution
    • Personal publishing (blogging, home pages)
    • Public content distribution
  • Social Coordination
    • Event management (scheduling, invitations, RSVP’s)
    • Calendaring
  • Social Collaboration
    • File sharing
    • Document collaboration (communal authoring/editing)
    • Collaborative filtering
    • Recommendation systems
    • Knowledge management
    • Human powered search
    • Project management
    • Workflow
  • Commerce
    • Classified advertising
    • Auctions
    • Shopping

Today I have not seen any single player that provides a coherent solution to this entire "social stack" however Microsoft, Yahoo, and AOL are probably the strongest contenders. Can Facebook and other social networks truly compete or will they ultimately be absorbed into one of these larger players?

Enriching the Connections of the Web — Making the Web Smarter

Web 3.0 — aka The Semantic Web — is about enriching the connections of the Web. By enriching the connections within the Web, the entire Web may become smarter.

I  believe that collective intelligence primarily comes from connections — this is certainly the case in the brain where the number of connections between neurons far outnumbers the number of neurons; certainly there is more "intelligence" encoded in the brain’s connections than in the neurons alone. There are several kinds of connections on the Web:

  1. Connections between information (such as links)
  2. Connections between people (such as opt-in social relationships, buddy lists, etc.)
  3. Connections between applications (web services, mashups, client server sessions, etc.)
  4. Connections between information and people (personal data collections, blogs, social bookmarking, search results, etc.)
  5. Connections between information and applications (databases and data sets stored or accessible by particular apps)
  6. Connections between people and applications (user accounts, preferences, cookies, etc.)

Are there other kinds of connections that I haven’t listed — please let me know!

I believe that the Semantic Web can actually enrich all of these types of connections, adding more semantics not only to the things being connected (such as representations of information or people or apps) but also to the connections themselves.

In the Semantic Web approach, connections are represented with statements of the form (subject, predicate, object) where the elements have URIs that connect them to various ontologies where their precise intended meaning can be defined. These simple statements are sometimes called "triples" because they have three elements. In fact, many of us are working with statements that have more than three elements ("tuples"), so that we can represent not only subject, predicate, object of statements, but also things like provenance (where did the data for the statement come from?), timestamp (when was the statement made), and other attributes. There really is no limit to what kind of metadata can be stored in these statements. It’s a very simple, yet very flexible and extensible data model that can represent any kind of data structure.

The important point for this article however is that in this data model rather than there being just a single type of connection (as is the case on the present Web which basically just provides the HREF hotlink, which simply means "A and B are linked" and may carry minimal metadata in some cases), the Semantic Web enables an infinite range of arbitrarily defined connections to be used.  The meaning of these connections can be very specific or very general.

For example one might define a type of connection called "friend of" or a type of connection called "employee of" — these have very different meanings (different semantics) which can be made explicit and also machine-readable using OWL. By linking a page about a person with the "employee of" link to another page about a different person, we can express that one of them employs the other. That is a statement that any application which can read OWL is able to see and correctly interpret, by referencing the underlying definition of "employee of" which is defined in some ontology and might for example specify that an "employee of" relation connects a person to a person or organization who is their employer. In other words, rather than just linking things with the generic "hotlink" we are all used to, they can now be linked with specific kinds of links that have very particular and unambiguous meaning and logical implications.

This has the potential at least to dramatically enrich the information-carrying capacity of connections (links) on the Web. It means that connections can carry more meaning, on their own. It’s a new place to put meaning in fact — you can put meaning between things to express their relationships. And since connections (links) far outnumber objects (information, people or applications) on the Web, this means we can radically improve the semantics of the structure of the Web as a whole — the Web can become more meaningful, literally. This makes a difference, even if all we do is just enrich connections between gross-level objects (in other words, connections between Web pages or data records, as opposed to connections between concepts expressed within them, such as for example, people and companies mentioned within a single document).

Even if the granularity of this improvement in connection technology is relatively gross level it could still be a major improvement to the Web. The long-term implications of this have hardly been imagined let alone understood — it is analogous to upgrading the dendrites in the human brain; it could be a catalyst for new levels of computation and intelligence to emerge.

It is important to note that, as illustrated above, there are many types of connections that involve people. In other words the Semantic Web, and Web 3.0, are just as much about people as they are about other things. Rather than excluding people, they actually enrich their relationships to other things. The Semantic Web, should, among other things, enable dramatically better social networking and collaboration to take place on the Web. It is not only about enriching content.

Now where will all these rich semantic connections come from? That’s the billion dollar question. Personally I think they will come from many places: from end-users as they find things, author content, bookmark content, share content and comment on content (just as hotlinks come from people today), as well as from applications which mine the Web and automatically create them. Note that even when Mining the Web a lot of the data actually still comes from people — for example, mining the Wikipedia, or a social network yields lots of great data that was ultimately extracted from user-contributions. So mining and artificial intelligence does not always imply "replacing people" — far from it! In fact, mining is often best applied as a means to effectively leverage the collective intelligence of millions of people.

These are subtle points that are very hard for non-specialists to see — without actually working with the underlying technologies such as RDF and OWL they are basically impossible to see right now. But soon there will be a range of Semantically-powered end-user-facing apps that will demonstrate this quite obviously. Stay tuned!

Of course these are just my opinions from years of hands-on experience with this stuff, but you are free to disagree or add to what I’m saying. I think there is something big happening though. Upgrading the connections of the Web is bound to have a significant effect on how the Web functions. It may take a while for all this to unfold however. I think we need to think in decades about big changes of this nature.

Web 3.0 — Next-Step for Web?

The Business 2.0 Article on Radar Networks and the Semantic Web just came online. It’s a huge article. In many ways it’s one of the best popular articles written about the Semantic Web in the mainstream press. It also goes into a lot of detail about what Radar Networks is working on.

One point of clarification, just in case anyone is wondering…

Web 3.0 is not just about machines — it’s actually all about humans — it leverages social networks, folksonomies, communities and social filtering AS WELL AS the Semantic Web, data mining, and artificial intelligence. The combination of the two is more powerful than either one on it’s own. Web 3.0 is Web 2.0 + 1. It’s NOT Web 2.0 – people. The "+ 1" is the
addition of software and metadata that help people and other
applications organize and make better sense of the Web. That new layer
of semantics — often called "The Semantic Web" — will add to and
build on the existing value provided by social networks, folksonomies,
and collaborative filtering that are already on the Web.

So at least here at Radar Networks, we are focusing much of our effort on facilitating people to help them help themselves, and to help each other, make sense of the Web. We leverage the amazing intelligence of the human brain, and we augment that using the Semantic Web, data mining, and artificial intelligence. We really believe that the next generation of collective intelligence is about creating systems of experts not expert systems.

Business 2.0 and BusinessWeek Articles About Radar Networks

It’s been an interesting month for news about Radar Networks. Two significant articles came out recently:

Business 2.0 Magazine published a feature article about Radar Networks in their July 2007 issue. This article is perhaps the most comprehensive article to-date about what we are working on at Radar Networks, it’s also one of the better articulations of the value proposition of the Semantic Web in general. It’s a fun read, with gorgeous illustrations, and I highly recommend reading it.

BusinessWeek  posted an article about Radar Networks on the Web. The article covers some of the background that led to my interests in collective intelligence and the creation of the company. It’s a good article and covers some of the bigger issues related to the Semantic Web as a paradigm shift. I would add one or two points of clarification in addition to what was stated in the article: Radar Networks is not relying solely on software to organize the Internet — in fact, the service we will be launching combines human intelligence and machine intelligence to start making sense of information, and helping people search and collaborate around interests more productively. One other minor point related to the article — it mentions the story of EarthWeb, the Internet company that I co-founded in the early 1990’s: EarthWeb’s content business actually was sold after the bubble burst, and the remaining lines of business were taken private under the name Dice.com. Dice is the leading job board for techies and was one of our properties. Dice has been highly profitable all along and recently filed for a $100M IPO.