Defining the Semantic Graph — What is it Really?

This is written in response to a post by Anne Zelenka.

I’ve been talking about the coming “semantic graph” for quite some time now, and it seems the meme has suddenly caught on thanks to a recent article by Tim Berners-Lee in which he speaks of an emerging “Giant Global Graph” or “GGG.” But if the GGG emerges it may or may not be semantic. For example social networks are NOT semantic today, even though they contain various kinds of links between people and other things.

So what makes a graph “semantic?” How is the semantic graph different from social networks like Facebook for example?

Many people think that the difference between a social graph and a semantic graph is that a semantic graph contains more types of nodes and links. That’s potentially true, but not always the case. In fact, you can make a semantic social graph or a non-semantic social graph. The concept of whether a graph is semantic is orthogonal to whether it is social.

A graph is “semantic” if the meaning of the graph is definedand exposed in an open and  machine-understandable fashion. In otherwords, a graph is semantic if the semantics of the graph are  part ofthe graph or atleast connected from the graph. This can be accomplished by representing a social graph using RDF and OWL, the languages of the Semantic Web.

Today most social networks are non-semantic, but it is relatively easy to transform them into semantic graphs. A simple way to make any non-semantic social graph into a semantic social graph is touse the FOAF ontology to define the entities and links in the graph.

FOAF stands for “friend of a friend” and is a simple ontology of peopleand social relationships. If a social network links its data to theFOAF ontology, and exposes these linkages to other applications on theWeb, then other applications can understand the meaning of the data inthe network in an unambiguous manner. In other words it is now asemantic social graph because its semantics are visible to otherapplications.

As illustrated by the FOAF example above, one way to make a graphsemantic is to use the W3C open standards for the Semantic Web (RDF andOWL) to represent, and define the meaning of, the nodes and links inthe graph. By using the Semantic Web, the graph becomesmachine-understandable and thus more easily navigated, imported by,searched, and integrated by other applications.

For example, let’s say that social network Application A comes alongand wants to use the dataset of social network Application B. App Asees the graph of nodes and links in B, and it sees something called a”has team” link connecting various nodes in the graph together. Whatdoes that mean? What kinds of things can or cannot be connected withthis link? What can be inferred if things are connected this way?

The meaning of “has team” is ambiguous to App A because it’s notdefined anywhere that the software can see. The only way App A can useApp B’s data correctly is if the programmer of App A speaks to theprogrammer of App B (or reads something they wrote such asdocumentation of some sort) that defines what they meant by the “hasteam” link.

Only by knowing what was intended by the programmer of App B, canApp A treat App B’s data appropriately, without any misinterpretationthat might lead to mistakes or inconsistencies. This is importantbecause, for example, if a user searches for “Yankees Players” shouldpeople who are linked by the “has team” link to sports teams called”Yankees” be returned, or does “has team” mean “a connection from aperson to a sports team they support,” or does it mean “a connectionfrom a person to a sports team they play on,” or does it mean “aconnection from a person to a workgroup they participate in?” In short,App A has no idea what to do with data that is linked by App B’s “hasteam” link unless it is explicitly programmed to make use of it.

The OWL language (Web Ontology Language) provides a way for theprogrammers of App A and App B to define what the links in their graphsmean in an unambiguous and machine-understandable way.  So App A justhas to look up this definition and it can instantly start to use AppB’s data correctly, without any new programming or difficultintegration.

How is this accomplished? The programmer of App B simply uses OWL todefine an ontology of social relationships for their service: forexample they define the “has team” link to be a link that connects aperson to a sports team they play on. They also define what they meanby a “sports team” (for example, “a group of two or more people thatplay a sport” and a sport is one of “baseball, basketball, football,soccer, hockey, tennis” and they link these terms to another ontologyof sports somewhere else on the Web.) The ontology file that definesApp B’s data is added to the Website of App B, and linked from it’sdata, so that other applications can see it.

Now when another application such as App A comes along and looks atApp B’s data it can reference App B’s ontology to see for itself whatwas intended by the “has team” link — it can see exactly what thatlink implies and what can be inferred by it. It understands how to useApp B’s data set, and how to correctly make new links using that dataset which are consistent with the meaning of the links it contains.

This is the real point of the Semantic Web open standards — RDFenables data to be represented in a database independent manner, andOWL enables the semantic of that data to be defined in an openmachine-understandable way so that other applications can use that datawithout having to first be programmed to do so. As long as they speakRDF/OWL, applications can use any data they find and lookup the meaningof any data they need to use so they can use the data appropriately.

For example, suppose another application, App C, that is OWL-awareapplication but has never seen App B’s data-set before and was notprogrammed specifically to use it, pulls some data out from App B’sAPI. App C can immediately begin to use this data correctly andconsistently with how App B uses it, because all that is necessary forunderstanding how to use B’s data is encoded in the OWL ontology thatApp B’s data refers to.

The point is here that using Semantic Web open standards such as RDFand OWL to encode what data means is a giant leap beyond just puttingraw data onto the Web in an open format. It doesn’t just put the dataitself on the Web, it also puts the definition of what the data meansand how to use it, on the Web in an open format.  A semantic graph isfar more  reusable than a non-semantic graph — it’s a graph thatcarries its own meaning.

The semantic graph is not merely a graph with links to more kinds ofthings than the social graph. It’s a graph of interconnected thingsthat is machine-understandable — it’s meaning or “semantics” isexplicitly represented on the Web, just like its data. This is the realway to make social networks open. Merely opening up their API’s is justthe first step.

Only when the semantics of data is defined and shared in an open waycan any graph truly be said to be semantic. Once data around the Web isdefined in a machine-understandable way, a whole new world of easy,instant mashups becomes possible. Applications can start to freely andinstantly mix and match each other’s data, including new data they werenot programmed in advance to understand. This opens up the door to theWeb truly becoming a giant database and eventually an integratedoperating system in which all applications are able to more easilyinteroperate and share data.

The Giant Global Graph may or may not be a semantic graph. Thatdepends on whether it is implemented with, or at least connected to,W3C standards for the Semantic Web.

I believe that because the Semantic Web makes data-integrationeasier, it will ultimately be widely adopted. Simply put, applicationsthat wish to access or integrate data in the Age of the Web can moreeasily do so using RDF and OWL. That alone is reason enough to usethese standards.

Of course there are many other benefits as well, such as the abilityto do more sophisticated reasoning across the data, but that is lessimportant. Simply making data more accessible, connectable, andreusable across applications would be a huge benefit.

7 thoughts on “Defining the Semantic Graph — What is it Really?

  1. Hi Nova, I left a response on GigaOM but also wanted to stop by here and say thanks for the comment and the post.
    I’m not arguing that you can’t represent a unified social graph semantically — I’m pointing about how a unified social graph doesn’t really adequately represent the complexity of human relationships. And I’m also wondering if moving towards a semweb approach for the social graph removes too much of the human, since semantic web technologies are all about machine processing.
    Seems to me that many people calling for a unified social graph are those that treat their friends/fans like undifferentiated nodes. Most people, however, don’t have so many people they interact with online (or so many services) that they need a unified, machine-processable approach. And a unified machine-processable approach has drawbacks (possibility for spam and privacy abuses, a loss of having multiple ways to say “you are my friend,” a loss of multiple identities online).
    Hope you had a nice Thanksgiving. 🙂

  2. Nova,
    I agree with your analysis. To me, however, I have another interpretation about Tim’s GGG claim.
    I understand GGG to be equivalent to WWW but from a different angle of view. When we talk about WWW, we take the publisher’s point of view; when we talk about GGG, we are trying to take the viewer’s point of view. Both views look upon the same Web, but gathering a different structure of the Web. Moreover, I believe that the purpose of Twine is exactly an attempt to convert the web information storage from the original publisher-oriented point of view to the more friendly viewer-oriented point of view.
    You may look at the entire analysis of my understanding of GGG at Thinking Space.
    — Yihong

  3. G’Day from the Antipodes, Nova.
    As usual, your thoughts provoke more thoughts. Is that negentropic?
    The part of this post I’m not comfortable with is the presumption that–just because RDF provides us with a well-formed container into which we humans can pour information about the information–suddenly the machines will be able to use each other’s information [“App C can immediately begin to use this data correctly and consistently with how App B uses it”] to deliver an improvement in the knowledge state experienced by some other humans…
    Surely the mere fact that RDF offers a structure does virtually nothing to improve the chance that most humans will start to think like S.R. Ranganathan and populate their ontologies etc with logical, useful, language.
    IOW: the G3 gambit makes it much easier and faster for machines to locate yet another confusing piece of human-generated information that requires some level of disambiguation.
    I think my nagging problem is that RDF appears to be used to store information fundamentally mismatched with RDFs level of abstraction.
    The semantic web cannot be driven by more human coloratura…there’s no end to the amount of clarification that is required for usefully machine-mediated human communication, no matter how neatly the storage containers fit together.
    To me it seems more productive to use RDF as a storage grid for a type of metadata that is more native (somehow) to the logics of the machines themselves. There needs to be more of a state change in the qualities of the information held at the first (or surely the second) order of abstraction above the original.
    I understand the need to improve people’s habits of describing what they’re talking about so that people two nodes away can still get the message. But the whole project suffers mightily when you ask those same messy humans to do the job.

  4. Hi Nova, very good explanation of the semantic graph. Its always encouraging for me to read ur blog as I too am working on an ambitious semantic web project 🙂

  5. Thank you, Nova. This is a great description of how these technologies relate to the web we already know. This is one of the biggest hurdles for us to overcome: explaining what we’re so excited about, both to fellow programmers as well as to non-programmers.

  6. Nova, you say that Semantic graphs expose their data and the meaning of links through RDF and OWL and then you say other apps can make use of this data. Can you please give a few examples as to how this semantic graph can be of use for other apps?

  7. Pingback: BibSonomy :: url :: Defining the Semantic Graph — What is it Really? | Nova Spivack - Minding the Planet

Comments are closed.