Proposal for GRDDLR: The Open Source GRDDL Repository

Now that GRDDL is officially on its way to becoming a W3C open
standard, I’m even more interested in its potential. For those who don’t know,
GRDDL provides a way to map from non-RDF data into RDF, on-the-fly. So for
example, if a semantic web app needs data from some Web page, and that page
does not contain RDF, a GRDDL transform can be provided that maps that Web
page’s content to appropriate RDF. The semantic web app can simply look for the
presence of a GRDDL transform for that page and then generate RDF on the fly
for it. Ok well that’s probably an oversimplification but it gets the point
across. Here are some nicely illustrative use-case examples that shed more light on the subject.

So this got me thinking. Where can semantic web app x find good GRDDL
transforms to use? The current idea is that content creators will create and
publish GRDDL for their own content. So the publisher of Web page x would have
to create a GRDDL transform for it and host it on their site. When apps come
along that need RDF they can check to see if this GRDDL information exists
there, and if so they can use it.

But that approach puts all the burden on content publishers — many of whom may
be slow to adopt GRDDL until they see some real demand for it. So how could we
get this started faster?

This is pure speculation at this point, but what if there was a central open
site — a sourceforge of sorts — where anyone could create and publish a GRDDL
transform for anything that has a URI. Let’s call it GRDDLR (GRDDL + R for "Repository").

In other words, even though I may not be
the publisher of, if I wanted to I
could provide GRDDL for turning CNN content into RDF and share it with the
world as an open-source project on GRDDLR. Others could use it and improve it if they
wanted to. This would enable app creators to generate and share GRDDL about
content sources without waiting for third-party content providers to adopt
GRDDL. And this in turn could actually motivate content providers to get
involved sooner — because if you publish Web site x, and other people are
making GRDDL about it, you will probably want to at least make sure it is
correct, or publish your own definitive version (which you could designate as
definitive by simply linking  to it from your actual site — a simple way
to "claim" it as official).

But in addition, this central repository, GRDDLR, would also be a mirror and central
lookup location for all GRDDL data. Let’s say some semantic web app wants to
get an RDF representation of some Web resource x. First it should check that
resource and see if it links to any GRDDL that it recommends, and if so that
should be taken to be the "official" GRDDL. If no GRDDL link is
found, then the app would check the central repository to see if any GRDDL
transforms have been added for that URI by any other parties. If multiple
alternative GRDDL transforms are found to exist for a given resource, then
there would be a way to choose the "best" by default — perhaps the
one that has the best approval ratings from other members of the community.

An open-source GRDDL development portal that also serves as a lookup directory
for GRDDL would be a very useful piece of infrastructure. It could be come a
resource that would be very widely used by applications in the future as RDF
becomes more and more important as the "data interchange" language of

OK, so I like this enough that I bought the domain If anyone wants to build the site, this is my lazyweb
request. I’ll contribute the domain to the project.

(Thanks to Kinglsey for peer-reviewing this before I posted it)

Social tagging: >

Comments are closed.