Here are some further thoughts on the potential uses of the open dataset that may result from the Meme Propagation experiment.
First of all, this dataset is different from other meme propagation tests in that it enables each instance of the meme to be time and location stamped, and provides information about how each instance of the meme was vectored (who it was discovered from). This data makes it possible to analyze how memes spread in space, time, and social networks.
It will be interesting to see how quickly this meme spreads around the planet, and where the major spreading points are. Who are the major vectors — what blogs, what blog software, what operating systems, what geographic locations, etc. Who is reading which blogs, and what software are they reading with? What is the average number of blogs that pick up the meme per blog that posts it?
Are there any demographic patterns to the spread of the data — for example, by gender, number of years blogging, particular software, geography, etc? Are there any patterns to the space-time distribution of the results? Can we classify blogs that participate into different categories, for example, by type of audience reached, or by number of downstream readers that result from them? How does this meme move through time — for example, can we track and visualize the momentum of this meme as it moves? (See my earlier article on measuring meme momentums for some ideas of how to do this).
The dataset generated by this experiment could help us explore these questions further, if enough people participate. This is just a first step however. I think similar techniques could be applied to tracking any story, or doing a survey or opinion poll, or even marketing using memes.
Another aspect of this that interests me is the concept of using a GUID to represent a meme so that it can easily be tracked wherever it occurs. Imagine for example, doing a survey in this manner. Each blog is responsible for housing its own results. The results all have the GUID for the survey on them. That way they can later be located and aggregated via search engines by simply searching for the GUID. In a sense, using a GUID in this manner creates a decentralized dataset.
This reminds me in some ways of my friend Paul Ford’s Future History of the The Semantic Web which has suddenly resurfaced on Slashdot today. It’s a good read by the way, although Google is not going to beat Amazon or eBay to the Semantic Web. From what I hear, Google is philosophically opposed to the Semantic Web — they prefer more traditional information retrieval techniques — primarily statistical rather than semantic.