Thoughts on Uses of the Meme Propagation Dataset

Here are some further thoughts on the potential uses of the open dataset that may result from the Meme Propagation experiment.

First of all, this dataset is different from other meme propagation tests in that it enables each instance of the meme to be time and location stamped, and provides information about how each instance of the meme was vectored (who it was discovered from). This data makes it possible to analyze how memes spread in space, time, and social networks.

It will be interesting to see how quickly this meme spreads around the planet, and where the major spreading points are. Who are the major vectors — what blogs, what blog software, what operating systems, what geographic locations, etc. Who is reading which blogs, and what software are they reading with? What is the average number of blogs that pick up the meme per blog that posts it?

Are there any demographic patterns to the spread of the data — for example, by gender, number of years blogging, particular software, geography, etc? Are there any patterns to the space-time distribution of the results? Can we classify blogs that participate into different categories, for example, by type of audience reached, or by number of downstream readers that result from them? How does this meme move through time — for example, can we track and visualize the momentum of this meme as it moves? (See my earlier article on measuring meme momentums for some ideas of how to do this).

The dataset generated by this experiment could help us explore these questions further, if enough people participate. This is just a first step however. I think similar techniques could be applied to tracking any story, or doing a survey or opinion poll, or even marketing using memes.

Another aspect of this that interests me is the concept of using a GUID to represent a meme so that it can easily be tracked wherever it occurs. Imagine for example, doing a survey in this manner. Each blog is responsible for housing its own results. The results all have the GUID for the survey on them. That way they can later be located and aggregated via search engines by simply searching for the GUID. In a sense, using a GUID in this manner creates a decentralized dataset.

This reminds me in some ways of my friend Paul Ford’s Future History of the The Semantic Web which has suddenly resurfaced on Slashdot today. It’s a good read by the way, although Google is not going to beat Amazon or eBay to the Semantic Web. From what I hear, Google is philosophically opposed to the Semantic Web — they prefer more traditional information retrieval techniques — primarily statistical rather than semantic.

Social tagging:

11 Responses to Thoughts on Uses of the Meme Propagation Dataset

  1. Very interesting experiment. I wish you would have chosen a shorter Google-ID, as several people already broke the current one up into several single ones. This will make it harder to search for the meme. Still it should be possible by using some sort of unique string within the text to search — “The dataset generated by this experiment could help us” should work.

  2. Nova Spivack says:

    Yeah it probably should have been a shorter GUID — but even searching for substrings within the GUID will still work, so even if people shorten it I think the results will be found (assuming Google searches for substrings, which if I recall correctly, they do).

  3. Tracking Meme Propogation

    Nova Spivack’s trying something interesting, a project to track the spread of a blog post as it travels through the Blogosphere. There’s three posts on it, here here & here. As requested, here’s the original post in full, with my…

  4. Tracking Meme Propogation

    Nova Spivack’s trying something interesting, a project to track the spread of a blog post as it travels through the Blogosphere. There’s three posts on it, here here & here. As requested, here’s the original post in full, with my…

  5. Tracking Meme Propogation

    Nova Spivack’s trying something interesting, a project to track the spread of a blog post as it travels through the Blogosphere. There’s three posts on it, here here & here. As requested, here’s the original post (after the break) in…

  6. Tracking Meme Propogation

    Nova Spivack’s trying something interesting, a project to track the spread of a blog post as it travels through the Blogosphere. There’s three posts on it, here here & here. As requested, here’s the original post (after the break) in…

  7. Nope, Google doesn’t search for substrings, as I’ve just answered here:
    http://blog.outer-court.com/forum/941.html

  8. Nova Spivack says:

    Turns out that Google is indexing the *full* GUID, so that’s good.

  9. Richard BF says:

    One thing you probably should have included is the date and time that each person read or discovered the meme. For me this is about 12 hours before I finally posted about it. Typically I read a lot, then much later at night I write up my dailies. I’m guessing that for a lot of people who blog once or less per day, this is their typical behaviour. Not sure how useful the stat would be though.

  10. Nova Spivack says:

    Good idea. Hadn’t occurred to me. My own blogging pattern is different — I usually read something and then blog it immediately or I don’t blog it at all.

  11. Czarism.com says:

    Media Mammon – A Stock Market for Ideas

    While browsing around Google for additional resources on memes I stumbled upon
    Minding the Planet (Nova
    Spivack’s journal of unusual news and ideas).
    I was unable to continue reading this blog for more then one minute when I