Evri Ties the Knot with Twine — Twine CEO Comments and Analysis

Today I am announcing that my company, Radar Networks, and its flagship product, Twine, have been acquired by Evri. TechCrunch broke the story here.

This acquisition consolidates two leading providers of semantic discovery and search. It is also the culmination of a long and challenging venture to pioneer the adoption of the consumer Semantic Web.

As the CEO and founder of Radar Networks and Twine.com, it is difficult to describe what it feels like to have reached this milestone during what has been a tumultuous period of global recession. I am very proud of my loyal and dedicated team and the incredible work and accomplishments that we have made together, and I am grateful for the unflagging support of our investors, and the huge community of Twine users and supporters.

Selling Twine.com was not something we had planned on doing at this time, but given the economy and the fact that Twine.com is a long-term project that will require significant ongoing investment and work to reach our goals, it is the best decision for the business and our shareholders.

While we received several offers for the company, and were in discussions about M&A with multiple industry leading companies in media, search and social software, we eventually selected Evri.

The Twine team is joining Evri to continue our work there. The Evri team has assured me that Twine.com’s data and users are safe and sound and will be transitioned into the Evri.com service over time, in a manner that protects privacy and data, and is minimally disruptive. I believe they will handle this with care and respect for the Twine community.

It is always an emotional experience to sell a company. Building Twine.com has been a long, intense, challenging, rewarding, and all-consuming effort. There were incredible high points and some very deep lows along the way. But most of all, it has been an adventure I will never forget. I was fortunate to help pioneer a major new technology — the Semantic Web — with an amazing team, including many good friends. Bringing something as big, as ambitious, and as risky as Twine.com to market was exhilarating.

Twine has been one of the great learning experiences of my life. I am profoundly grateful to everyone I’ve worked with, and especially to those who supported us financially and personally with their moral support, ideas and advocacy.

I am also grateful to unsung heroes behind the project — the families of all of us who worked on it, who never failed to be supportive as we worked days, nights, weekends and vacations to bring Twine to market.

What I’m Doing Next

I will advise Evri through the transition, but will not be working full-time there. Instead, I will be turning my primary focus to several new projects, including some exciting new ventures:

  • Live Matrix, a new venture focusing on making the live Web more navigable. Live Matrix is led by Sanjay Reddy (CEO of Live Matrix; formerly SVP of Corp Dev for Gemstar TV Guide). Live Matrix is going to give the Web a new dimension: time. More news about this soon.
  • Klout, the leading provider of social analytics about influencers on Twitter and Facebook (which I was the first angel investor in, and which I now advise). Klout is a really hot  company and it’s growing fast.
  • I’m experimenting with a new way to grow ventures. It’s part incubator, part fund, part production company. I call it a Venture Production Studio. Through this initiative my partners and I are planning to produce a number of original startups, and selected outside startups as well. There is a huge gap in the early-stage arena, and to fill this we need to modify the economics and model of early stage venture investing.
  • I’m looking forward to working more on my non-profit interests, particularly those related to supporting democracy and human rights around the world, and one of my particular interests, Tibetan cultural preservation.
  • And last but not least, I’m getting married later this month, which may turn out to be my best project of all.

If you want to keep up with what I am thinking about and working on, you should follow me on Twitter at @novaspivack, and also keep up with my blog here at novaspivack.com and my mailing list (accessible in the upper right hand corner of this page).

The Story Behind the Story

In making this transition, it seems appropriate to tell the Twine.com story. This will provide some insight into how we got here, including some of our triumphs, and our mistakes, and some of the difficulties we faced along the way. Hopefully this will shed some light on the story behind the story, and may even be useful to other entrepreneurs out there in what is perhaps one of the most difficult venture capital and startup environments in history.

(Note: You may also be interested in viewing this presentation, “A Yarn About Twine” which covers the full history of the project with lots of pictures of various iterations of our work from the early semantic desktop app to Twine, to T2.)

The Early Years of the Project

The ideas that led to Twine were born in the 1990’s from my work as a co-founder of EarthWeb (which today continues as Dice.com), where among many things we prototyped a number of new knowledge-sharing and social networking tools, along with our primary work developing large Web portals and communities for customers, and eventually our own communities for IT professionals. My time with EarthWeb really helped me to understand that challenges and potential of sharing and growing knowledge socially on the Web. I became passionately interested in finding new ways to network people’s minds together, to solve information overload, and to enable the evolution of a future “global brain.”

After EarthWeb’s IPO I worked with SRI and Sarnoff to build their business incubator, nVention, and then eventually started my own incubator, Lucid Ventures, through which I co-founded Radar Networks with Kristin Thorisson, from the MIT Media Lab, and Jim Wissner (the continuing Chief Architect of Twine) in 2003. Our first implementation was a peer-to-peer Java-based knowledge sharing app called “Personal Radar.”

Personal Radar was a very cool app — it organized all the information on the desktop in a single semantic information space that was like an “iTunes for information” and then made it easy to share and annotate knowledge with others in a collaborative manner. There were some similarities to apps like Ray Ozzie’s Groove and the MIT Haystack project, but Personal Radar was built for consumers, entirely with Java, RDF, OWL and the standards of the emerging Semantic Web. You can see some screenshots pictures of this early work in this slideshow, here.

But due to the collapse of the first Internet bubble there was simply no venture funding available at the time and so instead, we ended up working as subcontractors on the DARPA CALO project at SRI. This kept our research alive through the downturn and also introduced us to a true Who’s Who of AI and Semantic Web gurus who worked on the CALO project. We eventually helped SRI build OpenIRIS, a personal semantic desktop application, which had many similarities to Personal Radar. All of our work for CALO was open-sourced under the LGPL license.

Becoming a Venture-Funded Company

Deborah L. McGuinness, who was one of the co-designers of the OWL language (the Web Ontology Language, one of the foundations of the Semantic Web standards at the W3C), became one of our science advisers and kindly introduced us to Paul Allen, who invited us to present our work to his team at Vulcan Capital. The rest is history. Paul Allen and Ron Conway led an angel round to seed-fund us and we moved out of consulting to DARPA and began work on developing our own products and services.

Our long-term plan was to create a major online portal powered by the Semantic Web that would provide a new generation of Web-scale semantic search and discovery features to consumers. But for this to happen, first we had to build our own Web-scale commercial semantic applications platform, because there was no platform available at that time that could meet the requirements we had. In the process of building our platform numerous technical challenges had to be overcome.

At the time (the early 2000’s) there were few development tools in existence for creating ontologies or semantic applications, and in addition there were no commercial-quality databases capable of delivering high-performance Web-scale storage and retrieval of RDF triples. So we had to develop our own development tools, our own semantic applications framework, and our own federated high-performance semantic datastore.

This turned out to be a nearly endless amount of work. However we were fortunate to have Jim Wissner as our lead technical architect and chief scientist. Under his guidance we went through several iterations and numerous technical breakthroughs, eventually developing the most powerful and developer-friendly semantic applications platform in the world. This led to the  development of a portfolio of intellectual property that provides fundamental DNA for the Semantic Web.

During this process we raised a Series A round led by Vulcan Capital and Leapfrog Ventures, and our team was joined by interface designer and product management expert, Chris Jones (now leading strategy at HotStudio, a boutique design and user-experience firm in San Francisco). Under Chris’ guidance we developed Twine.com, our first application built on our semantic platform.

The mission of Twine.com was to help people keep up with their interests more efficiently, using the Semantic Web. The basic idea was that you could add content to Twine (most commonly by bookmarking it into the site, but also by authoring directly into it), and then Twine would use natural language processing and analysis, statistical methods, and graph and social network analysis, to automatically store, organize, link and semantically tag the content into various topical areas.

These topics could easily be followed by other users who wanted to keep up with specific types of content or interests. So basically you could author or add stuff to Twine and it would then do the work of making sense of it, organizing it, and helping you share it with others who were interested. The data was stored semantically and connected to ontologies, so that it could then be searched and reused in new ways.

With the help of Lew Tucker, Sonja Erickson and Candice Nobles, as well as an amazing team of engineers, product managers, systems admins and designers, Twine was announced at the Web 2.0 Summit in October of 2007 and went into full public beta in Q1 of 2008. Twine was well-received by the press and early-adopter users.

Soon after our initial beta launch we raised a Series B round, led by Vulcan Capital and Velocity Interactive Group (now named Fuse Capital), as well as DFJ. This gave us the capital to begin to grow Twine.com rapidly to become the major online destination we envisioned.

In the course of this work we made a number of additional technical breakthroughs, resulting in more than 20 patent filings in total, including several fundamental patents related to semantic data management, semantic portals, semantic social networking, semantic recommendations, semantic advertising, and semantic search.

Four of those patents have been granted so far and the rest are still pending — and perhaps the most interesting of these patents are related to our most recent work on “T2” and are not yet visible.

At the time of beta launch and for almost six months after, Twine was still very much a work in progress. Fortunately our users and the press were fairly forgiving as we worked through evolving the GUI and feature set from what was initially just slightly better than an alpha site to the highly refined and graphical UI we have today.

During these early days of Twine.com we were fortunate to have a devoted user-base and this became a thriving community of power-users who really helped us to refine the product and develop great content within it.

Rapid Growth, and Scaling Challenges

As Twine grew the community went through many changes and some growing pains, and eventually crossed the chasm to a more mainstream user-base. Within less than a year from launch the site grew to around 3 million monthly visitors, 300,000 registered users, 25,000 “twines” about various interests, and almost 5 million pieces of user-contributed content. It was on its way to becoming the largest semantic web on the Web.

By all accounts Twine was looking like a potential “hit.” During this period the company staff increased to more than 40 people (inclusive of contractors and offshore teams) and our monthly burn rate increased to aggressive levels of spending to keep up with growth.

Despite this growth and spending we still could not keep up with demand for new features and at times we experienced major scaling and performance challenges. We had always planned for several more iterations of our backend architecture to facilitate scaling the system. But now we could see the writing on the wall — we had to begin to develop a more powerful, more scalable backend for Twine, much sooner than we had expected we would need to.

This required us to increase our engineering spending further in order to simultaneously support the live version of Twine and its very substantial backend, and run a parallel development team working on the next generation of the backend and the next version of Twine on top of it. Running multiple development teams instead of one was a challenging and costly endeavor. The engineering team was stretched thin and we were all putting in 12 to 15 hour days every day.

Breakthrough to “T2”

We began to work in earnest on a new iteration of our back-end architecture and application framework — one that could scale fast enough to keep up with our unexpectedly fast growth rate and the increasing demands on our servers that this was causing.

This initiative yielded unexpected fruit. Not only did we solve our scaling problems, but we were able to do so to such a degree that entirely new possibilities were opened up to us — ones that had previously been out of reach for purely technical reasons. In particular, semantic search.

Semantic search had always been a long-term goal of ours, however, in the first version of Twine (the one that is currently online) search was our weakest feature area, due to the challenge of scaling a semantic datastore to handle hundreds of billions of triples. But our user-studies revealed that it was in fact the feature our users wanted us to develop the most – search slowly became the dominant paradigm within Twine, especially when the content in our system reached critical mass.

Our new architecture initiative solved the semantic search problem to such a degree that we realized that not only could we scale Twine.com, we could scale it to eventually become a semantic search engine for the entire Web.

Instead of relying on users to crowdsource only a subset of the best content into our index, we could crawl large portions of the Web automatically and ingest millions and millions of Web pages, process them, and make them semantically searchable — using a true W3C Semantic Web compliant backend. (Note: Why did we even attempt to do this? We believed strongly in supporting open-standards for the Semantic Web, despite the fact that they posed major technical challenges and required tools that did not exist yet, because they promised to enable semantic application and data interoperability, one of the main potential benefits of the Semantic Web).

Based on our newfound ability to do Web-scale semantic search, we began planning the next version of Twine — Twine 2.0 (“T2”), with the help of Bob Morgan, Mark Erickson, Sasi Reddy, and a team of great designers.

The new T2 plan would merge new faceted semantic search features with the existing social, personalization and knowledge management features of Twine 1.0. It would be the best of both worlds: semantic search + social search. We began working intensively on developing T2, along with a new hosted developer tools that would make it easy for any webmaster to easily add their site into our semantic index. We were certain that with T2 we had finally “cracked the code” to the Semantic Web — we had a product plan and a strategy that could really bring the Semantic Web to everyone on the Web. It elegantly solved the key challenges to adoption and on a technical level, using SOLR instead of a giant triplestore, we were able to scale to unprecedented levels. It was an exciting plan and everyone on the team was confident in the direction.

To see screenshots that demo T2 and our hosted development tools click here.

The Global Recession

Our growth was fast, and so was our spending, but at the time this seemed logical because the future looked bright and we were in a race to keep ahead of our own curve. We were quickly nearing a point where we would soon need to raise another round of funding to sustain our pace, but we were confident that with our growth trends steadily increasing and our exciting plans for T2, the necessary funding would be forthcoming at favorable valuations.

We were wrong.

The global economy crashed unexpectedly, throwing a major curveball in our path. We had not planned on that happening and it certainly was inconvenient to say the least.

The recession not only hit Wall Street, it hit Silicon Valley. Venture capital funding dried up almost overnight. VC funds sent alarming letters to their portfolio companies warning of dire financial turmoil ahead. Many startups were forced to close their doors, while others made drastic sudden layoffs for better or for worse. We too made spending cuts, but we were limited in our ability to slash expenses until the new T2 platform could be completed. Once that was done, we would be able to move Twine to a much more scalable and less costly architecture, and we would no longer need parallel development teams. But until that happened, we still had to maintain a sizeable infrastructure and engineering effort.

As the recession dragged on, and the clock kept ticking down, the urgency of raising a C round increased, and finally we were faced with a painful decision. We had to drastically reduce our spending in order to wait out the recession and live to raise more funding in the future.

Unfortunately, the only way to accomplish such a drastic reduction in spending was to lay off almost 30% of our staff and cut our monthly spending by almost 40%. But by doing that we could not possibly continue to work on as many fronts as we had been doing. The result was that we had to stop most work on Twine 1.0 (the version that was currently online) and focus all our remaining development cycles and spending on the team needed to continue our work on T2.

This was extremely painful for me as the CEO, and for everyone on our team. But it was necessary for the survival of the business and it did buy us valuable time. However, it also slowed us down tremendously. The irony of making this decision was that it reduced our burn-rate but slowed us down, reduced productivity, and cost us time to such a degree that in the end it may have cost us the same amount of money anyway.

While much of our traffic had been organic and direct, we also had a number of marketing partnerships and PR initiatives that we had to terminate. In addition, as part of this layoff we lost our amazing and talented marketing team, as well as half our product management team, our entire design team, our entire marketing and PR budget, and much of our support and community management team. This made it difficult to continue to promote the site, launch new features, fix bugs, or to support our existing online community. And as a result the service began to decline and usage declined along with it.

To make matters worse, at around the same time as we were making these drastic cuts, Google decided to de-index Twine. To this day we still are not sure why they decided to do this – it could have been that Google suddenly decided we were a competitive search engine, or it could be that their algorithm changed, or it could be that there was some error in our HTML markup that may have caused an indexing problem. We had literally millions of pages of topical user-generated content – but all of a sudden we saw drastic reductions in the number of pages being indexed, and in the ranking of those pages. This caused a very significant drop in organic traffic. With what little team I had remaining we spent time petitioning Google and trying to get reinstated. But we never managed to return to our former levels of index prominence.

Eventually, with all these obstacles, and the fact that we had to focus our remaining budget on T2, we put Twine.com on auto-pilot and let the traffic fall off, believing that we would have the opportunity to win it back once we launched next versipn. While painful to watch, this reduction in traffic and user activity at least had the benefit of reducing the pressure on the engineering team to scale the system and support it under load, giving us time to focus all our energy on getting T2 finished and on raising more funds.

But the recession dragged on and on and on, without end. VC’s remained extremely conservative and risk-averse. Meanwhile, we focused our internal work on growing a large semantic index of the Web in T2, vertical by vertical, starting with food, then games, and then many other topics (technology, health, sports, etc.). We were quite confident that if we could bring T2 to market it would be a turning point for Web search, and funding would follow.

Meanwhile we met with VC’s in earnest. But nobody was able to invest in anything due to the recession. Furthermore we were a pre-revenue company working on a risky advanced technology and VC partnerships were far too terrified by the recession to make such a bet. We encountered the dreaded “wait and see” response.

The only way we could get the funding we needed to continue was to launch T2, grow it, and generate revenues from it, but the only way we could reach those milestones was to launch T2 in the first place: a classic catch-22 situation.

We took comfort in the fact that we were not alone in this predicament. Almost every tech company at our stage was facing similar funding challenges. However, we were determined to find a solution despite the obstacles in our path.

Selling the Business

Had the recession not happened, I believe we would have raised a strong C round based on the momentum of the product and our technical achievements. Unfortunately, we, like many other early-stage technology ventures, found ourselves in the worst capital crunch in decades.

We eventually came to the conclusion that there was no viable path for the company but to use the runway we had left to sell to another entity that was more able to fund the ongoing development and marketing necessary to monetize T2.

While selling the company had always been a desirable exit strategy, we had hoped to do it after the launch and growth of T2. However, we could not afford to wait any longer. With some short-term bridge funding from our existing investors, we worked with Growth Point Technology Partners to sell the company.

We met with a number of the leading Internet and media companies and received numerous offers. In the end, the best and most strategically compatible offer came from Evri, one of our sibling companies in Vulcan Capital’s portfolio. While we had the option to sell to larger and more established companies with very compelling offers, it was simply the best option to join Evri.

And so we find ourselves at the present day. We got the best deal possible for our shareholders given the circumstances. Twine.com, my team, our users and their data are safe and sound. As an entrepreneur and CEO it is, as one advisor put it, of the utmost importance to always keep the company moving forward. I feel that I did manage to achieve this under extremely difficult economic circumstances. And for that I am grateful.

Outlook for the Semantic Web

I’ve been one of the most outspoken advocates of the Semantic Web during my tenure at Twine. So what about my outlook for the Semantic Web now that Twine is being sold and I’m starting to do other things? Do I still believe in the promise of the Semantic Web vision? Where is it going? These are questions I expect to be asked, so I will attempt to answer them here.

I continue to believe in the promise of semantic technologies, and in particular the approach of the W3C semantic web standards (RDF, OWL, SPARQL). That said, having tried to bring them to market as hard as anyone ever has, I can truly say they present significant challenges both to developers and to end-users. These challenges all stem from one underlying problem: Data storage.

Existing SQL databases are not optimal for large-scale, high-performance semantic data storage and retrieval. Yet triplestores are still not ready for prime-time. New graph databases and column stores show a lot of promise, but they are still only beginning to emerge. This situation makes it incredibly difficult to bring Web-scale semantic applications to market cost-effectively.

Enterprise semantic applications are much more feasible today however — because existing and emerging databases and semantic storage solutions do scale to enterprise levels. But for consumer-grade, enormous, Web services, there are still challenges. This is single greatest technical obstacle that Twine faced and it cost us a large amount of our venture funding to surmount. Finally we did find a solution with our T2 architecture, but it is still not a general solution for all types of applications.

I have recently seen some new graph data storage products that may provide the levels of scale and performance needed, but pricing has not been determined yet. In short, storage and retrieval of semantic graph datasets is a big unsolved challenge that is holding back the entire industry. We need federated database systems that can handle hundreds of billions to trillions of triples under high load conditions, in the cloud, on commodity hardware and open source software. Only then will it be affordable to make semantic applications and services at Web-scale.

I believe that semantic metadata is essential for the growth and evolution of the Web. It is one of the only ways we can hope to dig out from the increasing problem of information overload. It is one of the only ways to make search, discovery, and collaboration smart enough to really be significantly better than it is today.

But the notion that everyone will learn and adopt standards for creating this metadata themselves is flawed in my opinion. They won’t. Instead, we must focus on solutions (like Twine and Evri) that make this metadata automatically by analyzing content semantically. I believe this is the most practical approach to bringing the value of semantic search and discovery to consumers, as well as Webmasters and content providers around the Web.

The major search engines are all working on various forms of semantic search, but to my knowledge none of them are fully supporting the W3C standards for the Semantic Web. In some cases this is because they are attempting to co-opt the standards for their own competitive advantage, and in other cases it is because it is simply easier not to use them. But in taking the easier path, they are giving up the long-term potential gains of a truly open and interoperable semantic ecosystem.

I do believe that whoever enables this open semantic ecosystem first will win in the end — because it will have greater and faster network effects than any closed competing system. That is the promise and beauty of open standards: everyone can feel safe using them since no single commercial interest controls them. At least that’s the vision I see for the Semantic Web.

As far as where the Semantic Web will add the most value in years to come, I think we will see it appear in some new areas. First and foremost is e-commerce, an area that is ripe with structured data that needs to be normalized, integrated and made more searchable. This is perhaps the most potentially profitable and immediately useful application of semantic technologies. It’s also one where there has been very little innovation. But imagine if eBay or Amazon or Salesforce.com provided open-standards-compliant semantic metadata and semantic search across all their data.

Another important opportunity is search and SEO — these are the areas that Twine’s T2 project focused on, by enabling webmasters to easily and semi-automatically add semantic descriptions of their content into search indexes, without forcing them to learn RDF and OWL and do it manually. This would create a better SEO ecosystem and would be beneficial not only to content providers and search engines, but also to advertisers. This is the approach that I believe the major search engines should take.

Another area where semantics could add a lot of value is social media — by providing semantic descriptions of user profiles and user profile data, as well as social relationships on the Web, it would be possible to integrate and search across all social networks in a unified manner.

Finally, another area where semantics will be beneficial is to enable easier integration of datasets and applications around the Web — currently every database is a separate island, but by using the Semantic Web appropriately data can be freed from databases and easily reused, remixed and repurposed by other applications. I look forward to the promise of a truly open data layer on the Web, when the Web becomes essentially one big open database that all applications can use.

Lessons Learned and Advice for Startups

While the outcome for Twine was decent under the circumstances, and was certainly far better than the alternative of simply running out of money, I do wonder how it could have been different. I ask myself what I learned and what I would do differently if I had the chance or could go back in time.

I think the most important lessons I learned, and the advice that I would give to other entrepreneurs can be summarized with a few key points:

  1. Raise as little venture capital as possible. Raise less than you need, not more than you need. Don’t raise extra capital just because it is available. Later on it will make it harder to raise further capital when you really need it. If you can avoid raising venture capital at all, do so. It comes with many strings attached. Angel funding is far preferable. But best of all, self-fund from revenues as early as you can, if possible. If you must raise venture capital, raise as little as you can get by on — even if they offer you more. But make sure you have at least enough to reach your next funding round — and assume that it will take twice as long to close as you think. It is no easy task to get a startup funded and launched in this economy — the odds are not in your favor — so play defense, not offense, until conditions improve (years from now).
  2. Build for lower exits. Design your business model and capital strategy so that you can deliver a good ROI to your investors at an exit under $30mm. Exit prices are going lower, not higher. There is less competition and fewer buyers and they know it’s a buyer’s market. So make sure your capital strategy gives the option to sell in lower price ranges. If you raise too much you create a situation where you either have to sell at a loss, or raise even more funding which only makes the exit goal that much harder to reach.
  3. Spend less. Spend less than you want to, less than you need to, and less than you can. When you are flush with capital it is tempting to spend it and grow aggressively, but don’t. Assume the market will crash — downturns are more frequent and last longer than they used to. Expect that. Plan on it. And make sure you keep enough capital in reserve to spend 9 to 12 months raising your next round, because that is how long it takes in this economy to get a round done.
  4. Don’t rely on user-traction to raise funding. You cannot assume that user traction is enough to get your next round done. Even millions of users and exponential growth are not enough. VC’s and their investment committees want to see revenues, and particularly at least breakeven revenues. A large service that isn’t bringing in revenues yet is not a business, it’s an experiment. Perhaps it’s one that someone will buy, but if you can’t find a buyer then what? Don’t assume that VC’s will fund it. They won’t. Venture capital investing has changed dramatically — early stage and late stage deals are the only deals that are getting real funding. Mid-stage companies are simply left to die, unless they are profitable or will soon be profitable.
  5. Don’t be afraid to downsize when you have to. It sucks to fire people, but it’s sometimes simply necessary. One of the worst mistakes is to not fire people who should be fired, or to not do layoffs when the business needs require it. You lose credibility as a leader if you don’t act decisively. Often friendships and personal loyalties prevent or delay leaders from firing people that really should be fired. While friendship and loyalty are noble they unfortunately are not always the best thing for the business. It’s better for everyone to take their medicine sooner rather than later. Your team knows who should be fired. Your team knows when layoffs are needed. Ask them. Then do it. If you don’t feel comfortable firing people, or you can’t do it, or you don’t do it when you need to, don’t be the CEO.
  6. Develop cheaply, but still pay market salaries. Use offshore development resources, or locate your engineering team outside of the main “tech hub” cities. It is simply too expensive to compete with large public and private tech companies to pay top dollar for engineering talent in places like San Francisco and Silicon Valley.  The cost of top-level engineers is too high in major cities to be affordable and the competition to hire and retain them is intense. If you can get engineers to work for free or for half price then perhaps you can do it, but I believe you get what you pay for. So rather thank skimp on salaries, pay people market salaries, but do it where market salaries are more affordable.
  7. Only innovate on one frontier at a time. For example, either innovate by making a new platform, or a new application, or a new business model. Don’t do all of these at once, it’s just too hard. If you want to make a new platform, just focus on that, don’t try to make an application too. If you want to make a new application, use an existing platform rather than also building a platform for it. If you want to make a new business model, use an existing application and platform — they can be ones you have built in the past, but don’t attempt to do it all at once. If you must do all three, do them sequentially, and make sure you can hit cash flow breakeven at each stage, with each one. Otherwise you’re at risk in this economy.

I hope that this advice is of some use to entrepreneurs (and VC’s) who are reading this. I’ve personally made all these mistakes myself, so I am speaking from experience. Hopefully I can spare you the trouble of having to learn these lessons the hard way.

What we did Well

I’ve spent considerable time in this article focusing on what didn’t go according to plan, and the mistakes we’ve learned from. But it’s also important to point out what we did right. I’m proud of the fact that Twine accomplished many milestones, including:

  • Pioneering the Semantic Web and leading the charge to make it a mainstream topic of conversation.
  • Creating the most powerful, developer friendly, platform for the Semantic Web.
  • Successfully completing our work on CALO, the largest Semantic Web project in the US.
  • Launching the first mainstream consumer application of Semantic Web.
  • Having a very successful launch, covered by hundreds of articles.
  • Gaining users extremely rapidly — faster than Twitter did in it’s early years.
  • Hiring and retaining an incredible team of industry veterans.
  • Raising nearly $24mm of venture capital over 2 rounds, because our plan was so promising.
  • Developing more than 20 patents, several of which are fundamentally important for the Semantic Web field.
  • Surviving two major economic bubbles and the downturns that followed.
  • Innovating and most of all, adapting to change rapidly.
  • Breaking through to T2 — a truly awesome technological innovation for Web-scale semantic search.
  • Selling the company in one of the most difficult economic environments in history.

I am proud of what we accomplished with Twine. It’s been “a long strange trip” but one that has been full of excitement and accomplishments to remember.


If you’ve actually read this far, thank you. This is a big article, but after all, Twine is a big project – One that lasted nearly 5 years (or 9 years if you include our original research phase). I’m still bullish on the Semantic Web, and genuinely very enthusiastic about what Evri will do with Twine.com going forward.

Again I want to thank the hundreds of people who have helped make Twine possible over the years – but in particular the members of our technical and management team who went far beyond the call of duty to get us to the deal we have reached with Evri.

While this is certainly the end of an era, I believe that this story has only just begun. The first chapters are complete and now we are moving into a new era. Much work remains to be done and there are certainly still challenges and unknowns, but progress continues and the Semantic Web is here to stay.

Eliminating the Need for Search – Help Engines

We are so focused on how to improve present-day search engines. But that is a kind of mental myopia. In fact, a more interesting and fruitful question is why do people search at all? What are they trying to accomplish? And is there a better way to help them accomplish that than search?

Instead of finding more ways to get people to search, or ways to make existing search experiences better, I am starting to think about how to reduce or  eliminate the need to search — by replacing it with something better.

People don’t search because they like to. They search because there is something else they are trying to accomplish. So search is in fact really just an inconvenience — a means-to-an-end that we have to struggle through to do in order to get to what we actually really want to accomplish. Search is “in the way” between intention and action. It’s an intermediary stepping stone. And perhaps there’s a better way to get to where we want to go than searching.

Searching is a boring and menial activity. Think about it. We have to cleverly invent and try pseudo-natural-language queries that don’t really express what we mean. We try many different queries until we get results that approximate what we’re looking for. We click on a bunch of results and check them out. Then we search some more. And then some more clicking. Then more searching. And we never know whether we’ve been comprehensive, or have even entered the best query, or looked at all the things we should have looked at to be thorough. It’s extremely hit or miss. And takes up a lot of time and energy. There must be a better way! And there is.

Instead of making search more bloated and more of a focus, the goal should really be get search out of the way.  To minimize the need to search, and to make any search that is necessary as productive as possible. The goal should be to get consumers to what they really want with the least amount of searching and the least amount of effort, with the greatest amount of confidence that the results are accurate and comprehensive. To satisfy these constraints one must NOT simply build a slightly better search engine!

Instead, I think there’s something else we need to be building entirely. I don’t know what to call it yet. It’s not a search engine. So what is it?

Bing’s term “decision engine” is pretty good, pretty close to it. But what they’ve actually released so far still looks and feels a lot like a search engine. But at least it’s pushing the envelope beyond what Google has done with search. And this is good for competition and for consumers. Bing is heading in the right direction by leveraging natural language, semantics, and structured data. But there’s still a long way to go to really move the needle significantly beyond Google to be able to win dominant market share.

For the last decade the search wars have been fought in battles around index size, keyword search relevancy, and ad targeting — But I think the new battle is going to be fought around semantic understanding, intelligent answers, personal assistance, and commerce affiliate fees. What’s coming next after search engines are things that function more like assistants and brokers.

Wolfram Alpha is an example of one approach to this trend. The folks at Wolfram Alpha call their system a “computational knowledge engine” because they use a knowledge base to compute and synthesize answers to various questions. It does a lot of the heavy lifting for you, going through various data, computing and comparing, and then synthesizes a concise answer.

There are also other approaches to getting or generating answers for people — for example, by doing what Aardvark does: referring people to experts who can answer their questions or help them. Expert referral, or expertise search, helps reduce the need for networking and makes networking more efficient. It also reduces the need for searching online — instead of searching for an answer, just ask an expert.

There’s also the semantic search approach — perhaps exemplified by my own Twine “T2” project — which basically aims to improve the precision of search by helping you get to the right results faster, with less irrelevant noise. Other consumer facing semantic search projects of interest are Goby and Powerset (now part of Bing).

Still another approach is that of Siri, which is making an intelligent “task completion assistant” that helps you search for and accomplish things like “book a romantic dinner and a movie tonight.” In some ways Siri is a “do engine” not a “search engine.” Siri uses artificial intelligence to help you do things more productively. This is quite needed and will potentially be quite useful, especially on mobile devices.

All of these approaches and projects are promising. But I think the next frontier — the thing that is beyond search and removes the need for search is still a bit different — it is going to combine elements of all of the above approaches, with something new.

For a lack of a better term, I call this a “help engine.” A help engine proactively helps you with various kinds of needs, decisions, tasks, or goals you want to accomplish. And it does this by helping with an increasingly common and vexing problem: choice overload.

The biggest problem is that we have too many choices, and the number of choices keeps increasing exponentially. The Web and globalization have increased the number of choices that are within range for all of us, but the result has been overload. To make a good, well-researched, confident choice now requires a lot of investigation, comparisons, and thinking. It’s just becoming too much work.

For example, choosing a location for an event, or planning a trip itinerary, or choosing what medicine to take, deciding what product to buy, who to hire, what company to work for, what stock to invest in, what website to read about some topic. These kinds of activities require a lot of research, evaluations of choices, comparisons, testing, and thinking. A lot of clicking. And they also happen to be some of the most monetizable activities for search engines. Existing search engines like Google that make money from getting you to click on their pages as much as possible have no financial incentive to solve this problem — if they actually worked so well that consumers clicked less they would make less money.

I think the solution to what’s after search — the “next Google” so to speak — will come from outside the traditional search engine companies. Or at least it will be an upstart project within one of them that surprises everyone and doesn’t come from the main search teams within them. It’s really such a new direction from traditional search and will require some real thinking outside of the box.

I’ve been thinking about this a lot over the last month or two. It’s fascinating. What if there was a better way to help consumers with the activities they are trying to accomplish than search? If it existed it could actually replace search. It’s a Google-sized opportunity, and one which I don’t think Google is going to solve.

Search engines cause choice overload. That wasn’t the goal, but it is what has happened over time due to the growth of the Web and the explosion of choices that are visible, available, and accessible to us via the Web.

What we need now is not a search engine — it’s something that solves the problem created by search engines. For this reason, the next Google probably won’t be Google or a search engine at all.

I’m not advocating for artificial intelligence or anything that tries to replicate human reasoning, human understanding, or human knowledge. I’m actually thinking about something simpler. I think that it’s possible to use computers to provide consumers with extremely good, automated decision-support over the Web and the kinds of activities they engage in. Search engines are almost the most primitive form of decision support imaginable. I think we can do a lot better. And we have to.

People use search engines as a form of decision-support, because they don’t have a better alternative. And there are many places where decision support and help are needed: Shopping, travel, health, careers, personal finance, home improvement, and even across entertainment and lifestyle categories.

What if there was a way to provide this kind of personal decision-support — this kind of help — with an entirely different user experience than search engines provide today? I think there is. And I’ve got some specific thoughts about this, but it’s too early to explain them; they’re still forming.

I keep finding myself thinking about this topic, and arriving at big insights in the process. All of the different things I’ve worked on in the past seem to connect to this idea in interesting ways. Perhaps it’s going to be one of the main themes I’ll be working on and thinking about for this coming decade.

Twine "T2" – Latest Demo Screenshots (Internal Alpha)

This is a series of screenshots that demo the latest build of the consumer experience and developer tools for Twine.com’s “T2” semantic search product. This is still in internal alpha — not released to public yet.

The Road to Semantic Search — The Twine.com Story

This is the story of Twine.com — our early research (with never before seen screenshots of our early semantic desktop work), and our evolution from Twine 1.0 towards Twine 2.0 (“T2”) which is focused on semantic search.

The Next Generation of Web Search — Search 3.0

The next generation of Web search is coming sooner than expected. And with it we will see several shifts in the way people search, and the way major search engines provide search functionality to consumers.

Web 1.0, the first decade of the Web (1989 – 1999), was characterized by a distinctly desktop-like search paradigm. The overriding idea was that the Web is a collection of documents, not unlike the folder tree on the desktop, that must be searched and ranked hierarchically. Relevancy was considered to be how closely a document matched a given query string.

Web 2.0, the second decade of the Web (1999 – 2009), ushered in the beginnings of a shift towards social search. In particular blogging tools, social bookmarking tools, social networks, social media sites, and microblogging services began to organize the Web around people and their relationships. This added the beginnings of a primitive “web of trust” to the search repertoire, enabling search engines to begin to take the social value of content (as evidences by discussions, ratings, sharing, linking, referrals, etc.) as an additional measurment in the relevancy equation. Those items which were both most relevant on a keyword level, and most relevant in the social graph (closer and/or more popular in the graph), were considered to be more relevant. Thus results could be ranked according to their social value — how many people in the community liked them and current activity level — as
well as by semantic relevancy measures.

In the coming third decade of the Web, Web 3.0 (2009 – 2019), there will be another shift in the search paradigm. This is a shift to from the past to the present, and from the social to the personal.

Established search engines like Google rank results primarily by keyword (semantic) relevancy. Social search engines rank results primarily by activity and social value (Digg, Twine 1.0, etc.). But the new search engines of the Web 3.0 era will also take into account two additional factors when determining relevancy: timeliness, and personalization.

Google returns the same results for everyone. But why should that be the case? In fact, when two different people search for the same information, they may want to get very different kinds of results. Someone who is a novice in a field may want beginner-level information to rank higher in the results than someone who is an expert. There may be a desire to emphasize things that are novel over things that have been seen before, or that have happened in the past — the more timely something is the more relevant it may be as well.

These two themes — present and personal — will define the next great search experience.

To accomplish this, we need to make progress on a number of fronts.

First of all, search engines need better ways to understand what content is, without having to do extensive computation. The best solution for this is to utilize metadata and the methods of the emerging semantic web.

Metadata reduces the need for computation in order to determine what content is about — it makes that explicit and machine-understandable. To the extent that machine-understandable metadata is added or generated for the Web, it will become more precisely searchable and productive for searchers.

This applies especially to the area of the real-time Web, where for example short “tweets” of content contain very little context to support good natural-language processing. There a little metadata can go a long way. In addition, of course metadata makes a dramatic difference in search of the larger non-real-time Web as well.

In addition to metadata, search engines need to modify their algorithms to be more personalized. Instead of a “one-size fits all” ranking for each query, the ranking may differ for different people depending on their varying interests and search histories.

Finally, to provide better search of the present, search has to become more realtime. To this end, rankings need to be developed that surface not only what just happened now, but what happened recently and is also trending upwards and/or of note. Realtime search has to be more than merely listing search results chronologically. There must be effective ways to filter the noise and surface what’s most important effectively. Social graph analysis is a key tool for doing this, but in
addition, powerful statistical analysis and new visualizations may also be required to make a compelling experience.

Sneak Peak – Siri — Interview with Tom Gruber

Sneak Preview of Siri – The Virtual Assistant that will Make Everyone Love the iPhone, Part 2: The Technical Stuff

In Part-One of this article on TechCrunch, I covered the emerging paradigm of Virtual Assistants and explored a first look at a new product in this category called Siri. In this article, Part-Two, I interview Tom Gruber, CTO of Siri, about the history, key ideas, and technical foundations of the product:

Nova Spivack: Can you give me a more precise definition of a Virtual Assistant?

Tom Gruber: A virtual personal assistant is a software system that

  • Helps the user find or do something (focus on tasks, rather than information)
  • Understands the user’s intent (interpreting language) and context (location, schedule, history)
  • Works on the user’s behalf, orchestrating multiple services and information sources to help complete the task

In other words, an assistant helps me do things by understanding me and working for me. This may seem quite general, but it is a fundamental shift from the way the Internet works today. Portals, search engines, and web sites are helpful but they don’t do things for me – I have to use them as tools to do something, and I have to adapt to their ways of taking input.

Nova Spivack: Siri is hoping to kick-start the revival of the Virtual Assistant category, for the Web. This is an idea which has a rich history. What are some of the past examples that have influenced your thinking?

Tom Gruber: The idea of interacting with a computer via a conversational interface with an assistant has excited the imagination for some time.  Apple’s famous Knowledge Navigator video offered a compelling vision, in which a talking head agent helped a professional deal with schedules and access information on the net. The late Michael Dertouzos, head of MIT’s Computer Science Lab, wrote convincingly about the assistant metaphor as the natural way to interact with computers in his book “The Unfinished Revolution: Human-Centered Computers and What They Can Do For Us”.  These accounts of the future say that you should be able to talk to your computer in your own words, saying what you want to do, with the computer talking back to ask clarifying questions and explain results.  These are hallmarks of the Siri assistant.  Some of the elements of these visions
are beyond what Siri does, such as general reasoning about science in the Knowledge Navigator.  Or self-awareness a la Singularity.  But Siri is the real thing, using real AI technology, just made very practical on a small set of domains. The breakthrough is to bring this vision to a mainstream market, taking maximum advantage of the mobile context and internet service ecosystems.

Nova Spivack: Tell me about the CALO project, that Siri spun out from. (Disclosure: my company, Radar Networks, consulted to SRI in the early days on the CALO project, to provide assistance with Semantic Web development)

Tom Gruber: Siri has its roots in the DARPA CALO project (“Cognitive Agent that Learns and Organizes”) which was led by SRI. The goal of CALO was to develop AI technologies (dialog and natural language understanding,s understanding, machine learning, evidential and probabilistic reasoning, ontology and knowledge representation, planning, reasoning, service delegation) all integrated into a virtual
assistant that helps people do things.  It pushed the limits on machine learning and speech, and also showed the technical feasibility of a task-focused virtual assistant that uses knowledge of user context and multiple sources to help solve problems.

Siri is integrating, commercializing, scaling, and applying these technologies to a consumer-focused virtual assistant.  Siri was under development for several years during and after the CALO project at SRI. It was designed as an independent architecture, tightly integrating the best ideas from CALO but free of the constraints of a national distributed research project. The Siri.com team has been evolving and hardening the technology since January 2008.

Nova Spivack: What are primary aspects of Siri that you would say are “novel”?

Tom Gruber: The demands of the consumer internet focus — instant usability and robust interaction with the evolving web — has driven us to come up with some new innovations:

  • A conversational interface that combines the best of speech and semantic language understanding with an interactive dialog that helps guide
    people toward saying what they want to do and getting it done. The
    conversational interface allows for much more interactivity that one-shot search style interfaces, which aids usability and improves intent understanding.  For example, if Siri didn’t quite hear what you said, or isn’t sure what you meant, it can ask for clarifying information.   For example, it can prompt on ambiguity: did you mean pizza restaurants in Chicago or Chicago-style pizza places near you? It can also make reasonable guesses based on context. Walking around with the phone at lunchtime, if the speech interpretation comes back with something garbled about food you probably meant “places to eat near my current location”. If this assumption isn’t right, it is easy to correct in a conversation.
  • Semantic auto-complete – a combination of the familiar “autocomplete” interface of search boxes with a semantic and linguistic model of what might be worth saying. The so-called “semantic completion” makes it possible to rapidly state complex requests (Italian restaurants in the SOMA neighborhood of San Francisco that have tables available tonight) with just a few clicks. It’s sort of like the power of faceted search a la Kayak, but packaged in a clever command line style interface that works in small form factor and low bandwidth environments.
  • Service delegation – Siri is particularly deep in technology for operationalizing a user’s intent into computational form, dispatching to multiple, heterogeneous services, gathering and integrating results, and presenting them back to the user as a set of solutions to their request.  In a restaurant selection task, for instance, Siri combines information from many different sources (local business directories, geospatial databases, restaurant guides, restaurant review sources, online reservation services, and the user’s own favorites) to show a set of candidates that meet the intent expressed in the user’s natural language request.

Nova Spivack: Why do you think Siri will succeed when other AI-inspired projects have failed to meet expectations?

Tom Gruber: In general my answer is that Siri is more focused. We can break this down into three areas of focus:

  • Task focus. Siri is very focused on a bounded set of specific human tasks, like finding something to do, going out with friends, and getting around town.  This task focus allows it to have a very rich model of its domain of competence, which makes everything more tractable from language understanding to reasoning to service invocation and results presentation
  • Structured data focus. The kinds of tasks that Siri is particularly good at involve semistructured data, usually on tasks involving multiple criteria and drawing from multiple sources.  For example, to help find a place to eat, user preferences for cuisine, price range, location, or even specific food items come into play.  Combining results from multiple sources requires
    reasoning about domain entity identity and the relative capabilities of different information providers.  These are hard problems of semantic
    information processing and integration that are difficult but feasible
    today using the latest AI technologies.
  • Architecture focus. Siri is built from deep experience in integrating multiple advanced technologies into a platform designed expressly for virtual assistants. Siri co-founder Adam Cheyer was chief architect of the CALO project, and has applied a career of experience to design the platform of the Siri product. Leading the CALO project taught him a lot about what works and doesn’t when applying AI to build a virtual assistant. Adam and I also have rather unique experience in combining AI with intelligent interfaces and web-scale knowledge integration. The result is a “pure  play” dedicated architecture for virtual assistants, integrating all the components of intent understanding, service delegation, and dialog flow management. We have avoided the need to solve general AI problems by concentrating on only what is needed for a virtual assistant, and have chosen to begin with a
    finite set of vertical domains serving mobile use cases.

Nova Spivack: Why did you design Siri primarily for mobile devices, rather than Web browsers in general?

Tom Gruber: Rather than trying to be like a search engine to all the world’s information, Siri is going after mobile use cases where deep models of context (place, time, personal history) and limited form factors magnify the power of an intelligent interface.  The smaller the form factor, the more mobile the context,
the more limited the bandwidth : the more it is important that the interface make intelligent use of the user’s attention and the resources at hand.  In other words, “smaller needs to be smarter.”  And the benefits of being offered just the right level of detail or being prompted with just the right questions can make the difference between task completion or failure.  When you are on the go, you just don’t have time to wade through pages of links and disjoint interfaces, many of which are not suitable to mobile at all.

Nova Spivack: What language and platform is Siri written in?

Tom Gruber: Java, Javascript, and Objective C (for the iPhone)

Nova Spivack: What about the Semantic Web? Is Siri built with Semantic Web open-standards such as RDF and OWL, Sparql?

Tom Gruber: No, we connect to partners on the web using structured APIs, some of which do use the Semantic Web standards.  A site that exposes RDF usually has an API that is easy to deal with, which makes our life easier.  For instance, we use geonames.org as one of our geospatial information sources. It is a full-on Semantic
Web endpoint, and that makes it easy to deal with.  The more the API declares its data model, the more automated we can make our coupling to it.

Nova Spivack: Siri seems smart, at least about the kinds of tasks it was designed for. How is the knowledge represented in Siri – is it an ontology or something else?

Tom Gruber: Siri’s knowledge is represented in a unified modeling system that combines ontologies, inference networks, pattern matching agents, dictionaries, and dialog models.  As much as possible we represent things declaratively (i.e., as data in models, not lines of code).  This is a tried and true best practice for complex AI systems.  This makes the whole system more robust and scalable, and the development process more agile.  It also helps with reasoning and learning, since Siri can look at what it knows and think about similarities and generalizations at a semantic level.

Nova Spivack: Will Siri be part of the Semantic Web, or at least the open linked data Web (by making open API’s, sharing of linked data, RDF, available, etc.)?

Tom Gruber: Siri isn’t a source of data, so it doesn’t expose data using Semantic Web standards.  In the Semantic Web ecosystem, it is doing something like the vision of a semantic desktop – an intelligent interface that knows about user needs
and sources of information to meet those needs, and intermediates.  The original Semantic Web article in Scientific American included use cases that an assistant would do (check calendars, look for things based on multiple structured criteria, route planning, etc.).  The Semantic Web vision focused on exposing the structured data, but it assumes APIs that can do transactions on the data.  For example, if a virtual assistant wants to schedule a dinner it needs more than the information
about the free/busy schedules of participants, it needs API access to their calendars with appropriate credentials, ways of communicating with the participants via APIs to their email/sms/phone, and so forth. Siri is building on the ecosystem of APIs, which are better if they declare the meaning of the data in and out via ontologies.  That is the original purpose of ontologies-as-specification that I promoted in the
1990s – to help specify how to interact with these agents via knowledge-level APIs.

Siri does, however, benefit greatly from standards for talking about space and time, identity (of people, places, and things), and authentication.  As I called for in my Semantic Web talk in 2007, there is no reason we should be string matching on city names, business names, user names, etc.

All players near the user in the ecommerce value chain get better when the information that the users need can be unambiguously identified, compared, and combined. Legitimate service providers on the supply end of the value chain also benefit, because structured data is harder to scam than text.  So if some service provider offers a multi-criteria decision making service, say, to help make a product purchase in some domain, it is much easier to do fraud detection when the product instances, features, prices, and transaction availability information are all structured data.

Nova Spivack: Siri appears to be able to handle requests in natural language. How good is the natural language processing (NLP) behind it? How have you made it better than other NLP?

Tom Gruber: Siri’s top line measure of success is task completion (not relevance).  A subtask is intent recognition, and subtask of that is NLP.  Speech is another element, which couples to NLP and adds its own issues.  In this context, Siri’s NLP is “pretty darn good” — if the user is talking about something in Siri’s domains of competence, its intent understanding is right the vast majority of the time, even in the face of noise from speech, single finger typing, and bad habits from too much keywordese.  All NLP is tuned for some class of natural language, and Siri’s is tuned for things that people might want to say when talking to a virtual assistant on their phone. We evaluate against a corpus, but I don’tknow how it would compare to standard message and news corpuses using by the NLP research community.

Nova Spivack: Did you develop your own speech interface, or are you using third-party system for that? How good is it? Is it battle-tested?

Tom Gruber: We use third party speech systems, and are architected so we can swap them out and experiment. The one we are currently using has millions of users and continuously updates its models based on usage.

Nova Spivack: Will Siri be able to talk back to users at any point?

Tom Gruber: It could use speech synthesis for output, for the appropriate contexts.  I have a long standing interest in this, as my early graduate work was in communication prosthesis. In the current mobile internet world, however, iPhone-sized screens and 3G networks make it possible to do so more much than read menu items over the phone.  For the blind, embedded appliances, and other applications it would make sense to give Siri voice output.

Nova Spivack: Can you give me more examples of how the NLP in Siri works?

Tom Gruber: Sure, here’s an example, published in the Technology Review, that illustrates what’s going on in a typical dialogue with Siri. (Click link to view the table)

Nova Spivack: How personalized does Siri get – will it recommend different things to me depending on where I am when I ask, and/or what I’ve done in the past? Does it learn?

Tom Gruber: Siri does learn in simple ways today, and it will get more sophisticated with time.  As you said, Siri is already personalized based on immediate context, conversational history, and personal information such as where you live.  Siri doesn’t forget things from request to request, as do stateless systems like search engines. It always considers the user model along with the domain and task models when coming up with results.  The evolution in learning comes as users have a history with Siri, which gives it achance to make some generalizations about preferences.  There is a natural progression with virtual assistants from doing exactly what they are asked, to making recommendations based on assumptions about intent and preference. That is the curve we will explore with experience.

Nova Spivack: How does Siri know what is in various external services – are you mining and doing extraction on their data, or is it all just real-time API calls?

Tom Gruber: For its current domains Siri uses dozens of APIs, and connects to them in both realtime access and batch data synchronization modes.  Siri knows about the data because we (humans) explicitly model what is in those sources.  With declarative representations of data and API capabilities, Siri can reason about the various capabilities of its sources at run time to figure out which combination would best serve the current user request.  For sources that do not have nice APIs or expose data using standards like the Semantic Web, we can draw on a value chain of players that do extract structure by data mining and exposing APIs via scraping.

Nova Spivack: Thank you for the information, Siri might actually make me like the iPhone enough to start using one again.

Tom Gruber: Thank you, Nova, it’s a pleasure to discuss this with someone who really gets the technology and larger issues. I hope Siri does get you to use that iPhone again. But remember, Siri is just starting out and will sometimes say silly things. It’s easy to project intelligence onto an assistant, but Siri isn’t going to pass the Turing Test. It’s just a simpler, smarter way to do what you already want to do. It will be interesting to see how this space evolves, how people will come to understand what to expect from the little personal assistant in their pocket.

Twine's Explosive Growth

Twine has been growing at 50% per month since launch in October. We've been keeping that quiet while we wait to see if it holds. VentureBeat just noticed and did an article about it. It turns out our January numbers are higher than Compete.com estimates and February is looking strong too. We have a slew of cool viral features coming out in the next few months too as we start to integrate with other social networks. Should be an interesting season.

Fast Company Interview — "Connective Intelligence"

In this interview with Fast Company, I discuss my concept of "connective intelligence." Intelligence is really in the connections between things, not the things themselves. Twine facilitates smarter connections between content, and between people. This facilitates the emergence of higher levels of collective intelligence.

Interest Networks are at a Tipping Point

UPDATE: There’s already a lot of good discussion going on around this post in my public twine.

I’ve been writing about a new trend that I call “interest networking” for a while now. But I wanted to take the opportunity before the public launch of Twine on Tuesday (tomorrow) to reflect on the state of this new category of applications, which I think is quickly reaching its tipping point. The concept is starting to catch on as people reach for more depth around their online interactions.

In fact – that’s the ultimate value proposition of interest networks – they move us beyond the super poke and towards something more meaningful. In the long-term view, interest networks are about building a global knowledge commons. But in the short term, the difference between social networks and interest networks is a lot like the difference between fast food and a home-cooked meal – interest networks are all about substance.

At a time when social media fatigue is setting in, the news cycle is growing shorter and shorter, and the world is delivered to us in soundbytes and catchphrases, we crave substance. We go to great lengths in pursuit of substance. Interest networks solve this problem – they deliver substance.t

So, what is an interest network?

In short, if a social network is about who you are interested in, an interest network is about what you are interested in. It’s the logical next step.

Twine for example, is an interest network that helps you share information with friends, family, colleagues and groups, based on mutual interests. Individual “twines” are created for content around specific subjects. This content might include bookmarks, videos, photos, articles, e-mails, notes or even documents. Twines may be public or private and can serve individuals, small groups or even very large groups of members.

I have also written quite a bit about the Semantic Web and the Semantic Graph, and Tim Berners-Lee has recently started talking about what he calls the GGG (Giant Global Graph). Tim and I are in agreement that social networks merely articulate the relationships between people. Social networks do not surface the equally, if not more important, relationships between people and places, places and organizations, places and other places, organization and other organizations, organization and events, documents and documents, and so on.

This is where interest networks come in. It’s still early days to be clear, but interest networks are operating on the premise of tapping into a multi–dimensional graph that manifests the complexity and substance of our world, and delivers the best of that world to you, every day.

We’re seeing more and more companies think about how to capitalize on this trend. There are suddenly (it seems, but this category has been building for many months) lots of different services that can be viewed as interest networks in one way or another, and here are some examples:

What all of these interest networks have in common is some sort of a bottom-up, user-driven crawl of the Web, which is the way that I’ve described Twine when we get the question about how we propose to index the entire Web (the answer: we don’t.

We let our users tell us what they’re most interested in, and we follow their lead).

Most interest networks exhibit the following characteristics as well:

  • They have some sort of bookmarking/submission/markup function to store and map data (often using existing metaphors, even if what’s under the hood is new)
  • They also have some sort of social sharing function to provide the network benefit (this isn’t exclusive to interest networks, obviously, but it is characteristic)
  • And in most cases, interest networks look to add some sort of “smarts” or “recommendations” capability to the mix (that is, you get more out than you put in)

This last bullet point is where I see next-generation interest networks really providing the most benefit over social bookmarking tools, wikis, collaboration suites and pure social networks of one kind or another.

To that end, we think that Twine is the first of a new breed of intelligent applications that really get to know you better and better over time – and that the more you use Twine, the more useful it will become. Adding your content to Twine is an investment in the future of your data, and in the future of your interests.

At first Twine begins to enrich your data with semantic tags and links to related content via our recommendations engine that learns over time. Twine also crawls any links it sees in your content and gathers related content for you automatically – adding it to your personal or group search engine for you, and further fleshing out the semantic graph of your interests which in turn results in even more relevant recommendations.

The point here is that adding content to Twine, or other next-generation interest networks, should result in increasing returns. That’s a key characteristic, in fact, of the interest networks of the future – the idea that the ratio of work (input) to utility (output) has no established ceiling.

Another key characteristic of interest networks may be in how they monetize. Instead of being advertising-driven, I think they will focus more on a marketing paradigm. They will be to marketing what search engines were to advertising. For example, Twine will be monetizing our rich model of individual and group interests, using our recommendation engine. When we roll this capability out in 2009, we will deliver extremely relevant, useful content, products and offers directly to users who have demonstrated they are really interested in such information, according to their established and ongoing preferences.

6 months ago, you could not really prove that “interest networking” was a trend, and certainly it wasn’t a clearly defined space. It was just an idea, and a goal. But like I said, I think that we’re at a tipping point, where the technology is getting to a point at which we can deliver greater substance to the user, and where the culture is starting to crave exactly this kind of service as a way of making the Web meaningful again.

I think that interest networks are a huge market opportunity for many startups thinking about what the future of the Web will be like, and I think that we’ll start to see the term used more and more widely. We may even start to see some attention from analysts — Carla, Jeremiah, and others, are you listening?

Now, I obviously think that Twine is THE interest network of choice. After all we helped to define the category, and we’re using the Semantic Web to do it. There’s a lot of potential in our engine and our application, and the growing community of passionate users we’ve attracted.

Our 1.0 release really focuses on UE/usability, which was a huge goal for us based on user feedback from our private beta, which began in March of this year. I’ll do another post soon talking about what’s new in Twine. But our TOS (time on site) at 6 minutes/user (all time) and 12 minutes/user (over the last month) is something that the team here is most proud of – it tells us that Twine is sticky, and that “the dogs are eating the dog food.”

Now that anyone can join, it will be fun and gratifying to watch Twine grow.

Still, there is a lot more to come, and in 2009 our focus is going to shift back to extending our Semantic Web platform and turning on more of the next-generation intelligence that we’ve been building along the way. We’re going to take interest networking to a whole new level.

Stay tuned!

New Video: Leading Minds from Google, Yahoo, and Microsoft talk about their Visions for Future of The Web

Video from my panel at DEMO Fall ’08 on the Future of the Web is now available.

I moderated the panel, and our panelists were:

Howard Bloom, Author, The Evolution of Mass Mind from the Big Bang to the 21st Century

Peter Norvig, Director of Research, Google Inc.

Jon Udell, Evangelist, Microsoft Corporation

Prabhakar Raghavan, PhD, Head of Research and Search Strategy, Yahoo! Inc.

The panel was excellent, with many DEMO attendees saying it was the best panel they had ever seen at DEMO.

Many new and revealing insights were provided by our excellent panelists. I was particularly interested in the different ways that Google and Yahoo describe what they are working on. They covered lots of new and interesting information about their thinking. Howard Bloom added fascinating comments about the big picture and John Udell helped to speak about Microsoft’s longer-term views as well.


How To Use Twine — Screencast!

I have made a screencast that teaches you how to get started using Twine, and explains most of the features, best-practices for using it, and where we are headed with the product. You can read more about it and discuss it with me here.

For anyone who is new to Twine, this will be really helpful. Once you see this you will understand what Twine is for and how you can start to benefit from it right away.

The high-quality version is here.

For those who prefer YouTube’s lower-quality here is the first part. Note that YouTube requires that videos are less than 10 minutes but the whole screencast is about 30 minutes, so I had to break it into parts. Here is Part 1 of 4:

And here is the rest of it in YouTube format:
Part 2

Part 3

Part 4

Life in Perpetual Beta: The Film

Melissa Pierce is a filmmaker who is making a film about "Life in Perpetual Beta." It’s about how people who are adapting and reinventing themselves in the moment, and a new philosophy or approach to life. She’s interviewed a number of interesting people, and while I was in Chicago recently, she spoke with me as well. Here is a clip about how I view the philosophy of living in Beta. Her film is also in perpetual beta, and you can see the clips from her interviews on her blog as the film evolves. Eventually it will be released through the indie film circuit, and it looks like it will be a cool film. By the way, she is open to getting sponsors so if you like this idea and want your brand on the opening credits, drop her a line!

If Social Networks Were Like Cars…

I have been thinking a lot about social networks lately, and why there are so many of them, and what will happen in that space.

Today I had what I think is a "big realization" about this.

Everyone, including myself, seems to think that there is only room for one big social network, and it looks like Facebook is winning that race. But what if that assumption is simply wrong from the start?

What if social networks are more like automobile brands? In other words, there can, will and should be many competing brands in the space?

Social networks no longer compete on terms of who has what members. All my friends are in pretty much every major social network.

I also don’t need more than one social network, for the same reason — my friends are all in all of them. How many different ways do I need to reach the same set of people? I only need one.

But the Big Realization is that no social network satisfies all types of users. Some people are more at home in a place like LinkedIn than they are in Facebook, for example. Others prefer MySpace.  There are always going to be different social networks catering to the common types of people (different age groups, different personalities, different industries, different lifestyles, etc.).

The Big Realization implies that all the social networks are going to be able to interoperate eventually, just like almost all email clients and servers do today. Email didn’t begin this way. There were different networks, different servers and different clients, and they didn’t all speak to each other. To communicate with certain people you had to use a certain email network, and/or a certain email program. Today almost all email systems interoperate directly or at least indirectly. The same thing is going to happen in the social networking space.

Today we see the first signs of this interoperability emerging as social networks open their APIs and enable increasing integration. Currently there is a competition going on to see which "open" social network can get the most people and sites to use it. But this is an illusion. It doesn’t matter who is dominant, there are always going to be alternative social networks, and the pressure to interoperate will grow until it happens. It is only a matter of time before they connect together.

I think this should be the greatest fear at companies like Facebook. For when it inevitably happens they will be on a level playing field competing for members with a lot of other companies large and small. Today Facebook and Google’s scale are advantages, but in a world of interoperability they may actually be disadvantages — they cannot adapt, change or innovate as fast as smaller, nimbler startups.

Thinking of social networks as if they were automotive brands also reveals interesting business opportunities. There are still several unowned opportunities in the space.

Myspace is like the car you have in high school. Probably not very expensive, probably used, probably a bit clunky. It’s fine if you are a kid driving around your hometown.

Facebook is more like the car you have in college. It has a lot of your junk in it, it is probably still not cutting edge, but its cooler and more powerful.

LinkedIn kind of feels like a commuter car to me. It’s just for business, not for pleasure or entertainment.

So who owns the "adult luxury sedan" category? Which one is the BMW of social networks?

Who owns the sportscar category? Which one is the Ferrari of social networks?

Who owns the entry-level commuter car category?

Who owns equivalent of the "family stationwagon or minivan" category?

Who owns the SUV and offroad category?

You see my point. There are a number of big segments that are not owned yet, and it is really unlikely that any one company can win them all.

If all social networks are converging on the same set of features, then eventually they will be close to equal in function. The only way to differentiate them will be in terms of the brands they build and the audience segments they focus on. These in turn will cause them to emphasize certain features more than others.

In the future the question for consumers will be "Which social network is most like me? Which social network is the place for me to base my online presence?"

Sue may connect to Bob who is in a different social network — his account is hosted in a different social network. Sue will not be a member of Bob’s service, and Bob will not be a member of Sue’s, yet they will be able to form a social relationship and communication channel. This is like email. I may use Outlook and you may use Gmail, but we can still send messages to each other.

Although all social networks will interoperate eventually, depending on each person’s unique identity they may choose to be based in — to live and surf in — a particular social network that expresses their identity, and caters to it. For example, I would probably want to be surfing in the luxury SUV of social networks at this point in my life, not in the luxury sedan, not the racecar, not in the family car, not the dune-buggy. Someone else might much prefer an open source, home-built social network account running on a server they host. It shouldn’t matter — we should still be able to connect, share stuff, get notified of each other’s posts, etc. It should feel like we are in a unified social networking fabric, even though our accounts live in different services with different brands, different interfaces, and different features.

I think this is where social networks are heading. If it’s true then there are still many big business opportunities in this space.

On the Difference Between "Semantic" and "Semantic Web"

This is a brief post with one purpose: to clarify the meaning of the term "semantic." It has suddenly become chic to label every new app as somehow "semantic" but what does this mean really? Are all "semantic" apps part of the "Semantic Web?" What is the criteria for something to be "semantic" versus "Semantic Web" anyway?

It’s pretty simple actually. Any app that can understand language to some degree could be labeled as "semantic." So even Google is somewhat of a semantic application by that criterion. Of course some applications are a lot more semantic than others. Powerset is more semantic than Google, for example, because it understands natural language, not just keywords.

But for an application to be considered part of the "Semantic Web" it has to support a set of open standards defined by the W3C, including at the very least RDF, and potentially also OWL and SPARQL. These are the technologies that collectively comprise the Semantic Web. Supporting these technologies means making at least some RDF data visible to outside applications.

I’m not sure if Powerset is doing this yet, nor whether Freebase is doing it yet, but they should (and I’m guessing they will). Twine, my company’s application, is using RDF and OWL internally within our app and we are also exposing this via our site (although we are still in private beta so only beta participants can see that data today). Other companies such as Digg are already making their RDF data visible to the public. Any application with at least publishes RDF data can be considered to be both semantic and part of the Semantic Web.

A Few Predictions for the Near Future

This is a five minute video in which I was asked to make some predictions for the next decade about the Semantic Web, search and artificial intelligence. It was done at the NextWeb conference and was a fun interview.

Learning from the Future with Nova Spivack from Maarten on Vimeo.

Twine and Linked Data on the Semantic Web

Tim Berners-Lee just posted his thoughts about the importance of Linked Data on the Semantic Web. Linked data support is built-into Twine. All the data in Twine is accessible as open-standard RDF and OWL today and will be accessible to other applications via several API’s including SPARQL. You can learn more about Twine’s support for Linked Data and see some examples here.

Tim says:

In all this Semantic Web news, though, the proof of the pudding is in the eating. The benefit of the Semantic Web is that data may be re-used in ways unexpected by the original publisher. That is the value added. So when a Semantic Web start-up either feeds data to others who reuse it in interesting ways, or itself uses data produced by others, then we start to see the value of each bit increased through the network effect.

So if you are a VC funder or a journalist and some project is being sold to you as a Semantic Web project, ask how it gets extra re-use of data, by people who would not normally have access to it, or in ways for which it was not originally designed. Does it use standards? Is it available in RDF? Is there a SPARQL server?

Twine provides RDF and supports SPARQL (although while we are in beta we have not opened our SPARQL API yet, but we will…). At the same time Twine also protects privacy by only providing its data according to permissions. Apps can only get Twine data they permission to see such as their own data or their owner’s or users’s data, data that has been shared with them, or public data in Twine.

Twine is also designed to consume external Linked Data via it’s APIs. Twine will be able to consume external RDF and OWL ontologies, as a means to enable other applications and users to extend its functionality and add new data to it.

First Week of Twine Beta Phase II Report

This week we began letting the second wave of beta users into the Twine invite-only beta. It’s been a very busy and exciting time for the Twine team. I’ll be providing more detailed stats on an ongoing basis in a few weeks once we have more data to analyze. For now, I will just provide some qualitative observations.

Twine is still in the early beta process, but already we are seeing a rapid increase in adoption and scale. We have only let in a few hundred more users to get the process started, but we will be letting more and more in every week as we go forward.

It has been really exciting to watch Twine grow. I find that I am increasingly glued to my Interest Feed watching the fascinating information that is flowing through from all the new members. There have been many new twines created around a wide and growing range of interests and large amount of content added. The recommendations are also quite interesting — I have already discovered a wide range of new people, twines and content that I didn’t know about.

As of this writing, I now have 157 social connections in Twine. My social network in Twine has doubled in size in a week and is rapidly approaching the size of my Facebook network. That’s pretty impressive considering this happened in a week (it took about half a year for my Facebook network to grow to that size).

We also had our first outside Twine client app, called "Entwine," written spontaneously by a beta user — it browses through the RDF data from various items in Twine. That was very cool and unexpected! It really got the team jazzed to see this happen.

Twine is now full of active discussions around interests, questions, ideas, suggestions, current events, technologies and products. I have been pleasantly surprised to see so much interaction among users develop so quickly. As we had hypothesized, discussions are turning out to be a very key feature.

We have received a lot of great feedback from beta users within Twine, as well as many suggestions for how to improve Twine, streamline the user experience, and integrate Twine with other applications and services. This is exactly what we had hoped for from our beta. The team is hard at work analyzing this and prioritizing our next development sprints in light of what we are learning from our users (we do minor releases every week and major ones every 3 weeks).

Most of the press reviews and user stories point to Twine being very exciting, useful and full of potential, which has been great to hear after so much work — they also universally agree that we still have room to improve the user experience and we need to work on making Twine easier to learn and use. That’s not unexpected — we opened the beta well before the app is finished in order to understand user priorities better. We are really focusing on usability and bug fixes for the next several sprints.  All this feedback has been incredibly valuable to the team. Keep it coming!

Another interesting observation. The quality of the users in Twine is distinctly impressive. It’s a very smart community of leading-edge thinkers, builders, and technology adopters. Kind of like having your own TED Conference, 24/7 around the world. We will be inviting in a wider range of users in later phases, once the app is further along. In the meantime it is really great to see so many of my colleagues in Twine, and to be making so many new contacts and friends here. For this initial phase this is exactly the audience we need — people who will really roll up their sleeves and help us make Twine into a great application.

Twine is also rapidly aggregating most of the leading minds in the worldwide Semantic Web development and research community into a social and collaborative interest network. It is great to have this global community of people interested in building and using the Semantic Web come together in Twine, an application that is built using Semantic Web technologies on the Radar Networks Semantic Web Applications platform. I look forward to beginning to share Twine with this worldwide community, and to collaborate with others to extend it and integrate it with other semantic apps and data sets. This is definitely our goal.

It’s been a great week. I haven’t slept much. I’m having too much fun in Twine!

Twine Perspective on Yahoo Semantic Web Search Announcement

The Beginning of the Mainstream Semantic Web?

It is being reported that Yahoo will be indexing a wide array of structured metadata, including Semantic Web metadata. This will make Yahoo’s search index potentially better than Google’s, although it will also open their index up to sophisticated attempts to “game the system” as well that will need to be solved. But in any event, this will undoubtedly prod Google to begin indexing and making sense of structured metadata as well (actually, Google is already indexing FOAF, a Semantic Web metadata format).

I believe Yahoo’s announcement marks the beginning of the mainstream Semantic Web. It should quickly catalyze an arms race by search engines, advertisers, and content providers to make the best use of semantic metadata on the Web. This will benefit the entire semantic sector and all players in it.

As they say, “a rising tide lifts all boats.”

Where Twine Fits Into This Ecosystem

From the perspective of a company working on a large Semantic Web driven portal venture (Twine), and full platform for semantic applications (and search), this is good news. We’ll be happy to open up Twine’s content to Yahoo’s index (when we go into General Availability in the summer timeframe, or maybe even sooner…). In addition, as more content providers add metadata to their content, it will make Twine’s job of helping users collect, organize, share and discover interesting content, that much easier.

Where does Twine fit into the emerging Semantic Web ecosystem? Twine provides presence and content on the Semantic Web. It enables individuals and groups to homestead on the Semantic Web and get immediate value, without having to learn RDF.

Currently we are not going after the “be the search engine of the Semantic Web” opportunity — we are focused on the “help users manage their information and connect with others who share their interests”and the “build thriving communities of interest” opportunities.

Our feeling is that incumbent search engines are probably best positioned to win the search engine of the entire Semantic Web war, when they decide to (as Yahoo just did, and Google most likely will soon decide to do as well…).

Twine is generating high-quality Semantic Web metadata about people, groups, topics of interest, and resources on the Web (Web pages, images, videos, books,
products, documents, etc.). The metadata we are creating results from a combination of automated processing and user-contributions from our community.

The metadata Twine generates is then provided back to the users and community as open RDF that can be accessed and reused elsewhere. So we are effectively making a semantic graph of RDF about content around the Web, and related people, groups and their interests. Ultimately we become a semantic annotation layer above the Web. I can imagine that this is a dataset that Yahoo and Google and many others are going to want to be able to search.

The content in Twine is rapidly growing into a large semantic graph of information around people, groups and interests on the Web. We and our users are producing a large volume of high-quality original content and semantic metadata about existing Web content, that will undoubtedly make the Yahoo index much richer (and will drive traffic back to Twine and the sites we link back to from our graph).

The Semantic Web Eliminates Traditional Silos By Opening Up and Linking the Data

Twine is a hosted online service, but is not actually a “silo” in the traditional sense because all of our data is represented in open-standards-based RDF, and we are already providing access to that data on an experimental basis, and will provide even more via upcoming API’s in the future.

This means that the data Twine is creating and gathering, is open, linked data, that can be reused in other applications and services. Ultimately this makes Twine a part of a growing distributed ecosystem. Semantic Web metadata in RDF and OWL is even better than microformat because it carries its own meaning about how to use it. Software that speaks RDF and OWL can instantly reuse it without any additional programming. To learn more about Twine’s open RDF availability, see the Twine Tour: Semantic Web section.

I believe that the open-standards of the Semantic Web eliminate silos. Effectively all services that participate in using these standards and make their data open are becoming part of one big distributed worldwide database, rather than old fashioned silos. That’s the benefit of open linked data services powered by RDF, OWL, SPARQL, and GRDDL.

How Will End-Users Participate in the Semantic Web?

If Yahoo and possibly Google make search better by indexing all sorts of metadata, there is then an even larger opportunity to help non-technical end-users create and use that metadata. This is where services like Twine fits in. End-users need ways to author, organize, share, reuse, and discover Semantic Web content.

We don’t believe ordinary Webmasters or end-users are going to write microformats or RDF by hand. Even hard-core Semantic Web researchers don’t do that. Ultimately end-users need user-friendly services that do this for them automatically, or at least make it easier to do. Twine helps these users to participate in the Semantic Web, without requiring them to have a degree in computer science. Twine provides an (increasingly) user-friendly hosted place where users can collect, organize, share and discover other interesting content around their interests, using the Semantic Web transparently “under the hood.”

Concluding Thoughts

In short, Twine is where ordinary non-technical individuals and groups can join the Semantic Web, get a presence there, and start using it in useful ways, today. If Yahoo and Google become the search engines of the Semantic Web, that will make Twine even more necessary as the place where end-users can participate in this emerging ecosystem. We believe our community, and the rich the semantic graph we are growing will become increasingly valuable as the major search engines begin to index the Semantic Web.

But this is just the beginning of our story. Twine is designed to become a platform that others can build on and integrate with as well. There is more to our strategy than we have currently opened up about. In time we will be telling the rest of our story. We have some fun surprises in store in the future…

Reminder: Twine is a Beta — A note about what Beta means

I want to remind everyone, TWINE IS A BETA. It is only a beta. Beta means not finished, under development, work in progress, construction site, imperfect, open to feedback, undergoing testing, getting better everyday, in need of more work, etc. and many other things that are not synonymous with “finished” or “ready for consumer launch.” We know this. We never claimed otherwise. We opened Twine early to get feedback and let the community play around and give us feedback to guide our future work.

Some of the recent coverage of our project has seemingly misunderstood the meaning of the term “beta” or forgotten it, or simply expected a beta to be more of a finished application. Perhaps this is because many companies never come out of beta or use beta to mean “1.0, only cooler.” In our case, beta really means Beta. We knew there were bugs and unfinished features, but we decided to open up anyway in order to get user feedback to guide our further work.

But even though Twine is a beta, it is already quite useful, and there is a large and thriving community in there sharing knowledge about interests including the Semantic Web, Web 3.0, Web 2.0, venture capital, politics, art, fashion, travel, cultures, religion, books, and many other interests.

In fact, the number of connections I have in Twine is rapidly approaching, and will probably soon surpass, the number of connections I have in Facebook. And in terms of use, we are finding that our users are visiting Twine many times a day and actively adding information, searching, and participating in discussions and debates there.

The hype around the Semantic Web (and even Twine) is in my opinion justified, but it will take time for that opinion to be obvious to everyone. In the meantime, I do think it has gotten a bit out of control. There is too much wild speculation and a general feeling that somehow the Semantic Web (or services like Twine) will solve every problem on the Internet. That won’t be the case. However the Semantic Web and services like Twine that are built with it will  improve the content of the Web and enable applications to become smarter with less work.

To some degree the hype around the Semantic Web has set unrealistic expectations and it’s not surprising that there is now some backlash. Some folks who came into Twine may have had impossible expectations — perhaps thinking Twine would be some kind of a three-dimensional interface to all information, or a kind of Hal 9000 intelligent assistant. I’m sorry to disappoint them. Twine is much more pragmatic and focused on things like organizing, sharing and discovering information around interests. It is also just a first step in a long development path in which much more will be added in the future. And let’s not forget… Twine is in Beta. It’s not finished yet.

I think the backlash is good actually — it will reset expectations to realistic levels. Hopefully then folks can focus on what the Semantic Web (and Twine) do today, rather than what they imagine they might do in 20 years, or what they don’t do yet.

In the case of Twine, it is not a panacea, but it is certainly well on its way to becoming a leading semantically-driven online service with some interesting opportunities in the marketplace. There is certainly a lot more in the application than can be discovered in 7 minutes of using it and I can understand how that might be frustrating to reviewers who have little time and high expectations of a finished consumer app. That is something we are working on and when we eventually move out of beta, it is something we will be able to say we have solved it.

Meanwhile, Twine is a beta and while there is already a LOT there, we can, must, and will be doing much, much more to address usability and finish features that are
still under development and imperfect.

Response to Read Write Web article about Twine

UPDATE: I posted some further notes on the fact that Twine is in beta, and what “beta” actually means and why we are in beta here.

Marshall Kirkpatrick wrote a critical review of Twine today that identified several known issues the team is working on. These are points well-taken — we certainly understand that Twine is still a work in progress and there are many areas where we can improve usability. After all, Twine is still in private invite-only beta and is not a finished application yet. There is much that is still under development and we are learning from our users everyday.

However, we have also been getting quite a lot of very positive feedback from our beta testers as well. Twine is already quite useful and works surprisingly well on a wide selection of Web content today, as our growing beta user base can attest to.

So on balance, while Marshall points out several issues we are aware of and are working on, there is much we are proud of in what we have been able to accomplish so far.

But I want to address some of the specific points Marshall made. Marshall pointed out the following issues:

  • Sometimes Twine is unable to auto-summarize the content of some pages it sees, when they are added to the service.

That’s true — it’s sometimes hard for Twine to identify the “content part” of a page when the page has complex structure (including tables, Flash, Ajax, frames, multiple DIV areas, etc.). In the meantime, Twine does actually do a good job on things like Typepad Blogs, the Wikipedia, Youtube, Flickr, Amazon books, WordPress, and most sites that have relatively standard page structure and/or metadata. That said however, we are working on making Twine smarter so that it can do a better job, even when there is uncertainty about the content and structure of a page. As Marshall points out this is a hard problem because there is so much non-standard content on the Web, but it’s not an insurmountable one. Twine will steadily improve over time on this front.

  • Twine doesn’t recognize article authors as related people.

Actually, if Twine can see the author’s name, it will recognize them as a related person. But the author’s name is not always visible on the article. It would be easier to manage this if there was better metadata on pages, but until that happens, the natural language approach is the main option, and it is not always perfect.

  • Marshall mentions that he thinks Twine could be better organized.

Marshall mentions that he had a hard time getting oriented and finding his way through the application because there is so much there. One of the challenges we have is simply educating users about how to Twine and what it is capable of. In addition there are many improvements we know we can make to the user-interface and information design to make it easier to figure out.

Marshall also asked for RSS feeds and visualizations.

RSS output is already supported to a limited extent and we will have more support for it next month. We are also planning to add RSS input as well in coming months.

Regarding visualizations, we’ve done a lot of work on visualizations in the past. Our feeling is that they usually don’t add much value, other than being eye-candy. However, we will be opening up our API’s eventually to allow others to make all the visualizations they want. If someone makes a really useful one, perhaps we’ll include it back into Twine.

Finally, I would also like to correct one thing that Marshall mentioned: We are not in fact going into general release next month — we are just starting to let more people in from our waiting list to continue to help beta test Twine. There will still be a members-only policy in effect for several more months. The full public opening (when Twine will be opened to non-member guests, and search engines, etc.) will be in the summer timeframe. Even then, Twine will still be in beta. There is a good year of additional work to do on Twine before it will be fully “baked,” to use Marshall’s term. Between now and that time we will be working to improve (and
finish) the app, in partnership with our beta community.

In closing, as I have said many times, Twine is still an early Beta and we have to keep expectations in line with reality. Twine is already far and beyond what any other semantic app I know of is capable of, but that still isn’t good enough. We have to push further and focus more on usability. We are opening it up early in order to get feedback and more help testing and guiding the direction of the app from users.

Hopefully as we work on Twine further, and we move out of Beta, Twine will eventually meet Marshall’s high expectations. Meanwhile, his comments are helpful in that they do give us feedback about what aspects of Twine we need to focus on more as we head towards a more consumer-friendly application.

Do You Want Early Access to the Twine Beta?

Special offer to readers of my blog…

There are now well over 30,000 users in the queue to get into the Twine beta. We’re going to start letting people in from the waiting list in waves and it should take about a month or two to let everyone in.

But what good is a waiting list if there’s no way to cut to the front, right? Fortunately, there is a way to skip ahead to the front of the line…

Write a blog post about Twine on your blog and why you want early access, and send me the link to nova (at) radarnetworks (dot) com. along with your first name, last name, and email address. If I like your post, I’ll get you an early access VIP pass to front of the line.

See you in Twine!

Insightful Article About Twine

Carla Thompson, an analyst for Guidewire Group, has written what I think is a very insightful article about her experience participating in the early-access wave of the Twine beta.

We are now starting to let the press in and next week we will begin to let waves of people in from our over 30,000 user wait list. We will be letting people into the beta in waves every week going forward.

As Carla notes, Twine is a work in progress and we are mainly focused on learning from our users now. We have lots more to do, but we’re very excited about the direction Twine is headed in, and it’s really great to see Twine getting so much active use.

Continue reading

How about Web 3G?

I’m here at the BlogTalk conference in Cork, Ireland with a range of bloggers and technologists discussing the emerging social Web. Including myself, Ian Davis and Paul Miller from Talis, there are also a bunch of other Semantic Web folks including Dan Brickley, and a group from DERI Galway.

Over dinner a few of us were discussing the terms “Semantic Web” versus “Web 3.0” and we all felt a better term was needed. After some thinking, Ian Davis suggested “Web 3G.” I like this term better than Web 3.0 because it loses the “version number” aspect that so many objected to. It has a familiar ring to it as well, reminding me of the 3G wireless phone initiative. It also suggests Tim Berners-Lee’s “Giant Global Graph” or GGG — a synonym for the Semantic Web. Ian stayed up late and put together a nice blog post about the term, echoing many of my own sentiments about how this term should apply to a decade (the third decade of the Web), rather than to a particular technology.