Privacy and Search

The recent negative hype about the lack of privacy in search results got me thinking about the needs of online services versus those of individuals. Is there a way to satisfy both constraints?

AOL’s accidental data release was one thing that worried me. Google’s "personal search" feature, where the log of all your searches is displayed, was another. The fact that everything you search for and click on, during your entire life, could potentially be logged, owned, accessed, and shared, by and with parties other than yourself, without your consent or even your knowledge, is a step towards a world I wouldn’t want to live in.

The arguments in favor of allowing this to continue either hinge on commercial needs or homeland security and law enforcement. Regarding commercial needs: just as in other situations where commercial needs pose risks to individual privacy (such as medical records for example), the government needs to step in and regulate if the industry can’t do an adequate job of self-regulating. And regarding industry self-regulation, it can easily become a case of wolves guarding sheep, and so has to be carefully regulated by government on a meta-level. As for the needs of homeland security and law enforcement, access should be strictly regulated (and in theory, it already is). 

The thing is, even if governments and industry stepped up and took responsibility for regulating this situation, one can never be
sure that future regime change, accidents, or individuals or groups
with both access and a motive won’t lead to future privacy violations. As a result
even ironclad assurances, laws, and strict procedures by organizations
and governments, won’t protect anyone against such unknowns. The only truly safe solution is one that puts all of the control, and all of the responsibility and liability, for one’s own private data, in one’s own hands. In a digital world, where everything is potentially recorded and logged forever, this is really important.

The solution is, I think, that individuals, rather than search companies, should own and control their searchstreams and their clickstreams such that they can make use of that information for their own personalization needs, and they can selectively (and either authentically or anonymously) share it with other services if and when they want to. Someone should build an infrastructure that enables this and then make it an API that all services and apps can use. The folks at Attention Trust and Root Markets are on the right track. This is a very interesting business opportunity.

I would like to see a search engine and a search toolboar for Firefox that enable you to search anonymously. I did a little research (on Google, how ironic) and found Proxify, Kaxy and Mezzy. They seem interesting although perhaps a little clunky seeming. What we need is a high-profile, really polished, professional, well-funded, simple anonymous proxy for Google. And a Firefox toolbar to go with it.

If a service like what I am describing existed (and there was some level of independent audit that could assure me that it really didn’t capture or save anything private without my permission — for example if all the code was open source and vetted), then I would definitely always use it instead of going directly to Google. Does it exist already? Let me know. If not, someone should build it. In fact, I wouldn’t mind if it showed me ads, just like Google does. So it could make money from my searching. I would bring my business there as would most people who have educated themselves about this issue.

Finally, wearing my corporate hat for the moment, as someeone building an online service in the search space, if there was a suitable (and that is the key term here…) way that the service my company is building could give individuals control of their private data while also still being able to learn from it in aggregate and/or anonymously for individuals, that would be great. As an online service provider I don’t really want to have to worry about keeping such private information and all the overhead and potential liability that goes with it.

Online services do need to learn from the behavior of their users in order to personalize content and target ads, etc. But they don’t need to necessarily house that data themselves, nor do they need to necessarily be able to key it to the real identities of their users. If there was an infrastructure that enabled my service to learn, personalize and target, without having to hold and manage the dataset underlying that capability, that would actually be a potential savings to my business, and a reduction of risk, and a benefit to my users. The thing is, while early attempts to enable this do exist, they aren’t mature enough to rely on, and nobody knows how well they will scale or whether they will have enough funding and traction to last. So in the meantime those of us building online services are in a gray area — we need certain features for our services to function well, and we also would like to find a way to protect privacy for the individual. This is the connundrum of the moment. It’s a business opportunity for someone out there.