AI, BI, and the Necessity of Automating the Analyst

It’s Time to Automate the Analyst

I have been speaking about the need for “automating the analyst” for several years. This need is prompted not only by the data deluge — the Cambrian explosion of data volume, velocity, and variety of data sources — but also by the simple reality that enterprises cannot hire the number of data scientists they need to adapt to this new environment.

In short, data science is hard: It requires human data scientists to do data preparation and analysis, but data scientists are becoming increasingly scarce and hard to get.

By 2018, IDC predicts there will be a shortage of nearly 300,000 data scientists and analysts, and nearly 1.5M managers with proficiency in data-driven decisions, in the USA alone.

Meanwhile, the decreasing costs of creating, storing and computing on vast amounts of data, are enabling large enterprises to become increasingly data and algorithm driven.

These two trends — the data deluge and data scientist scarcity — are in direct opposition to each other and the tension this is causing for the enterprise will soon reach a crisis point.

The growing scarcity of analysts is already a critical roadblock for many large enterprises. In coming years it will filter down to become a major problem for small and medium businesses as well.

The Evolution of the Artificial Analyst

I believe the inevitable solution to this problem is to develop an “artificial analyst” that can run in the cloud and replicate, or at least facilitate, much of what data scientists and analysts do.

But to accomplish this level of automation, we will have to push the boundary and innovate far beyond the state-of-the-art across several disciplines. It requires coupling the most advanced and flexible analytics capabilities with powerful natural language understanding and conversation technology, and AI that can learn from, reason about, and explain patterns, relationships and discoveries in complex data sets like a human would. It’s not easy to do this, but it can be done, and it would be disruptive to the entire BI industry today (in a good way).

The automation of data science is not without precedent. In fact we can understand the automation of data science through similar technological disruptions in other areas of technology, for example the development of printing.

Printing began as an art around 3000 BC, with the use of clay tablets by craftsmen to reproduce images and texts on cloth. This evolved to woodblock printing and early primitive moveable type using letter blocks. Then it developed into a mechanized process with the invention of the Gutenberg printing press around 1439 AD.

This resulted in printing houses, journeyman printers and an entire profession and industry of printing that culminated in the invention of the photocopier and laser printer in the 1960’s, the dot matrix printer the1970’s, and the personal laser printer in the 1980’s. Finally, today, we have print-on-demand services in the cloud. Anyone can print anything on anything, anywhere, anytime from any device.

Printing went from an art that belonged only to a highly skilled set of people, to something ubiquitous that anyone can do anywhere, anytime from their mobile phone.

Like printing, data science and analytics will also evolve from an art, to a science, to products, and finally to ubiquitous services that anyone can access themselves, without needing to rely on other people.

The Future of Automated Business Intelligence

It is not enough to simply give people the tools to do data science themselves – they need the tools to be automated. The tools themselves are part of the problem — they require too much skill to use.

The future of business intelligence isn’t merely to “democratize” data science and analytics — that is not sufficient: Giving everyone access to data science tools doesn’t make everyone a data scientist. Data science is hard.

The skills of analysts and data scientists must be embodied in software and automated to the point where they are intuitive and easy enough for anyone to harness them, without having to become analysts or data scientists themselves.

I believe that by 2030, decision makers will be able to interact in a natural and human way with artificial analysts that persist in the cloud and act as their intelligent personal agents.

These agents will be able to advise them with actionable insights from their data, using natural language conversation, visualization, simulation, data storytelling, and eventually even mixed reality interfaces that illustrate insights in a more immersive way.

The artificial analyst will use our analytics and data science tools, on data sets that are too vast and changing too fast, for us to make sense of on our own. They will work proactively, and will be able to report back to us and explain their findings, just like a human analyst would.

The artificial analyst will be cheap and scalable in the cloud — enabling every business decision maker to draw on the the power of their own artificial analyst team.

Consumers will also be able to draw on artificial analysts for their personal decisions support — for example to make better personal financial decisions, health decisions, travel decisions, or purchasing decisions.

It is both necessary and inevitable the artificial analyst will emerge, and that everyone will be able to leverage the power of data science and analytics, without having to become a data scientists themselves.

Artificial analysts may eventually replace teams of human analysts and data scientists for the majority of analyst tasks, freeing up the minds of actual human analysts and data scientists to work on higher-level, more creative thinking.

The automation of data science will turn the tables on information overload for knowledge workers: There will be finally be enough supply of analytics horsepower to meet demand. This will make the world better by helping everyone make better decisions faster and more often.

The rise of the artificial analyst will be a key catalyst for enabling all types of organizations to harness the power of analytics-driven decisioning. Not just multi-billion dollar corporations with huge data science teams, but even small and medium sized businesses with no analysts or data scientists at all, and even individual consumers.

Decisions should never be made poorly simply because there is a scarcity of human analyst labor to inform them.

Data scientist scarcity will be solved through automation, and I believe that this will be one of the most important, practical and useful applications of AI.

Automating the analyst is no small task. But nothing worth doing ever is.