As part of my role at the Digital Catapult I recently participated in the The Royal Society’s investigation of machine learning, and this is a blog post contributed there. I give a little background to where machine learning is today, but then claim that we have a bigger set of challenges around systems that make decisions, regardless of whether they use machine learning algorithms or more conventional software.
Machine learning grew from the broader field of Artificial Intelligence, and gradually became a significant discipline in its own right. Machine Learning (ML) algorithms “learn” from examples and improve from experience, rather than being explicitly programmed. But why the sudden excitement? Because, over the last few years, ML has advanced to the point of making significant impact in our lives and in business.
The highest profile developments have been in “deep learning”, new architectures of neural networks that, when combined with the huge volumes of examples to learn from (“training” data) that are now more readily available, and large-scale cloud processing power, have made spectacular progress in many areas, ranging from recognition of images and natural language to more fun demonstrations like learning to play video games, or challenging the world’s top Go player.
It is no coincidence that the big five “GAFAM” internet companies (Google, Apple, Facebook, Amazon and Microsoft) are racing to acquire companies and scientists globally, including the $400m+ acquisition of DeepMind by Google and the recent $250m acquisition of SwiftKey by Microsoft, both in the UK. They all have the necessary huge sets of training data, and they all have immediate ways to monetise any new developments with improvements to their existing services (such as Google Translate translating text from smartphone cameras, Microsoft’s Skype Translator, Facebook’s face recognition or Twitter’s porn detection).
These companies have a stranglehold on large-scale data in many areas particularly consumer behaviour on web and mobile, photographs and videos (there were 1.8B images uploaded every day even back in 2014). This strategic advantage is so great that they have been happy to open source sophisticated libraries and toolkits for ML projects such as TensorFlow from Google and Torch, contributed to by Facebook, and even hardware designs, while Amazon contributed to the $1B invested to found Elon Musk and Sam Altman’s new non-profit research company OpenAI.
The question was asked by the Royal Society whether certain sectors have particularly strong uses for machine learning. Our thought is that any sector where decisions are based on data can make use of all kinds of data science techniques including ML, and these days that will include most sectors! For instance a selection of recent examples include medical diagnosis, funny cartoons, image recognition, exam revision, financial trading, background checking and robot tasks.
Many of the really exciting opportunities for companies, and where machine learning will affect all of us, are with systems that become more autonomous, where an algorithm is making the decisions — often decisions about people. The move is from humans making decisions (perhaps using rules of thumb, gut instinct or behaviours handed down and learned), to humans making decisions based on data, with increasingly sophisticated analytics, and finally to machines making the decisions.
We’re particularly interested in the cases where these decisions affect our lives in significant ways. Bruce Schneier has called this development the “World-Sized Web” — imagining all the sensors globally as the eyes and ears of this “robot” and all of the actuators (that include autonomous drones, or car navigation systems that can re-route traffic) as its hands and feet.
We can see an analogue with how online businesses have deepened their use of data. We started with simple streams of raw information, like counts of visitors and page views, as a minor influence on human decision making, alongside existing experience from retail or marketing. Over time the data became more sophisticated, and we could track repeat visits, correlate with purchasing both on and offline, and tie in to channels for marketing, promotion and advertising.
However, the decision making, although now very data-driven, was still largely a human activity. Finally we reach a stage where the system can become fully automated. Systems run experiments comparing personalised variants of messages or ads, and optimising over time.
Many use variants of the “multi-armed bandit” algorithms that would be considered a form of ML, in particular “reinforcement” learning that adapts based on rewards or punishments. Complex recommender systems feed merchandising decisions on Amazon, sort posts in the Facebook news feed and tailor our media consumption on Netflix.
So how do I decide whether to cede control to a machine learning system? I would say there are some quite different scenarios.
We can give agency to an algorithm to directly act on our behalf. There’s a spectrum of risk. I can give my smart thermostat control over the temperature in my house. I can give my pension to a financial trading algorithm, or indeed a poker bot. As an Uber driver I can obey instructions from a central routing algorithm. The very visible ethical battleground currently is with autonomous cars and the trolley problem, but in fact these systems will become more pervasive and less visible than the robot cars soon to be seen on our high streets.
There are also algorithms whose decisions, using our personal data, directly affect us, but that aren’t visible to us or under our control. The medical diagnosis tool used by your doctor, the risk assessment algorithm used by your bank, and even the Facebook newsfeed that can affect our individual moods.
And there are the fuzzier boundaries of this space, where algorithms make decisions that indirectly affect us, as a community or a population. A large group using the same navigation app, such as Waze, can all be re-routed.
As the toolkits for ML become more commonly used, we will see the work move from science to engineering. Currently working with machine learning techniques is a rare and valuable skill among software engineers, and typically requires higher-level training in computer science and statistics. The idea of algorithms that need to be trained, that learn and whose behaviour changes over time poses challenges to commonly-used workflows for developing, testing, deploying and maintaining software.
We expect that as “black box” machine learning libraries become more commonly used, with the corresponding easier and cheaper availability of bursts of cloud computing, they will integrate themselves into the day to day skills of a software engineer.
As the dominant designs emerge for architectures embedding machine learning within a broad range of software projects, and as the field of machine learning progresses, these decision making systems will improve, and so will be employed in more and more contexts. We believe that user control of personal data and what it’s used for is an important principle.
The corresponding question with algorithms will be: what is making decisions about me, and can I understand how and why? This is a timely debate. Many popular machine learning methods will learn complex statistical models, or patterns of weights and connections in a neural network, whose resulting decisions will be very hard to understand in an individual case.
According to Wired Magazine, even within Google there has been reluctance to surrender control of search rankings to a learning system that is hard to adjust: With machine learning, he [Amit Singhal (Head of Google’s search engine ranking team)] wrote, the trouble was that “…it’s hard to explain and ascertain why a particular search result ranks more highly than another result for a given query”. He added: “It’s difficult to directly tweak a machine learning-based system to boost the importance of certain signals over others”.
There is concern about how we’ll cope with the rise of “superintelligences” (and indeed a new £10m centre announced to study the Future of Intelligence between the universities of Oxford, Cambridge, Berkley and Imperial College). However, we think the issues we’ve discussed will be upon us more quickly, with a vast number of “microintelligences” soaking into the software infrastructure powering our lives.
Although the recent advances in machine learning algorithms, particularly deep learning, are wonderful and astonishing, they are just part of the increasingly effective arsenal of data and decision making tools at our disposal. A better way to describe this new frontier is to talk about the move towards autonomous, decision making systems, algorithms that are becoming more embedded in our lives. As these algorithms move from their pure and innocent beginnings in research labs into the messy politics of the real world, we must remember that they are created for and by people in society. As Mark Rolston points out, it is up to us to develop human-centered solutions that resist corruption.
Originally published at whizzyideas.wordpress.com on February 16, 2016.