The past, present and future of machine learning

17 October 2019

If you’ve noticed a surge of AI powered products and services hitting the marketplace lately, you are not mistaken. Artificial Intelligence (AI) and machine learning technology have been developing rapidly in recent years, with possibilities growing in tandem with a greater availability of data and advancements in computing capability and storage solutions. In fact, if you look behind the scenes, you can spot many examples of machine learning technology already in practice in all kinds of industries—ranging from consumer goods and social media to financial services and manufacturing.
 
But the question remains: How did machine learning evolve from science fiction to reality in such a short period time? After all, it was only in 1959 that data scientist Arthur Lee Samuel successfully developed a computer program that could teach itself how to play checkers. To find the answer, let’s chart the course of machine learning’s development by taking a look at the past and present, and envisioning what might be coming next.
 
What is machine learning?
Machine learning (ML) is a sub-set of AI where machines, enabled with trained algorithms and neural network models, are able to autonomously learn from data and continuously improve performance and decision making accuracy related to a specific task.

Machine Learning in the Past 

Can a machine exercise intelligence?
The origin of machine learning can be traced back to a series of profound events in the 1950s in which pioneering research established computers’ ability to learn. In 1950, the famous “Turing Test” was developed by the English mathematician Alan Turing to determine if a machine exhibits intelligent behavior equal or similar to a human. In 1952, thedata scientist Arthur Lee Samuel managed to teach an IBM computer program to not only learn the game of checkers but to improve the more it played. Then in 1957, the world’s first neural network for computers was designed by the American psychologist Frank Rosenblatt. From there, experimentation escalated. In the 1960s, Bayesian methods for probabilistic interference in machine learning were introduced. And in 1986, the computer scientist Rina Dechter introduced the Deep Learning technique, based on artificial neural networks, to the machine learning community.
 
Adopting a data-driven approach
It wasn’t until the 1990s that machine learning shifted from a knowledge-driven approach to the data-driven approach we are familiar with today. Scientists started creating computer programs that could analyze large quantities of data and learn from the results. It was during this period that support vector machines and recurrent neural networks rose in popularity. In the 2000s, Kernal methods for algorithm pattern analysis, like Support Vector Clustering became prominent.
 
Hardware for efficient processing
The next momentous event that helped enable machine learning as we know it today is hardware advancements which occurred in the early 2000s. Graphics processing units (GPUs) were developed that could not only speed up algorithm training significantly—from weeks to days— but could also be used in embedded systems. In 2009, Nvidia’s GPUs were used by the famous Google Brain to create capable deep neural networks that could learn to recognize unlabeled pictures of cats from YouTube. Once deep learning became demonstrably feasible, a promising new era of AI and machine learning for software services and applications could begin.

Machine Learning in the Present 

Big demand for GPUs
Today, the demand for GPUs continues to rise as companies from all kinds of industries seek to put their data to work and realize the benefits of AI and machine learning. Some examples of machine learning applications we can see today are medical diagnosis, machine maintenance prediction, and targeted advertising.
 
However, when it comes to applying machine learning models in the real world, there’s a certain stumbling block that is hindering progress. And that stumbling block is called latency.
 
Edge machine learning
Most companies today store their data in the cloud. This means that data has to travel to a central data center—which is often located thousands of miles away—for model comparison before the concluding insight can be relayed back to the device of origin. This is a critical, and even dangerous problem in cases such as fall detection where time is of the essence.
 
The problem of latency is what is driving many companies to move from the cloud to the edge today. “Intelligence on the edge,” Edge AI” or “Edge machine learning” means that, instead of being processed in algorithms located in the cloud, data is processed locally in algorithms stored on a hardware device. This not only enables real-time operations, but it also helps to significantly reduce the power consumption and security vulnerability associated with processing data in the cloud. 
 
Solving power constraint issues 
As we move towards applying AI and edge machine learning to smaller and smaller devices and wearables, resource constraints are presenting another major roadblock. How can we run machine learning applications on tiny devices without sacrificing performance and accuracy?
 
While moving from the cloud to the edge is a vital step in solving resource constraint issues,many machine learning models are still using too much computing power and memory to be able to fit the small microprocessors on the market today. Many are approaching this challenge by creating more efficient software, algorithms and hardware. Or by combining these components in a specialized way.  

Machine Learning in the Future 

So what’s next? The future of machine learning is continuously evolving, as new developments and milestones are achieved in the present. While that makes it challenging to offer accurate predictions, we can, however, identify some key trends.
 
Unsupervised machine learning
In the majority of AI and machine learning projects today, the tedious process of sorting and labelling data takes up the bulk of development time. In fact, the analyst firm Cognilytica estimated that in the average AI project, about 80% of project time is used aggregating, cleaning, labeling, and augmenting data to be used in machine learning models.
 
This is why the prospect of unsupervised learning is so exciting. In the future, more and more machines will be able to independently identify previously unknown patterns within a data set which has not been labelled or categorized. Unsupervised learning is particularly useful for discovering previously unknown patterns in a data set when you do not know what the outcome should be. This could be useful for applications such as analyzing consumer data to determine the target market for a new product or detecting data anomalies like fraudulent transactions or malfunctioning hardware.
 
Hardware acceleration for Edge machine learning
A new generation of purpose-built accelerators is emerging as chip manufacturers and startups work to speed up and optimize the workloads involved in AI and machine learning projects—ranging from training to inferencing. Faster, cheaper, more power-efficient and scalable. These accelerators promise to boost edge devices to a new level of performance.
 
One of the ways they achieve this is by relieving edge devices’ central processing units of the complex and heavy mathematical work involved in running deep learning models. What does this mean? Get ready for faster predictions.
 
Scaling up
In the future, the much talked about Internet of Things will become increasingly tangible in our everyday lives. Especially as AI and machine learning technology continues to become increasingly affordable. However, as the number of AI devices increase, we will need to ensure we have an infrastructure to match. According to Drew Henry, Senior Vice President of Strategy Planning & Operationsat Armin a recent article:
 
“The world of one trillion IoT devices we anticipate by 2035 will deliver infrastructural and architectural challenges on a new scale…our technology must keep evolving to cope. On the edge computing side, it means Arm will continue to invest heavily in developing the hardware, software, and tools to enable intelligent decision-making at every point in the infrastructure stack. It also means using heterogeneous [computation] at the processor level and throughout the network—from cloud to edge to endpoint device.”
 
When we look at history and where we are today, it appears that the evolution of edge machine learning is fast and unstoppable. As future developments continue to unfold, prepare for impact and make sure you're ready to seize the opportunities this technology brings.