In this age of rapid technological advancement, most people are familiar with AI. We’ve likely all read the articles about automation destroying jobs, or a future of robots taking over the world. While those articles are more based on science fiction than a tangible future of AI, it certainly keeps interest in AI top-of-mind for many people.
There are dozens of examples of AI that everyday consumers may use, such as facial recognition, auto-correct, search engines, or social media algorithms. But have you ever wondered how these programs work?
AI runs off of algorithms, but not all AI algorithms are the same. They’re developed with different goals and methods. In this article, we’ll talk about the four major categories of AI algorithms and how they all work.
In this article, we’ll cover:
What is artificial intelligence?
Artificial intelligence is a branch of computer science concerned with creating machines that can think and make decisions independently of human intervention. Those AI programs can do complex tasks that were previously only able to be done by humans. Some AI programs can complete simple tasks, some more complex. Some can take in data to learn and improve, completely without the touch of a human developer.
What is an AI algorithm?
So then what is an AI algorithm? The definition of an algorithm is “a set of instructions to be followed in calculations or other operations.” This applies to both mathematics and computer science. So, at the essential level, an AI algorithm is the programming that tells the computer how to learn to operate on its own.
An AI algorithm is much more complex than what most people learn about in algebra, of course. A complex set of rules drive AI programs, determining their steps and their ability to learn. Without an algorithm, AI wouldn’t exist.
How do AI algorithms work?
While a general algorithm can be simple, AI algorithms are by nature more complex. AI algorithms work by taking in training data that helps the algorithm to learn. How that data is acquired and is labeled marks the key difference between different types of AI algorithms.
At the core level, an AI algorithm takes in training data (labeled or unlabeled, supplied by developers, or acquired by the program itself) and uses that information to learn and grow. Then it completes its tasks, using the training data as a basis. Some types of AI algorithms can be taught to learn on their own and take in new data to change and refine their process. Others will need the intervention of a programmer in order to streamline.
Types of artificial intelligence algorithms
There are three major categories of AI algorithms: supervised learning, unsupervised learning, and reinforcement learning. The key differences between these algorithms are in how they’re trained, and how they function.
Under those categories, there are dozens of different algorithms. We’ll be talking about the most popular and commonly used from each category, as well as where they are commonly used.
Supervised learning algorithms
The first, and most commonly used category of algorithms is “Supervised learning.” These work by taking in clearly-labeled data while being trained and using that to learn and grow. It uses the labeled data to predict outcomes for other data. The name “supervised learning” comes from the comparison of a student learning in the presence of a teacher or expert.
Building a supervised learning algorithm that actually works takes a team of dedicated experts to evaluate and review the results, not to mention data scientists to test the models the algorithm creates to ensure their accuracy against the original data, and catch any errors from the AI.
Definitions: Classification and Regression
Below, we jump into explaining the different types of supervised learning algorithms. All of them can either be used for classification or regression, or both.
Classification means an either/or result using binary (0 = no, 1 = yes). So the algorithm will classify something as either one or another, but never both. There is also multi-class classification, which deals with organizing data into defined categories or types relevant to a specific need.
Regression means the result will end with a real number (either round or a decimal point). You usually have a dependent variable and an independent variable, and the algorithm will use both points to estimate a possible other result (either forecast or generalized estimate).
One of the most common supervised learning algorithms, decision trees get their name because of their tree-like structure (even though the tree is inverted). The “roots” of the tree are the training datasets and they lead to specific nodes which denote a test attribute. Nodes often lead to other nodes, and a node that doesn’t lead onward is called a “leaf”.
Decision trees classify all the data into decision nodes. It uses a selection criteria called Attribute Selection Measures (ASM) which takes into account various measures (some examples would be entropy, gain ratio, information gain, etc). Using the root data and following the ASM, the decision tree can classify the data it is given by following the training data into sub-nodes until it reaches the conclusion.
The random forest algorithm is actually a broad collection of different decision trees, leading to its name. The random forest builds different decision trees and connects them to gain more accurate results. These can be used for both classification and regression.
Support Vector Machines
The support vector machine (SVM) algorithm is another common AI algorithm that can be used for either classification or regression (but is most often used for classification). SVM works by plotting each piece of data on a chart (in N dimensional space where N = the number of datapoints). Then, the algorithm classifies the datapoints by finding the hyperplace that separates each class. There can be more than one hyperplane.
The reason this algorithm is called “Naive Bayes” is that it’s based on Bayes’ Theorem, and also relies heavily on a large assumption: that the presence of one particular feature is unrelated to the presence of other features in the same class. That major assumption is the “naive” aspect of the name.
Naive Bayes is useful for large datasets with many different classes. It, like many other supervised learning algorithms, is a classification algorithm.
Linear regression is a supervised learning AI algorithm used for regression modeling. It’s mostly used for discovering the relationship between data points, predictions, and forecasting. Much like SVM, it works by plotting pieces of data on a chart with the X-axis is the independent variable and the Y-axis is the dependent variable. The data points are then plotted out in a linear fashion to determine their relationship and forecast possible future data.
A logistic regression algorithm usually uses a binary value (0/1) to estimate values from a set of independent variables. The output of logistic regression is either 1 or 0, yes or no. An example of this would be a spam filter in email. The filter uses logistic regression to mark whether an incoming email is spam (0) or not (1).
Logistic regression is only useful when the dependent variable is categorical, either yes or no.
Unsupervised learning algorithms
It may at this point be relatively easy to guess what unsupervised learning algorithms mean, in comparison to supervised learning. Unsupervised learning algorithms are given data that isn’t labeled. Unsupervised learning algorithms use that unlabeled data to create models and evaluate the relationships between different data points in order to give more insight to the data.
Many unsupervised learning algorithms perform the function of clustering, which means they sort the unlabeled data points into pre-defined clusters. The goal is to have each data point belong to only one cluster, with no overlap. There can be more than one data point in any given cluster, but a data point cannot belong to more than one cluster.
K-means is an algorithm designed to perform the clustering function in unsupervised learning. It does this by taking in the pre-determined clusters and plotting out all the data regardless of the cluster. It then plots a randomly-selected piece of data as the centroid for each cluster (think of it as a circle around each cluster, with that piece of data as the exact center point). From there, it sorts the remaining data points into clusters based on their proximity to each other and the centroid data point for each cluster.
Gaussian mixture model
Gaussian mixture models are similar to K-means clustering in many ways. Both are concerned with sorting data into pre-determined clusters based on proximity. However, Gaussian models are a little more versatile in the shapes of the clusters they allow.
Picture a graph with all your data points plotted out. K-means clustering only allows data to be clustered in circles with the centroid in the center of each cluster. Gaussian mixture can handle data that lands on the graph in more linear patterns, allowing for oblong-shaped clusters. This allows for greater clarity in clustering if one datapoint lands inside the circle of another cluster.
Both supervised and unsupervised algorithms
Some AI algorithms can use either supervised or unsupervised data input and still function. They might have slightly different applications based on their status.
K-nearest neighbor algorithm
K-nearest neighbor (KNN) algorithm is a simplistic AI algorithm that assumes that all the data points provided are in proximity to each other and plots them out on a map to show the relationship between them. Then the algorithm can calculate the distance between data points in order to extrapolate their relationship, and calculate the distance on a graph.
In supervised learning, it can be used for either classification or regression applications. In unsupervised learning, it’s popularly used for anomaly detection; that is, finding data that doesn’t belong and removing it.
Neural network algorithm is a term for a collection of AI algorithms that mimic the functions of a human brain. These tend to be more complex than many of the algorithms discussed above and have applications beyond some of the ones discussed here. In unsupervised and supervised learning, it can be used for classification and pattern recognition.
Reinforcement learning algorithms
The last major type of AI algorithm is reinforcement learning algorithms, which learn by taking in feedback from the result of its action. This is typically in the form of a “reward.
A reinforcement algorithm is usually composed of two major parts: an agent that performs an action, and the environment in which the action is performed. The cycle begins when the environment sends a “state” signal to the agent. That queues the agent to perform a specific action within the environment. Once the action is performed, the environment sends a “reward” signal to the agent, informing it on what happened, so the agent can update and evaluate its last action. Then, with that new information, it can take the action again. That cycle repeats until the environment sends a termination signal.
There are two types of reinforcement the algorithm can use: either a positive reward, or a negative one.
Definitions: Model, Policy, Value
In reinforcement algorithms, there are slightly different approaches depending on what is being measured and how it’s being measured. Here are some definitions of different models and measures:
- Policy: The approach the agent takes to determine the next action taken by the agent.
- Model: The situation and dynamics of the environment.
- Value: The expected long-term results. This is different from the reward, which is the result of a single action within the environment. The Value is the long-term result of many actions.
In a value-based reinforcement algorithm the agent pushes toward an expected long-term return, instead of just focusing on the short-term reward.
A policy-based reinforcement algorithm usually takes one of two approaches to determining the next course of action. Either a standardized approach where any state produces the same action or a dynamic approach where certain probabilities are mapped out and probabilities calculated. Each probability has its own policy reaction.
In this algorithm, the programmer creates a different dynamic for each environment. That way, when the agent is put into each different model, it learns to perform consistently under each condition.
Uses of AI algorithms
There are thousands of applications for AI systems and algorithms. We touched on what may seem like simple algorithms in this article, but even those have hundreds of possible applications. The possibilities are endless.
Some common uses of AI algorithms include:
- Data entry and classification
- Advanced or predictive analytics
- Search engines (Google, Yahoo, Bing, etc.)
- Digital assistants (Siri, Alexa, etc.)
- Robotics (assembly machines, self-driving cars, etc.)
AI algorithms and business applications
Now that you know about the different ways AI works and a little about the possible applications, it’s time to think about how you can use it in business. According to the 2021 Appen State of AI report, businesses need to adopt AI into their models or risk being left behind as the technology advances.
Tableau knows how important it is for businesses to stay on the cutting edge of analytics to ensure they can make the best steps forward at any given time. That’s why we developed AI analytics, to offer the best predictive analytics to our clients. Learn how Tableau helps customers succeed with AI analytics now.