Machine Learning (ML) and Artificial Intelligence (AI) are changing the landscape of computing around the world, allowing us to do things with computers that were previously thought to be nearly impossible. ML/AI systems are now able to hear, speak, see, touch and interact with the environment and people around them.
In this article, I'll treat ML and AI together as a single topic. ML is the computer learning subset of AI, but for this discussion, they can be thought of as one.
A high level understanding of the layers of technology in ML/AI can help sort through the many options for developing and implementing an ML/AI system. ML/AI technology can be viewed as a three level hierarchy:
Mathematics > Models > Applications
Each of these layers provides an essential set of elements needed for a successful ML/AI system. Let's start with the mathematics base and build from there...
Mathematics, which dates back to 3000 BC and basic arithmetic, is a field of study that uses formulas (sequences of symbols) to represent ideas and the real world. Sounds a bit abstract, right? It is, but think about it ... ML/AI by its nature is doing just that inside a computing device - representing ideas and the real world. So mathematics is naturally and ideally suited to the pursuits of ML/AI.
ML/AI derives its tremendous power from the use of mathematics to, among other things, analyze probabilistic situations and outcomes. For example, an object recognition model might return the probability of .73 that a given photo contains the image of a cat. When we humans see a cat, we're usually pretty sure it's a cat. However, the combination of our senses and brains have done the complex mathematical-like analysis that produces that conclusion. Mathematics represents the calculations, estimations and processes needed to develop successful ML/AI models.
It's not necessary to understand all the math involved in an ML/AI system if you're using commercially supplied APIs or building on existing open source code. However, having some level of understanding of the underlying math can often be very useful and sometimes essential. One example, Stochastic Gradient Descent (SGD), is a mathematical function used to find a minima or maxima by iteration. It's used in a number of ML/AI models to iteratively improve the accuracy of output functions such as identifying objects in an input image. At a high level, the formula for SGD looks like this:
The components of this equation are:
- Q( ): A function whose value is to be maximized or minimized
- n: The number of times the function Q is recalculated
- /: A division function used to find the average of the recalculated values of the Q function
- i: A number indicating the individual version of the Q function
- w: A parameter that's used to find the minimal or maximal values for the function Q
- summation: Adds up all the values of the individual Q function calculations and is represented by this symbol:
Below is a list of some of typical mathematical concepts and functions used in ML/AI. Wikipedia is a good source of articles to start delving into these topics:
Bayesian Probability and Statistics - Calculus - Classification - Cluster Analysis - Convolution - Deviation Analysis - Dimensional Analysis - Eigenvalues, Eigenvectors - Error Analysis, Accuracy, Precision, Sensitivity, Specificity - Functional Analysis, Activation Functions, Sigmoid Function, Rectified Linear Unit - Geometry, Geometric Transformations - Gradients, Stochastic Gradient Descent, Gradient Boosting - Graph Theory - Hyperparameter Optimization - Information Theory, Entropy, Cross Entropy - K-means Clustering - Linear Algebra - Logistic Regression - Loss/Cost Functions - Markov Chains - Mathematical Constants - Matrix Mathematics - Model Fitting, Underfitting, Overfitting, Regularization - Monte Carlo Algorithms - Pattern Recognition - Probability Theory - Regression Analysis, Linear, Non-Linear, Softmax - Sampling - Statistical Analysis, Bias, Correlation, Hypothesis Testing, Inference, Validation, Cross Validation - Time Series Analysis - Variation Analysis, Coefficient of Determination - Vector Spaces, Vector, Algebra, Scalars - Weights, Synaptic Weights
Models are the embodiment in computer code of the mathematical representations used to perform ML/AI functions. In our example of a computer recognizing a cat in a photo, the ML/AI model represents the layers of processing needed to differentiate the image of a cat from all other possibilities.
Below is a conceptual model of an Artificial Neural Network (ANN), one of the types of ML/AI mathematical models. Data (shown as lines) is passed in a forward left to right direction between processing nodes (shown as circles). Numerical weights (w) are applied to individual data flows and biases (b) are applied to nodes in order to shape the output, such as the identification of a cat in an input image. Mathematical methods such as Stochastic Gradient Descent (discussed above) are used to adjust the weights and biases as data is repeatedly passed through the ANN.
In this diagram, the shapes and letters represent:
- i: input layer node
- h: hidden layer node
- o: output layer node
- w: weights applied to data going across layers
- b: biases applied to node values
Below is a list of some of the mathematical models used in ML/AI. Wikipedia is a good source of basic information about these models:
Artificial Neural Networks - Association Rule Learning - Bayesian Networks - Decision Tree Learning - Deep Learning - Ensemble Learning - Hierarchical Clustering - Learning Classifier Systems - Learning to Rank - Long Short-Term Memory Neural Networks - Nearest Neighbors Algorithms - Recurrent Neural Networks - Reinforcement Learning - Sequence-to-Sequence Neural Networks - Similarity Learning - Sparse Dictionary Learning - Stochastic Neural Networks - Support Vector Machines - Unsupervised Learning
ML/AI applications use mathematical models, such as those discussed above, to perform meaningful tasks and produce meaningful results. Raw ML/AI results from mathematical models can be very interesting, but by themselves offer little utility. ML/AI applications provide that utility.
As an example, let's say we wanted to use our cat detecting Artificial Neural Network to let users of our smartphone app take a photo of a cat and determine what breed it belongs to. Our development team would need to:
- Collect cat images from the internet
- Sort the images into known breeds
- Use the grouped images to train the ANN to recognize different breeds of cats
- Test the trained ANN on images of cats of unknown breeds
- Test the trained ANN on recognizing cats in images of many different types
- Decide on the minimum acceptable probability from the ANN for determining a cat is really a member of the indicated breed
- Maintain a table of the minimum acceptable probability for each breed of cat
- Develop the user interface functions that allow the app user to photograph their cat and get the determination of which breed the application thinks it belongs to
- Develop the server code to host the ANN and application software
- Develop the Application Programming Interfaces (APIs) to connect client applications with the server code
- Track ongoing performance results
ML/AI applications are appearing in every imaginable area of computing. If you do an internet search on almost any topic and include the term 'machine learning' you'll likely find results. Below are just a few examples of areas of active ML/AI applications:
Biometrics - DNA Classification - Computer Vision - Fraud Detection - Marketing - Medical Diagnosis - Economics - Natural Language Processing - Language Translation - Online Advertising - Search Engines - Handwriting Recognition - Speaker Recognition - Speech Recognition - Financial Market Analysis and Trading - Customer Service - Systems Monitoring - Recommender Systems - Self-driving Vehicles - Robotics - Cybersecurity - Legal Research - Criminal Investigations - Security Screening - Mapping - Healthcare - Face Detection and Recognition - Object Recognition - Weather Forecasting - Image Processing