Images can improve the design and the appearance of a web page.
Example
<img src="pic_trulli.jpg"
alt="Italian Trulli">
Example
<img src="img_girl.jpg"
alt="Girl in a jacket">
Example
<img src="img_chania.jpg"
alt="Flowers in Chania">
HTML Images Syntax
The HTML <img> tag is used to embed an
image in a web page.
Images are not technically inserted into a web page; images are linked to web
pages. The <img> tag creates a holding
space for the referenced image.
The <img> tag is empty, it contains attributes only, and does not
have a closing tag.
The <img> tag has two required
attributes:
src - Specifies the path to the image
alt - Specifies an alternate text for the image
Syntax
<img src="url" alt="alternatetext">
The src Attribute
The required src attribute specifies the path (URL) to the image.
Note: When a web page loads, it is the browser, at that
moment, that gets the image from a web server and inserts it into the page.
Therefore, make sure that the image actually stays in the same spot in relation
to the web page, otherwise your visitors will get a broken link icon. The broken
link icon and the alt text are shown if the browser cannot find the image.
Example
<img src="img_chania.jpg" alt="Flowers
in Chania">
The alt Attribute
The required alt attribute provides an alternate text for an image, if the user for
some reason cannot view it (because of slow connection, an error in the src
attribute, or if the user uses a screen reader).
The value of the alt attribute should describe the image:
Example
<img src="img_chania.jpg" alt="Flowers
in Chania">
If a browser cannot find an image, it will display the value of the alt
attribute:
Example
<img src="wrongname.gif" alt="Flowers
in Chania">
Tip: A screen reader is a software program that reads the HTML code, and allows the user to "listen" to the content. Screen readers are useful
for people who are visually impaired or learning disabled.
Image Size - Width and Height
You can use the style attribute to specify the width and
height of an image.
Example
<img src="img_girl.jpg" alt="Girl in a jacket" style="width:500px;height:600px;">
Alternatively, you can use the width and height attributes:
Example
<img src="img_girl.jpg" alt="Girl in a jacket" width="500" height="600">
The width and height attributes always define the width and height of the
image in pixels.
Note: Always specify the width and height of an image. If width and height are not specified, the
web page
might flicker while the image loads.
Width and Height, or Style?
The width, height, and style attributes are
all valid in HTML.
However, we suggest using the style attribute. It prevents styles sheets from changing
the size of images:
Example
<!DOCTYPE html> <html> <head> <style> img { Â Â width: 100%; }
</style> </head> <body>
Notes on external images: External images might be under
copyright. If you do not get permission to use it, you may be in violation of
copyright laws. In addition, you cannot control external images; they can suddenly
be removed or changed.
To use an image as a link, put the <img> tag inside the <a>
tag:
Example
<a href="default.asp"> Â <img src="smiley.gif" alt="HTML tutorial"
style="width:42px;height:42px;"> </a>
Image Floating
Use the CSS float property to let the image float to the right or to the left of a text:
Example
<p><img src="smiley.gif" alt="Smiley face"
style="float:right;width:42px;height:42px;">
The image will float to the right of
the text.</p>
<p><img src="smiley.gif" alt="Smiley face"
style="float:left;width:42px;height:42px;">
The image will float to the left of
the text.</p>
Tip: To learn more about CSS Float, read our .
Common Image Formats
Here are the most common image file types, which are supported in all browsers
(Chrome, Edge, Firefox, Safari, Opera):
Abbreviation
File Format
File Extension
APNG
Animated Portable Network Graphics
.apng
GIF
Graphics Interchange Format
.gif
ICO
Microsoft Icon
.ico, .cur
JPEG
Joint Photographic Expert Group image
.jpg, .jpeg, .jfif, .pjpeg, .pjp
PNG
Portable Network Graphics
.png
SVG
Scalable Vector Graphics
.svg
Chapter Summary
Use the HTML <img> element to define an image
Use the HTML src attribute to define the URL of the image
Use the HTML alt attribute to define an alternate text for an image, if it cannot be displayed
Use the HTML width and height attributes
or the CSS width and height
properties to define the size of the image
Use the CSS float property to let the image float
to the left or to the right
Note: Loading large images takes time, and can slow down your
web page. Use images carefully.
HTML Exercises
HTML Image Tags
Tag
Description
Defines an image
Defines an image map
Defines a clickable area inside an image map
Defines a container for multiple image resources
For a complete list of all available HTML tags, visit our .
Machine Learning is a subfield of Artificial intelligence
"Learning machines to imitate human intelligence"
Machine Learning (ML)
Traditional programming uses known algorithms to produce results from data:
Data + Algorithms = Results
Machine learning creates new algorithms from data and results:
Data + Results = Algorithms
Neural Networks (NN)
Neural Networks is:
A programming technique
A method used in machine learning
A software that learns from mistakes
Neural Networks are based on how the human brain works: Neurons are sending messages to each other. While the neurons are trying to solve a problem (over and over again), it is strengthening the connections that lead to success and diminishing the connections that lead to failure.
Perceptrons
The Perceptron defines the first step into Neural Networks.
It represents a single neuron with only one input layer, and no hidden layers.
Neural Networks
Neural Networks are Multi-Layer Perceptrons.
In its simplest form, a neural network is made up from:
An input layer (yellow)
A hidden layer (blue)
An output layer (red)
In the Neural Network Model, input data (yellow) are processed against a hidden layer (blue) before producing the final output (red).
The First Layer: The yellow perceptrons are making simple decisions based on the input. Each single decision is sent to the perceptrons in the next layer.
The Second Layer: The blue perceptrons are making decisions by weighing the results from the first layer. This layer make more complex decisions at a more abstract level than the first layer.
Deep Neural Networks
Deep Neural Networks is:
A programming technique
A method used in machine learning
A software that learns from mistakes
Deep Neural Networks are made up of several hidden layers of neural networks that perform complex operations on massive amounts of data.
Each successive layer uses the preceding layer as input.
For instance, optical reading uses low layers to identify edges, and higher layers to identify letters.
In the Deep Neural Network Model, input data (yellow) are processed against a hidden layer (blue) and modified against more hidden layers (green) to produce the final output (red).
The First Layer: The yellow perceptrons are making simple decisions based on the input. Each single decision is sent to the perceptrons in the next layer.
The Second Layer: The blue perceptrons are making decisions by weighing the results from the first layer. This layer make more complex decisions at a more abstract level than the first layer.
The Third Layer: Even more complex decisions are made by the green perceptrons.
Deep Learning (DL)
Deep Learning is a subset of Machine Learning.
Deep Learning is responsible for the AI boom of the last years.
Deep learning is an advanced type of ML that handles complex tasks like image recognition.
Machine Learning
Deep Learning
A subset of AI
A subset of Machine Learning
Uses smaller data sets
Uses larger datasets
Trained by humans
Learns on its own
Creates simple algorithms
Creates complex algorithms
Artificial Intelligence Is a Contrast to Human Intelligence
What is Artificial Intelligence?
Artificial Intelligence suggest that machines can mimic humans in:
Talking
Thinking
Learning
Planning
Understanding
Artificial Intelligence is also called Machine Intelligence and Computer Intelligence.
Arthur Samuel 1959:
"Machine Learning is a subfield of computer science that gives computers the ability to learn without being programmed"
Arthur Samuel, IBM Journal of Research and Development, Vol. 3, 1959.
Wikipedia 2022:
Artificial intelligence is intelligence demonstrated by machines. Unlike natural intelligence displayed by humans and animals, which involves consciousness and emotionality.
Investopedia 2022:
Artificial intelligence refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions.
IBM 2022:
Artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind.
Britannica 2022:
Artificial intelligence is the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings, .... such as the ability to reason, discover meaning, generalize, or learn from past experience.
Artificial Intelligence (AI)
Artificial Intelligence is a scientific discipline embracing several Data Science fields ranging from narrow AI to strong AI, including machine learning, deep learning, big data and data mining.
Artificial Intelligence Narrow AI Machine Learning Neural Networks Big Data Deep Learning Strong AI
Narrow AI
Narrow Artificial Intelligence is limited to narrow (specific) areas like most of the AI we have around us today:
Email spam Filters
Text to Speech
Speech Recognition
Self Driving Cars
E-Payment
Google Maps
Text Autocorrect
Automated Translation
Chatbots
Social Media
Face Detection
Visual Perception
Search Algorithms
Robots
Automated Investment
NLP - Natural Language Processing
Flying Drones
IBM's Dr. Watson
Apple's Siri
Microsoft's Cortana
Amazon's Alexa
Netflix's Recommendations
Narrow AI is also called Weak AI.
Weak AI: Built to simulate human intelligence.
Strong AI: Built to copy human intelligence.
Strong AI
Strong Artificial Intelligence is the type of AI that mimics human intelligence.
Strong AI indicates the ability to think, plan, learn, and communicate.
Strong AI is the theoretical next level of AI: True Intelligence.
Strong AI moves towards machines with self-awareness, consciousness, and objective thoughts.
One need not decide if a machine can "think". One need only decide if a machine can act as intelligently as a human.
Alan Turing
Traditionally, Machine Learning applications are using R or Python.
But JavaScript has a great future as an Machine Learning language:
JavaScript is well known. All developers can use it.
Security is built in. JavaScript cannot access your files.
JavaScript is faster than Python.
JavaScript can use hardware acceleration.
JavaScript runs in the browser
JavaScript is Good for Machine Learning
Machine Learning can be math-heavy. The nature of neural networks is highly technical, and the jargon that goes along with it tends to scare people away.
This is where JavaScript comes to help, with easy to understand software to simplifying the process of creating and training neural networks.
With new Machine Learning libraries, JavaScript developers can add Machine Learning and Artificial Intelligence to web applications.
WebGL API
WebGL is a JavaScript API for rendering 2d and 3D graphics in any browser.
WebGL can run on both integrated and standalone graphic cards in any PC.
WebGL brings 3D graphics to the web browser. Major browser vendors Apple (Safari), Google (Chrome), Microsoft (Edge), and Mozilla (Firefox) are members of the WebGL Working Group.
JavaScript Machine Learning Libraries
Machine Learning in the Browser means:
Machine Learning in JavaScript
Machine Learning for the Web
Machine Learning for Everyone
Machine Learning on more Platforms
Advantages:
Easy to use. Nothing to install.
Powerful graphics. Browsers support WebGL.
Better privacy. Data can stay on the client.
More platforms. JavaScript runs on mobile devices.
Math.js
Math.js is an extensive math library for JavaScript and Node.js.
Math.js is powerful and easy to use. It comes with a large set of built-in functions, a flexible expression parser, and solutions to work with many data types like numbers, big numbers, complex numbers, fractions, units, arrays, and matrices.
Brain.js
Brain.js is a JavaScript library that makes it easy to understand Neural Networks because it hides the complexity of the mathematics.
Brain.js is simple to use. You do not need to know neural networks in details to work with Brain.js.
Brain.js provides multiple neural network implementations as different neural nets can be trained to do different things well.
ml5.js
ml5.js is trying to make machine learning more accessible to a wider audience.
The ml5 team is working to wrap machine learning functionality in friendlier ways.
The example below uses only three lines of code to classify an image:
// Function to run when results arrive function gotResult(error, results) { const element = document.getElementById("result"); if (error) { element.innerHTML = error; } else { let num = results[0].confidence * 100; element.innerHTML = results[0].label + "<br>Confidence: " + num.toFixed(2) + "%"; } } </script> </body> </html>
Image Classification
With MobileNet and ml5.js
robin, American robin, Turdus migratorius Confidence: 95.52%
Try substitute "pic1.jpg" with "pic2.jpg" and "pic3.jpg".
TensorFlow
is a web application written in d3.js.
With TensorFlow Playground you can learn about Neural Networks (NN) without math.
In your own Web Browser you can create a Neural Network and see the result.
TensorFlow.js was previously called Tf.js and Deeplearn.js.
Plotting in the Browser
Here is a list of some JavaScript libraries to use for both Machine Learning graphs and other HTML charts:
<script> function plot(type) { const xArray = document.getElementById("xvalues").value.split(','); const yArray = document.getElementById("yvalues").value.split(','); let mode = "lines"; if (type == "scatter") {mode = "markers"} Plotly.newPlot("myPlot", [{x:xArray, y:yArray, mode:mode, type:"scatter"}]); } </script>
</body> </html>
Machine Learning Languages
Programming languages involved in Machine Learning and Artificial Intelligence are:
LISP
R
Python
C++
Java
JavaScript
SQL
LISP
LISP is the second oldest programming language in the world (1958), one year younger than Fortran (1957).
The term Artificial Intelligence was made up by John McCarthy who invented LISP.
LISP was founded on the theory of Recursive Functions (self modifying functions), and this is very suitable for Machine Learning programs where "self-learning" is an important part of the program.
The R Language
R is a programming language for Graphics and Statistical computing.
R is supported by the .
R comes with a wide set of statistical and graphical techniques for:
Linear Modeling
Nonlinear Modeling
Statistical Tests
Time-series Analysis
Classification
Clustering
Python
Python is a general-purpose coding language. It can be used for all types of programming and software development.
Python is typically used for server development, like building web apps for web servers.
Python is also typically used in Data Science.
An advantage for using Python is that it comes with some very suitable libraries:
NumPy (Library for working with Arrays)
SciPy (Library for Statistical Science)
Matplotlib (Graph Plotting Library)
NLTK (Natural Language Toolkit)
TensorFlow (Machine Learning)
C++
C++ holds the title: "The worlds fastest programming language".
Because of the speed, C++ is a preferred language when programming Computer Games.
It provides faster execution and has less response time which is applied in search engines and development of computer games.
Google uses C++ in Artificial Intelligence and Machine Learning programs for SEO (Search Engine Optimization).
SHARK is a super-fast C++ library with support for supervised learning algorithms, linear regression, neural networks, and clustering.
MLPACK is also a super-fast machine learning library for C++.
Java
Java is another general-purpose coding language that can be used for all types of software development.
For Machine Learning, Java is mostly used to create algorithms, and neural networks.
SQL
SQL (Structured Query Language) is the most popular language for managing data.
Knowledge of SQL databases, tables and queries helps data scientists when dealing with data.
SQL is very convenient for storing, manipulating, and retrieving data in databases.
Image Classification Example
Artificial Music Intelligence
Can an algorithm compose better music than a human?
David Cope is a former professor of music at the University of Santa Cruz (California).
For over 30 years, David Cope has been developing Emmy or EMI (Experimental Musical Intelligence), an algorithm to compose music in the style of famous composers.
Bach, Larson, or EMI?
In a test performed by professor Douglas Hofstadter of the University of Oregon, a pianist performed three musical pieces in the style of Bach:
One written by Bach
One written by Steve Larson
One written by EMI
Dr. Larson was hurt when the audience concluded that his piece was written by EMI.
He felt better when the listeners decided that the piece composed by EMI was a genuine Bach.
Source
Project Baseline
is an initiative to make it easy for everyone to contribute to the map of human health and to participate in clinical research.
In Project Baseline, researchers, clinicians, engineers, designers, advocates, and volunteers, can collaborate building the next generation of healthcare tools and services.
Data Scientists
Data Scientists can be experts in multiple disciplines:
Applied mathematics
Computational statistics
Computer Science
Machine learning
Deep learning
Data Scientists also have significant big data experience:
Business Intelligence
Data Base Design
Data Warehouse Design
Data Mining
SQL Queries
SQL Reporting
Artificial Intelligence is a scientific discipline embracing several Data Science fields ranging from narrow AI to strong AI, including machine learning, deep learning, big data and data mining.
Artificial Health Intelligence
The Corona Pandemic pushed the need for optimizing Medical Healthcare.
Machine learning is a new technology that can provide better drug discovery, shorter development time, and lower drug costs.
Machine Learning enables healthcare to use "big data" for making better medical or clinical decisions.
FDA Statement
Statement from FDA Commissioner Scott Gottlieb, M.D. on steps toward a new, tailored review framework for artificial intelligence-based medical devices:
"Artificial intelligence and machine learning have the potential to fundamentally transform the delivery of health care. As technology and science advance, we can expect to see earlier disease detection, more accurate diagnosis, more targeted therapies and significant improvements in personalized medicine".
Linear Graphs
Machine Learning often uses line graphs to show relationships.
A line graph displays the values of a linear function: y = ax + b
Important keywords:
Linear (Straight)
Slope (Angle)
Intercept (Start value)
Linear
Linear means straight. A linear graph is a straight line.
The graph consists of two axes: x-axis (horizontal) and y-axis (vertical).
Example
Example
const xValues = []; const yValues = [];
// Generate values for (let x = 0; x <= 10; x += 1) { xValues.push(x); yValues.push(x); }
// Define Data const data = [{ x: xValues, y: yValues, mode: "lines" }];
// Define Layout const layout = {title: "y = x"};
// Display using Plotly Plotly.newPlot("myPlot", data, layout);
// Display using Plotly Plotly.newPlot("myPlot", data, layout); </script>
</body> </html>
When to Use Scatter Plots
Scatter plots are great for:
Seeing the "Big Picture"
Compare different values
Discovering potential trends
Discovering patterns in data
Discovering relationships between data
Discovering Clusters and Correlations
A Perceptron is an Artificial Neuron
It is the simplest possible Neural Network
Neural Networks are the building blocks of Machine Learning.
Frank Rosenblatt
Frank Rosenblatt (1928 – 1971) was an American psychologist notable in the field of Artificial Intelligence.
In 1957 he started something really big. He "invented" a Perceptron program, on an IBM 704 computer at Cornell Aeronautical Laboratory.
Scientists had discovered that brain cells (Neurons) receive input from our senses by electrical signals.
The Neurons, then again, use electrical signals to store information, and to make decisions based on previous input.
Frank had the idea that Perceptrons could simulate brain principles, with the ability to learn and make decisions.
The Perceptron
The original Perceptron was designed to take a number of binary inputs, and produce one binary output (0 or 1).
The idea was to use different weights to represent the importance of each input, and that the sum of the values should be greater than a threshold value before making a decision like yes or no (true or false) (0 or 1).
Perceptron Example
Imagine a perceptron (in your brain).
The perceptron tries to decide if you should go to a concert.
Is the artist good? Is the weather good?
What weights should these facts have?
Criteria
Input
Weight
Artists is Good
x1 = 0 or 1
w1 = 0.7
Weather is Good
x2 = 0 or 1
w2 = 0.6
Friend will Come
x3 = 0 or 1
w3 = 0.5
Food is Served
x4 = 0 or 1
w4 = 0.3
Alcohol is Served
x5 = 0 or 1
w5 = 0.4
The Perceptron Algorithm
Frank Rosenblatt suggested this algorithm:
Set a threshold value
Multiply all inputs with its weights
Sum all the results
Activate the output
1. Set a threshold value:
Threshold = 1.5
2. Multiply all inputs with its weights:
x1 * w1 = 1 * 0.7 = 0.7
x2 * w2 = 0 * 0.6 = 0
x3 * w3 = 1 * 0.5 = 0.5
x4 * w4 = 0 * 0.3 = 0
x5 * w5 = 1 * 0.4 = 0.4
3. Sum all the results:
0.7 + 0 + 0.5 + 0 + 0.4 = 1.6 (The Weighted Sum)
4. Activate the Output:
Return true if the sum > 1.5 ("Yes I will go to the Concert")
Note
If the weather weight is 0.6 for you, it might be different for someone else. A higher weight means that the weather is more important to them.
If the threshold value is 1.5 for you, it might be different for someone else. A lower threshold means they are more wanting to go to any concert.
This can be interpreted as true or false / yes or no.
In the example above, the node values are: 1, 0, 1, 0, 1
Node Weights
Weights shows the strength of each node.
In the example above, the node weights are: 0.7, 0.6, 0.5, 0.3, 0.4
The Activation Function
The activation function maps the the weighted sum into a binary value of 1 or 0.
This can be interpreted as true or false / yes or no.
In the example above, the activation function is simple: (sum > 1.5)
Note
It is obvious that a decision is NOT made by one neuron alone.
Many other neurons must provide input:
Is the artist good
Is the weather good
...
Multi-Layer Perceptrons can be used for very sophisticated decision making.
Neural Networks
The Perceptron defines the first step into Neural Networks:
Neural Networks are used in applications like Facial Recognition.
These applications use Pattern Recognition.
This type of Classification can be done with a Perceptron.
Perceptrons can be used to classify data into two parts.
Perceptrons are also known as a Linear Binary Classifiers.
Pattern Classification
Imagine a strait line (a linear graph) in a space with scattered x y points.
How can you classify the points over and under the linA perceptron can be trained to recognize the points over the line, without knowing the formula for the line.
How to Program a Perceptron
To program a perceptron, we can use a simple JavaScript program that will:
Create a simple plotter
Create 500 random x y points
Display the x y points
Create a line function: f(x)
Display the line
Compute the desired answers
Display the desired answers
Create a Simple Plotter
Creating a simple plotter object is described in the .
Example
const plotter = new XYPlotter("myCanvas"); plotter.transformXY();
// Create Random XY Points const numPoints = 500; const xPoints = []; const yPoints = []; for (let i = 0; i < numPoints; i++) { xPoints[i] = Math.random() * xMax; yPoints[i] = Math.random() * yMax; }
// Line Function function f(x) { return x * 1.2 + 50; }
//Plot the Line plotter.plotLine(xMin, f(xMin), xMax, f(xMax), "black");
// Compute Desired Answers const desired = []; for (let i = 0; i < numPoints; i++) { desired[i] = 0; if (yPoints[i] > f(xPoints[i])) {desired[i] = 1} }
// Diplay Desired Result for (let i = 0; i < numPoints; i++) { let color = "blue"; if (desired[i]) color = "black"; plotter.plotPoint(xPoints[i], yPoints[i], color); } </script> </body> </html>
How to Train a Perceptron
In the next chapter, you will learn how to use the correct answers to:
Train a perceptron to predict the output values of unknown input values.
Create a Perceptron Object
Create a Training Function
Train the perceptron against correct answers
Training Task
Imagine a straight line in a space with scattered x y points.
Train a perceptron to classify the points over and under the line.
Create a Perceptron Object
Create a Perceptron object. Name it anything (like Perceptron).
Let the perceptron accept two parameters:
The number of inputs (no)
The learning rate (learningRate).
Set the default learning rate to 0.00001.
Then create random weights between -1 and 1 for each input.
Example
// Perceptron Object function Perceptron(no, learningRate = 0.00001) {
// Set Initial Values this.learnc = learningRate; this.bias = 1;
// Compute Random Weights this.weights = []; for (let i = 0; i <= no; i++) { this.weights[i] = Math.random() * 2 - 1; }
// End Perceptron Object }
The Random Weights
The Perceptron will start with a random weight for each input.
The Learning Rate
For each mistake, while training the Perceptron, the weights will be adjusted with a small fraction.
This small fraction is the "Perceptron's learning rate".
In the Perceptron object we call it learnc.
The Bias
Sometimes, if both inputs are zero, the perceptron might produce an incorrect output.
To avoid this, we give the perceptron an extra input with the value of 1.
This is called a bias.
Add an Activate Function
Remember the perceptron algorithm:
Multiply each input with the perceptron's weights
Sum the results
Compute the outcome
Example
this.activate = function(inputs) { let sum = 0; for (let i = 0; i < inputs.length; i++) { sum += inputs[i] * this.weights[i]; } if (sum > 0) {return 1} else {return 0} }
The activation function will output:
1 if the sum is greater than 0
0 if the sum is less than 0
Create a Training Function
The training function guesses the outcome based on the activate function.
Every time the guess is wrong, the perceptron should adjust the weights.
After many guesses and adjustments, the weights will be correct.
// Create Random XY Points const xPoints = []; const yPoints = []; for (let i = 0; i < numPoints; i++) { xPoints[i] = Math.random() * xMax; yPoints[i] = Math.random() * yMax; }
// Line Function function f(x) { return x * 1.2 + 50; }
//Plot the Line plotter.plotLine(xMin, f(xMin), xMax, f(xMax), "black");
// Compute Desired Answers const desired = []; for (let i = 0; i < numPoints; i++) { desired[i] = 0; if (yPoints[i] > f(xPoints[i])) {desired[i] = 1} }
// Create a Perceptron const ptron = new Perceptron(2, learningRate);
// Train the Perceptron for (let j = 0; j <= 10000; j++) { for (let i = 0; i < numPoints; i++) { ptron.train([xPoints[i], yPoints[i]], desired[i]); } }
// Display the Result for (let i = 0; i < numPoints; i++) { const x = xPoints[i]; const y = yPoints[i]; let guess = ptron.activate([x, y, ptron.bias]); let color = "black"; if (guess == 0) color = "blue"; plotter.plotPoint(x, y, color); }
// Perceptron Object --------------------- function Perceptron(no, learningRate = 0.00001) {
// Set Initial Values this.learnc = learningRate; this.bias = 1;
// Compute Random Weights this.weights = []; for (let i = 0; i <= no; i++) { this.weights[i] = Math.random() * 2 - 1; }
// Activate Function this.activate = function(inputs) { let sum = 0; for (let i = 0; i < inputs.length; i++) { sum += inputs[i] * this.weights[i]; } if (sum > 0) {return 1} else {return 0} }
// Train Function this.train = function(inputs, desired) { inputs.push(this.bias); let guess = this.activate(inputs); let error = desired - guess; if (error != 0) { for (let i = 0; i < inputs.length; i++) { this.weights[i] += this.learnc * error * inputs[i]; } } }
// End Perceptron Object } </script> </body> </html>
Backpropagation
After each guess, the perceptron calculates how wrong the guess was.
If the guess is wrong, the perceptron adjusts the bias and the weights so that the guess will be a little bit more correct the next time.
This type of learning is called backpropagation.
After trying (a few thousand times) your perceptron will become quite good at guessing.
Create Your Own Library
Library Code
// Perceptron Object function Perceptron(no, learningRate = 0.00001) {
// Set Initial Values this.learnc = learningRate; this.bias = 1;
// Compute Random Weights this.weights = []; for (let i = 0; i <= no; i++) { this.weights[i] = Math.random() * 2 - 1; }
// Activate Function this.activate = function(inputs) { let sum = 0; for (let i = 0; i < inputs.length; i++) { sum += inputs[i] * this.weights[i]; } if (sum > 0) {return 1} else {return 0} }
// Train Function this.train = function(inputs, desired) { inputs.push(this.bias); let guess = this.activate(inputs); let error = desired - guess; if (error != 0) { for (let i = 0; i < inputs.length; i++) { this.weights[i] += this.learnc * error * inputs[i]; } } }
// Create Random XY Points const xPoints = []; const yPoints = []; for (let i = 0; i < numPoints; i++) { xPoints[i] = Math.random() * xMax; yPoints[i] = Math.random() * yMax; }
// Line Function function f(x) { return x * 1.2 + 50; }
//Plot the Line plotter.plotLine(xMin, f(xMin), xMax, f(xMax), "black");
// Compute Desired Answers const desired = []; for (let i = 0; i < numPoints; i++) { desired[i] = 0; if (yPoints[i] > f(xPoints[i])) {desired[i] = 1} }
// Create a Perceptron const ptron = new Perceptron(2, learningRate);
// Train the Perceptron for (let j = 0; j <= 10000; j++) { for (let i = 0; i < numPoints; i++) { ptron.train([xPoints[i], yPoints[i]], desired[i]); } }
// Display the Result for (let i = 0; i < numPoints; i++) { const x = xPoints[i]; const y = yPoints[i]; let guess = ptron.activate([x, y, ptron.bias]); let color = "black"; if (guess == 0) color = "blue"; plotter.plotPoint(x, y, color); } </script> </body> </html>
A Perceptron must be Tested and Evaluated
A Perceptron must be tested against Real Values.
Test Your Library
Generate new unknown points and check if your Perceptron can guess the right answers:
// Create Random XY Points const xPoints = []; const yPoints = []; for (let i = 0; i < numPoints; i++) { xPoints[i] = Math.random() * xMax; yPoints[i] = Math.random() * yMax; }
// Line Function function f(x) { return x * 1.2 + 50; }
//Plot the Line plotter.plotLine(xMin, f(xMin), xMax, f(xMax), "black");
// Compute Desired Answers const desired = []; for (let i = 0; i < numPoints; i++) { desired[i] = 0; if (yPoints[i] > f(xPoints[i])) {desired[i] = 1} }
// Create a Perceptron const ptron = new Perceptron(2, learningRate);
// Train the Perceptron for (let j = 0; j <= 10000; j++) { for (let i = 0; i < numPoints; i++) { ptron.train([xPoints[i], yPoints[i]], desired[i]); } }
// Test Against Unknown Data const counter = 500; for (let i = 0; i < counter; i++) { let x = Math.random() * xMax; let y = Math.random() * yMax; let guess = ptron.activate([x, y, ptron.bias]); let color = "black"; if (guess == 0) color = "blue"; plotter.plotPoint(x, y, color); } </script> </body> </html>
// Create Random XY Points const xPoints = []; const yPoints = []; for (let i = 0; i < numPoints; i++) { xPoints[i] = Math.random() * xMax; yPoints[i] = Math.random() * yMax; }
// Line Function function f(x) { return x * 1.2 + 50; }
//Plot the Line plotter.plotLine(xMin, f(xMin), xMax, f(xMax), "black");
// Compute Desired Answers const desired = []; for (let i = 0; i < numPoints; i++) { desired[i] = 0; if (yPoints[i] > f(xPoints[i])) {desired[i] = 1} }
// Create a Perceptron const ptron = new Perceptron(2, learningRate);
// Train the Perceptron for (let j = 0; j <= 10000; j++) { for (let i = 0; i < numPoints; i++) { ptron.train([xPoints[i], yPoints[i]], desired[i]); } }
// Test Against Unknown Data const counter = 500; for (let i = 0; i < counter; i++) { let x = Math.random() * xMax; let y = Math.random() * yMax; let guess = ptron.activate([x, y, ptron.bias]); let color = "black"; if (guess == 0) color = "blue"; plotter.plotPoint(x, y, color); } </script> </body> </html>
Tune the Perceptron
How can you tune the Perceptron?
Here are some suggestions:
Adjust the learning rate
Increase the number of training data
Increase the number of training iterations
Learning is Looping
An ML model is Trained by Looping over data multiple times.
For each iteration, the Weight Values are adjusted.
Training is complete when the iterations fails to Reduce the Cost.
Gradient Descent
Gradient Descent is a popular algorithm for solving AI problems.
A simple Linear Regression Model can be used to demonstrate a gradient descent.
The goal of a linear regression is to fit a linear graph to a set of (x,y) points. This can be solved with a math formula. But a Machine Learning Algorithm can also solve this.
This is what the example above does.
It starts with a scatter plot and a linear model (y = wx + b).
Then it trains the model to find a line that fits the plot. This is done by altering the weight (slope) and the bias (intercept) of the line.
Below is the code for a Trainer Object that can solve this problem (and many other problems).
A Trainer Object
Create a Trainer object that can take any number of (x,y) values in two arrays (xArr,yArr).
Set weight to zero and the bias to 1.
A learning constant (learnc) has to be set, and a cost variable must be defined:
A standard way to solve a regression problem, is with an "Cost Function" that measures how good the solution is.
The function uses the weight and bias from the model (y = wx + b) and returns an error, based on how well the line fits a plot.
The way to compute this error, is to loop through all (x,y) points in the plot, and sum the square distances between the y value of each point and the line.
The most conventional way is to square the distances (to ensure positive values) and to make the error function differentiable.
this.costError = function() { total = 0; for (let i = 0; i < this.points; i++) { total += (this.yArr[i] - (this.weight * this.xArr[i] + this.bias)) **2; } return total / this.points; }
Another name for the Cost Function is Error Function.
The formula used in the function is actually this:
E is the error (cost)
N is the total number of observations (points)
y is the value (label) of each observation
x is the value (feature) of each observation
m is the slope (weight)
b is intercept (bias)
mx + b is the prediction
1/N * N∑1 is the squared mean value
The Train Function
We will now run a gradient descent.
The gradient descent algorithm should walk the cost function towards the best line.
Each iteration should update both m and b towards a line with a lower cost (error).
To do that, we add a train function that loops over all the data many times:
this.train = function(iter) { for (let i = 0; i < iter; i++) { this.updateWeights(); } this.cost = this.costError(); }
An Update Weights Function
The train function above should update the weights and biases in each iteration.
The direction to move is calculated using two partial derivatives:
// Cost Function this.costError = function() { total = 0; for (let i = 0; i < this.points; i++) { total += (this.yArr[i] - (this.weight * this.xArr[i] + this.bias)) **2; } return total / this.points; }
// Train Function this.train = function(iter) { for (let i = 0; i < iter; i++) { this.updateWeights(); } this.cost = this.costError(); }
// Update Weights Function this.updateWeights = function() { let wx; let w_deriv = 0; let b_deriv = 0; for (let i = 0; i < this.points; i++) { wx = this.yArr[i] - (this.weight * this.xArr[i] + this.bias); w_deriv += -2 * wx * this.xArr[i]; b_deriv += -2 * wx; } this.weight -= (w_deriv / this.points) * this.learnc; this.bias -= (b_deriv / this.points) * this.learnc; }
} // End Trainer Object
Now you can include the library in HTML:
<script src="myailib.js"></script>
Machine Learning Subcategories
Supervised Learning
Unsupervised Learning
Supervised Machine Learning uses a set of input variables to predict the value of an output variable.
Unsupervised Machine Learning, uses patterns from any unlabeled dataset, trying to understand patterns (or groupings) in the data.
Machine Learning Inference
A Model defines the relationship between the label (y) and the features (x).
There are three phases in the life of a model:
Data Collection
Training
Inference
Supervised Learning
Supervised learning uses labeled data (data with known answers) to train algorithms to:
Classify Data
Predict Outcomes
Supervised learning can classify data like "What is spam in an e-mail", based on known spam examples.
Supervised learning can predict outcomes like predicting what kind of video you like, based on the videos you have played.
Unsupervised Learning
Unsupervised learning is used to predict undefined relationships like meaningful patterns in data.
It is about creating computer algorithms than can improve themselves.
It is expected that machine learning will shift to unsupervised learning to allow programmers to solve problems without creating models.
Reinforcement Learning
Reinforcement learning is based on non-supervised learning but receives feedback from the user whether the decisions is good or bad. The feedback contributes to improving the model.
Self-Supervised Learning
Self-supervised learning is similar to unsupervised learning because it works with data without human added labels.
The difference is that unsupervised learning uses clustering, grouping, and dimensionality reduction, while self-supervised learning draw its own conclusions for regression and classification tasks.
Key Machine Learning Terminologies are:
Relationships
Labels
Features
Models
Training
Inference
Relationships
Machine learning systems uses Relationships between Inputs to produce Predictions.
In algebra, a relationship is often written as y = ax + b:
y is the label we want to predict
a is the slope of the line
x are the input values
b is the intercept
With ML, a relationship is written as y = b + wx:
y is the label we want to predict
w is the weight (the slope)
x are the features (input values)
b is the intercept
Machine Learning Labels
In Machine Learning terminology, the label is the thing we want to predict.
It is like the y in a linear graph:
Algebra
Machine Learning
y = ax + b
y = b + wx
Machine Learning Features
In Machine Learning terminology, the features are the input.
They are like the x values in a linear graph:
Algebra
Machine Learning
y = ax + b
y = b + wx
Sometimes there can be many features (input values) with different weights:
y = b + w1x1 + w2x2 + w3x3 + w4x4
Up to 80% of a Machine Learning project is about Collecting Data:
What data is Required?
What data is Available?
How to Select the data?
How to Collect the data?
How to Clean the data?
How to Prepare the data?
How to Use the data?
What is Data?
Data can be many things.
With Machine Learning, data is collections of facts:
Type
Examples
Numbers
Prices. Dates.
Measurements
Size. Height. Weight.
Words
Names and Places.
Observations
Counting Cars.
Descriptions
It is cold.
Intelligence Needs Data
Human intelligence needs data:
A real estate broker needs data about sold houses to estimate prices.
Artificial Intelligence also needs data:
A Machine Learning program needs data to estimate prices.
Data can help us to see and understand.
Data can help us to find new opportunities.
Data can help us to resolve misunderstandings.
Healthcare
Healthcare and life sciences collect public health data and patient data to learn how to improve patient care and save lives.
Business
The most successful companies in many sectors are data driven. They use sophisticated data analytics to learn how the company can perform better.
Finance
Banks and insurance companies collect and evaluate data about customers, loans and deposits to support strategic decision-making.
Storing Data
The most common data to collect are Numbers and Measurements.
Often data are stored in arrays representing the relationship between values.
This table contains house prices versus size:
Price
7
8
8
9
9
9
10
11
14
14
15
Size
50
60
70
80
90
100
110
120
130
140
150
Quantitative vs. Qualitative
Quantitative data are numerical:
55 cars
15 meters
35 children
Qualitative data are descriptive:
It is cold
It is long
It was fun
Census or Sampling
A Census is when we collect data for every member of a group.
A Sample is when we collect data for some members of a group.
If we wanted to know how many Americans smoke cigarettes, we could ask every person in the US (a census), or we could ask 10 000 people (a sample).
A census is Accurate, but hard to do. A sample is Inaccurate, but is easier to do.
Sampling Terms
A Population is group of individuals (objects) we want to collect information from.
A Census is information about every individual in a population.
A Sample is information about a part of the population (In order to represent all).
Random Samples
In order for a sample to represent a population, it must be collected randomly.
A Random Sample, is a sample where every member of the population has an equal chance to appear in the sample.
Sampling Bias
A Sampling Bias (Error) occurs when samples are collected in such a way that some individuals are less (or more) likely to be included in the sample.
Big Data
Big data is data that is impossible for humans to process without the assistance of advanced machines.
Big data does not have any definition in terms of size, but datasets are becoming larger and larger as we continously collect more and more data and store data at a lower and lower cost.
Data Mining
With big data comes complicated data structures.
A huge part of big data processing is refining data.
Clusters are collections of similar data
Clustering is a type of unsupervised learning
The Correlation Coefficient describes the strength of a relationship.
Clusters
Clusters are collections of data based on similarity.
Data points clustered together in a graph can often be classified into clusters.
In the graph below we can distinguish 3 different clusters:
Identifying Clusters
Clusters can hold a lot of valuable information, but clusters come in all sorts of shapes, so how can we recognize them?
The two main methods are:
Using Visualization
Using an Clustering Algorithm
Clustering
Clustering is a type of Unsupervised Learning.
Clustering is trying to:
Collect similar data in groups
Collect dissimilar data in other groups
Clustering Methods
Density Method
Hierarchical Method
Partitioning Method
Grid-based Method
The Density Method considers points in a dense regions to have more similarities and differences than points in a lower dense region. The density method has a good accuracy. It also has the ability to merge clusters. Two common algorithms are DBSCAN and OPTICS.
The Hierarchical Method forms the clusters in a tree-type structure. New clusters are formed using previously formed clusters. Two common algorithms are CURE and BIRCH.
The Grid-based Method formulates the data into a finite number of cells that form a grid-like structure. Two common algorithms are CLIQUE and STING
The Partitioning Method partitions the objects into k clusters and each partition forms one cluster. One common algorithm is CLARANS.
Correlation Coefficient
The Correlation Coefficient (r) describes the strength and direction of a linear relationship and x/y variables on a scatterplot.
The value of r is always between -1 and +1:
-1.00
Perfect downhill
Negative linear relationship.
-0.70
Strong downhill
Negative linear relationship.
-0.50
Moderate downhill
Negative linear relationship.
-0.30
Weak downhill
Negative linear relationship.
0
No linear relationship.
+0.30
Weak uphill
Positive linear relationship.
+0.50
Moderate uphill
Positive linear relationship.
+0.70
Strong uphill
Positive linear relationship.
+1.00
Perfect uphill
Positive linear relationship.
A Regression is a method to determine the relationship between one variable (y) and other variables (x).
In statistics, a Linear Regression is an approach to modeling a linear relationship between y and x.
In Machine Learning, a Linear Regression is a supervised machine learning algorithm.
Scatter Plot
This is the scatter plot (from the previous chapter):
If scattered data points do not fit a linear regression (a straight line through the points), the data may fit an polynomial regression.
A Polynomial Regression, like linear regression, uses the relationship between the variables x and y to find the best way to draw a line through the data points.
The deep learning revolution started around 2010.
Since then, Deep Learning has solved many "unsolvable" problems.
The deep learning revolution was not started by a single discovery. It more or less happened when several needed factors were ready:
Computers were fast enough
Computer storage was big enough
Better training methods were invented
Better tuning methods were invented
Neurons
Scientists agree that our brain has between 80 and 100 billion neurons.
These neurons have hundreds of billions connections between them.
Neurons (aka Nerve Cells) are the fundamental units of our brain and nervous system.
The neurons are responsible for receiving input from the external world, for sending output (commands to our muscles), and for transforming the electrical signals in between.
Neural Networks
Artificial Neural Networks are normally called Neural Networks (NN).
Neural networks are in fact multi-layer Perceptrons.
The perceptron defines the first step into multi-layered neural networks.
Neural Networks is the essence of Deep Learning.
Neural Networks is one of the most significant discoveries in history.
Neural Networks can solve problems that can NOT be solved by algorithms:
Medical Diagnosis
Face Detection
Voice Recognition
The Neural Network Model
Input data (Yellow) are processed against a hidden layer (Blue) and modified against another hidden layer (Green) to produce the final output (Red).
Tom Mitchell
Tom Michael Mitchell (born 1951) is an American computer scientist and University Professor at the Carnegie Mellon University (CMU).
He is a former Chair of the Machine Learning Department at CMU.
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."
Tom Mitchell (1999)
E: Experience (the number of times). T: The Task (driving a car). P: The Performance (good or bad).
The Giraffe Story
In 2015, Matthew Lai, a student at Imperial College in London created a neural network called Giraffe.
Giraffe could be trained in 72 hours to play chess at the same level as an international master.
Computers playing chess are not new, but the way this program was created was new.
Smart chess playing programs take years to build, while Giraffe was built in 72 hours with a neural network.
Supervised Machine Learning
Unsupervised Machine Learning
Self-Supervised Machine Learning
Deep Learning
Classical programming uses programs (algorithms) to create results:
Traditional Computing
Data + Computer Algorithm = Result
Machine Learning uses results to create programs (algorithms):
Machine Learning
Data + Result = Computer Algorithm
Machine Learning
Machine Learning is often considered equivalent with Artificial Intelligence.
This is not correct. Machine learning is a subset of Artificial Intelligence.
Machine Learning is a discipline of AI that uses data to teach machines.
"Machine Learning is a field of study that gives computers the ability to learn without being programmed."
Arthur Samuel (1959)
Intelligent Decision Formula
Save the result of all actions
Simulate all possible outcomes
Compare the new action with the old ones
Check if the new action is good or bad
Choose the new action if it is less bad
Do it all over again
The fact that computers can do this millions of times, has proven that computers can take very intelligent decisions.
Training and Deploying Machine Learning Models In the Browser
Tensorflow Models
Models and Layers are important building blocks in Machine Learning.
For different Machine Learning tasks you must combine different types of Layers into a Model that can be trained with data to predict future values.
TensorFlow.js is supporting different types of Models and different types of Layers.
A TensorFlow Model is a Neural Network with one or more Layers.
A Tensorflow Project
A Tensorflow project has this typical workflow:
Collecting Data
Creating a Model
Adding Layers to the Model
Compiling the Model
Training the Model
Using the Model
Example
Suppose you knew a function that defined a strait line:
Y = 1.2X + 5
Then you could calculate any y value with the JavaScript formula:
y = 1.2 * x + 5;
To demonstrate Tensorflow.js, we could train a Tensorflow.js model to predict Y values based on X inputs.
The TensorFlow model does not know the function.
// Create Training Data const xs = tf.tensor([0, 1, 2, 3, 4]); const ys = xs.mul(1.2).add(5);
// Define a Linear Regression Model const model = tf.sequential(); model.add(tf.layers.dense({units:1, inputShape:[1]}));
// Specify Loss and Optimizer model.compile({loss:'meanSquaredError', optimizer:'sgd'});
// Train the Model model.fit(xs, ys, {epochs:500}).then(() => {myFunction()});
// Use the Model function myFunction() { const xArr = []; const yArr = []; for (let x = 0; x <= 10; x++) { xArr.push(x); let result = model.predict(tf.tensor([Number(x)])); result.data().then(y => { yArr.push(Number(y)); if (x == 10) {plot(xArr, yArr)}; }); } }
<script> // Create Training Data const xs = tf.tensor([0, 1, 2, 3, 4]); const ys = xs.mul(1.2).add(5);
// Define a Linear Regression Model const model = tf.sequential(); model.add(tf.layers.dense({units:1, inputShape:[1]}));
// Specify Loss and Optimizer model.compile({loss: 'meanSquaredError', optimizer:'sgd'});
// Train the Model model.fit(xs, ys, {epochs:500}).then(() => {myFunction()});
// Use the Model function myFunction() { const xMax = 10; const xArr = []; const yArr = []; for (let x = 0; x <= xMax; x++) { let result = model.predict(tf.tensor([Number(x)])); result.data().then(y => { xArr.push(x); yArr.push(Number(y)); if (x == xMax) {plot(xArr, yArr)}; }); } document.getElementById('message').style.display="none"; }
function plot(xArr, yArr) { // Define Data const data = [{x:xArr,y:yArr,mode:"markers",type:"scatter"}];
<script> // Create Training Data const xs = tf.tensor([0, 1, 2, 3, 4]); const ys = xs.mul(1.2).add(5);
// Define a Linear Regression Model const model = tf.sequential(); model.add(tf.layers.dense({units:1, inputShape:[1]}));
// Specify Loss and Optimizer model.compile({loss: 'meanSquaredError', optimizer:'sgd'});
// Train the Model model.fit(xs, ys, {epochs:500}).then(() => {myFunction()});
// Use the Model function myFunction() { const xMax = 10; const xArr = []; const yArr = []; for (let x = 0; x <= xMax; x++) { let result = model.predict(tf.tensor([Number(x)])); result.data().then(y => { xArr.push(x); yArr.push(Number(y)); if (x == xMax) {plot(xArr, yArr)}; }); } document.getElementById('message').style.display="none"; }
function plot(xArr, yArr) { // Define Data const data = [{x:xArr,y:yArr,mode:"markers",type:"scatter"}];
<script> // Create Training Data const xs = tf.tensor([0, 1, 2, 3, 4]); const ys = xs.mul(1.2).add(5);
// Define a Linear Regression Model const model = tf.sequential(); model.add(tf.layers.dense({units:1, inputShape:[1]}));
// Specify Loss and Optimizer model.compile({loss: 'meanSquaredError', optimizer:'sgd'});
// Train the Model model.fit(xs, ys, {epochs:500}).then(() => {myFunction()});
// Use the Model function myFunction() { const xMax = 20; const xArr = []; const yArr = []; for (let x = 10; x <= xMax; x++) { let result = model.predict(tf.tensor([Number(x)])); result.data().then(y => { xArr.push(x); yArr.push(Number(y)); if (x == xMax) {display(xArr,yArr)}; }); }
}
function display(xArr, yArr) { let text = "Correct Predicted<br>"; for (let i = 0; i < xArr.length; i++) { text += (xArr[i]*1.2+5).toFixed(4) + " " + yArr[i].toFixed(4) + "<br>"; } document.getElementById('message').innerHTML = text; } </script> </body> </html>
TensorFlow.js
Model is training!
TensorFlow Visor is a graphic tools for visualizing Machine Learning
It contains functions for visualizing TensorFlow Models
Visualizations can be organized in Visors (modal browser windows)
Can be used with Custom Tools likes d3, Chart.js, and Plotly.js
Often called tfjs-vis
Using tfjs-vis
To use tfjs-vis, add the following script tag to your HTML file(s):
When you have your map and filter functions ready, you can write a function to fetch the data.
async function runTF() { const jsonData = await fetch("cardata.json"); let values = await jsonData.json(); values = values.map(extractData).filter(removeErrors); }
Plotting the Data
Here is some code you can use to plot the data:
function tfPlot(values, surface) { tfvis.render.scatterplot(surface, {values:values, series:['Original','Predicted']}, {xLabel:'Horsepower', yLabel:'MPG'}); }
Shuffle Data
Always shuffle data before training.
When a model is trained, the data is divided into small sets (batches). Each batch is then fed to the model. Shuffling is important to prevent the model getting the same data over again. If using the same data twice, the model will not be able to generalize the data and give the right output. Shuffling gives a better variety of data in each batch.
Example
tf.util.shuffle(data);
TensorFlow Tensors
To use TensorFlow, input data needs to be converted to tensor data:
// Map x values to Tensor inputs const inputs = values.map(obj => obj.x); // Map y values to Tensor labels const labels = values.map(obj => obj.y);
// Convert inputs and labels to 2d tensors const inputTensor = tf.tensor2d(inputs, [inputs.length, 1]); const labelTensor = tf.tensor2d(labels, [labels.length, 1]);
Data Normalization
Data should be normalized before being used in a neural network.
A range of 0 - 1 using min-max are often best for numerical data:
In a sequential model, the input flows directly to the output. Other models can have multiple inputs and multiple outputs. Sequential is the easiest ML model. It allows you to build a model layer by layer, with weights that correspond to the next layer.
TensorFlow Layers
model.add() is used to add two layers to the model.
tf.layer.dense is a layer type that works in most cases. It multiplies its inputs by a weight-matrix and adds a number (bias) to the result.
Shapes and Units
inputShape: [1] because we have 1 input (x = horsepower).
units: 1 defines the size of the weight matrix: 1 weight for each input (x value).
Compiling a Model
Compile the model with a specified optimizer and loss function:
// Plot the Result tfPlot([values, predicted], surface1)
Input Data
Example 2 uses the same source code as Example 1.
But, because another dataset is used, the code must collect other data.
Data Collection
The data used in Example 2, is a list of house objects:
{ "Avg. Area Income": 79545.45857, "Avg. Area House Age": 5.682861322, "Avg. AreaNumberofRooms": 7.009188143, "Avg. Area Number of Bedrooms": 4.09, "Area Population": 23086.8005, "Price": 1059033.558, }, { "Avg. Area Income": 79248.64245, "Avg. Area House Age": 6.002899808, "Avg. AreaNumberofRooms": 6.730821019, "Avg. Area Number of Bedrooms": 3.09, "Area Population": 40173.07217, "Price": 1505890.915, },
The dataset is a JSON file stored at:
Cleaning Data
When preparing for machine learning, it is always important to:
Remove the data you don't need
Clean the data from errors
Remove Data
A smart way to remove unnecessary data, it to extract only the data you need.
This can be done by iterating (looping over) your data with a map function.
The function below takes an object and returns only x and y from the object's Horsepower and Miles_per_Gallon properties:
function extractData(obj) { return {x:obj.Horsepower, y:obj.Miles_per_Gallon}; }
Remove Errors
Most datasets contain some type of errors.
A smart way to remove errors is to use a filter function to filter out the errors.
The code below returns false if on of the properties (x or y) contains a null value:
When you have your map and filter functions ready, you can write a function to fetch the data.
async function runTF() { const jsonData = await fetch("cardata.json"); let values = await jsonData.json(); values = values.map(extractData).filter(removeErrors); }
Plotting the Data
Here is some code you can use to plot the data:
function tfPlot(values, surface) { tfvis.render.scatterplot(surface, {values:values, series:['Original','Predicted']}, {xLabel:'Rooms', yLabel:'Price',}); }
Shuffle Data
Always shuffle data before training.
When a model is trained, the data is divided into small sets (batches). Each batch is then fed to the model. Shuffling is important to prevent the model getting the same data over again. If using the same data twice, the model will not be able to generalize the data and give the right output. Shuffling gives a better variety of data in each batch.
Example
tf.util.shuffle(data);
TensorFlow Tensors
To use TensorFlow, input data needs to be converted to tensor data:
// Map x values to Tensor inputs const inputs = values.map(obj => obj.x); // Map y values to Tensor labels const labels = values.map(obj => obj.y);
// Convert inputs and labels to 2d tensors const inputTensor = tf.tensor2d(inputs, [inputs.length, 1]); const labelTensor = tf.tensor2d(labels, [labels.length, 1]);
Data Normalization
Data should be normalized before being used in a neural network.
A range of 0 - 1 using min-max are often best for numerical data:
const model = tf.sequential(); creates a Sequential ML Model.
In a sequential model, the input flows directly to the output. Other models can have multiple inputs and multiple outputs. Sequential is the easiest ML model. It allows you to build a model layer by layer, with weights that correspond to the next layer.
TensorFlow Layers
model.add() is used to add two layers to the model.
tf.layer.dense is a layer type that works in most cases. It multiplies its inputs by a weight-matrix and adds a number (bias) to the result.
Shapes and Units
inputShape: [1] because we have 1 input (x = rooms).
units: 1 defines the size of the weight matrix: 1 weight for each input (x value).
Compiling a Model
Compile the model with a specified optimizer and loss function:
// Plot the Result tfPlot([values, predicted], surface1)
Graphic Libraries
JavaScript libraries to use for both Artificial Intelligence graphs and other charts:
OpenGL: OpenGL (Open Graphics Library) is a widely used cross-platform graphics API that allows developers to interact with the GPU (Graphics Processing Unit) to render 2D and 3D graphics. It is commonly used in computer games, scientific simulations, and CAD (Computer-Aided Design) applications.
DirectX: DirectX is a collection of APIs developed by Microsoft for Windows platforms. It provides access to various multimedia features, including 2D and 3D graphics, audio, and input devices. DirectX is often used in game development on Windows.
Vulkan: Vulkan is a low-level graphics and compute API designed for high-performance graphics applications. It offers more control to developers but also requires more explicit management of resources compared to OpenGL.
Metal: Metal is Apple's graphics and GPU programming framework, primarily used on macOS and iOS devices. It allows developers to take full advantage of Apple's hardware for graphics rendering and computation.
Direct2D and Direct3D: These are subsets of the DirectX API, focusing on 2D and 3D graphics, respectively. Direct2D is often used for 2D game development and GUI rendering on Windows.
SFML (Simple and Fast Multimedia Library): SFML is a C++ multimedia library that simplifies the process of developing games and multimedia applications. It provides functions for graphics, audio, networking, and more.
SDL (Simple DirectMedia Layer): SDL is a cross-platform development library designed for multimedia applications and games. It offers support for graphics, audio, input, and window management.
Qt: Qt is a popular C++ framework for developing cross-platform GUI applications. It includes a wide range of libraries and tools for graphics, as well as other aspects of application development.
Cairo Cairo is a 2D graphics library that provides a device-independent API for drawing vector graphics. It's often used for rendering graphics in applications and libraries that need high-quality 2D rendering.
Skia: Skia is an open-source 2D graphics library developed by Google. It is used in various Google products and can be integrated into applications for high-performance 2D rendering.
Plotly.js
Plotly.js is a charting library that comes with over 40 chart types, 3D charts, statistical graphs, and SVG maps.
Chart.js
Chart.js comes with many built-in chart types:
Scatter
Line
Bar
Radar
Pie and Doughnut
Polar Area
Bubbl
Google Chart
From simple line charts to complex tree maps, Google Chart provides a number of built-in chart types:
Scatter Chart
Line Chart
Bar / Column Chart
Area Chart
Pie Chart
Donut Chart
Org Chart
Map / Geo Chart
HTML Canvas is perfect for Scatter Plots
HTML Canvas is perfect for Line Graphs
HTML Canvas is perfect for combining Scatter and Lines
// Plot Scatter ctx.fillStyle = "red"; for (let i = 0; i < xArray.length-1; i++) { let x = xArray[i]*400/150; let y = yArray[i]*400/15; ctx.beginPath(); ctx.ellipse(x, y, 2, 3, 0, 0, Math.PI * 2); ctx.fill(); }
// Plot Scatter ctx.fillStyle = "red"; for (let i = 0; i < xArray.length-1; i++) { let x = xArray[i] * xMax/150; let y = yArray[i] * yMax/15; ctx.beginPath(); ctx.ellipse(x, y, 2, 3, 0, 0, Math.PI * 2); ctx.fill(); }
for (let x = 0; x <= 10; x += 1) { x1Values.push(x); x2Values.push(x); x3Values.push(x); y1Values.push(eval(exp1)); y2Values.push(eval(exp2)); y3Values.push(eval(exp3)); }
Chart.js is a free JavaScript library for making HTML-based charts. It is one of the simplest visualization libraries for JavaScript, and comes with the following built-in chart types:
Scatter Plot
Line Chart
Bar Chart
Pie Chart
Donut Chart
Bubble Chart
Area Chart
Radar Chart
Mixed Chart
How to Use Chart.js?
Chart.js is easy to use.
First, add a link to the providing CDN (Content Delivery Network):
About 70 000 years ago, something happened to the human brain.
Humans started to develop "Cognitive Intelligence":
Being able to understand a language
Being able to understand numbers
Being able to understand abstract thinking
Words and Numbers
Using words, was a big step in the development of human intelligence:
"Elephant" is more informative than "Big Animal".
Understanding numbers, was also a big step:
"5" or "50" is more informative than "few" or "many".
Languages
"There is a lion behind the big oak" is more informative than shouting "Danger!".
Having a language is probably a key characteristic that distinguishes us from animals.
Abstract Thinking
Abstract thinking is thinking about things that are not concrete, like freedom, or ideas, or concepts.
A language is a structured system of communication.
The type of communication that involves the use of words.
What is a Language?
Apes and Whales communicate with each other.
Birds and Bees communicate with each other.
But only humans have developed a real Language.
No other species can express ideas using sentences constructed by a set of words (with verbs and nouns).
This skill is remarkable. And what is even more remarkable: Even children master this skill.
Spoken Languages
We are not sure of how old the spoken language is. The topic is difficult to study because of the lack of evidence.
We don't know how it started. But we have a clue.
The great African apes, Pan and Gorilla, are our closest living relatives. Why are they called "Apes"? Because they ape. Apes mime to get their message across.
It is assumed that the evolution of languages must have been a long process. Our ancestors might have started speaking a million years ago, but with fewer words, more miming, and no grammar.
The Tower of Babel
Cognitive Development
According to , there are six aspects of language development:
Theory of Mind
Understanding Vocal Signals
Understanding Imitation
Understanding Numbers
Understanding Intentional Communication
Understanding Non-linguistic Representations
Human Languages
Human languages contain a limited set of Words put together in Sentences:
Example
I'm going on holiday in my new car. Vado in vacanza con la mia macchina nuova. Me voy de vacaciones en mi auto nuevo. Ich fahre mit meinem neuen Auto in den Urlaub.
Computer Languages
Computers are programmed with a limited set of Words put together in computer Statements:
Example
var points = [40, 100, 1, 5, 25, 10]; points.sort(function(a, b){return a - b});
Written Languages
Egyptian and Sumerian are the earliest known written languages.
(The oldest written language in use today, is Chinese)
3500 BC
Sumerian
3000 BC
Egyptian
1500 BC
Chinese
1500 BC
Vedic Sanskrit
1500 BC
Greek
1000 BC
Hebrew
900 BC
Aramaic
700 BC
Etruscan
500 BC
Tamil
700 BC
Latin
700 AC
Classical Sanskrit
500 AC
Arabic
400 AC
Gothic (German)
700 AC
English
700 AC
Japanese
800 AC
French
900 AC
Italian
1000 AC
Spanish
To understand AI, it is important to understand the concept of Numbers and Counting.
AI is About Numbers
Artificial Intelligence is all about Numbers.
Numbers are easy to understand: 1,2,3,4,5 ... 11,12,13,14,15.
Studies of animals indicates that even animals can understand some numbers:
2 Wives
8 Sons
5 Eggs
The need for numbers in the modern world is absolute. We cannot live without numbers:
100 Dollar
Pi = 3.14
365 Days
25 Years
20% Tax
100 Miles
AI is About Counting
The concept of numbers leads to the concept of counting.
Imagine prehistoric thinking:
How to count apples?
How to weigh corn?
How to pay?
How far is the ocean?
Artificial Intelligence is a result of the human need for calculations.
Counting is easy to understand: 2 + 2 = 4.
Studies of animals indicates that animals can only understand very simple counting.
How do Homo Sapiens deal with calculations?
Complex calculations are done by computers.
"Yes! Computers can be smarter than humans."
Two Babylonian Scientists
About 6000 Years ago ...
Two Babylonian scientists were talking:
Scientist 1: "We need to invent a number system".
Scientist 2: "What?".
Scientist 1: "We need to give every number a name".
Scientist 2: "You mean like 1, 2, and 3".
Scientist 1: "Exactly!".
Scientist 2: "But why?".
Scientist 1: "How can I tell you I have 7 sons, if you don't know what 7 is?
Scientist 2: "Every number should have a name?".
Scientist 1: "Exactly!".
Scientist 2: "So, how many numbers do we need? 15?".
Scientist 1: "More. Some people have more than 15 sons".
Scientist 2: "Ok. 30 then. Just to be sure".
Scientist 1: "But people older than 30 should be able to tell their age".
Scientist 2: "Ok. 60 then".
Babylonian Numbers (Base 60)
We believe that the Babylonians started the development of complex counting.
History of AI and ML
1950
Alan Turing publishes "Computing Machinery and Intelligence"
1952
Arthur Samuel develops a self-learning program to play checkers
1956
Artificial Intelligence used by John McCarthy in a conference
1957
First programming language for numeric and scientific computing (FORTRAN)
1958
First AI programming language (LISP)
1959
Arthur Samuel used the term Machine Learning
1959
John McCarthy and Marvin Minsky founded the MIT Artificial Intelligence Project
1961
First industrial Robot (Unimate) on the assembly line at General Motors
1965
ELIZA by Joseph Weizenbaum was the first program that could communicate on any topic
1972
First logic programming language (PROLOG)
1991
U.S. forces uses DART (automated logistics planning and scheduling) in the Gulf war
1997
Deep Blue (IBM) beats the world champion in chess
2002
The first robot cleaner (Roomba)
2005
Self-driving car (STANLEY) wins DARPA
2008
Breakthrough in speech recognition (Google)
2011
A neural network wins over humans in traffic sign recognition (99.46% vs 99.22%)
2011
Apple Siri
2011
Watson (IBM) wins Jeopardy!
2014
Amazon Alexa
2014
Microsoft Cortana
2014
Self-driving car (Google) passes a state driving test
2015
Google AlphaGo defeated various human champions in the board game Go
2016
The human robot Sofia by Hanson Robotics
Why AI Now?
One of the greatest innovators in the field of machine learning was John McCarthy, widely recognized as the "Father of Artificial Intelligence".
In the mid 1950s, McCarthy coined the term "Artificial Intelligence" and defined it as "the science of making intelligent machines".
The algorithms has been here since then. Why is AI more interesting now?
The answer is:
Computing power has not been strong enough
Computer storage has not been large enough
Big data has not been available
Fast Internet has not been available
Another strong force is the major investments from big companies (Google, Microsoft, Facebook, YouTube) because their datasets became much too big to handle traditionally.
Man vs Machine
Man
Computer
Smart
Stupid
Slow
Fast
Inaccurate
Accurate
Interesting Questions
Studying AI raises many interesting questions:
"Can computers think like humans?"
"Can computers be smarter than humans?"
"Can computers take over the world?"
Machines can understand verbal commands, recognize faces, drive cars, and play games better than us.
In the future, wealthy industrialists and business magnates and their top employees reign over the city of Metropolis from colossal skyscrapers, while underground-dwelling workers toil to operate the great machines that power it.
Rated as one of the greatest and most influential films ever made. Inscribed in UNESCO's Memory of the World Register in 2001, as the first film thus distinguished.
Selected for preservation in the National Film Registry by the US Library of Congress in 1991 as Culturally, Historically, or Aesthetically Significant.
Westworld American Science-Fiction Western (1973).
An adult amusement park has 3 worlds populated with androids that are indistinguishable from human beings: Western World (American Old West), Medieval World (Medieval Europe), and Roman World (City of Pompeii).
Westworld is a story about how artificial intelligence can be used to entertain us and allow us to live out our fantasies.
Winning 7 Oscars at the 50th Academy Awards (including Best Picture). The Empire Strikes Back (1980) and Return of the Jedi (1983) rounded the Star Wars trilogy.
Selected for preservation in the National Film Registry by the US Library of Congress in 1989, for being Culturally, Historically, or Aesthetically Significant.
Selected for preservation in the US National Film Registry by the Library of Congress in 2012, for being Culturally, Historically, or Aesthetically Significant.
Theodore develops a relationship with Samantha, an artificially intelligent virtual assistant personified through a female voice. Gives us a glimpse of how artificial assistants can be in the future and how we can even fall in love with them.
Based on the biography Alan Turing: The Enigma by Andrew Hodges. The film's title quotes the game Alan Turing proposed for answering the question "Can machines think?", in his 1950 seminal paper "Computing Machinery and Intelligence".
Abacus
Analog Computers
Digital Computers
Electronic Computers
Computer Speed
The First Abacus
The Babylonian Abacus was developed to Reduce the Time to perform calculations.
As stated in the previous chapter, we believe that the Babylonians invented complex counting.
The period 2700–2300 BC probably saw the first appearance of an abacus, a table of successive columns which defined the orders of a 60 digits number system.
Abacus 2.0
The Roman Abacus used 10 digits Roman numbers to Reduce the Time to perform calculations:
Image: 1911 Encyclopedia Britannica (public domain).
The Romans developed the Roman Abacus, a portable, base-10 version of earlier abacuses used by the Babylonians.
This was the worlds first handheld computer. Used by Roman engineers, merchants and tax collectors.
Analog Computers
The Difference Engine (Charles Babbage 1822) was a mechanical machine designed to Reduce the Time to calculate complex mathematical functions.
The Analytical Engine (Charles Babbage 1833) was a mechanical machine designed with modern computer elements like arithmetic, logic, and memory.
Both these "computers" used 10 digit (decimal) mechanical cogwheels to perform mathematical calculations:
Digital Computers use 0/1 switches to perform calculations. They operate on binary values like 11100110 in contrast to analog values like 230.
Try It Yourself:
The first Electric Digital Computer was designed and built by Konrad Zuse in Germany (1941).
It used 2600 electrical relays as 0/1 switches. The clock speed was about 5 Hz.
Replica of the Zuse Z3. Deutsches Museum. Munich.
Electronic Computers
First generation Computers (1945-1950) used vacuum tubes as binary switches.
Vacuum tubes are much faster than electrical relays.
The clock speed of these computers was between 500 KHz and 1 Mhz.
Second Generation Computers
Second generation Computers (1950-1960) used transistors as binary 0/1 switches.
Transistors are much faster than vacuum tubes.
Third Generation Computers
Third generation Computers (1960) used integrated circuits as binary switches.
Integrated circuits are much faster than transistors.
Computer Speed
The first electrical computer could do 5 instructions per second.
The first electronic computer did 5000 instructions per second.
The first PC did 5 million instructions per second.
AMD was the first PC to reach 1 billion instructions per second.
Today, IPhone 12 can do 11 billion instructions per second.
Year
Computer
Instructions per Second
Bits per Instruction
1941
Z3
5
4
1945
ENIAC
5.000
8
1981
IBM PC
5.000.000
16
1995
Industrial Robots
Artificial Intelligence Robots
Industrial Robots
Industrial robots have been around for more than 50 years.
The first robot patent was applied for in 1954 and granted in 1961.
In 1969, Victor Scheinman invented the Stanford Arm (Stanford University), and in 1972 he designed the MIT Arm for the MIT Artificial Intelligence Lab.
These "robots" are not considered intelligent. They are electrical machines designed to permit 6-axis arm movement. But the new design allowed for a machine to follow a programmed path and opened up for "robot jobs" like car painting, welding and assembly.
The pioneering robot company Unimation (with support from General Motors) took the Scheinman's design to the marked as the PUMA (Programmable Universal Machine for Assembly).
Most industrial robots are non-intelligent.
Most modern robots are said to be autonomous or semi-autonomous because they do not require much human input after they have been programmed.
A robot can easily be programmed to do a lot of different things (like on an assembly line), but it will never change what it is doing. It will continue to the same job until you turn it off.
Robotic Bartenders on Quantum of the Seas - Royal Caribbean:
Robots and Artificial Intelligence are two different things.
Robot technology is not a subset of Artificial Intelligence.
A robot is a physical thing. After 50 years of development, almost anything is programmable, your radio, your watch, your phone, and even robots.
AI Robots
Artificial Intelligence can be built into robots, and AI is a very exciting field in future robotics.
Hanson Robotics' Sophia personifies some dreams for the future of AI.
Sophia is a combination of science, engineering, and artistry. She is a human-crafted science fiction character depicting the future of AI and robotics.
Can AI Robots Interact Socially?
Yes, AI Robots can learn to interact socially.
Kismet (a MIT robot) is programmed to understand body language and voice inflection. The creators study how human and babies interact, based on tone of speech and visual cue.
Kismet could be the foundation of a human-like learning system.
History of AI and ML
1950
Alan Turing publishes "Computing Machinery and Intelligence"
1952
Arthur Samuel develops a self-learning program to play checkers
1956
Artificial Intelligence used by John McCarthy in a conference
1957
First programming language for numeric and scientific computing (FORTRAN)
1958
First AI programming language (LISP)
1959
Arthur Samuel used the term Machine Learning
1959
John McCarthy and Marvin Minsky founded the MIT Artificial Intelligence Project
1961
First industrial Robot (Unimate) on the assembly line at General Motors
1965
ELIZA by Joseph Weizenbaum was the first program that could communicate on any topic
1972
First logic programming language (PROLOG)
1991
U.S. forces uses DART (automated logistics planning and scheduling) in the Gulf war
1997
Deep Blue (IBM) beats the world champion in chess
2002
The first robot cleaner (Roomba)
2005
Self-driving car (STANLEY) wins DARPA
2008
Breakthrough in speech recognition (Google)
2011
A neural network wins over humans in traffic sign recognition (99.46% vs 99.22%)
2011
Apple Siri
2011
Watson (IBM) wins Jeopardy!
2014
Amazon Alexa
2014
Microsoft Cortana
2014
Self-driving car (Google) passes a state driving test
2015
Google AlphaGo defeated various human champions in the board game Go
2016
The human robot Sofia by Hanson Robotics
Why AI Now?
One of the greatest innovators in the field of machine learning was John McCarthy, widely recognized as the "Father of Artificial Intelligence".
In the mid 1950s, McCarthy coined the term "Artificial Intelligence" and defined it as "the science of making intelligent machines".
The algorithms has been here since then. Why is AI more interesting now?
The answer is:
Computing power has not been strong enough
Computer storage has not been large enough
Big data has not been available
Fast Internet has not been available
Another strong force is the major investments from big companies (Google, Microsoft, Facebook, YouTube) because their datasets became much too big to handle traditionally.
Man vs Machine
Man
Computer
Smart
Stupid
Slow
Fast
Inaccurate
Accurate
Interesting Questions
Studying AI raises many interesting questions:
"Can computers think like humans?"
"Can computers be smarter than humans?"
"Can computers take over the world?"
Machines can understand verbal commands, recognize faces, drive cars, and play games better than us.
How long will it take before they walk among us?
What Jobs Will be Taken Over by Computers?
In 2013, the Oxford scientists Carl Benedikt Frey and Michael A. Osborne, published a probability that 47% of all professions will be taken over by computers in Two Decades.
The table below is from the list. It ranks occupations according to their probability of computerization.
Bookkeeping
99% of all Tax Preparers 99% of all Account Clerks 98% of all Bookkeeping Clerks 98% of all Credit Analysts
Carl Benedikt Frey and Michael A. Osborne were right.
Today, bookkeeping is automated.
For these jobs, computers are much more affordable than people.
Sales And Customers
99% of all Telemarketers 97% of all Cashiers 94% of all Door-to-Door Salesmen 92% of all Insurance Sales Agents 85% of all Sales Representatives 58% of all Financial Advisors 55% of all Customer Service 54% of all Sales Agents
They were right.
Web shopping is taking over.
Conversion rates for telephone sales are not very attractive.
Production Workers
98% of all Packaging Machine Operators 95% of all Print Binding and Finishing Workers 93% of all Industrial Truck Operators 92% of all Production Workers 87% of all Food Preparation Workers
Assembly line robots has been around for 50 years.
Today, smart robots can be programmed by anyone to do anything.
Robots are replacing jobs at Amazon:
Read and Write
99% of all Data Entry Keyers 84% of all Proofreaders 65% of all Librarians 54% of all Film and Video Editors
Proofreading software is everywhere. From spell check to .
Bartenders
81% of all Fast Food Cooks 77% of all Bartenders 77% of all Dishwashers
Coffee robots can replace many baristas.
Robotic Bartenders on Quantum of the Seas - Royal Caribbean:
Postal Services
95% of all Postal Service Clerks 79% of all Mail Sorters 75% of all Postmasters 68% of all Mail Carriers
Airplane Pilots
55% of all Pilots
The military uses drones today.
Tomorrow AI will replace the pilots of cargo planes.
Passenger planes will have only one pilot. The second pilot will be an autopilot.
Drivers. Transportation Jobs
69% of all Taxi Drivers 69% of all Truck Drivers 67% of all Bus Drivers
Self-driving cars is already a reality.
The potential of jobs lost is staggering.
Very soon AI could replace millions of transportation jobs.
Robots could take 20 million jobs by 2030
According to a study from (2019), there could be 14 million robots working in China by 2030.
Thoughts
Feelings
Emotions
Self Awareness
Empathy
What is Mind?
There is no proven definition of Mind.
Mind can be defined as an instantiation of intelligence.
Mind can be defined as a collection of knowledge.
Is thinking, feeling, and meaning mind or know-how?
Is the Mind just a big computer?
Cognitive Science
Cognitive science is the study of mind processes.
A cognitive scientist studies intelligence and behavior.
Cognitive science focus on how brain cells process and transform information.
Cognitive Science also tries to learn how to develop intelligent computer algorithms.
The Mental Model
A mental model is an internal picture of the external reality.
Scientists expects that a model plays a major role in reasoning and decision-making (cognition).
Kenneth Craik suggested in 1943 that the mind constructs "small-scale models" of reality when trying to anticipate events.
The image of the world around us, which we carry in our head, is just a model. Nobody in his head imagines all the world, government or country. He has only selected concepts, and relationships between them, and uses those to represent the real system. Jay Wright Forrester, 1971.
Can AI Be Human?
Scientists are trying to discover what separates human intelligence from artificial intelligence.
What is the status? What is the future?
Year 2000: Reactive Machines
Year 2015: Machine Learning
Year 2030: Theory of Mind
Year 2050: Self-Awareness
Reactive Machines
Early AI systems were reactive. Reactive systems cannot use past experience.
In 1997 a reactive machine ("IBM Deep Blue") beat the world champion in chess.
"Deep Blue" could not think. But it was stored with information about the chess board, and the rules for moving chess pieces.
"Deep Blue" won because it was programmed to calculate every move to win.
Machine Learning
Today, AI systems can use some information from the past.
One example is self-driven cars. They can combine pre-programmed information with information they collect while they learn how to drive.
Theory of Mind
Theory of Mind is a term from psychology about an individual's capacity for empathy and understanding of others.
This is an awareness of others being like yourself, with individual needs and intentions.
One of the abilities language users have, is to communicate about things that are not concrete, like needs, ideas, or concepts.
, British psychologist and professor at the University of Cambridge, argues (1999) that "Theory of Mind" must have preceded languages, based on knowledge about early human activities:
Teaching
Building Shared Goals
Building Shared Plans
Intentional Communication
Intentional Sharing of Topic
Intentional Sharing of Focus
Intentional Persuasion
Intentional Pretending
Intentional Deception
Self-Awareness
In psychology, "Theory of Mind" means that people have thoughts, feelings and emotions that affect their behavior.
Future AI systems must learn to understand that everyone (both people and AI objects) have thoughts and feelings.
Future AI systems must know how to adjust their behavior to be able to walk among us.
The last step, before AI can be human, is machine consciousness.
We can not construct this software before we know much more about the human brain, memory, and intelligence.
The main branches of Mathematics involved in Machine Learning are:
Linear Functions
Linear Graphics
Linear Algebra
Probability
Statistics
Machine Learning = Mathematics
Behind every ML success there is Mathematics.
All ML models are constructed using solutions and ideas from math.
The purpose of ML is to create models for understanding thinking.
If you want an ML career:
Data Scientist
Machine Learning Engineer
Robot Scientist
Data Analyst
Natural Language Expert
Deep Learning Scientist
You should focus on the mathematic concepts described here.
Linear Functions
Linear means straight
A linear function is a straight line
A linear graph represents a linear function
Graphics
Graphics plays an important role in Math
Graphics plays an important role in Statistics
Graphics plays an important role in Machine Learning
A Function is special relationship where each input has an output.
A function is often written as f(x) where x is the input:
Results from f(x) = x
Results from f(x) = 2x
Linear Equations
A Linear Equation is an equation for a straight line:
y = x
y = x*2
y = x*2 + 7
y = ax + b
5x = 3y
y/2 = 6
[Y=X*2+7] [Y=X*2] [Y=X]
Non-Linear Equations
A Linear Equation can NOT contain exponents or square roots:
y = x**2
y = Math.sqrt(x)
y = Math.sin(x)
Linear Regression
A Linear regression tries to model the relationship between two variables by fitting a linear graph to data.
One variable (x) is considered to be data, and the other (y) is considered to be dependent.
For example, a Linear Regression can be a model to relate the price of houses to their size.
Linear Least Squares
Linear algebra is used to solve Linear Equations.
Linear Least Squares (LLS) is a set of formulations for solving statistical problems involved in Linear Regression.
Machine Learning experts cannot live without Linear Algebra:
ML make heavy use of Scalars
ML make heavy use of Vectors
ML make heavy use of Matrices
ML make heavy use of Tensors
The purpose of this chapter is to highlight the parts of linear algebra that is used in data science projects like machine learning and deep learning.
Vectors and Matrices
Vectors and Matrices are the languages of data.
With ML, most things are done with vectors and matrices.
With vectors and matrices, you can Discover Secrets.
Scalars
In linear algebra, a scalar is a single number.
In JavaScript it can be written like a constant or a variable:
const myScalar = 1; let x = 1; var y = 1;
Vectors
In linear algebra, a vector is an array of numbers.
In JavaScript, it can be written as an array:
const myArray = [50,60,70,80,90,100,110,120,130,140,150]; myArray.length; // the length of myArray is 11
An array can have multiple dimensions, but a vector is a 1-dimensional array.
A vector can be written in many ways. The most common are:
Matrices
In linear algebra, a matrix is a 2-dimensional array.
C =
3
0
0
0
0
3
0
0
0
0
3
0
0
0
0
3
In JavaScript, a matrix is an array with 2 indices (indexes).
Example
const myArray = [[1,2],[3,4],[5,6]];
Tensors
A Tensor is an N-dimensional Matrix.
In JavaScript, a tensor is an array with multiple indices (indexes).
Linear Algebra is the branch of mathematics that concerns linear equations (and linear maps) and their representations in vector spaces and through matrices.
Linear algebra is central to almost all areas of mathematics.
Wikipedia
Vectors are 1-dimentional Arrays
Vectors have a Magnitude and a Direction
Vectors typically describes Motion or Force
Vector Notation
Vectors can be written in many ways. The most common are:
Motion
Vectors are the building blocks of Motion
In geometry, a vector can describe a movement from one point to another.
The vector [3, 2] says go 3 right and 2 up.
Vector Addition
The sum of two vectors (a+b) is found by moving the vector b until the tail meets the head of vector a. (This does not change vector b).
Then, the line from the tail of a to the head of b is the vector a+b:
Vector Subtraction
Vector -a is the opposite of +a.
This means that vector a and vector -a has the same magnitude in opposite directions:
Scalar Operations
Vectors can be modified by adding, subtracting, or multiplying a scalar (number) from all the vector values:
a = [1 1 1]
a + 1 = [2 2 2]
[1 2 3] + 1 = [2 3 4]
Vector multiplications has much of the same properties as normal multiplication:
[2 2 2] * 3 = [6 6 6]
[6 6 6] / 3 = [2 2 2]
Force
Force is a Vector.
Force is a vector with a Magnitude and a Direction.
Velocity
Velocity is a Vector.
Velocity is a vector with a Magnitude and a Direction.
A matrix is set of Numbers.
A matrix is an Rectangular Array.
A matrix is arranged in Rows and Columns.
Matrix Dimensions
Square Matrices
A Square Matrix is a matrix with the same number of rows and columns.
An n-by-n matrix is known as a square matrix of order n.
A 2-by-2 matrix (Square matrix of order 2):
A 4-by-4 matrix (Square matrix of order 4):
Diagonal Matrices
A Diagonal Matrix has values on the diagonal entries, and zero on the rest:
The Identity Matrix
The Identity Matrix has 1 on the diagonal and 0 on the rest.
This is the matrix equivalent of 1. The symbol is I.
If you multiply any matrix with the identity matrix, the result equals the original.
The Zero Matrix
A Tensor is a N-dimensional Matrix:
A Scalar is a 0-dimensional tensor
A Vector is a 1-dimensional tensor
A Matrix is a 2-dimensional tensor
A Tensor is a generalization of Vectors and Matrices to higher dimensions.
Tensor Ranks
The number of directions a tensor can have in a N-dimensional space, is called the Rank of the tensor.
The rank is denoted R.
A Scalar is a single number.
It has 0 Axes
It has a Rank of 0
It is a 0-dimensional Tensor
A Vector is an array of numbers.
It has 1 Axis
It has a Rank of 1
It is a 1-dimensional Tensor
A Matrix is a 2-dimensional array.
It has 2 Axis
It has a Rank of 2
It is a 2-dimensional Tensor
Real Tensors
Technically, all of the above are tensors, but when we speak of tensors, we generally speak of matrices with a dimension larger than 2 (R > 2).
Statistics are tools to get answers to questions about data:
What is Common?
What is Expected?
What is Normal?
What is the Probability?
Inferential Statistics
Inferential statistics are methods for quantifying properties of a population from a small Sample:
You take data from a sample and make a prediction about the whole population.
For example, you can stand in a shop and ask a sample of 100 people if they like chocolate.
From your research, using inferential statistics, you could predict that 91% of all shoppers like chocolate.
Incredible Chocolate Facts
Nine out of ten people love chocolate.
50% of the US population cannot live without chocolate every day.
You use Inferential Statistics to predict whole domains from small samples of data.
Descriptive Statistics
Descriptive Statistics summarizes (describes) observations from a set of data.
Since we register every newborn baby, we can tell that 51 out of 100 are boys.
From these collected numbers, we can predict a 51% chance that a new baby will be a boy.
It is a mystery that the ratio is not 50%, like basic biology would predict. We only know that we have had this tilted sex ratio since the 17th century.
Note
Raw observations are only data. They are not real knowledge.
You use Descriptive Statistics to transform raw observations into data that you can understand.
Descriptive Statistics Measurements
Descriptive statistics are broken down into different measures:
Tendency (Measures of the Center)
The Mean (the average value)value
The Median (the mid point value)
The Mode (the most common value)
Spread (Measures of Variability)
Min and Max
Standard Deviation
Variance
Skewness
Kurtosis
Descriptive Statistics is broken down into Tendency and Variability.
Tendency is about Center Measures:
The Mean (the average value)
The Median (the mid point value)
The Mode (the most common value)
The Mean
The Mean Value is the Average of all values.
This table contains 11 values:
To find the Mean Value: Add all values and divide by the number of values.
The Mean Value is: (7+8+8+9+9+9+10+11+14+14+15)/11 = 10.3636363636.
The Mean is the Sum divided by the Count.
Calculate the Mean Value:
let mean = (7+8+8+9+9+9+10+11+14+14+15)/11;
Or use a math library like math.js:
const values = [7,8,8,9,9,9,10,11,14,14,15];
let mean = math.mean(values);
The Median
A list of speed values:
99,86,87,88,111,86,103,87,94,78,77,85,86
The Median is the value in the middle (after the values are sorted):
77,78,85,86,86,86,87,87,88,94,99,103,111
Calculate the median:
const speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]; let median = math.median(speed);
If there are two numbers in the middle, divide the sum of them by two.
77,78,85,86,86,86,87,87,88,94,99,103
(86 + 87) / 2 = 86.5
Calculate the median:
const speed = [99,86,87,88,86,103,87,94,78,77,85,86]; let median = math.median(speed);
The Mode
The Mode Value is the value that appears the most number of times:
99,86,87,88,111,86,103,87,94,78,77,85,86
Calculate the mode:
const speed = [99,86,87,88,86,103,87,94,78,77,85,86]; let mode = math.mode(speed);
Outliers
Outliers are values "outside" the other values:
99,86,87,88,111,86,103,87,94,78,300,85,86
Outliers can change the mean a lot. Sometimes we don't use them (they might be an error), or we use the median or the mode instead.
Calculate the Mean:
const values = [99,86,87,88,111,86,103,87,94,78,300,85,86]; let mean = math.mean(values);
Descriptive Statistics is broken down into Tendency and Variability.
Variability uses these measures:
Min and Max
Variance
Deviation
Distribution
Skewness
Kurtosis
The Variance
In statistics, the Variance is the average of the squared differences from the Mean Value.
In other words, the variance describes how far a set of numbers is Spread Out from the mean (average) value.
Mean value is described in the previous chapter.
This table contains 11 values:
Calculate the Variance:
// Calculate the Mean (m) let m = (7+8+8+9+9+9+10+11+14+14+15)/11;
// Calculate the Sum of Squares (ss) let ss = (7-m)**2 + (8-m)**2 + (8-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (10-m)**2 + (11-m)**2 + (14-m)**2 + (15-m)**2;
// Calculate the Variance let variance = ss / 11;
Or use a math library like math.js:
const values = [7,8,8,9,9,9,10,11,14,14,15]; let variance = math.variance(values, "uncorrected");
Standard Deviation
Standard Deviation is a measure of how spread out numbers are.
The symbol is σ (Greek letter sigma).
The formula is the √ variance (the square root of the variance).
The Standard Deviation is (in JavaScript):
// Calculate the Mean (m) let m = (7+8+8+9+9+9+10+11+14+15)/11;
// Calculate the Sum of Squares (ss) let ss = (7-m)**2 + (8-m)**2 + (8-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (10-m)**2 + (11-m)**2 + (14-m)**2 + (15-m)**2;
// Calculate the Variance let variance = ss / 11;
// Calculate the Standard Deviation let std = Math.sqrt(variance);
Deviation is a measure of Distance.
How far (on average), all values are from the Mean (the Middle).
Or if you use a math library like math.js:
const values = [7,8,8,9,9,9,9,10,11,14,15]; let std = math.std(values, "uncorrected");
What is Normal Distribution?
What is Margin of Error?
What is Skewness?
What is Kurtosis?
Normal Distribution
The Normal Distribution Curve is a bell-shaped curve.
Each band of the curve has a width of 1 Standard Deviation:
Probability is about how Likely something is to occur, or how likely something is true.
The mathematic probability is a Number between 0 and 1.
0 indicates Impossibility and 1 indicates Certainty.
The Probability of an Event
The probability of an event is:
The number of ways the event can happen / The number of possible outcomes.
Probability = # of Ways / Outcomes
Tossing Coins
P(A) - The Probability
The probability of an event A is often written as P(A).
When tossing two coins, there are 4 possible outcomes: