Hello World! for Machine Learning

Create your first Machine Learning model with a few lines of code

Posted by Navendu Pottekkat on May 15, 2020

The most common question that keeps on popping is "How do I get started with Machine Learning?". I know it can be quite overwhelming with all the different tools and resources available online and you have no idea where to start. Believe me I have been there. So in this article, I will try to get you started with building Machine Learning models and familiarise you with the practices that is being used in the industry.

I hope you know a little python which we will be using to code our model. If not here is a great place for you to start : https://www.w3schools.com/python/

What is Machine Learning?

Machine Learning provides systems the ability to automatically learn and improve from without being explicitly programmed.

"Geez! that seems complicated..."

To put it simply, Machine Learning is just pattern recognition. That's it!

What we want is a machine that can learn from experience – Alan Turing, 1947
Learning by experience

Consider a baby playing with a toy. He has to put the blocks in the correct slots or it won't fit. He tries to stick the square block to the circular hole. It doesn't work! He tries it a few more times and eventually fits the square block in the square hole. Now, every time he sees a square block, he would know to fit it into the square hole!

This is a very simple idea of how Machine Learning works. Machine Learning is obviously more complicated but let us start simple.

In Machine Learning, instead of trying to define the rules and express them in a programming language, you provide the answers (typically called labels) along with the data, and the machine will infer the rules that determine the relationship between the answers and the data.

Machine Learning determines the "rules" by looking at the data and answers

The data and labels are used to create Machine Learning Algorithms (The "Rules") which are typically called Models.

Using this model, when the Machine gets new data, it can predict or correctly label them.

When the model recieves new data, it predicts the answer by looking at the rules

For example, if we train the model with labelled images of Cats and Dogs, the model would be able to predict when a new image is shown whether it is a Cat or a Dog.

Learning by experience

Now that we have a basic understanding, let us get coding!

Creating your first Machine Learning Model

Consider the following set of numbers. Are you able to figure out the relationship between them?


          X: -1 0 1 2 3 4 5 
          Y: -1 1 3 5 7 9 11
          

As you look at them you might notice that the X value is increasing by 1 as you read left to right, and the corresponding Y value is increasing by 2. So you probably think Y=2X plus or minus something. Then you'd probably look at the zero on X and see that Y = 1, and you'd come up with the relationship Y=2X+1.

Now if we are given a value of 6 for X, we can accurately predict the value of Y to be 2*6 + 1 = 13

It must have been pretty easy to figure this out for you. Now let's try to get a computer to figure this out. Put on your coding hats because we are about to code our first Machine Learning model!

Setting up the environment

We will use Google Colab for writing our code.

So what is Google Colab?

It’s an incredible online browser-based platform that allows us to train our models on machines for free! Sounds too good to be true, but thanks to Google, we can now work with large datasets, build complex models, and even share our work seamlessly with others.

Google Colab Landing page- Click on new notebook to get started

So basically it would be where we would train and use our model. You would need a Google Account for using Colab. Once that is done create a new notebook. Voila! You have your first notebook.

Your first notebook. Feel free to look around and try figuring things out!

Check out this tutorial for learning how to use Google Colab if you have not used it before.

Now we write code for real!

Let's get coding!

The complete notebook is available here and here(GitHub).

Imports

We are importing TensorFlow and calling it tf for ease of use.

Next we import a library called numpy, which helps us to represent our data as lists easily and quickly.

The framework for defining a model as a set of sequential layers is called keras, so we import that too.

            
import tensorflow as tf
import numpy as np
from tensorflow import keras
            
          
Create the data set

As shown earlier, we have 7 Xs and 7 Ys and we found the relationship between them to be Y = 2X + 1

A python library called numpy provides lots of array type data structures that are a defacto standard way of doing it. We declare that we want to use these by specifying the values as an array in numpy using np.array[]

              
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0], dtype=float)
ys = np.array([-1.0, 1.0, 3.0, 5.0, 7.0, 9.0, 11.0], dtype=float)
              
            
Define the model

Next we will create the simplest possible neural network. It has 1 layer, and that layer has 1 neuron, and the input shape to it is just 1 value.


model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
            

You know that in the function, the relationship between the numbers is y=2x+1.

When the computer is trying to 'learn' that, it makes a guess...maybe y=10x+10. The loss function measures the guessed answers against the known correct answers and measures how well or how badly it did.

Next, the model uses the optimiser function to make another guess. Based on the loss function's result, it will try to minimise the loss. At this point maybe it will come up with something like y=5x+5. While this is still pretty bad, it's closer to the correct result (i.e. the loss is lower).

The model will repeat this for the number of epochs which you will see shortly.

But first, here's how we tell it to use mean squared error for the loss and stochastic gradient descent (sgd) for the optimiser. You don't need to understand the maths for these yet, but you can see that they work! :)

Over time you will learn the different and appropriate loss and optimiser functions for different scenarios.

              
model.compile(optimizer='sgd', loss='mean_squared_error')

Training the model

Training is the process where the model learns as we said about before. model.fit function is used for fitting the training data that we have created to the model.

When you run this code, you'll see the loss will be printed out for each epoch.


  Epoch 1/500
  1/1 [==============================] - 0s 3ms/step - loss: 59.7111
  Epoch 2/500
  1/1 [==============================] - 0s 2ms/step - loss: 41.0885
  Epoch 3/500
  1/1 [==============================] - 0s 2ms/step - loss: 28.2783

You can see that for the first few epochs, the loss value is quite large and with each step it is getting smaller.

Using the trained model to make predictions

Finally! Our model has been trained and is ready to face the real world. Let's see how well our model can predict the value of Y given X.

We use the model.predict method to have it figure out Y for any value of X.

So, if we take the value of X as 8, then we know the value of Y is 2*8 + 1 = 17. Let's see if our model can get it right.

              
print(model.predict([8.0]))

[[17.00325]]
            

That ended up a little over than the value that we would expect.

Machine Learning deal with probabilities, so given the data that we fed the model with, it calculated that there is a very high probability that the relationship between X and Y is Y=2X+1, but with only 7 data points we can't know for sure. As a result, the result for 8 is very close to 17, but not necessarily 17.

That is it! You have covered the core concepts of Machine Learning that you would use in very different scenarios.

The process/steps we have used here is what you would do when you build complex models. Happy coding!