Post

# Normal Equation Implementation From Scratch In Python

Part - 1 - Introduction

The normal equation is a more closed-form solution of figuring out the value of parameter or θ that minimizes the cost function. It's called a closed-form solution in the sense that it gives the result directly through the equation.

Here,
• The left-hand side of the equation is the value of q  that's going to minimize the cost function
• y is a vector containing our target values or labels
Part - 2 - Data generation

Now, we need data on which we can apply the Normal Equation. Let's create some random data. We need to make sure that the data plot isn't a glorious line, so we're adding some noise to scatter them or create some deviation.
1. import numpy as np
2.
3. def generate_data():
4.     X = 2 * np.random.rand(1001)
5.     y = 4 + 3 * X + np.random.randn(1001)
6.
7.     # y = 4 + 3X + Gaussian noise
8.     # theta_0 or Bias Term = 4
9.     # theta_1 = 3
10.
11.     return X, y
Now that our data is created, it is time to fetch our parameters from this dataset. Or we can say getting a value of q that will minimize the cost function.

Part - 3 - Getting

So how do we find the best q? We will use the normal equation.

from numpy.linalg import inv
1. def get_best_param(X, y):
2.     X_transpose = X.T
3.     best_params = inv(X_transpose.dot(X)).dot(X_transpose).dot(y)
4.     # normal equation
5.     # theta_best = (X.T * X)^(-1) * X.T * y
6.
7.     return best_params # returns a list
Part - 4 - Testing

We have our data, we have 𝛉, so let's test things!
1. X, y = generate_data()
2. %matplotlib inline
3. import matplotlib.pyplot as plt
4.
5. plt.plot(X, y, "r.")

Had we not chosen to add some noise, this data would've been a dead straight line, and then we wouldn't need fancy machine learning. Simple math would do. [ Apparently, we're not here for that ]
1. X_b = np.c_[np.ones((1001)), X] # set bias term to 1 for each sample
2. params = get_best_param(X_b, y)
3. print(params)
4. array([[ 3.94204397],
5.        [ 3.08256508]])
Well, the expected values would've been  4 and 3 but we have some noise so deviation is acceptable. [ I'm not saying this is perfect! ]

So how do we get the prediction? Well, another equation,

Here,
• The left-hand side of the equation is the expected output,
1. # test prediction
2.
3. test_X = np.array([, ])
4. test_X_b = np.c_[np.ones((21)), test_X]
5.
6. prediction = test_X_b.dot(params)
7.
8. # y = h_Theta_X(Theta) = Theta.T * X
9. print(prediction)
10. array([[  3.94204397],
11.        [ 10.10717414]])
Okay numbers numbers all around. So let's plot the prediction and our dataset in a single graph and see how the model fits the dataset for a regression problem.
1. plt.plot(test_X, prediction, "r--")
2. plt.plot(X, y, "b.")
3. plt.axis() # x axis range 0 to 2, y axis range 0 to 15
4. plt.show()

So, we implemented Normal Equation and applied it to predict output for a Linear regression problem. So one curious question - what if we didn't add noise? How would the data plot look? let's find out!

Extra - What if no noise?
1. def generate_noiseless_data():
2.     X = 2 * np.random.rand(1001)
3.     y = 4 + 3 * X
4.
5.     # y = 4 + 3X
6.     # theta_0 or Bias Term = 4
7.     # theta_1 = 3
8.
9.     return X, y
10. X, y = generate_noiseless_data()
11. plt.plot(X, y, "r.")
12. [<matplotlib.lines.Line2D at 0x10f30f978>]

As you can see - a dead straight line (almost!) let's look at the param and prediction
1. X_b = np.c_[np.ones((100, 1)), X] # set bias term to 1 for each sample
2. param = get_best_param(X_b, y)
3. param
4. array([[ 4.],
5.        [ 3.]])
Well well, exactly 4 and 3. For such cases, you don't need machine learning. Just putting values in the line equation does the job.
1. test_X = np.array([, ])
2. test_X_b = np.c_[np.ones((2, 1)), test_X]
3.
4. prediction = test_X_b.dot(params)
5. plt.plot(test_X, prediction, "r--")
6. plt.plot(X, y, "b.")
7. plt.axis([0, 2, 0, 15]) # x axis range 0 to 2, y axis range 0 to 15
8. plt.show()

Nothing to separate really!

Recommended Free Ebook
Similar Articles  