This article is focused on showing how a point on the loss surface is equivalent to a line between X and Y. The example that I have taken here is of a simple linear regression model between 2 variables sunshine (in hours) and attendance (in thousands).
-
- import matplotlib.pyplot as plt
- import numpy as np
- import pandas as pd
-
- dataset = pd.read_csv('sunshine.csv')
-
- dataset.head()
-
-
- dataset.corr()
-
- X = dataset.iloc[:,[0]].values
- y = dataset.iloc[:,1].values
- print(X.shape)
- print(y.shape)
-
- plt.scatter(X,y)
- plt.xlabel("Sunshine in hrs")
- plt.ylabel("Attendance in '000s")
- plt.title("Sunshine vs Attendance")
- plt.show()
-
-
- from sklearn.linear_model import LinearRegression
-
- model = LinearRegression()
- model.fit(X, y)
- print(model.coef_)
- print(model.intercept_)
-
- plt.scatter(X,y)
- plt.plot(X,model.predict(X))
- plt.xlabel("Sunshine in hrs")
- plt.ylabel("Attendance in '000s")
- plt.title("Sunshine vs Attendance")
- plt.show()
Now the best fit line has a loss which is defined as Least Sum of Squared Errors i.e L2 loss which has the formula
Min of Σ(Actual y – Predicted y)2
So for coefficient 5.45 the loss is
Let’s plot this loss against the coefficient and our regression line side by side
- loss = sum((y - ypred)**2)
- plt.scatter(model.coef_, loss)
- plt.xlabel('w')
- plt.ylabel('loss')
- plt.show()
Now, let's change the coefficient range from 2.5 to 9 and plot the different lines that we get.
So for each coefficient, you get a line and a corresponding loss. So each loss point on the LHS figure is actually a regression line on the RHS figure. We have ignored the bias/intercept so far in this visualization.
Plotting L2 loss
Suppose we plot for the bias, we will following the curve. The L2 loss function is quadratic in nature hence we get bowl shaped curve.
- slope = np.arange(2.5,7.5,0.5)
- bias = np.arange(13.2, 18, 0.5)
- w0, w1 = np.meshgrid(slope, bias)
- ypred = w0*X + w1
- loss = np.power((y-ypred),2)
- fig = plt.figure()
- ax = fig.gca(projection='3d')
- surf = ax.plot_surface(w0,
- w1,
- loss,
- label="Loss surface",
- cmap='viridis', edgecolor='none')
- surf._facecolors2d=surf._facecolors3d
- surf._edgecolors2d=surf._edgecolors3d
- ax.set_xlabel('Slope')
- ax.set_ylabel('Bias')
- ax.legend()
Geometrically loss function is a convex function as shown above.
Plotting L1 Loss
Similarly you can plot the L1 loss which is abs(y-ypred). Here there is no quadratic term. So how does the geometry of this loss function look? It looks V shaped.
You can visualize the other loss functions in the same way.
I have made a video on this topic, and uploaded it
here.
The code is also uploaded