Gradient Descent Optimization

Introduction

Gradient descent is a popular optimization algorithm used in machine learning to minimize the loss function and update the model parameters. It iteratively adjusts the parameters in the direction of the negative gradient of the loss function to find the optimal values that minimize the loss. Here, We'll optimize the coefficients of a linear model to fit a given dataset.

Implementation

import numpy as np

# Generate some example data

# X represents the input features, y represents the target values
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Add a bias term to X
#X_b is the input matrix with an additional bias term. We initialize the model parameters theta randomly and then perform gradient descent iterations to update them.
X_b = np.c_[np.ones((100, 1)), X]

# Initialize model parameters
theta = np.random.randn(2, 1)

# Hyperparameters

learning_rate = 0.1
n_iterations = 1000

# Gradient Descent

for iteration in range(n_iterations):
gradients = 2 * X_b.T.dot(y - X_b.dot(theta))
theta -= learning_rate * gradients

# Optimal parameters

intercept, slope = theta
print("Intercept:", intercept[0])
print("Slope:", slope[0])

Gradient Descent Optimization

Libraries like TensorFlow, PyTorch, or scikit-learn provide built-in optimization functions that handle gradient descent and other optimization algorithms for you. The effectiveness of gradient descent depends on factors like learning rate, batch size (for mini-batch gradient descent), and the choice of optimization algorithm. Experimentation and tuning are often required to find the best settings for your specific problem. Let's Suppose we want to implement gradient descent to optimize a simple quadratic function f(x) = x^2 + 5x + 6. We want to find the value of x that minimizes this function using Javascript.

// Define the quadratic function and its gradient

function f(x) {
    return x ** 2 + 5 * x + 6;
}

function gradient(x) {
    return 2 * x + 5;
}

// Gradient Descent parameters
let learningRate = 0.1;
let iterations = 50;
let currentX = -10; // Starting point

// Gradient Descent algorithm
for (let i = 0; i < iterations; i++) {
    let gradientValue = gradient(currentX);
    currentX -= learningRate * gradientValue;
}

console.log("Optimal x:", currentX);
console.log("Optimal f(x):", f(currentX));

Conclusion

Here we built a simple linear regression model and implemented the gradient descent algorithm to find the optimal slope and intercept for the best fit.


Similar Articles