Machine Learning  

First Machine Learning Project in Python: Salary Prediction Using Linear Regression

Introduction

After setting up Python and VS Code correctly, the best way to start AI/ML is to build a small working project. In this article, I will build a beginner-friendly Machine Learning project to predict salary from years of experience.

Problem Statement

I want to predict salary based on experience using a simple ML algorithm.

What We Will Build

We will build a program that:

  • Loads dataset from CSV

  • Splits data into training/testing

  • Trains a Linear Regression model

  • Predicts salary for user input experience

Prerequisites

  • Python installed

  • VS Code configured

  • Virtual environment activated

  • Packages installed:

pip install pandas scikit-learn

Step-by-Step Setup

Step 1: Project Structure

Use this structure:

02_linear_regression/salary_prediction/
├── data/
│   └── data.csv
└── src/
    └── salary_predict.py

Step 2: Create Dataset File

Create data/data.csv

Experience,Salary
1,35000
2,40000
3,50000
4,60000
5,65000
6,70000
7,80000
8,85000
9,90000
10,100000

Code

Create src/salary_predict.py

import os
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

# Get correct CSV path (works from anywhere)
base_dir = os.path.dirname(__file__)
csv_path = os.path.join(base_dir, "..", "data", "data.csv")

# Load dataset
df = pd.read_csv(csv_path)

# Input (X) and Output (y)
X = df[["Experience"]]
y = df["Salary"]

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=420
)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# User input
experience = float(input("Enter years of experience: "))

# Predict
input_data = pd.DataFrame({"Experience": [experience]})
predicted_salary = model.predict(input_data)

print(f"Predicted salary for {experience} years experience = {predicted_salary[0]:.2f}")

Output

Run:

python src/salary_predict.py

Example:

Enter years of experience: 11
Predicted salary for 11.0 years experience = 106978.30

GitHub Repo Link

https://github.com/pramodsingh-ai/ai-ml-python-practicals

Summary

In this article, we built a complete beginner ML project using Linear Regression. This helped us understand:

  • Dataset loading

  • Train/Test split

  • Model training

  • Prediction