The ABCs of Machine Learning

Rikam Palkar
Sep 18
1.6k
0
2

Article

AI… it’s one of those things that sounds super techy and, honestly, a bit confusing. Welcome to AI for Dummies, where I’ll walk you through the world of Artificial Intelligence without all the jargon and over-complicated stuff.

This is Part 2: The ABCs of Machine Learning

Feel free to explore other sections of the series:

Part 1: Layers of Artificial Intelligence
Part 3: The ABCs of Deep Learning
Part 4: Foundation Models: Everything, Everywhere, All at Once!

Now it’s time to learn the top-most layer of all: Machine Learning (ML).

If AI is a car, then Machine Learning is the engine under the hood. It’s the part that gives power and makes a lot of today’s cool AI applications actually work.

1. What Is Machine Learning?

Machine learning is basically a branch of AI that helps computers get better at tasks by learning from data, kind of like how we humans learn from experience. Instead of us writing down every single rule for the computer to follow, we just feed it a bunch of examples, and the system figures out the rules on its own.

For example, let’s say we want a computer to recognize cats in pictures. Instead of programming it with step-by-step instructions like “look for whiskers, pointy ears, or tails,” we just give it thousands of cat photos. Over time, the algorithm learns the patterns that make a cat a cat, and then it can spot them on its own in new images.

2. Training Data

Garbage In, Garbage Out

Say you wanna bake a cake with spoiled eggs or expired milk, it doesn’t matter how good your recipe is, the cake’s going to taste awful. Machine Learning works the same way. If the data you feed it is messy, incomplete, or just plain wrong, the results will also be off.

That’s why people often say, “garbage in, garbage out.” No matter how advanced your algorithm is, if the input data is bad, the predictions will be bad too.

Labeled vs. Unlabeled Data

When training a Machine Learning model, you’ll run into different types of data. But before we jump into the details, it’s important to know that data usually comes in two main forms: labeled and unlabeled.

Labeled Data: This is data that already comes with the “right answer” attached. Like flashcards, on one side you see the question (like a picture of a fruit), and on the other side, the answer is written (“apple,” “banana,” or “orange”). That way, the model knows exactly what each example is.
Unlabeled Data: Here, there are no answers provided. The model has to look for patterns on its own. For example, if I toss my entire wedding album into the training data without adding any tags, the algorithm won’t know what’s what. But it can still pick up on patterns like colors, shapes, or textures and group similar photos together, even without knowing who’s in them or what the event was. That way I can have my engagement album separated from wedding album. Wish I had done that.

There’s also something in between called semi-supervised learning. Here, only part of the data is labeled. The algorithm uses the labeled examples to learn the basics, then applies that knowledge to make sense of the unlabeled data.

Structured vs. Unstructured Data

Not all data comes in the same shape and form; some of it’s nice and organized, while some of it’s a total mess - like the Crypto market, LOL.

Structured Data: This is the neat stuff, usually organized into rows and columns like a spreadsheet. It’s the kind of data that classic ML algorithms love to work with.

Tabular Data: A banking database with customer details, account number, balance, and transaction history.
Time-Series Data: Fitness tracker readings, like your daily step count or heart rate over time.

Unstructured Data: This is the messy kind that doesn’t fit neatly into tables. It usually needs more advanced ML techniques to understand.

Text Data: Tweets, product reviews, blog posts.
Image Data: Photos, X-rays, or video frames.
Audio Data: Music, podcasts, or voice recordings.

3. How Machines Learn

Once your data is prepped, the next step is running it through algorithms. Most ML approaches fall into three main buckets:

Supervised Learning: You train the model with labeled data, basically, examples where both the input and the correct answer are already known. The goal is for the model to learn on new, unseen data and predict the right output.
Unsupervised Learning: No labels are given. No answers attached. Its job is to dig through the data and uncover hidden patterns, structures, or relationships.
Reinforcement Learning: The model learns through trial and error, getting “rewards” for good moves and “penalties” for bad ones. Like training a pet, it tries things, learns from the “good job” or “nope,” and gradually improves its decision-making.
Semi-Supervised Learning: This is a mix of the two worlds. Only part of the training data is labeled, and the algorithm uses those labeled examples to help make sense of the larger pool of unlabeled data.

4. Inferencing: Putting the Model to Work

Once your model is trained, it’s ready to use what it’s learned. This step is called inferencing, basically just a fancy word for making predictions.

There are two common ways to do it:

Batch Inferencing: Here, the model processes a big chunk of data all at once. For example, analyzing thousands of medical scans overnight to flag potential issues. Batch is great when accuracy matters more than speed, since you’re not in a rush for instant answers.

Real-Time Inferencing: In this case, the model makes decisions on the fly as new data comes in. Think of a fraud detection system spotting a suspicious credit card transaction instantly, or a self-driving car deciding when to hit the brakes. Here, speed is critical.

Both methods are valuable; it just depends on whether you care more about depth (batch) or speed (real-time).

Time to put my cloud knowledge to the test! Here are the services currently used in AWS, Azure, and Google Cloud as of this article’s publishing.

1. Data Preparation & Management

Service	AWS	Azure	Google Cloud
Data Storage	Amazon S3	Azure Blob Storage	Google Cloud Storage
Data Labeling	Amazon SageMaker Ground Truth	Azure Machine Learning Data Labeling	Google Cloud Data Labeling Service
Data Processing	AWS Glue	Azure Data Factory	Google Cloud Dataflow

2. Model Training

Service	AWS	Azure	Google Cloud
Managed ML Platform	Amazon SageMaker	Azure Machine Learning	Vertex AI
Training Infrastructure	EC2 Instances, SageMaker Training	Azure ML Compute	Google Cloud AI Platform Training
Pre-built Models	SageMaker JumpStart	Azure AI Gallery	Vertex AI Workbench

3. Model Evaluation & Tuning

Service	AWS	Azure	Google Cloud
Hyperparameter Tuning	SageMaker Automatic Model Tuning	Azure HyperDrive	Vertex AI Hyperparameter Tuning
Model Explainability	SageMaker Clarify	Azure ML Interpretability	Vertex AI Explainable AI

4. Model Deployment & Inferencing

Service	AWS	Azure	Google Cloud
Real-Time Inference	SageMaker Endpoints	Azure ML Endpoints	Vertex AI Endpoints
Batch Inference	SageMaker Batch Transform	Azure ML Batch Inference	Vertex AI Batch Prediction
Edge Deployment	SageMaker Neo	Azure IoT Edge	Vertex AI Edge

5. Model Monitoring & Management

Service	AWS	Azure	Google Cloud
Model Monitoring	SageMaker Model Monitor	Azure ML Model Monitoring	Vertex AI Model Monitoring
Drift Detection	SageMaker Clarify	Azure ML Data Drift	Vertex AI Data Drift

My 2 cents:

That’s Machine Learning in a nutshell, it’s basically teaching computers with examples instead of nagging them with rules.

In the next article, we’ll peel back another layer and talk about Deep Learning, the part of AI that gives us things like facial recognition, voice assistants, and those “how did Netflix know I’d watch this?” moments.

You may also explore additional parts of the series:

Part 1: Layers of Artificial Intelligence

Part 3: ABCs of Deep Learning

Part 4: Foundation Models: Everything, Everywhere, All at Once!