Python  

Building Real-World Data Models in Python: From Structs to Smart Classes

Table of Contents

  • Introduction

  • Why Python Classes Replace C-Style Structs

  • Data Classes: The Modern Way to Model Data

  • Real-World Scenario: Managing IoT Sensor Readings in Smart Agriculture

  • Implementation with Error Handling and Validation

  • Best Practices for Custom Data Structures

  • Conclusion

Introduction

In C and C++, developers often use struct with typedef to bundle related data into a single unit—like a SensorReading containing temperature, humidity, and timestamp. Python doesn’t have structs, but it offers something far more powerful: classes. And since Python 3.7, data classes make this even cleaner, safer, and more maintainable.

This article shows how to model real-world data using Python classes—using a timely, real-life example from smart agriculture, where every sensor byte counts.

Why Python Classes Replace C-Style Structs

While C structs are passive containers, Python classes are active, extensible, and support methods, validation, and inheritance. You can:

  • Bundle data and behavior

  • Enforce data integrity

  • Add logging, conversion, or serialization logic

  • Easily integrate with modern frameworks (FastAPI, Pydantic, ORMs)

For pure data containers, data classes eliminate boilerplate while keeping all these advantages.

Data Classes: The Modern Way to Model Data

Introduced in Python 3.7 via @dataclass, they auto-generate __init__, __repr__, __eq__, and more:

from dataclasses import dataclass
from datetime import datetime

@dataclass
class SensorReading:
    device_id: str
    temperature: float
    humidity: float
    timestamp: datetime

That’s it—no manual __init__ needed. You get a clean, typed, readable data structure instantly.

Real-World Scenario: Managing IoT Sensor Readings in Smart Agriculture

Problem: A farm uses 500+ soil sensors across fields. Each sensor reports:

  • Device ID (e.g., "FIELD-A-042")

  • Soil temperature (°C)

  • Moisture level (%)

  • Timestamp (UTC)

These readings stream into a backend every 10 minutes. Engineers need to:

  • Validate incoming data

  • Reject outliers (e.g., humidity > 100%)

  • Serialize to JSON for dashboards

  • Compare readings for anomaly detection

PlantUML Diagram

A naive dictionary won’t cut it. We need a robust, self-validating data structure.

Implementation with Error Handling and Validation

Here’s a production-ready SensorReading class using data classes and custom validation:

from dataclasses import dataclass
from datetime import datetime
import json

@dataclass
class SensorReading:
    device_id: str
    temperature: float
    humidity: float
    timestamp: datetime

    def __post_init__(self):
        # Validate after initialization
        if not self.device_id or not isinstance(self.device_id, str):
            raise ValueError("device_id must be a non-empty string")
        if not (-50 <= self.temperature <= 80):
            raise ValueError("Temperature out of realistic range [-50°C, 80°C]")
        if not (0 <= self.humidity <= 100):
            raise ValueError("Humidity must be between 0% and 100%")
        if not isinstance(self.timestamp, datetime):
            raise TypeError("timestamp must be a datetime object")

    def to_json(self) -> str:
        """Serialize to JSON-compatible format"""
        return json.dumps({
            "device_id": self.device_id,
            "temperature": round(self.temperature, 2),
            "humidity": round(self.humidity, 2),
            "timestamp": self.timestamp.isoformat()
        })

    @classmethod
    def from_dict(cls, data: dict):
        """Create instance from dictionary (e.g., from API or MQTT)"""
        return cls(
            device_id=data["device_id"],
            temperature=float(data["temperature"]),
            humidity=float(data["humidity"]),
            timestamp=datetime.fromisoformat(data["timestamp"])
        )

Usage in Action

# Simulate incoming sensor data
raw_data = {
    "device_id": "FIELD-B-117",
    "temperature": 23.4,
    "humidity": 68.2,
    "timestamp": "2024-06-15T14:30:00"
}

# Safe creation with validation
try:
    reading = SensorReading.from_dict(raw_data)
    print(reading)  
    # SensorReading(device_id='FIELD-B-117', temperature=23.4, humidity=68.2, timestamp=datetime.datetime(2024, 6, 15, 14, 30))

    # Send to dashboard
    json_payload = reading.to_json()
    print(json_payload)
    # {"device_id": "FIELD-B-117", "temperature": 23.4, "humidity": 68.2, "timestamp": "2024-06-15T14:30:00"}

except (ValueError, KeyError, TypeError) as e:
    print(f"Invalid sensor data: {e}")

This approach prevents bad data from corrupting analytics pipelines—critical in precision agriculture where irrigation decisions depend on accurate moisture levels.

Complete Code

from dataclasses import dataclass
from datetime import datetime
import json
from typing import Dict, Any

@dataclass
class SensorReading:
    device_id: str
    temperature: float
    humidity: float
    timestamp: datetime

    def __post_init__(self):
        # --- Validation after initialization ---
        if not self.device_id or not isinstance(self.device_id, str):
            raise ValueError("device_id must be a non-empty string")
        
        # Check temperature range
        if not (-50 <= self.temperature <= 80):
            # Using f-string for better error message
            raise ValueError(f"Temperature {self.temperature}°C out of realistic range [-50°C, 80°C]")
        
        # Check humidity range
        if not (0 <= self.humidity <= 100):
            raise ValueError(f"Humidity {self.humidity}% must be between 0% and 100%")
        
        if not isinstance(self.timestamp, datetime):
            raise TypeError("timestamp must be a datetime object")

    def to_json(self) -> str:
        """Serialize to JSON-compatible format (string)."""
        return json.dumps({
            "device_id": self.device_id,
            "temperature": round(self.temperature, 2),
            "humidity": round(self.humidity, 2),
            # Use isoformat() for standard timestamp representation
            "timestamp": self.timestamp.isoformat() 
        })

    @classmethod
    def from_dict(cls, data: Dict[str, Any]):
        """
        Create instance from dictionary. 
        Includes robust error handling for missing keys and invalid types.
        """
        required_keys = ["device_id", "temperature", "humidity", "timestamp"]
        
        # 1. Check for missing keys
        if not all(k in data for k in required_keys):
            missing = [k for k in required_keys if k not in data]
            raise KeyError(f"Missing required keys in input data: {', '.join(missing)}")
        
        try:
            # 2. Safely cast types before creating the instance
            # The __post_init__ will validate the ranges
            return cls(
                device_id=str(data["device_id"]),
                temperature=float(data["temperature"]),
                humidity=float(data["humidity"]),
                timestamp=datetime.fromisoformat(data["timestamp"])
            )
        except (ValueError, TypeError) as e:
            # Catch errors from float(), str(), or datetime.fromisoformat()
            raise TypeError(f"Could not convert input data to correct types: {e}")

    @classmethod
    def from_json(cls, json_str: str):
        """Create instance from a JSON string."""
        try:
            data = json.loads(json_str)
            # Delegate to from_dict for the main construction logic
            return cls.from_dict(data)
        except json.JSONDecodeError as e:
            raise ValueError(f"Invalid JSON string provided: {e}")


# --- Usage in Action (Updated) ---

print("--- Successful Case ---")
# Simulate incoming sensor data
raw_data = {
    "device_id": "FIELD-B-117",
    "temperature": 23.4,
    "humidity": 68.2,
    "timestamp": "2024-06-15T14:30:00"
}

try:
    reading = SensorReading.from_dict(raw_data)
    print(f"Created object: {reading}")
    
    # Test to_json and from_json cycle
    json_payload = reading.to_json()
    print(f"JSON Payload: {json_payload}")

    reading_from_json = SensorReading.from_json(json_payload)
    print(f"Object from JSON: {reading_from_json}")
    assert reading == reading_from_json, "from_json/to_json cycle failed"

except (ValueError, KeyError, TypeError) as e:
    print(f"Invalid sensor data: {e}")

print("\n--- Error Case 1: Missing Key ---")
raw_data_missing = {
    "device_id": "FIELD-B-118",
    "temperature": 25.0,
    # 'humidity' is missing
    "timestamp": "2024-06-15T14:35:00"
}

try:
    SensorReading.from_dict(raw_data_missing)
except (ValueError, KeyError, TypeError) as e:
    print(f"Caught expected error: {e}") 

print("\n--- Error Case 2: Out of Range Value ---")
raw_data_bad_range = {
    "device_id": "FIELD-B-119",
    "temperature": 90.0, # Too high
    "humidity": 50.0,
    "timestamp": "2024-06-15T14:40:00"
}

try:
    SensorReading.from_dict(raw_data_bad_range)
except (ValueError, KeyError, TypeError) as e:
    print(f"Caught expected error: {e}")

print("\n--- Error Case 3: Invalid Type ---")
raw_data_bad_type = {
    "device_id": "FIELD-B-120",
    "temperature": "twenty", # Not a float/int string
    "humidity": 55.0,
    "timestamp": "2024-06-15T14:45:00"
}

try:
    SensorReading.from_dict(raw_data_bad_type)
except (ValueError, KeyError, TypeError) as e:
    print(f"Caught expected error: {e}")
1

Best Practices for Custom Data Structures

  1. Use @dataclass for pure data containers—it’s concise and typed.

  2. Validate in __post_init__—catch errors early.

  3. Add from_dict() and to_json()—for seamless I/O.

  4. Use type hints—improves IDE support and readability.

  5. Avoid mutable defaults—use default_factory if needed.

  6. Make instances immutable when possible (frozen=True) for thread safety.

Conclusion

Python classes—and especially data classes—are the natural, powerful evolution of C-style structs. In domains like IoT, finance, or healthcare, modeling data correctly isn’t just convenient—it’s essential for reliability. By wrapping your data in smart, validated classes, you turn raw bytes into trusted, actionable information. Whether you’re tracking soil moisture in a vineyard or patient vitals in a hospital, your data structures should work as hard as your algorithms. Start small: replace your next dict with a @dataclass. Your future self—and your production logs—will thank you.