Data Science  

Building a GA4 to Power BI Data Pipeline Using Snowflake and dbt (Medallion Architecture)

Introduction

In this article, I’ll walk you through a Proof of Concept (POC) project that demonstrates how to build a modern, modular, and scalable data pipeline using.

  • Google Analytics 4 (GA4) data.
  • Snowflake for cloud data warehousing.
  • dbt (Data Build Tool) for transformation.
  • and Power BI for dashboard visualization.

The architecture follows the Medallion Architecture (Bronze → Silver → Gold), a layered approach that ensures data quality, transformation transparency, and reusability.

This guide is aimed at data engineers, BI developers, and analytics professionals interested in building efficient, cloud-native pipelines.

Google

Project Objective

To build a modular and robust data pipeline that ingests event-level data from GA4, performs layered transformations, and exposes curated metrics for business users via Power BI.

Tools Used

  • Google Analytics 4: User and event tracking
  • Snowflake: Cloud data platform
  • dbt: SQL-based transformation and documentation tool
  • Power BI: Visualization and reporting
  • SQL: For dbt transformations

Step-by-Step Implementation

Step 1. Ingest GA4 Data into Snowflake

Export GA4 data into BigQuery or Google Cloud Storage, then load it into Snowflake.

-- Sample raw JSON table in Snowflake (Bronze Layer)
CREATE TABLE GA4_DB.raw_events (
    event_timestamp   TIMESTAMP,
    event_name        STRING,
    event_params      VARIANT,
    user_properties   VARIANT,
    geo               VARIANT,
    device            VARIANT
);

Step 2. Set up dbt Project (Bronze → Silver → Gold)

Use dbt to manage transformations.

Project Structure:

ga4_dbt_project/
├── models/
│   ├── bronze/
│   ├── silver/
│   └── gold/
├── dbt_project.yml
└── profiles.yml

Bronze Layer (raw passthrough)

-- models/bronze/ga4_events_bronze.sql

SELECT
    event_timestamp,
    event_name,
    event_params,
    geo,
    device
FROM 
    {{ source('raw', 'events') }}
;

Silver Layer (cleaned & structured)

-- models/silver/ga4_events_silver.sql

SELECT
    event_timestamp::DATE AS event_date,
    event_name
  • geo: country AS country,
  • device: category AS device_type
FROM {{ ref('ga4_events_bronze') }}

Gold Layer (curated metrics)

-- models/gold/daily_event_metrics.sql

SELECT
    event_date,
    country,
    COUNT(*) AS total_events,
    COUNT(DISTINCT event_name) AS unique_events
FROM {{ ref('ga4_events_silver') }}
GROUP BY
    event_date,
    country;

Step 3. Generate Documentation

dbt docs generate
dbt docs serve

Step 4. Connect to Power BI

Use the Snowflake connector in Power BI Desktop to pull data from the gold.daily_event_metrics table.

Build visuals like,

  • Event volume over time
  • Top countries by activity
  • Device distribution

Benefits of the Medallion Architecture

Layer Purpose
Bronze Raw, untouched data from the source
Silver Cleansed and structured data
Gold Business-consumable metrics and KPIs

This structure promotes,

  • Data reusability
  • Scalability
  • Data quality control
  • Layered responsibility

Conclusion

This POC illustrates a powerful use of modern data stack principles using.

  • GA4 as the source
  • dbt for modular SQL transformations
  • Snowflake as the engine
  • Power BI as the front-end layer

It’s easy to scale this architecture to include other sources, real-time feeds, or enterprise-grade automation.