ELT Pipeline AWS — Medallion
Multi-tenant analytical platform on AWS with 4-layer Medallion architecture — 99% cost reduction vs Azure Databricks.
The Problem
Corporate analytical platform on Azure Databricks costing ~$800/month for 5 business units — unfeasible for a personal portfolio project keeping the same technical depth.
The Solution
Serverless-first migration: S3 + Glue Data Catalog + Athena v3 (Trino) + Apache Iceberg for ACID transactions and schema evolution. dbt-athena for 45 incremental transformations. Airflow with Datasets for event-driven orchestration. Observability via CloudWatch + SNS → Lambda → Slack. 100% Terraform infra with remote state on S3 + DynamoDB lock.
Result
Operational platform preserving 100% of business logic (8 datamarts, 45 models, Kimball star schema, 5 tenants) at ~$6/month — ~99% reduction. CI time < 5min, make up → operational Airflow in < 60s.
Related Projects
RPA Suite Fictor
Suite of 80+ RPA pipelines automating critical logistics, supply chain and sales reports for 5 subsidiaries.
NFe OCR Pipeline
Multi-engine pipeline for classification, OCR extraction and automated organization of ~500 invoices/month — 98% accuracy.