Personal Finance Data Pipeline

An end-to-end ETL pipeline that turns raw transactions into insight — with tests and CI.

Self-directed · learning data engineering by building

Code coming soon — building in the open.

// overview

I'm teaching myself data engineering the way I learn best — by building something real. This pipeline ingests personal transaction data, validates and transforms it, stores it in a database, flags unusual spending, and surfaces everything in an interactive dashboard. I'm treating it like production software, with automated tests and continuous integration rather than a one-off script.

// what I'm building

An ETL flow that ingests, validates, transforms, and stores transactions in SQLite.
Anomaly detection that flags unusual transactions using z-scores.
An interactive Plotly Dash dashboard to explore spending over time.
A pytest test suite and a GitHub Actions CI pipeline so changes stay reliable.

// tech stack

PythonpandasSQLitePlotly DashpytestGitHub Actions

// what I'm focused on

Designing data flows that are reliable and repeatable, not just runnable once.
Writing tests and setting up CI/CD — habits that carry into any real codebase.
Directing my own learning: diagnosing what I don't know and closing the gap deliberately.

Back to all projects