2026 — in progress
In progressPersonal Finance Data Pipeline
An end-to-end ETL pipeline that turns raw transactions into insight — with tests and CI.
Self-directed · learning data engineering by building
Code coming soon — building in the open.
// overview
I'm teaching myself data engineering the way I learn best — by building something real. This pipeline ingests personal transaction data, validates and transforms it, stores it in a database, flags unusual spending, and surfaces everything in an interactive dashboard. I'm treating it like production software, with automated tests and continuous integration rather than a one-off script.
// what I'm building
- An ETL flow that ingests, validates, transforms, and stores transactions in SQLite.
- Anomaly detection that flags unusual transactions using z-scores.
- An interactive Plotly Dash dashboard to explore spending over time.
- A pytest test suite and a GitHub Actions CI pipeline so changes stay reliable.
// tech stack
PythonpandasSQLitePlotly DashpytestGitHub Actions
// what I'm focused on
- Designing data flows that are reliable and repeatable, not just runnable once.
- Writing tests and setting up CI/CD — habits that carry into any real codebase.
- Directing my own learning: diagnosing what I don't know and closing the gap deliberately.