Senior Data & AI Consultant building cloud warehouses for pharma, banking, aviation and energy across DACH. I pull scattered sources — mostly SAP — into one governed Snowflake platform, model it with dbt, and ship to production with automated testing.
On top I build in-warehouse ML and LLM features on Snowflake Cortex and agentic systems with LangGraph: lead scoring, forecasting, and a guard-railed Text-to-SQL agent. Azure certified (Solutions Architect Expert). I work in English and German.
from dataclasses import dataclass @dataclass(frozen=True, slots=True) class David: role: str = "Senior Data & AI Consultant" based: str = "Essen, DE · DACH + remote" stack: tuple[str, ...] = ("Snowflake", "dbt", "Cortex", "LangGraph") def expertise(self) -> dict[str, list[str]]: return { "platforms": ["SAP → Snowflake", "Data Vault 2.0", "Kimball"], "ai": ["Cortex ML & LLM", "agentic Text-to-SQL", "RAG"], "delivery": ["Terraform", "CI/CD", "automated testing"], } david = David() # 6+ yrs · pharma · banking · aviation · energy
A DACH pharmaceutical group needed one governed platform for group-wide reporting instead of data scattered across systems. I built its global analytics warehouse on Snowflake with dbt: a layered Data Vault 2.0 architecture from raw ingestion and staging, through business-vault enrichment, to subject-area marts and a governance layer. It consolidates SAP ECC and HANA, flat files, and manual reference data, and covers sales order book, invoicing, P&L, supply chain (including OTIF), and data quality. Automated incremental pipelines are promoted across dev, QA, and production on Azure DevOps.
An analytics platform that turns SAMS flight data into reliable insights for operations, airlines, capacity, and planning. Data flows from Azure into a Snowflake landing zone, through automated ELT into a star-schema data mart, and on to Power BI. It covers three domains: Coordinated (scheduled flights and slot allocations), Operated (actual flight performance), and Utilization (airport capacity). Snowflake Streams and Tasks run the incremental, dependency-based loads, so dimensions are always ready before facts.
Every new data source used to mean hand-written pipelines and ad-hoc schema changes. I built a configuration-driven staging framework where a single JSON file provisions the entire ingestion stack (Snowpipes, Streams, Tasks, formats and roles), deployed via Terraform across all environments. Every staged row keeps a full audit trail back to its source file.
A risk and performance reporting platform for a DACH energy trader, consolidating trading, risk, and master data into one governed Snowflake analytics layer. I led a team of 4 building this 2+ TB platform on Snowflake and Azure: layered dbt from raw ingestion and staging, through a Risk Data Vault and business vault, to hourly and daily reporting marts. Sources include Python risk metrics (VaR, PAR, Expected Shortfall), PSI trade quantities, HPFC price curves, contract and portfolio master data, risk limits, and PnL adjustments, with Oracle integrated via SAP OData. dbt handles Data Vault automation, data-quality tests, and column masking; GitLab CI/CD runs scheduled, manual, and merge-request pipelines.
A reference implementation for safe, observable GenAI over a real warehouse, not a single-prompt demo. Business users ask in plain English; a LangGraph agent uses Snowflake Cortex to generate SQL, enforces read-only safety with sqlglot and a dedicated read-only role, runs the query, and summarizes the answer, with bounded self-repair when a query fails. It runs on the UCI Online Retail II dataset, ingested with Python into Snowflake and modeled with dbt into a Kimball star schema and a generated semantic layer. Served through a FastAPI /ask endpoint and a Typer CLI, traced with Langfuse, and scored by execution-accuracy evaluation against a gold question set. Code is public on GitHub.
The assistant answering questions on this page right now. I chunk my CV and projects, embed them with Cortex AI_EMBED into a Snowflake VECTOR column, and retrieve the most relevant context per question, then ground it into an AI_COMPLETE answer behind a Cloudflare Worker that rate-limits, caches and logs. Bilingual (English / German) and answers only from real facts — it won't invent experience.
Safe, observable, evaluated LangGraph text-to-SQL agent over a Snowflake warehouse — read-only guardrails, a generated semantic layer, and execution-accuracy evals.
A hands-on benchmark of ClickHouse and Snowflake on the same analytical workload — comparing load, query latency, and cost trade-offs.
A set of self-built ETL projects exploring different data-engineering tools and patterns end to end, from ingestion through transformation.






