A “small” change to a business rule breaks a dashboard three layers downstream. Every data team has lived that moment.

It happens because most organizations treat data as isolated layers. Operations builds processes. IT builds systems. Analytics builds dashboards. Nobody maps the full chain. When something shifts upstream, everything downstream absorbs the shock — and nobody saw it coming.

The dataflow framework is how I think about this problem. It traces how data moves through your organization — from the boardroom decision that creates a process, through the source systems that capture signals, into the warehouse that models them for analysis, out to dashboards that surface insights, and back to the boardroom where those insights reshape strategy.

It’s a loop. And if you don’t see the full loop, you’re always surprised when something breaks.

The Full Chain

Most data methodologies start in the middle. Kimball starts at the warehouse. Inmon starts at the enterprise model. dbt starts at the transformation layer. All useful. All incomplete.

The dataflow framework starts where data actually originates: a business decision.

Here’s the chain:

Business strategyBusiness processesSource systemsData warehouseDashboards & KPIsFeedback → repeat.

Each step depends on the one before it. Each step constrains the one after it. Miss one, and your downstream work inherits problems you can’t see from where you’re standing.

Step 1: Business Strategy — What Is Data Actually For?

Before touching any tool, any database, any dashboard: which decisions need better data?

Not “what data do we have?” That question leads to dashboard graveyards — reports nobody asked for, answering questions nobody has.

The right starting point: sit with the people who make decisions. The COO wondering why delivery times are climbing. The CFO who can’t reconcile revenue across three systems. The Head of Sales whose pipeline reports contradict what the CRM shows.

Their questions define what “good data” means. Everything downstream exists to answer those questions.

I run this as a leadership alignment session. Two hours. We map the 5-10 decisions that actually drive the business, identify which ones currently rely on gut feeling instead of facts, and define what “answered” looks like. No SQL. No architecture diagrams. Just clarity.

Related: How to Align Your Data Strategy with Business Strategy

Step 2: Business Processes — Where Data Is Born

Business decisions become business processes. Processes generate signals. Those signals become your data.

An order-to-cash process generates invoices, payments, credit notes. A hiring process generates applications, interviews, offers. A manufacturing process generates production runs, quality checks, waste reports.

If you don’t understand the process, you can’t model the data it produces. You’ll build a fact table around “orders” without realising that your company’s order process has a draft-approval-confirmed-shipped lifecycle with different business meaning at each stage.

This is where philosophy helps, surprisingly. Kant’s insight was to start from the observer, not the object. In data terms: the same transaction looks different depending on which process generated it and who’s asking about it. A “sale” in the CRM is not the same entity as a “sale” in the ERP. Same word, different observer, different data.

Related: The Observer is Central: Why Philosophy Matters for Data

Step 3: Source Systems — Reading What You Inherit

Your ERP, CRM, project management tool, time tracking system — these weren’t built for analytics. They were built for operations. Their database schemas reflect operational concerns: transaction speed, data entry workflows, application logic.

You don’t choose these models. You inherit them. But if you can’t read what you’re looking at, your analytical model will inherit problems you didn’t know existed.

Source systems use different modeling patterns:

  • Relational databases (your ERP, most SaaS platforms) — normalized for write integrity, not for reporting. Joining 15 tables to answer “what did we sell last month?” is normal.
  • Event-driven systems (webhooks, message queues, activity logs) — data as a stream of things that happened, not a snapshot of current state.
  • Hierarchical stores (JSON APIs, document databases, nested XML exports) — flattening these is always harder than it looks.
  • Flat files — CSVs, Excel exports, SFTP drops. Not a formal model, but half your integration problems start here.

The skill isn’t just extracting this data. It’s recognising what it represents: snapshots vs. event logs, transactional data vs. state data, master data vs. reference data. Get this wrong, and your warehouse tells lies that look like truth.

Step 4: Data Warehouse — Modeling for Analysis

This is where most data books start. And they’re not wrong — the warehouse is where the core modeling discipline lives. But by the time you get here through the dataflow framework, you already know what to model and why.

Three analytical modeling paradigms, each solving a different problem:

Dimensional modeling (Kimball) — stars, snowflakes, facts and dimensions. Optimised for query performance and analytical usability. The right choice when your primary consumer is a dashboard or a business analyst exploring data. If you’ve used any BI tool, you’ve used dimensional models — even if nobody told you.

Data Vault (Linstedt) — hubs, links, satellites. Optimised for integration across multiple source systems and full auditability. The right choice when you’re combining data from 10+ sources and need to trace every value back to its origin. Heavier to implement, but nothing else handles integration at scale the same way.

One Big Table — flatten everything into one wide table per domain. No star schema, no joins. The de facto standard in dbt projects on columnar warehouses where joins are expensive and storage is cheap. Nobody writes about it because it feels too obvious. But it’s full of trade-offs practitioners navigate daily without guidance.

No single paradigm is always right. I often combine them: vault for the integration layer, dimensional for the presentation layer, OBT for specific use cases where simplicity wins. The dataflow framework doesn’t prescribe one — it tells you when each one earns its complexity.

Related: When Do You Need a Data Warehouse?

Step 5: Dashboards & KPIs — Where Modeling Meets Reality

A well-modeled warehouse makes dashboard design trivial. A poorly-modeled one makes it impossible.

This is where upstream sins surface. If your dimensional model has the wrong grain, every KPI calculation requires workarounds. If your data vault lacks business vault constructs, your BI tool can’t self-serve — every question needs a data engineer. If your OBT has 300 columns, nobody knows which “revenue” field to use.

KPIs are modeling decisions. “Revenue” means nothing until you define: gross or net? Recognised at invoice or at payment? Including returns or excluding them? These aren’t dashboard questions. They’re questions that should have been answered at step 1 and encoded in the warehouse at step 4.

When this works, dashboards answer the questions leadership asked in the alignment session. The COO sees delivery times by region and root cause. The CFO sees reconciled revenue across all three systems. The Head of Sales sees pipeline that matches the CRM because it comes from the CRM through a modeled, validated path.

Step 6: Feedback — Closing the Loop

Here’s what no other framework tells you: dashboards change the business.

When the COO sees that delivery times in region X are 40% higher than region Y, they change the process. When processes change, source systems change. When source systems change, the warehouse needs updating. When the warehouse changes, dashboards need updating.

This is the loop. Strategy → process → systems → warehouse → dashboards → new questions → strategy. It never stops. A data architecture that can’t absorb change is a data architecture with an expiration date.

Most organizations get stuck here. They build a dashboard, declare victory, and move on. Six months later, the numbers don’t match reality anymore because three upstream processes changed and nobody told the data team.

I close this loop with documentation that captures why a model looks the way it does — not just what it contains. Decision records, not just schema diagrams. When a business process changes, you can trace which models are affected and adapt them before the dashboard breaks.

Related: Data Culture: The Context for Successful Data Projects

Why Full-Chain Thinking Matters

You can be excellent at dimensional modeling and still deliver a failed project — because the business process wasn’t understood, or the source system was misread, or the KPI definition was never agreed on.

The hard part of this profession isn’t any single discipline. It’s keeping the full chain in scope at once. A change in business strategy reshapes processes. Processes reshape source systems. Source systems reshape the warehouse. The warehouse reshapes dashboards. Dashboards reshape strategy.

Every step is a whole field of study. But you have to hold them all at once.

That’s what the dataflow framework is for. Not a methodology to follow blindly, but a mental model that tells you where you are in the chain, what’s upstream of you, and what’s downstream of you. So when something breaks, you know where to look. And when you build something new, you know what it needs to survive.

Apply It

Three ways to use this framework today:

  1. Audit your current chain. Where does it break? Is there a step you’re skipping or a handoff where context gets lost?

  2. Start your next project at step 1. Not at the tool. Not at the warehouse. At the business question. Two hours with leadership will save you months of rework.

  3. Download the Dataflow Template. A structured worksheet that walks you through each step of the framework for your own organization.

Want to apply this framework to your organization? Book a call and let’s map your dataflow together.