Loading article

Data Engineering

What Enables Connected Decision Systems: Inside a Modern Analytics Platform

By Syed Hussnain Sherazi | 2026-05-07 | Analytics Platforms | Data Architecture | Decision Systems

A plain-English explanation of the modern analytics platform layers that support connected decision systems.

Breaking down the engine that powers data-driven organisations: one layer at a time

A plain-English explanation of the modern analytics platform layers that support connected decision systems.velopers, data scientists, infrastructure engineers. It was expensive, slow, and fragile. One person leaving could bring the whole thing down.

Today that picture looks very different. Modern analytics platforms have compressed what used to take months of custom engineering into something you can configure in weeks. But understanding why they work: and what makes one platform better than another for your specific situation: requires understanding what is actually happening under the hood.

So let me walk you through the anatomy of a modern analytics platform. Not the marketing version. The real version.

The Core Problem These Platforms Solve

Before we get into architecture, it helps to understand the problem.

Organisations have data coming from dozens or hundreds of different sources: transactional databases, SaaS applications, cloud services, IoT devices, spreadsheets, APIs, streaming events. All of this data has different formats, different update frequencies, different schemas, and different levels of quality.

At the same time, people across the business need to ask questions of that data: now, not next week. They need answers they can trust, not "it depends on which system you look at."

A modern analytics platform is the layer between messy, scattered source data and confident, reliable decisions. Let me show you how it does that.

The Architecture of a Modern Analytics Platform

Visual summary of the workflow
Step 1"馃摜 Data Sources"
Step 2Databases SQL 路 NoSQL
Step 3SaaS Apps Salesforce 路 SAP 路 HubSpot
Step 4Streaming Kafka 路 IoT 路 Events
Step 5Files CSV 路 JSON 路 Parquet
Step 6"馃攧 Ingestion & Integration"
Step 7Batch Pipelines ETL / ELT
Step 8Real-time Streams CDC 路 Event Bus

Let me go through each layer and explain what it does and why it matters.

Layer 1: Data Sources

This is where your data actually lives before the platform touches it. It includes:

  • Operational databases: your core transactional systems, the databases behind your applications
  • SaaS platforms: Salesforce, SAP, HubSpot, Shopify, whatever your teams are using day to day
  • Streaming data: real-time events from applications, IoT sensors, clickstreams
  • Files and flat data: the CSVs, Excel files, and JSON exports that nobody wants to admit are still everywhere

Most platforms do not move your source data. They read from it. The golden rule here is: do not touch the source. You ingest a copy; you never modify the original.

Layer 2: Ingestion and Integration

This layer is the plumbing. It is responsible for moving data from sources into your platform reliably, at the right frequency, and without losing anything.

There are two main patterns:

Batch ingestion pulls data on a schedule: every hour, every night, every week. It is simpler to build and works well when data freshness is not critical. Your monthly finance report does not need real-time data.

Streaming ingestion pulls data as it is generated: milliseconds to seconds of latency. This is necessary for use cases like fraud detection, live inventory tracking, or customer-facing personalisation engines.

Most mature platforms support both. The key capability to look for here is Change Data Capture (CDC): the ability to detect and capture only the rows that changed in a source database, rather than pulling the entire table every time. This makes ingestion dramatically more efficient at scale.

Layer 3: Storage

Once data arrives, you need somewhere to store it. Modern platforms have moved away from the traditional "put everything in the data warehouse" approach toward a tiered storage model: often called the Medallion Architecture.

Bronze layer (raw zone): Data lands here exactly as it came from the source. No cleaning, no transformations. This is your safety net: if something goes wrong downstream, you can always reprocess from here.

Silver layer (cleaned zone): Data has been validated, deduplicated, standardised, and enriched. This is where raw customer records become clean customer records. This layer is where most of the data engineering effort happens.

Gold layer (domain zone): Business-ready data modelled for specific use cases. Sales reporting tables. Marketing attribution models. Finance consolidation views. The gold layer is what most business users actually query.

This three-tier model keeps raw data safe, ensures clean data is available for analysis, and makes business logic explicit and auditable.

Layer 4: Transformation and Modelling

Raw data almost never reflects business reality. A timestamp in a database is not the same as "the date a sale was recognised for revenue purposes." A user ID in a clickstream is not the same as a customer in your CRM.

Transformation is the process of applying business logic to turn raw data into meaningful data. This is where tools like dbt (data build tool) have become extremely popular: they allow you to define transformations as SQL code, version-control them, test them, and document them in one place.

For machine learning workloads, this layer also includes a feature store: a centralised repository of pre-computed features (things like "customer's average order value in the last 90 days") that can be reused across multiple models without duplication.

Layer 5: The Semantic Layer

This is perhaps the most underappreciated layer in modern analytics, and also one of the most important.

The semantic layer sits between the physical data and the people consuming it. Its job is to define what things mean in business terms. What is a "customer"? What counts as "revenue"? How is "conversion rate" calculated?

Without a semantic layer, every team builds their own version of these metrics. The result is the dreaded "which number is right?" moment every analyst dreads.

A good semantic layer also provides data lineage: the ability to trace any number back to its source. If a CEO asks "where does this revenue figure come from?", you should be able to show exactly which tables, transformations, and calculations produced it. This is what builds trust in data.

Layer 6: The Consumption Layer

This is where humans and machines actually use the data.

Business Intelligence tools like Power BI and Tableau let non-technical users build reports, explore data, and monitor KPIs through dashboards.

Notebooks and SQL editors give analysts and data scientists the flexibility to do ad-hoc analysis, build models, and explore data in ways that dashboards cannot.

Machine learning models consume prepared data from the platform to make predictions: churn risk, demand forecasting, product recommendations.

Embedded analytics push insights directly into the applications where decisions are actually made. Think of a sales rep seeing a churn risk score directly in their CRM, without ever opening a separate analytics tool.

The best platforms make it easy to serve all of these consumption patterns from the same underlying data.

Layer 7: Governance and Security

This layer is not really a "layer" in the architectural sense: it applies across everything. But it is often the difference between a platform your organisation will actually trust and use, versus one that creates compliance risk and political battles.

The key components are:

  • Access control: Who can see what? Sensitive financial data should not be visible to everyone. Customer PII needs to be masked or restricted based on role.
  • Data quality monitoring: Automatic checks that catch data issues before they reach business users. If a pipeline breaks and revenue figures go to zero overnight, you want to know before the CFO opens their dashboard.
  • Compliance and audit trails: For regulated industries, you need to prove who accessed what data, when, and why. GDPR, SOX, HIPAA: different industries have different requirements, but all of them care about this.

What Makes a Platform "Modern"

If I were evaluating a platform today, here is what I would look for:

1. Unified experience: can data engineers, analysts, and scientists all work in the same environment without constant handoffs? 2. Open formats: is the data stored in open standards like Delta Lake or Apache Parquet, or am I locked into proprietary formats? 3. Lakehouse capability: can I run both analytics and AI/ML workloads on the same data, without moving it between systems? 4. Built-in governance: is security and compliance native, or does it require additional tools bolted on? 5. Real-time support: can the platform handle streaming data as well as batch, or do I need a separate system?

Platforms like Microsoft Fabric, Databricks, and Snowflake are all competing in this space right now. Each has strengths and trade-offs, which I will cover in upcoming posts.

Closing Thought

A modern analytics platform is not a single product. It is an architecture: a set of layers that work together to get data from source systems to decision-makers in a way that is reliable, trustworthy, and fast.

Understanding these layers does not just help you pick the right technology. It helps you diagnose why your current setup is failing and where the real bottlenecks are. And in my experience, that understanding is far more valuable than any tool you can buy.

Next in this series: Building a Simple Decision-Support Workflow Using Microsoft Fabric: from raw data to actionable insight in a single platform.

Reader Comments

Add a comment with your name and email. Your email is used only for basic validation and is not shown publicly.