august 11, 2025
ODS and Data Marts in BI
What is an ODS (Operational Data Store)
An Operational Data Store (ODS) is a centralised database that consolidates up-to-date data from various operational systems within a company.
Data is loaded into an ODS almost in real time and is stored either without historical changes or with only a very limited history. The primary purpose of an ODS is to provide operational reporting based on current data without waiting for a full load into the corporate data warehouse (DWH). In simple terms, an ODS contains the freshest, most detailed information about business transactions and is updated very frequently (for example, every few minutes or hours).
An ODS reduces the load on live (operational) systems: analytical queries are directed to the ODS rather than to production databases, ensuring that the performance of core applications is not affected. The structure of an ODS is usually close to that of the source systems (a natural data format), possibly with minimal transformation and cleansing.
Unlike a data warehouse, an ODS is not intended for long-term storage or deep historical analysis — old data in the ODS is replaced with new data, and historical information is either not stored at all or kept only briefly. Therefore, an ODS does not replace a full-scale DWH but serves operational needs: quick same-day reports, monitoring of current metrics, situation rooms, and similar use cases.
Example: for an e-commerce business, an ODS might, throughout the day, aggregate the latest orders, stock updates, and delivery statuses — information needed for near real-time executive dashboards. At night, the full day’s data is transferred to the main data warehouse for long-term analytics, while the ODS is refreshed the next day with the latest operational data.
What is a Data Mart
A Data Mart is a dedicated storage repository that contains a structured set of data on a specific business topic or function.
It is typically created for a particular group of users or a specific department to meet their information needs. Unlike a corporate data warehouse (DWH), a data mart does not cover all business processes, but focuses on a single subject area (for example, sales, marketing, or inventory) and contains only the necessary, cleansed data without extraneous information.
Example: a marketing data mart might aggregate customer and sales information from various company systems.
In most cases, data within a data mart is pre-aggregated and organised in a convenient structure (often in a star schema or snowflake schema), making report creation and analysis simpler. The data volume in a data mart is relatively small compared to a corporate data warehouse or a data lake, which allows queries to run faster. A data mart may store both detailed transactions and aggregated figures — depending on analytical needs. However, data unrelated to the subject area is generally absent, and the historical period may be limited to what the department requires (for instance, only the past few years of sales data in a sales data mart).
Example use case: a marketing department may have a data mart that consolidates all customer interaction data: contracts, order history, payments, and responses to advertising campaigns. This mart serves as the primary data source for marketing analysts, enabling them to quickly get answers to their questions without accessing the entire corporate DWH. Another example is a finance data mart, which contains aggregated financial indicators from the DWH, making it easier for the finance department to prepare reports.
Differences between an ODS and a Data Mart
Although both an ODS and data marts are components of the data ecosystem, their roles and characteristics differ significantly. The key distinctions can be summarised as follows:
- Data freshness and retention period: an ODS holds operational, up-to-date data and supports near-continuous updates. Historical information in an ODS is either absent or retained for only a very short time. A data mart, on the other hand, can store accumulated data for a defined period (for example, several years of sales) to enable trend analysis. It is updated periodically (e.g., daily or weekly), rather than every minute.
- Purpose: an ODS is designed for operational reporting and fast transactional queries — essentially, it is a source for “today/now” reports that show the current state of processes. A data mart is focused on analytics and business reporting for a specific topic — for example, building dashboards, performing OLAP analysis, or calculating KPIs over a given period. Put simply, an ODS answers the question “What is happening right now?”, whereas a data mart answers “What are the results and metrics for a given period or area?”.
- Data structure: in an ODS, data is often stored in a format close to the operational systems (third normal form or even straightforward copies of source tables), which facilitates quick data updates and integration. In a data mart, the data is usually denormalised to optimise reading: star or snowflake schemas, pre-calculated metrics, and aggregates are used. This speeds up complex queries, but data marts are typically read-only for end users.
- Volume and performance: an ODS handles a relatively small volume of data — only the most current records. Queries to an ODS are simple and specific (e.g., retrieving the current order status). A data mart may contain a much larger volume (accumulated data) but is designed for efficient execution of analytical queries, including complex joins and aggregations. By using pre-aggregated and topic-specific data, data marts reduce the load on the central data warehouse and accelerate reporting.
Summary: an ODS and a Data Mart complement rather than replace each other. An ODS ensures data timeliness, while a data mart provides convenience and performance for analytics in a specific subject area. They can exist independently (in some cases, only one is used), but in complex BI landscapes, they are often implemented together.
Together or instead: should ODS and Data Marts be used in combination?
An ODS and a data mart are not direct substitutes — each serves its own purpose. The choice of whether to use one, the other, or both depends on the organisation’s need for data freshness and depth of analysis.
- When data marts alone are sufficient (without an ODS): in smaller organisations, or when real-time updates are not required, a separate ODS may be unnecessary. If daily reporting is sufficient, data can be loaded directly from source systems into the DWH and then into thematic data marts. Data marts refreshed once a day will meet most needs, while operational queries can be run directly against transactional systems or against the previous day’s marts. In such cases, implementing an ODS would only add unnecessary architectural complexity.
Example: an SMB with a single primary source (such as 1C or a small ERP) can update its sales mart nightly for analysis — same-day data is not needed, and figures from the previous day are sufficient.
- When only an ODS is used (without a full data mart model): if the primary requirement is monitoring ongoing operations, with little need for complex analytical queries, a company may use an ODS directly for reporting. This happens when data freshness is valued more than multidimensional analysis.
Example: a support service might use an ODS showing the current status of tickets in real time to track incidents. Separate marts are unnecessary because historical metric analysis is minimal — the priority is seeing the “here and now”. However, it should be noted that ODS-based reports are often more basic, and as analytical needs grow, a move to data marts or a full DWH may be required.
- When an ODS and DWH/data marts are used together: in medium and large companies, ODS and data marts are often deployed in tandem, complementing each other. This setup meets both the need for fresh operational data and the need for in-depth analysis.
Scenario: during the day, business users view current metrics (for today) from the ODS. For trend analysis, plan-vs-actual comparisons, and consolidated reporting, they use data marts refreshed once a day. The ODS also serves as a source for loading marts — nightly batch processes extract the day’s accumulated data from the ODS and load it into the warehouse/marts. This two-tier architecture (ODS as the operational layer + data mart as the presentation layer) separates workloads: during the day, numerous small transactional queries go to the ODS, while heavy analytical processing runs on the marts at other times. The advantage is the elimination of competition between operational and analytical workloads (they are isolated). The drawback is more complex ETL processes and the need to synchronise layers — but modern orchestrators (such as Airflow) can automate this.
In practice, smaller companies often start with simpler solutions — for example, building several marts directly on top of a single database, without an ODS or even a full DWH. This delivers quick business value. But as data volumes and requirements grow, the need for a more sophisticated architecture emerges. In large enterprises, all layers are typically present: ODS, corporate DWH, and departmental data marts.
Thus, ODS and data marts are not mutually exclusive but complementary components of a BI ecosystem.
Architecture for integrating ODS and Data Marts
A typical BI data architecture may include several layers: data sources, an intermediate layer (staging/ODS), the corporate data warehouse core, and the data mart layer. Here’s how ODS and data marts fit into these designs:
- ODS as a source for data marts. In the classic three-tier data warehouse architecture, the ODS sits before the DWH core. Raw data first lands in the staging area — a temporary loading zone where it is cleansed and standardised. It is then integrated and updated in the ODS, which stores a consolidated set of detailed data for a short period, in the most current form. On a schedule (e.g., once per day), data from the ODS/staging layer is transferred to the DWH core — the historical data store with a complete history in a normalised structure. From the core, data marts are created (denormalised, subject-specific structures). The resulting pipeline is: Sources → ODS → DWH core → Data marts → BI reports. In this setup, data marts are typically dependent — they source their data from the DWH core. The ODS plays an indirect role, ensuring timely data delivery to the warehouse.
- ODS as an “operational mart” for near-real-time analytics. In some architectures, the ODS itself is used as a data mart for certain BI reports. For example, there may be an operational data mart — essentially, a reporting layer on top of the ODS for cases where extremely fresh data is needed (situation rooms, monitoring dashboards). These marts pull data directly from the ODS, bypassing the DWH layer to minimise latency. In this scenario, the ODS plays a dual role: both as a source for the DWH and as a direct source for some reports. It is important to optimise queries to an ODS-based mart, as the ODS stores data in a production-like (normalised) structure. In some cases, views or lightweight aggregates are built on the ODS to make direct reporting easier.
- Independent data marts. Sometimes, a data mart is built directly from source systems without a full DWH. These independent marts often emerge when a company lacks the resources for a large-scale warehouse project. For example, BI analysts might build a sales mart by extracting data weekly from CRM/ERP systems for reporting. This accelerates implementation but has drawbacks: data may be inconsistent between departments’ marts. Independent marts can be seen as a simplified architecture: Sources → Mart → Reports,
where the mart handles both integration and storage for a single subject area. In this case, the mart itself may act as a quasi-ODS (if refreshed frequently), or there may be no dedicated ODS at all.
- Two-tier architecture (ODS + marts without a DWH core). As a compromise, some companies implement an ODS and a mart layer without a heavyweight DWH core. The ODS stores current detailed data, while marts are built in parallel to aggregate this data for different departments. In effect, the ODS handles integration and short-term storage, and the marts provide longer-term, topic-based storage. This approach works where maintaining a full corporate core is unnecessary or impractical, but it requires strict data quality control and consistent business rules in the creation of each mart.
Example: an ODS on PostgreSQL collects data from multiple systems. Based on it, scripts generate a “Sales” mart in ClickHouse and a “Marketing” mart in Oracle. Each department gets its own mart, and the ODS ensures consistency of the source data.
Summary: the ODS is generally positioned closer to the data sources, ensuring rapid ingestion. Data marts are closer to the BI tools, providing user-friendly access to prepared datasets. In a complex architecture, all layers work together: the ODS supplies current data, the DWH core stores history and corporate reference data, and data marts deliver exactly the information each business unit needs.
Tools and technologies: Power BI, Apache Superset, Bitrix24
Modern BI solutions and data visualisation tools work hand in hand with ODS and data marts, enabling business users to gain valuable insights. Let’s look at some popular technologies and their roles.
Microsoft Power BI (and its Data Marts)
Power BI is one of the most widely used BI platforms, offering tools for data preparation, modelling, and interactive reporting. In the context of ODS and data marts, Power BI can operate on two levels: as a consumer of existing marts (by connecting to prepared data marts or ODS) and as a means for business users to create simplified, self-service data marts.
Power BI includes the concept of datasets and, more recently, has introduced the Datamart feature. A Power BI Datamart is essentially a self-service cloud-based data mart that allows users to import data from various sources, perform transformations (ETL), and store the results in a fully managed Azure SQL database behind the scenes. Power BI automatically builds a semantic model on top of this database, so users can immediately create reports and dashboards without IT intervention. In other words, a Power BI Datamart enables business users to assemble a mini–data warehouse for their domain — with tables, relationships, and measures — without needing to know SQL or configure servers.
Practical example: a finance department, without waiting for a corporate DWH to be built, can use Power BI Datamart to load exports from the accounting system, Excel budget files, and CRM payment data. Using Power Query, they transform the data and save it into a mart (Azure SQL) inside Power BI. This mart then becomes available for reporting directly within Power BI — and even external tools or SQL clients can connect to it via the provided T-SQL endpoint. It acts as a bridge between business and IT: users get a “mini-DWH” tailored to their needs quickly and without database administration. (It’s worth noting that, as of 2025, Microsoft has announced that Datamart will be integrated into the Fabric platform, but the self-service mart concept remains.)
Another common Power BI use case is connecting to existing sources — whether to mart tables in SQL Server/PostgreSQL or even directly to an ODS. Power BI supports DirectQuery mode to databases, allowing reports to be built on near real-time operational data (for example, connecting directly to an ODS in SQL Server and seeing updates almost instantly). However, direct ODS connections must consider workload impact: it is often better to feed Power BI from a mart or cube so that heavy computations do not burden the ODS.
Summary: Power BI is a versatile tool. It can consume data from the mart layer (recommended for performance), connect directly to ODS/operational databases when needed, and — in the absence of an existing infrastructure — even act as a mart itself via the Datamart feature.
Apache Superset
Apache Superset is a popular open-source platform for data visualisation and exploration. Functionally, Superset is similar to Power BI (dashboards, charts), but its architecture and use cases have some distinct differences.
Superset does not store data itself — instead, it connects to existing databases and data sources, executing SQL queries to retrieve chart data. This makes it a natural fit as a visualisation layer on top of data marts or a DWH.
Key characteristics: Superset is a lightweight web-based BI interface that is both fast and highly scalable because it leverages the processing power of your database rather than pulling all data into memory. It supports connections to most SQL-compliant databases, including modern cloud warehouses and analytical engines — from PostgreSQL and MySQL to Snowflake, ClickHouse, BigQuery, and more. It offers a simple no-code visual chart builder for end users, as well as a full SQL IDE for analysts, making it suitable for teams with varying skill levels.
In the context of ODS/data marts, Superset is typically used as follows: the data mart (or DWH) serves as the source, and Superset is the tool for building dashboards on top of it.
Example: a company builds sales and marketing data marts in ClickHouse. To give marketers and executives a convenient interface, they deploy Superset, connect it to these ClickHouse marts, and create interactive dashboards. Since Superset does not require its own data-loading layer (it works directly with the existing data infrastructure), it is easy to implement without reworking pipelines. This also means that Superset’s report performance depends on the optimisation of the marts — it is often paired with columnar databases (ClickHouse, Druid, etc.) for near-instant query responses.
It is worth noting that Superset can also connect to operational databases or an ODS for simple real-time reporting. However, its main strength is revealed when working with aggregated data from marts, where it can visualise large data volumes effectively. Superset supports query caching, which speeds up chart loading on repeat views.
Summary: Apache Superset is an efficient solution for building BI dashboards on top of your data marts, particularly when open-source technology and scalability are priorities.
Bitrix24 and data integration
Bitrix24 is a popular business management platform (cloud CRM + collaboration tools). Many SMB companies use Bitrix24 to manage deals, clients, and tasks. When implementing BI, a key question arises: how can you extract data from Bitrix24 for analytics? In our context, Bitrix24 data can be one of the sources feeding into an ODS or data marts.
By default, Bitrix24 offers basic built-in reports and a BI Builder with a set of predefined datasets (ready-made selections of leads, deals, tasks, etc.). These datasets allow users to build charts and KPIs directly within Bitrix24. However, for advanced analytics, companies often export data from Bitrix24 into external storage.
There are several approaches to integrating Bitrix24 with BI systems:
- Using ready-made Bitrix24 data marts. In the Bitrix24 Marketplace, there are applications known as “ready-made data marts” for BI. Essentially, these solutions extract Bitrix24 data and present it in a convenient structure for external analytics tools. For example, the “Ready-made marketing and sales dashboard” application builds marts for leads, deals, and advertising, making it easy to connect Power BI, Google Data Studio, Yandex DataLens, and others. Bitrix24 itself states that “ready-made data marts allow you to quickly build dashboards in Power BI, Looker Studio… Tableau.” In other words, Bitrix24 provides a pre-built mart (a set of tables/views) — all that remains is to connect your BI tool.
- Replicating Bitrix24 data into an ODS/database. Another approach is to use integration tools that synchronise Bitrix24 data to an external database in near real time. For example, the “Sync Master” application exports deals, leads, contacts, tasks, and other Bitrix24 objects into your own MySQL database, refreshing data on a schedule (as often as every hour or upon each change). This effectively creates an ODS containing Bitrix24 data — a CRM data copy that can be freely queried. The advantage is that the external database is not limited by Bitrix24’s API constraints and can be joined with other sources. Once synchronisation is set up, analysts can connect Power BI or Superset directly to this MySQL database and build reports (for example, in Power BI via the MySQL connector). Bitrix24-recommended connectors can export everything — from leads to calls — without restrictions, ensuring BI has up-to-date data.
- Combining with 1C and other systems. Bitrix24 is often used together with 1C (accounting/warehouse systems). In such cases, the ODS can serve as the integration point for data from Bitrix24 (CRM) and, for example, 1C (ERP). ETL tools or scripts periodically fetch new deals from Bitrix24 (via API or connectors) and new documents from 1C, storing them in the ODS. From there, a DWH or data marts are built. For instance, a “Sales” mart could merge invoices from 1C with deals from Bitrix24 to provide a complete picture. BI tools (such as Power BI) can then connect to the “Sales” mart to display the full cycle — from lead to payment.
Overall, Bitrix24 fits into the data architecture as a source of operational data, particularly for sales and customer interaction. In raw form, Bitrix24 data is often insufficient for business analytics, which is why it is extracted into separate storage (ODS/data marts). Thanks to ready-made solutions and its API, integration is relatively quick: there are connectors for many BI systems (Power BI, DataLens, Tableau, etc.). This allows even small businesses using Bitrix24 to build their own lightweight analytical mart and connect it to flexible visualisation tools.
Example BI architecture for an SMB/LMB company
Let’s consider a simplified example of a storage-and–data-mart architecture for a notional company, and how the components interact. Scenario: a mid-sized trading firm (e.g., an online shop) uses Bitrix24 for CRM and also has web analytics data (Google Analytics) and call-centre call data. Management wants to see operational sales and customer-activity metrics, while analysts need reports on profitability, the sales funnel, and marketing effectiveness.
SMB solution: the company decides to implement a small data platform with an ODS and data marts on a cloud DBMS.
- Data collection (ODS). An integration process is set up to export new deals, leads, and contact details from Bitrix24 every 15 minutes and load them into ODS tables (e.g., in PostgreSQL or MySQL in the cloud). Overnight, data from 1C — payments, stock balances, and invoices — is also ingested. In the ODS, this information is kept in operational tables, for example: Deals_ODS, Customers_ODS, Payments_ODS. The ODS is designed for fast updates — inserts/updates are frequent, older records are overwritten (e.g., 1–2 days of history). During the day the ODS holds the most current picture: open Deals_ODS and you’ll see all deals up to the last few minutes.
- Operational reporting. A lightweight BI dashboard tool is connected directly to the ODS (say, Power BI in DirectQuery mode or Superset). A “Sales Today” dashboard is configured for executives: it reads directly from the ODS — number of new leads today, value of paid orders today, current warehouse stock. Thanks to the ODS, these metrics refresh in near real time without burdening production systems. For example, at 14:00 the Sales Director sees the day’s revenue of £X — effectively online (with a ~15-minute lag). In short, ODS + a lightweight BI tool deliver operational KPIs.
- Loading the DWH/marts. At night, when load is minimal, an ETL batch runs: it takes a day-end snapshot from the ODS (all deals, customers, payments) and loads it into the historical store. Suppose the company opts to build marts directly rather than a heavily normalised core. The ETL can then bypass a complex DWH core and populate several target marts. For example, a SalesMart is formed — it contains sales facts (deal + payment + items) and dimensions: customer, date, product, marketing channel. Data is aggregated to required grains (e.g., daily sales by product category). A MarketingMart is also built, combining web analytics (sessions, conversions) with CRM (leads, statuses). During loading, data is cleansed and additional fields are computed (ad ROI, stage lead-times, etc.).
- Storage and access to marts. SalesMart and MarketingMart are stored, for instance, in a cloud ClickHouse for fast reads or in Azure SQL. They refresh nightly. In the morning, business users get updated reports. Power BI connects to SalesMart and provides interactive dashboards: profit by product, monthly sales trends, manager performance. Because the mart is optimised for this purpose, reports are fast. The marketing team uses Superset on MarketingMart to review consolidated ad spend and acquired leads — enabling CAC, channel conversion, and similar analyses. These queries are heavier, but the mart contains the required aggregates and several years of history, making analysis feasible.
- Company growth (LMB). As the company scales (more data, branches, products), the architecture evolves. A full DWH core appears for centralised storage of all detail. The ODS still serves operational needs (e.g., branches view daily metrics on their dashboards). Data marts proliferate: separate marts for Finance, Inventory, HR, etc. Each mart sources from the trusted corporate warehouse. To improve performance, marts may be hosted on dedicated servers or in the cloud (e.g., the Sales mart on a high-performance DBMS such as Vertica or ClickHouse; the Finance mart on SQL Server). This avoids contention between departments: marketing hits its own mart, finance its own. Data governance (quality, conformed dimensions, and metric definitions) is enforced at the DWH level so all marts are aligned.
End state: for a larger business, the architecture is multi-layered — an ODS for day-to-day operations, an EDW (Enterprise Data Warehouse) as the “single source of truth”, and multiple data marts for specific needs. BI tools (Power BI, Superset, etc.) sit on top of the marts to deliver user-friendly visualisations. This separation is a proven approach: operational reporting does not slow down analytical workloads (thanks to the ODS), while analytics scales and remains performant on large volumes (thanks to marts and aggregations). For smaller businesses, some components may be combined (e.g., one database acting as both ODS and mart), but the principles remain the same.
Data optimisation techniques: incremental loads, aggregations, SLA
When building ODS and data marts, it’s not enough to design the architecture — you must also ensure it runs efficiently. Below are key optimisation practices related to data updates and reporting performance.
- Incremental data updates. Full data reloads are time-consuming and resource-intensive. It’s better to implement incremental loading — extracting only changed data (new or updated records) from the sources since the last load.
In ODS, Change Data Capture (CDC) is often used to track changes in transactional systems and immediately replicate them into the ODS. For example, using triggers or transaction logs, new deals can be appended to the ODS table every hour instead of reloading the entire table. For data marts, incremental updates are equally important: a mart refresh may involve adding only the latest day’s data (and optionally recalculating the last X days if corrections occurred). This shortens the load window and ensures ETL completes overnight. Incremental pipelines also reduce load on both the sources and the warehouse DBMS.
- Pre-aggregations and metric calculation. To make BI reports fast, data marts often store pre-aggregated measures. For example, a mart may contain daily sales totals by category, averages, or pre-calculated customer ratings. This eliminates the need for BI tools to aggregate millions of rows each time a report opens — they simply query the ready-made values.
This is especially valuable for heavy metrics (e.g., marketing ROI, customer LTV) where calculations require multiple joins. It’s better to compute them during ETL and store them in the mart.
In some cases, separate aggregate marts or OLAP cubes (e.g., SSAS) are created to hold multidimensional aggregates. This can speed up reporting by orders of magnitude, although it increases storage volume. The balance is found experimentally — storing every possible aggregate is impractical, but the main ones are invaluable.
- Storage structure optimisation (indexes, partitioning). ODS, being OLTP-like in structure, often needs indexes on fields used in reporting queries (e.g., update date, status). Time-based partitioning in ODS can help drop old partitions regularly, since historical data isn’t needed there.
For large data marts, date-based partitioning is standard: for example, splitting sales facts by month. New month loads go into their own partition, and reports on recent data don’t scan unnecessary partitions.
Many marts are stored in columnar DBMSs, which are inherently optimised for aggregation. Moving a mart from PostgreSQL to ClickHouse, for example, can reduce report times from minutes to seconds due to columnar storage and compression. The key is to ensure the chosen technology matches query patterns (OLAP vs OLTP). Sometimes it’s optimal to split: keep ODS on Postgres (good for frequent small updates) and marts on Vertica or ClickHouse (best for large analytical queries).
- Managing data refresh SLA. In this context, an SLA (Service Level Agreement) defines the guaranteed interval for data updates and availability to users. Two aspects matter for BI:
— Refresh frequency (e.g., “data for yesterday must be in the mart by 8:00 a.m. daily”);
— Report response time;
To meet SLAs, you plan load schedules and capacity accordingly. For example, if operational reports must be no more than 1 hour old, ODS updates should run at least hourly. If a mart must refresh daily by morning, ETL should run overnight with enough buffer to finish even if there are delays.
Example: users began requesting near-real-time metrics because previous-day data by noon was no longer sufficient. IT responded by updating the ODS every 15 minutes, so all critical dashboard KPIs now refresh within 30 minutes. SLA control includes monitoring — alerts if a load overruns or a report takes too long — with corrective actions ranging from query optimisation to infrastructure scaling.
- Separating workloads and resources. Splitting ODS and marts already helps — it separates write-heavy operational flows from read-heavy analytical flows. Further optimisation comes from infrastructure separation: hosting marts on a dedicated server or cluster so heavy analytics doesn’t slow down data updates.
In the cloud, this is simpler — you can horizontally scale the storage for marts. A Data Mesh–style approach distributes ownership by domain: each domain mart is maintained by its own team on its own resources, removing bottlenecks from a central DB.
A practical tip: if users run heavy custom queries (e.g., directly against the ODS), restrict access to marts only, or add caching layers (materialised views, aggregate tables) for popular queries.
Result: with these techniques in place, data arrives on time and reports run quickly. For example, incremental loading and partitioning enabled a major retailer to process 150 million CDC events in 4 hours each night — even as history grew, further optimisations (like moving ETL to Spark) kept SLA compliance. In another case, pre-aggregating data cut quarterly report preparation from weeks to hours, delivering ready-to-use mart datasets via a “data store” concept.
Properly tuned ODS and data marts not only deliver the information business needs — they do so within agreed timelines and with minimal performance cost.
Frequently asked questions(FAQ) and misconceptions
Is an ODS the same as a staging area?
Not quite. A staging area is a temporary store for raw data, where information from source systems is copied without modification and held only for the duration of processing. An ODS (Operational Data Store), on the other hand, is an operational integrated store where data is not just temporarily parked but is regularly updated and used for reporting.
The difference is that a staging area is usually inaccessible to end users and serves purely as a technical buffer, whereas an ODS is accessible for operational business queries (on cleansed, up-to-date data).
That said, ODSs are often built from staging data: data first lands in staging tables, is cleaned, and then the “clean” current version is stored in the ODS. Some architects refer to the ODS as a “persistent staging layer” — conceptually, it sits between staging and the DWH. But the key difference is that an ODS has a business purpose (operational reporting), whereas staging is purely a technical step.
In modern data lake architectures, the analogue of an ODS is the silver layer — where data has been standardised into a reasonably usable form and is ready for consumption.
In short: an ODS can be seen as an evolved version of staging, enhanced with integration and direct-read capabilities for urgent analytics. The terms shouldn’t be confused — staging usually holds no history, whereas an ODS may keep a small “rolling” history of recent data changes.
Why do we need data marts if we already have a DWH?
Data marts improve the efficiency and usability of data from a DWH. While it’s theoretically possible to run all queries directly against the corporate warehouse, in practice this creates several issues:
- Complexity. A DWH often has a highly normalised structure (dozens of tables) that is difficult for business users to navigate. A mart presents a simplified, topic-specific model — for example, one fact table and several dimension tables — with clear, business-friendly field names. This reduces report development time and minimises errors.
- Performance. A mart contains only the relevant subset of the warehouse’s data for its subject area. Less data = faster queries and lower load on the warehouse. Marts often store pre-aggregated data, which further speeds up user queries.
- Access control. Marts make it easier to separate access — marketing works with its own mart, without seeing or overloading finance data, ensuring security and independence.
- Parallel development. Different teams can create and maintain their own marts without affecting the global DWH model.
In short, a mart is not duplication — it’s an optimised representation of the warehouse tailored to a specific need.
To use an analogy: a DWH is the full library of knowledge, while a mart is a carefully curated shelf of books for a particular problem. Without marts, a corporate DWH can become “heavy” to use and scale (any change can affect everyone). Marts solve performance and usability problems, complementing the DWH.
Of course, with very small data volumes, a mart might be unnecessary — but in most cases, once a DWH is in place, building marts for key business areas significantly increases the return on investment in the warehouse.