Why do I need a data warehouse?
Do any of these scenarios sound familiar?
If so, you’ll probably want to consider a data warehouse.
Quick definition: a data warehouse is a data management system that:
- gathers data from a variety of sources (order entry, CRM, ERP etc.),
- combines, and
- transforms that data for reporting, querying and analytics.
Common use cases for data warehousing
Use case – Critical Data spread across multiple systems:
Scenario: your sales team is using a CRM to track their customer interaction and deals, sales orders and accounting transaction are tracked in your ERP, and the manufacturing department employs a custom solution to meet their needs.
You need reporting that analyzes the lifecycle from prospect to closed deal, and then ordering, manufacturing, all the way through payment receipts.
A data warehouse gathers the information from these systems and organizes and relates them in one place to allow for advanced reporting and analysis.
Use case – Historical trending:
Most systems used to run a business focus on the current values, not the historical values. This makes analyzing historical trends or changes to pricing, product naming etc. difficult over time.
A data warehouse can be configured to track those changes to allow for trending and analysis.
Use case – Business Intelligence platforms:
The rise of business intelligence tools like Power Bi, Tableau and Domo enables the rapid development of BI dashboards and reports. Out of the box these tools can connect to a variety of data sources to generate reports across systems. However, tying this data together in real time using these tools can be challenging and slow. The ETL (Extract, Transform, and Load) process of a data warehouse does the heavy lifting when it comes to connecting the data so the reporting system can be used to display useful and fast performing metrics easily.
Use case – Data analysis:
From one-off query requests, to projections, and even projection accuracy analysis, data warehouses allow for deeper dives into operational data.
For example: A company runs a one-time project over a 3 month period. During this period the project will impact data in multiple systems across the organization. Since the project will rarely if ever happen again, there is no advantage to developing a dashboard or reports to track the progress. However, there is interest in querying the data to identify the project’s impact and results. Since this data is tracked in the data warehouse, queries can be run against the warehouse to identify the results, rather than exporting data from various systems and joining using pivot tables, etc.
Use case – Performance concerns:
Many times, running complex reports can take a long time, or worse, can impact the performance of the source system, slowing transaction times for a number of users.
Data warehouses ingest the source data, transform it, and then load it into services designed to run reports. This provides better performance times on the reports being run, and doesn’t negatively impact the users of the operational systems.
Who benefits from a data warehouse?
People at all levels of an organization can benefit from a data warehouse:
- Management: A well-executed BI dashboard driven by a data warehouse provides management teams both big-picture and detailed views of the operations in their company. It also allows that information to be “sliced and diced” by region, department, product, customer or any way that helps to make informed decisions.
- Data scientists: With all the data in one place, the data science team can focus their energy on analysis and modeling, rather than on gathering data from the various operational systems that store the data they need. In addition, the data warehouse is designed for reporting and analysis, so it’s fine-tuned to deliver analysis results faster than the individual systems can.
- Sales: Sales teams can obtain data relating to their prospects’ or customers’ sales history, trends, closing rates, etc.
When is a data warehouse NOT needed?
A data warehouse won’t address all business problems. Here are some scenarios where a data warehouse is not needed or should be put off until later:
- Overkill: A small business that uses one application to run their business and the reports it provides are more than adequate for running the business
- Lack of access: When the data from a critical system is not accessible. Some software systems may not allow the data to be accessed. In this case it will not be possible for the data warehouse to ingest and transform the data.
- Bad Data: If the data in your systems is unreliable or lacks consistency, the effort to clean it in a data warehouse may be an obstacle. In this situation it may be beneficial to address the bad data problem before implementing a data warehouse solution.
How do I turn the data into valuable information I can use to run my business?
It’s true that the warehouse stores data, not insights. You’ll need additional functionality to harvest valuable and actionable information from the data warehouse:
- A BI platform: common business intelligence software, like Microsoft’s Power BI, Tableau or Domo can be connected to a data warehouse and used to design and develop BI dashboards, advanced reports, and KPIs to provide the critical information found in the data.
- Data science tools: data science teams employ a unique set of tools, languages like Python or F#, that allow them to efficiently analyze the data in a warehouse, develop one-off queries, and build models for projecting things such as market trends, inflation and sales.
Why can’t I just use my existing systems for analysis?
Sometimes you can.
Most business systems include reporting modules that help paint a picture of what’s going on in an organization. The issue with these reports is that they’re often limited:
- They’re only able to report on data found within that system.
- They traditionally can’t report on data trends over time,
- and they can sometimes drain resources from the mission-critical system that’s being used to operate the business.
Where are data warehouses hosted?
A data warehouse can be hosted either on-prem (on your office network) or in the cloud. Many cloud providers offer powerful tools like Azure Data Factory or AWS’ Data Pipeline that can be used to ingest and transform your data. Once transformed, you can store your data in cloud storage services, or load them into on-prem solutions like SQL server.
I think I need a data warehouse. How can I get started?
Want to talk with our team about your data warehousing needs?
Here are some questions you may want to start thinking about ahead of our conversation:
- What systems are currently in place in your organization?
- What KPIs’ or metrics are you interested in?
- Why are the reports from your current systems insufficient?