Work While You Wait for that Data Warehouse
When Acme Hospital generates transactional data by logging insurance payments and patient symptoms, that data comes in looking like the below. It’s messy. It’s unwieldy. It’s probably got a ham sandwich lodged in it somewhere.
Transactional data needs the loving care of a devoted DBA before it can be of any use to business professionals because it begins its life cycle as a sprawling “normalized” heap of tables and has to go through at least some denormalization in order to become reportable. Denormalization involves introducing data redundancy, joining tables, translating coded values, cleaning up user errors, adding aliases, and performing other tasks that make the data easier to handle.
Once the data has been extracted from its original storage, transformed into something more usable, and loaded into a new database reserved for the purpose, we have our ETL-generated data warehouse. The data now looks something like this:
Which is, by all accounts, preferable. But here’s the catch: data warehouses take time to build. According to David Stodder, Senior Director of Research for BI at The Data Warehousing Institute (TDWI), “[D]ata warehouse development cycles are long — they could be 18 months.” A major reason for this is that well-built data warehouses should reflect a company’s business logic, providing “long-term historical storage of data from disparate operational systems, while facilitating auditing, data lineage tracing, loading speed and resilience to change.” It can be difficult to coordinate these objectives, but they’re important to get right, so the development period has a tendency to drag out.
But what if the business professionals you serve can’t wait 1-2 years to access their data? What if you’re a SaaS company only a few months out from its launch deadline?
Exago BI connects to non-tabular data sources like stored procedures, .NET assemblies, and OLAP cubes so that you can access a more denormalized version of your transactional data during the warehousing period. In select cases, access to these types of sources may eliminate the need for a data warehouse altogether!
Below is a list of data source types commonly used for this purpose.
#1: OLAP Cubes
OLAP, or online analytical processing, performs multidimensional analysis of data and can respond to queries quickly using pre-calculated data sets called data cubes. Exago can retrieve “massaged” data sets from OLAP cubes through a SQL-like querying language called MDX (Multidimensional Expressions). Simply including the MDX query as part of a Custom SQL Object in Exago will retrieve the calculated data set.
#2: Table-Valued Functions
Table-Valued Functions (TVFs), available only in MS SQL Server, are programmable data objects that can join to other objects in the database. TVFs, written using Transact-SQL, can be used to manipulate transactional data.
#3: Custom ODBC Drivers
ODBC (Open Database Connectivity) drivers provide an alternate and standardized means of connecting to databases. They function as an ETL bridge between Exago BI and unstructured data sources that support that connection, but they can also help optimize the SQL being passed to a standard RDBMS.
#4 .NET Assemblies
.NET Assemblies are DLLs that can be used to perform calculations on transactional data and combine data sets from different types of data sources. Those calculations can be performed either in the DB or on the server; it all depends on how you write the assembly.
#5 Stored Procedures
Stored procedures are programmable data objects common to many relational database management systems, but because they cannot be joined to other objects at the database level, their performance is trickier to optimize.
#6 Web Services
Like .NET assemblies, Web Services can be used to perform intermediate calculations on raw data. In this case, however, that intermediary step involves web service protocols and an additional server, which has the potential to slow performance.
Whichever method you choose, the goal is to manipulate the data enough to report off of it while your data warehouse project is underway. Select the method that will work best in the interim and look forward to the ease and simplicity your warehouse will afford!