The True Last Mile
of The Data Journey
Settling The Debate Once And For All
There is a debate on where exactly the last mile of the data journey lies. Some say it’s at the Data Layer stage (like Data Warehouse or Data Mart), and some say it’s at the Analytics Data Prep stage. Are you part of those who think the data warehouse is the last mile of the data journey? Or rather, you are part of those who think it’s the dataset?
The truth is, is that nobody makes decisions based on datasets.
This is why we want to set the record straight. The true last mile of the data journey lies in the Data Analytics layer. Period. Analytics solutions are what are used to make business decisions, and organizations are now realizing that quality at the data layer is simply no longer enough.
This article will demonstrate why we believe the last mile of the data journey is in the Data Analytics layer and why organizations need to ensure the highest Analytics quality.
Understanding The True Last Mile
Of The Data Journey
The true last mile of the data journey.
They say the best place to start is at the very beginning so let’s start at the first mile of the data journey:
The very first mile is the transactional data i.e where your data is stored. This could be in ERPs, CRMs, Business Software, or even online. It’s the data at its most raw possible state. Now, it’s these systems that contain the information that is needed for decisions to be made. But, in reality, you wouldn’t want to simply connect it to your BI & Analytics platform straight away. Why? Because the data wouldn’t contain aggregations, business rules, data from other systems, and so on, thus reducing the level of the data quality.
To overcome this, the next mile of the data journey is the creation of a data warehouse and/or datamarts. It’s at this stage where the Business will express what answers and information they need to access in order to make decisions. The ETL team will then access the data, merge it, clean it and add the business rules followed by some thorough testing.
Here at Wiiisdom, we help organizations ensure the highest standard of Analytics data quality, but we often get companies saying “why would we need to verify the data in the Analytics layer when I know my data warehouse is well and truly governed and tested?” And that’s the issue, it’s not because your data warehouse is perfect that it means that when you consume a dashboard in Tableau or Power BI, for example, it will continue being accurate. People put too much trust in the data warehouse thus creating a false sense of security when consuming the data in the data analytics layer.
Analytics Data Prep
Once you’ve created your data warehouse and tested it, the next step is to connect your Analytics platform to this data warehouse, but as it contains so much information, realistically you will want a specific subset of it. For example, the data warehouse or datamart contains sales information for the entire company for the last 20 years, but you are only interested in Analytics on sales revenue for a specific set of brands and for a limited time period. Here is where Analytics Data Prep comes into play (i.e. in Tableau this is Tableau Prep and in Power BI this is Power Query).
Let’s say you only want Marketing data, the fact of just hand-picking columns of data out of hundreds, transforming it, and merging it with other external datasources will create an impact on your data. The risk of an impact increases the minute you take a subset of the data from your data warehouse. This Data Prep level is very rarely tested and if it is, it is done manually. You cannot solely rely on your data warehouse which probably is a trusted and tested source because, in effect, modern BI and Analytics solutions connect to datasets, requiring another stage of preparation and filtering, which in turn, creates new data, that needs to be tested.
Data Analytics / BI
Data Analytics is what we refer to as the actual data that is contained within your visualizations. It’s the data that is imported from the Data Prep/Dataset. This is yet another layer where the data can again be transformed through filters, new formulas, etc.
Example of a Tableau dashboard – the true last mile of the data journey.
Analytics data quality is important because there are so many opportunities in the data journey, specifically in the analytics layers for the data to be transformed (not always by IT people but by business users who may not have all the required skills) and therefore become untrustworthy, which is understandable but still a big issue! Organizations can no longer only rely on the data warehouse or even the Data Prep level because the quality is still impacted right up to the point of making decisions. Organizations need to be carrying out automated BI testing at every single stage to ensure the end-user can make trusted business decisions.
Discover our VP Product explaining why organizations cannot afford to ignore the true last mile of the data journey:
Why Do You Need To Ensure Quality
In The Last Mile?
Ensuring quality is not an option when it comes to the data analytics layer, it is a must because this is where people consume the data and make important decisions. As you’ve seen, the data has so many opportunities to be transformed from the very first mile to the finished dashboard in your Analytics platform. You simply cannot rely on the testing done at the data warehouse or data prep stage.
Let me take an analogy. When a vaccine is produced it goes through a lot of testing and verification before it leaves the manufacturer. Some might say this is the last mile of its journey when in fact it is not. The vaccines still need to be delivered to the pharmacies to be given to the public. This is the real last mile of the journey. We could even argue that the last mile is between the pharmacy and the individual patient, but you get the idea. Between the factory and the pharmacy, there is still a risk that the vaccines could be damaged or tampered with, so you need to be sure of the quality of the vaccine down to the very last mile of its journey.
And it should be the same case when it comes to data analytics, but organizations are forgetting the importance of quality in the last mile of the data journey. The solution to having a trusted data journey is AnalyticsOps; a framework that reuses the concepts from the DevOps and DataOps world and applies them to Analytics.
Ensure Analytics Quality Today
The true last mile of the data journey is the Data Analytics layer because this is where the business decisions are made. It may be the last mile of the data journey but it’s the first mile of the decision-making process, starting with business analysts and decision-makers. Ask yourself: are we considering Analytics quality as much as data quality? Can everybody 100% trust the dashboards or reports they consume to make decisions? Is data analytics properly tested?
At Wiiisdom we help organizations ensure the highest Analytics quality from the Data Prep layer right to where people consume the data in dashboards through automated testing inside Analytics. By implementing a thorough testing process, organizations can provide trust, ensure high user adoption and ultimately make the most reliable decisions for the company.