Global Market Insights estimates that cloud providers will host the majority of data warehousing loads by 2025. But don’t take their word for it. Gartner estimates that 30 percent of data warehousing workloads now run in the cloud and that this will grow to two-thirds by 2024. Just a few years ago in 2016 the figure was less than 7 percent, also according to Gartner.
None of this should be a surprise. Even the core data warehouse technology providers have seen this trend and are spending the majority of their R&D budgets to build solutions for public cloud providers. Moreover, the public cloud providers themselves have “company killing” products, such as AWS’s RedShift, a columnar database designed to compete with the larger enterprise data warehouse players.
Past impediments to building data warehouses and data marts on public clouds included a perception that security was still an issue on public clouds. Also petabytes of data were difficult to move from on-premises systems, considering that they had to be physically moved with portable storage systems. Finally, in many instances those running data warehouses on-premises could not find analytics tools to leverage locally and did not want to change.
The reality is that all these blocks have been removed. Most were removed well before the people building and maintaining data warehouses understood that public clouds were far ahead of most on-premises tools. Today the cloud has better security, performance, cost, and analytics.
The real killer of on-premises data warehouses has been the rise of artificial intelligence on the cloud and the ability to integrate AI with traditional data analytics. AI is not new, but the ability to pay next to nothing for data intelligence, collocated with your data on the public clouds is. AI is a game changer, considering that data warehouses are also a source of training data that could span decades and provide business insights not yet achievable.
Another trend that has pushed many on-premises data warehouses to the cloud is the rising need to leverage transactional data directly for analytics. Data warehouses have been famous for just taking snapshots of transactional data and rolling it up into a data warehouse for analytics. This means that the information could be several weeks if not months old. More and more executives are asking for real-time dashboards that consider current data from transactional systems, such as sales order entry.
This means we’ll need to use transitional data using data abstraction layers to emulate analytical databases and bind them with AI systems to make the solution even more compelling. It should come as no surprise that this technology can be found from public cloud providers, either through native services or through their ecosystems and marketplaces.
So, on-premises data warehousing is pretty much dead. It’s survived by cloud-based data analytics and database technology that is easily augmented by cheap AI and the ability to deal with data in more innovative ways, such as using transactional data.
The movement to the cloud does a few things. First, it allows enterprises to finally consolidate data on a centrally accessible platform. Second, the data is typically more secure in the cloud. Finally, those who need to leverage the data are no longer restricted to the limitations of on-premises technology. These should be the last three nails in the coffin.