Skip to Content

Data Apps Demystified

Blog | August 5, 2022 | By Amit Phatak

TL;DR: Traditional approaches brought data from the data cloud to the applications for their use. Each application augmented, transformed, and shaped this data for their own use. Data apps are purpose built to bring applications to the data. They promote faster development times, cost efficiencies, single source of truth, and total control over data from security and governance standpoint. While they have their limitations today, they are here to stay and be an integral part of the modern data and application development stack.

The trend of brining all data into one place – whether it is a Data Warehouse, Data Lake or Lake House is on the rise due to the proliferation of the cloud. For simplicity let us call this one place where all data resides the Data Cloud. There has been a paradigm shift in the way data is brought to this data cloud. The move was to go from Extract-Transform-Load (ETL) to Extract-Load-Transform (ELT). Essentially get all kinds of data in its raw form into the data cloud and then deal with it – storage is cheap now, getting raw data to transform it another way at a later point in time will be expensive.

Bringing in raw data means that different use cases can access data at various levels of transformation. Data shares and exchanges are also on the rise which provide secure and scalable ways to share data internally and externally. Each organization is striving to get all their data – structured, semi-structured and unstructured into the data cloud so that Business Intelligence (BI) and Artificial Intelligence/ Machine Learning (AI/ ML) can be applied on top of it to derive insights that can help in decision making.

Figure 1: Moving Data to Applications from the Data Cloud

Software as a Service (SaaS) applications lead the way here with cloud-based platforms for both BI and AI/ML. On premise software is being replaced by consumption and workload based SaaS offerings promising all the benefits of the cloud. These require the setting up of data pipelines to convert the raw data into forms that can be consumed by these applications. Reverse ETL is a terminology used to describe such an use case where the data in the data cloud is transformed and fed into a SaaS application for its usage. In some cases the software or application itself provides ways to augment, transform and shape the data for use.

All this, however, ends up fragmenting the data. So while you started with a single data store, you now have a plethora of transformed silos of this data. Worse, these do not talk to each other, and the transformations need to be repeated in each of the applications separately and consistently else you will end up with different definitions of metrics used to derive insights, used to make decisions.

Wouldn’t it be just simpler to use the data in the data cloud directly? This is the revolution data apps aim to bring in. Instead of having to take the data to where the different applications reside, you bring the applications to where the data resides. Create application to run off the data in your data cloud! This approach solves the fragmented data and inconsistent metrics issues along with a whole host of others issues. 

Figure 2: Moving Applications to the Data in the Data Cloud

For instance, security and governance that are the stickiest issues when it comes to data, are dealt with in an elegant manner with the data apps approach. The data never leaves the secure environment of your data cloud. There is no time spent on moving it to different applications and no risks associated with the movement. Data gets accessed in a secure and governed way from a single source of truth. The data team always maintains full control over the data.

This approach also lends itself well to the implementation of a semantic layer or/ and feature store which solves the data and metrics consistency issue by having an universal layer for declaring the definitions of data, metrics and features so that each application uses the data consistently. Along with consistency you also get speed of development as the data is pre modelled and shareable across applications. 

With some data warehouses, data lakes and lake houses providing native application support, integration between the data and application will also be simpler and more secure than earlier when it relied on Application Programming Interfaces (APIs) for this. Also options to code in programming languages of your choice and support for multiple languages means less time spent retraining your teams and more time spent developing applications in your favourite programming languages. Importantly, the approval process to try out SaaS applications will be a lot faster and less complex than earlier due to this native application support. 

Other than all these the biggest factor could turn out to be the cost savings achieved by use of data apps. No data movement, reuse of existing data, reuse of the data platform in case of native applications and vendors having to build only the code as the modelling is already done, all lead to significant cost reduction. In some cases based on the use case it could lead to elimination of intermediate managed applications as well which add to the cost savings. 

As you can see data apps come with some big positives. However, it is not all rosy. Data apps are good for use cases where you read the data from the data cloud. Any use case where you need to write back becomes a problem because these are not backed by transactional databases. It takes time for the data written back to be processed and to reflect in the data app. These times are not consistent with what application users in the traditional sense are used to – the order of a few milliseconds. Another use case where data apps may not be best suited are real time applications.  

For such use cases a transactional database might be needed to store the application data. In some cases we also have some data warehouses, date lakes or lake houses allowing for a combination of online analytical processing (OLAP) and online transaction processing (OLTP) to co-exist to solve such use cases within the data cloud.  

There is newer terminology at play to define these various kinds of applications. Traditional applications are being referred to as managed applications to represent applications with their own data stores and differentiate them from data application which use the data cloud directly which are being called connected applications. 

For the near future then, there is a good possibility of both traditional applications and data applications occupying the landscape. However, as the data clouds mature, the proliferation of data apps seems like a pretty good bet. There already are so many data apps out there in the market today that this seems like something that is here to stay and change the modern data and application development stack. 

author image
About the Author
A Data, Analytics and AI/ ML leader with a focus on providing data enriched offerings - Analytics 3.0
Amit Phatak | Vice President – Business Development & Go To Market | USEReady
Back to top