Skip to content

What is a Data Lake? Here’s How Data Lakes Solve Today’s Marketing Challenges


Despite the massive increases in data available to marketers today, most brands will only analyze a small fraction of the data available to them. In fact, according to the World Economic Forum, the universe will create 463 exabytes of data every day by 2025. To succeed, marketers need strategies for working with Big Data.

To help meet this challenge, many marketers are turning to data lakes. What are data lakes? Data lakes are capable of ingesting and processing massive amounts of structured and unstructured data through a flexible, cloud-based platform, allowing marketers to make sense of both their online and offline data – though that’s just one way in which data lakes are helping marketers overcome some of their biggest challenges.

As excerpted from Viant’s new white paper Demystifying Data Lakes, read on to learn more about how marketers are leveraging data lakes today.

Data Lakes Measure Across Multiple Platforms and Partners

Brands have been using Demand Side Platforms (DSPs) to programmatically buy digital advertising across a variety of publishers, often leveraging multiple platforms and partners. The primary problem is that this type of distributed approach doesn’t provide any consolidated reporting that would give marketers a complete view of their advertising efforts for things like global reach and frequency tracking. Additionally, most partners are measuring their own performance, which immediately brings objectivity into question.

Data Lakes Reach People Across All Their Devices with Certainty

There has always been a struggle to reach people across all of their devices, which has led to the collection of digital identifiers including cookies, IP addresses, device identifiers, geolocation data, home address and email address in order to accurately link devices back to their owners. Even with access to these data points, it is extremely difficult to connect them together accurately and with persistency. Many providers will offer probabilistic solutions that use algorithms to determine the likelihood that two devices belong to the same person. The only way to accurately connect the device to the individual or household is by working with providers who are truly people-based with deterministic matching.

The Differences Between Data Lakes and Data Warehouses

What are the differences between data lakes and data warehouses? Below, a quick breakout of the key differences between the two:

DATAStructuredStructured and unstructured
FLEXIBILITYFixed configuration, requires data engineering for large changesHighly flexible, configure data as needed
COSTCan be expensiveDesigned for low cost
USERSData professionalsData professionals and/or data scientists

Data Lakes Bridge the World of Offline and Online Data

Effectively bridging the world of online and offline advertising is the holy grail for marketers working with Big Data, but for a long time, it was not possible. That is because the offline world uses different identifiers from the digital world. Offline data typically is organized by name and address, loyalty card accounts and phone numbers. Online identifiers include IP address, cookie ID, device ID and email address. People-based solutions use both the online and offline identifiers, often using a combination of offline identifiers like name, street address and email address to link the two worlds together and give brands a complete view into the customer journey to create actionable insights and strategies.

Data Lakes Help Solve for Lack of Transparency

Traditionally, marketers received their attribution measurement from each advertising platform through a prepared report, such as a PowerPoint presentation or Excel spreadsheet. Until recently, marketers have been forced to trust the accuracy of the attribution methodologies, since they often do not have access to raw data logs to validate the reporting. By having all their structured and unstructured data housed in a central location, marketers can have unbiased insight into all of their marketing initiatives within a single platform. They now have the ability to validate reporting and gain transparency into:

  • True reach and frequency at the individual and household level
  • Unique reach across all advertising platforms
  • Performance broken out by multiple variables such as device, channel, publisher and format
  • Actual return on ad spend of all marketing initiatives

The more data you funnel into your data lake platform, the greater insight and transparency you will receive across all of your marketing initiatives.

Want to Learn More About Data Lakes?

All of these challenges have proven to be quite difficult, but are imperative for today’s modern marketers. Being able to use both online and offline data for targeting, as well as accurately measuring the complete customer journey from the first touchpoint to conversion (either online or in-store), is necessary in today’s omnichannel world. But to accomplish this, marketers must understand exactly what a data lake is, as well as a deep dive on how it compares to data warehouses or data management platforms (often called DMPs).

To learn everything you need to know to discern whether data lakes can help solve your specific challenges, download Viant’s latest guide, Demystifying Data Lakes.


Sign up to get Viant news and announcements delivered straight to your inbox.

Sign up to get Viant news and announcements delivered straight to your inbox.