sábado, novembro 23, 2024
HomeBig DataKiwi.com's Path to Data Democratization, Powered by Data Products

Kiwi.com’s Path to Data Democratization, Powered by Data Products


Driving Accessibility with Governed Data Products, Seamlessly Managed in Atlan

At a Glance

  • Kiwi.com, a global travel tech company powering over 100 million searches per day for the perfect travel route, aimed to improve access to data
  • Choosing Atlan as their modern data catalog, Kiwi.com streamlined the aggregation, curation, and monitoring of thousands of data assets, consolidating them into 58 discoverable data products for simpler and more efficient consumption
  • This data product approach has reduced the workload on their central engineering team by 53% and increased data user satisfaction by 20%, since onboarding 20+ teams to responsibly share and use data across their organization

Powering over 100 million daily searches for optimal travel routes and supporting 22.9 billion kilometers worth of travels in 2023, data is crucial to the success of travel tech company Kiwi.com. Their innovative algorithm enables customers to discover and book affordable flights often missed by other search engines, with billions of price checks conducted daily across 95% of global flight content.

For Kiwi.com employees that work every day to improve their product, experience, and operations, easy access to trustworthy data is crucial. And leading the charge to improve the way her colleagues use that data is Martina Ivanicova, Data Engineering Manager. 

At the 2024 Gartner Data & Analytics Summit in London, Martina joined Atlan to share her experience and lessons learned from bridging the gap between data producers and consumers, introducing data products, and leveraging Active Metadata to deliver the trusted, understandable data that drives the next great business decision. Reflecting on her journey, she posed a crucial question:

Let’s step into the shoes of a new data analyst in our company. Where do I find data in the first place?”

Journey to the Center of the Data Stack

Kiwi.com operates on a microservice architecture, with services organized by business domain. Leveraging Google Cloud Platform, they batch-extract data to BigQuery, their data warehouse, while Dataflow manages real-time data processing, storing results in Google Cloud Storage or directly in BigQuery. Additional transformations ensure data is available in Looker, their business intelligence tool. Metadata for all components is managed in Dataplex, a GCP service housing vast amounts of their data assets.

While this setup might seem straightforward, it operates on a massive scale, orchestrating and managing an enormous volume of data across the entire ecosystem. 

“We have 100 Postgres databases, tens of thousands of tables, thousands of BigQuery data sets, tens of thousands of BigQuery tables, hundreds of Airflow DAGs, and thousands of Looker objects,” Martina shared.

Finding Focus with Data Products

For Data Analysts at Kiwi.com, accessing accurate and reliable information is crucial for informed decision-making. However, even with data consolidated in one place, finding the right data asset remained challenging. This prompted Martina and her team to look beyond simple accessibility, making it simpler to navigate their complex data ecosystem.

If you were a Data Analyst, and you went to Dataplex to search for the keyword ‘Destination’, you would find more than 200,000 entries. How would you find data for your use case? The answer is that you are not looking for data, you are looking for a data product.”

Martina and her team embraced the concept of treating data as a product, ensuring it is reliable, user-friendly, discoverable, and valuable, making it a key operational asset. This approach led to the development of a governance framework for data products, establishing key standards to scale and manage data quality, documentation, security, and related processes.

A Governed Data Product Model 

Martina and her team’s data product approach was designed to focus on their most important data assets. They divided these assets into tiers, the top tier being the most critical for strategic decisions and financial reporting. Most of their data consumers’ use cases depended on this top-tier data, driving stringent requirements for reliability and accessibility, and inspiring a six-part framework to create, manage, and improve these data products.

Ownership: Establishing clear ownership and accountability for every data product at both product and technical levels.

Documentation: Making every data product easy-to-find and understand, with proper context and information that is discoverable and understandable.

Quality: Monitoring the freshness, accuracy, and reliability of each data product using an in-house observability platform.

Architecture: Optimizing the creation, modification, storage, and access of data products with a sound technological setup.

Security: Adhering to security and privacy standards, ensuring data protection, and maintaining regulatory compliance.

Processes: Implementing procedures to maintain data reliability, including data contracts, ensuring that all consumers and producers agree on specific SLAs and SLOs, instituting incident and change management protocols.

While Martina’s team could support this framework by blending a powerful set of existing tools, including Terraform, BigQuery, Dataplex, Monte Carlo, and Looker, managing data products would mean jumping from tool to tool as data flowed from producer to consumer. Evaluating the Active Metadata Management market for a solution that would unify these disparate tools into a single view, Kiwi.com chose Atlan.

Seamlessly integrating with their data stack, Martina and her team use Atlan to ensure their data products are accessible and understandable, performant and reliable, and align perfectly with their high standards for quality and security.

Atlan was flexible enough to provide an umbrella over all the metadata we were trying to track. It also helped us evaluate how well our data products perform against specific criteria, ensuring they meet required standards.”

A Streamlined Data Landscape

By moving from searches that returned thousands of results across a complex data estate, to building easily discoverable, governed data products, Martina and her team have significantly improved the capacity of the Data Engineering function, and are driving user satisfaction to unprecedented heights.

Currently, their team manages 58 of these top-tier data products — a carefully selected set that has focused and streamlined their workload. This landscape has since been organized by domain, curating data assets and ensuring clear ownership and documentation.

This approach has empowered data teams to take full responsibility for their data, fostering a culture of accountability. “We managed to organize the data landscape so that it matches our domains. Data is owned by the teams sitting in these domains,” Martina explained.

Through this transformation, Kiwi.com successfully onboarded over 20 teams to responsibly share and use data across the organization. Routine internal surveys revealed a 20% increase in data user satisfaction, showcasing the significant positive impact of this initiative.

The “Less is More” Lesson

Instead of digging through 272,000 hard-to-parse results, a Kiwi.com analyst can now find exactly what they need with Atlan. In one simple, easy-to-use interface, they’re offered the full picture—from ownership to related assets, data contracts, SLAs, and any data quality issues. 

Sharing the most important lessons learned from her experience, Martina expresses that access to vast amounts of data is just the first step to data democratization, and not the destination. With curiosity and focus on the needs of their data consumers, yielding the highest value from data means delivering an experience that delivers discoverable, understandable, and reliable data, instantaneously at their fingertips.

People say they want access to all the data, all the time. Take this with a grain of salt. It’s not enough that we break silos and connect all the data sources across the company. It’s not enough that we offer self-service analytics tools to the company. It’s important that we offer reliable and discoverable data, and that less is more.

Photo by kychan on Unsplash

RELATED ARTICLES
- Advertisment -
Google search engine

Most Popular

Recent Comments