sexta-feira, novembro 22, 2024
HomeBig DataTala: An Active Metadata Pioneer - Atlan

Tala: An Active Metadata Pioneer – Atlan


Supporting a World-class Documentation Strategy with Atlan

The Active Metadata Pioneers series features Atlan customers who have completed a thorough evaluation of the Active Metadata Management market. Paying forward what you’ve learned to the next data leader is the true spirit of the Atlan community! So they’re here to share their hard-earned perspective on an evolving market, what makes up their modern data stack, innovative use cases for metadata, and more.

In this installment of the series, we meet Tina Wang, Analytics Engineering Manager at Tala, a digital financial services platform  with eight million customers, named to Forbes’ FinTech 50 list for eight consecutive years. She shares their two-year journey with Atlan, and how their strong culture of documentation supports their migration to a new, state-of-the-art data platform.

This interview has been edited for brevity and clarity.


Could you tell us a bit about yourself, your background, and what drew you to Data & Analytics?

From the beginning, I’ve been very interested in business, economics, and data, and that’s why I chose to double major in Economics and Statistics at UCLA. I’ve been in the data space ever since. My professional background has been in start-ups, and in past experience, I’ve always been the first person on the data team, which includes setting up all the infrastructure, building reports, finding insights, and lots of communication with people. At Tala, I get to work with a team to design and build new data infrastructure. I find that work super interesting and cool, and that’s why I’ve stayed in this field.

Would you mind describing Tala, and how your data team supports the organization?

Tala is a FinTech company. At Tala, we know today’s financial infrastructure doesn’t work for most of the world’s population. We’re applying advanced technology and human creativity to solve what legacy institutions can’t or won’t, in order to unleash the economic power of the Global Majority.

The Analytics Engineering team serves as a layer between back-end engineering  teams and various Business Analysts. We build infrastructure, we clean up data, we set up tasks, and we make sure data is easy to find and ready to be used. We’re here to make sure data is clean, reliable, and reusable, so analysts on teams like Marketing and Operations can focus on analysis and generating insights.

What does your data stack look like?

We mainly use dbt to develop our infrastructure, Snowflake to curate, and Looker to visualize. It’s been great that Atlan connects to all three, and supports our process of documenting YAML files from dbt and automatically syncing them to Snowflake and Looker. We really like that automation, where the Analytics Engineering team doesn’t need to go into Atlan to update information, it just flows through from dbt and our business users can use Atlan directly as their data dictionary.

Could you describe your journey with Atlan, so far? Who’s getting value from using it?

We’ve been with Atlan for more than two years, and I believe we were one of your earlier users. It’s been very, very helpful.

We started to build a Presentation Layer (PL) with dbt one year ago, and previously to that, we used Atlan to document all our old infrastructure manually. Before, documentation was inconsistent between teams and it was often challenging to chase down what a table or column meant.

Now, as we’re building this PL, our goal is to document every single column and table that is exposed to the end user, and Atlan has been pretty handy for us. It’s very easy to document, and very straightforward for the business users. They can go to Atlan and search for a table or a column, they can even search for the description, saying something like, “Give me all the columns that have people information.”

For the Analytics Engineering team, we’re normally the curator for that documentation. When we build tables, we sync with the service owners who created the DB to understand the schema, and when we build columns we organize them in a reader-friendly manner and put it into a dbt YAML file, which flows into Atlan. We also go into Atlan and add in Readmes, if they’re needed.

Business users don’t use dbt, and Atlan is the only way for them to access Snowflake documentation. They’ll go into Atlan and search for a particular table or column, can read the documentation, and can find out who the owner is. They can also go to the lineage page to see how one table is related to another table and what are the codes that generate the table. The best thing about lineage is it is fully automated. It has been very helpful in data exploration when someone is not familiar with a new data source.

What’s next for you and your team? Anything you’re excited about building?

We have been looking into the dbt semantic layer in the past year. It will help further centralize business metric definitions and avoid duplicated definitions among various analysis teams in the company. After we mostly finish our presentation layer, we will build the dbt semantic layer on top of the presentation layer to make reporting and visualizations more seamless.

Do you have any advice to share with your peers from this experience?

Document. Definitely document.

In one of my previous jobs, there was zero documentation on their database, but their database was very small. As the first hire, I was a strong advocate for documentation, so I went in to document the whole thing, but that could live in a Google spreadsheet, which isn’t really sustainable for larger organizations with millions of tables.

Coming to Tala, I found there was so much data, it was challenging  to navigate. That’s why we started the documentation process before we built the new infrastructure. We documented our old infrastructure for a year, which was not wasted time because as we’re building the new infrastructure, it’s easy for us to refer back to the old documentation.

So, I really emphasize documentation. When you start is the time and the place to really centralize your knowledge, so whenever someone leaves, the knowledge stays, and it’s much easier for new people to onboard. Nobody has to play guessing games. It’s centralized, and there’s no question.

Sometimes different teams have different definitions for similar terms. And even in those cases, we’ll use the SQL to document so we can say “This is the formula that derives this definition of Profit.”

You want to leave very little room for misinterpretation. That’s really what I’d like to emphasize.

Anything else you’d like to share?

I still have the spreadsheet from two years ago when I looked for documentation tools. I did a lot of market research, looking at 20 different vendors and every tool I could find. What was important to me was finding a platform that could connect to all the tools I was already using, which were dbt, Snowflake, and Looker, and that had a strong support team. I knew that when we first onboarded, we would have questions, and we would be setting up a lot of permissions and data connections, and that a strong support team would be very helpful.

I remembered when we first worked with the team, everybody that I interacted with from Atlan was super helpful and very generous with their time. Now, we’re pretty much running by ourselves, and I’m always proud that I found and chose Atlan.

Photo by Priscilla Du Preez 🇨🇦 on Unsplash

RELATED ARTICLES
- Advertisment -
Google search engine

Most Popular

Recent Comments