Amazon QuickSight is cloud-powered, serverless, and embeddable business intelligence (BI) service that makes it straightforward to deliver insights to your organization. As a fully managed service, Amazon QuickSight lets you create and publish interactive dashboards that can then be accessed from different devices and embedded into your applications, portals, and websites.
When authors create datasets, build dashboards, and share with end-users, the users will see the same data as the author, unless row-level security (RLS) is enabled in the Amazon QuickSight dataset. Amazon QuickSight also provides options to pass a reader’s identity to a data source using trusted identity propagation and apply RLS at the source. To learn more, see Centrally manage permissions for tables and views accessed from Amazon QuickSight with trusted identity propagation and Simplify access management with Amazon Redshift and AWS Lake Formation for users in an External Identity Provider.
However, there are a few requirements when using trusted identity propagation with Amazon QuickSight:
- The authentication method for Amazon QuickSight must be using AWS IAM Identity Center.
- The dataset created using trusted identity propagation will be a direct query dataset in Amazon QuickSight. QuickSight SPICE can’t be used with trusted identity propagation. This is because when using SPICE, data is imported (replicated) and therefore the entitlements at the source can’t be used when readers access the dashboard.
This post outlines a solution to automatically replicate the entitlements for readers from the source (AWS Lake Formation) to Amazon QuickSight. This solution can be used even when the authentication method in Amazon QuickSight is not using IAM Identity Center and can work with both direct query and SPICE datasets in Amazon QuickSight. This lets you take advantage of auto scaling that comes with SPICE. Although we focus on using a Lake Formation table that exists in the same account, you can extend the solution for cross-account tables as well. When extracting data filter rules for the table in another account, the execution role must have necessary access to the tables in the other account.
Use case overview
For this post, let’s consider a large financial institution that has implemented Lake Formation as its central data lake and entitlement management system. The institution aims to streamline access control and maintain a single source of truth for data permissions across its entire data ecosystem. By using Lake Formation for entitlement management, the financial institution can maintain a robust, scalable, and compliant data access control system that serves as the foundation for its data-driven operations and analytics initiatives. This approach is particularly crucial for maintaining compliance with financial regulations and maintaining data security. The analytics team wants to build an Amazon QuickSight dashboard for data and business teams.
Solution overview
This solution uses APIs of AWS Lake Formation and Amazon QuickSight to extract, transform, and store AWS Lake Formation data filters in a format that can be used in QuickSight.
The solution has four key steps:
- Extract and transform the row-level security (data filters) and permissions to data filters for tables of interest from AWS Lake Formation.
- Create a rules dataset in Amazon QuickSight.
We use the following key services:
The following diagram illustrates the solution architecture.
Prerequisites
To implement this solution, you should have following services enabled in the same account
- AWS Lake Formation and
- Amazon QuickSight
- AWS Identity and Access Management (IAM) permissions: Make sure you have necessary IAM permissions to perform operation across all the services mentioned in the solution overview above
- AWS Lake Formation table with data filters with right permissions
- Amazon QuickSight principals (Users or Groups)
The below section shows how you can create Amazon QuickSight groups and AWS Lake formation tables and data filters
Create groups in QuickSight
Create two groups in Amazon QuickSight: QuickSight_Readers and QuickSight_Authors. For instructions, see Create a group with the QuickSight console.
You can then form the Amazon Resource Names (ARNs) of the groups as follows. These will be used when granting permission in AWS Lake Formation for data filters.
arn:aws:quicksight:<
:> <
>:group/< >/QuickSight_Readers arn:aws:quicksight:<
:> <
>:group/< >/QuickSight_Authors
You can also get the ARN of the groups by executing the Amazon QuickSight CLI command list-groups. The following screenshot shows the output.
Create a table in AWS Lake Formation
The following section is for example purposes and not necessary for production use of this solution. Complete the following steps to create a table in AWS Lake Formation using sample data. In this post, the table is called saas_sales
.
- Download the file Saas Sales.csv.
- Upload the file to an Amazon S3 location.
- Create a table in AWS Lake Formation.
Create row-level security (data filter) in AWS Lake Formation
In AWS Lake Formation, data filters are used to filter the data in a table for an individual or group. Complete the following steps to create a data filter:
- Create a data filter called
QuickSightReaderFilter
in the tablesaas_sales
. For Row-level access, enter the expressionsegment="Enterprise"
. - Grant the Amazon QuickSight group access to this data filter. Use the reader group ARN from the first step for SAML Users and groups.
- Grant the
QuickSight_Authors
group full access to the table. Use the reader group ARN from the first step for SAML Users and groups. - (Optional) You can create another table called
second_table
and create another data filter calledSecondFilter
and grant permission to theQuickSight_Readers
group.
Now that you have set up the table, permissions, and data filters, you can extract the row-level access details for the QuickSight_Readers
and QuickSight_Authors
groups and the saas_sales
table in AWS Lake Formation, and create the rules dataset in Amazon QuickSight for the saas_sales
table.
Extract and transform data filters and permissions from AWS Lake Formation using a Lambda function
In AWS Lake Formation, data filters are created for each table. There can be many tables in AWS Lake Formation. However, for a team or a project, there are only a specific set of tables that the BI developer is interested in. Therefore, choose a list of tables to track and update the data filters for. In a batch process, for each table in AWS Lake Formation, extract the data filter definitions and write them into Amazon S3 using AWS Lake Formation and Amazon S3 APIs.
We use the following AWS Lake Formation APIs to extract the data filter details and permissions:
- ListDataCellFilters – This API is used to list all the data filters in each table that is required for the project
- ListPermissions – This API is used to retrieve the permissions for each of the data filters extracted using the
ListDataCellFilters
API
The Lambda function covers three parts of the solution:
- Extract the data filters and permissions to data filters for tables of interest from AWS Lake Formation
- Transform the data filters and permission into a format usable in Amazon QuickSight
- Persist the transformed data
Complete the following steps to create an AWS Lambda function:
- On the Lambda console, create a function called
Lake_Formation_QuickSight_RLS
. Use Python 3.12 as the runtime and create a new role for execution. - Configure Lambda function timeout to 2 minutes. This can vary depending on the number of tables to be parsed and the number of data filters to be transformed.
- Attach the following permissions to the Lambda execution role:
- Set the following environment variables for the Lambda function:
Name Value S3Bucket Value of the S3 bucket where the output files will be stored tablesToTrack List of tables to track as JSON converted to string Tmp /tmp
The Lambda function gets the list of tables and S3 bucket details from the environment variables. The list of tables is given as a JSON array converted to string. The JSON format is shown in the following code. The values for catalogId
, DatabaseName
, and Name
can be fetched from the AWS Lake Formation console.
- Add a folder named
tmp
. - Download the zip file Lake_Formation_QuickSight_RLS.zip.
Note: This is sample code for non-production usage. You should work with your security and legal teams to meet your organizational security, regulatory, and compliance requirements before deployment. - For the Lambda function code, upload the downloaded .zip file to the Lambda function, on the Code tab.
- Provide necessary access to the execution role in AWS Lake Formation. Although the AWS Identity and Access Management (IAM) permissions are given to the Lambda execution role, explicit permission has to be given to the role in AWS Lake Formation for the Lambda function to get the details about the data filters. Therefore, you have to explicitly grant access to the execution role to limit the Lambda role to read-only admin. For more details, see Viewing data filters.
In the following sections, we explain what the Lambda function code does in more detail.
Extract data filters and permissions for data filters and tables in AWS Lake Formation
The main flow of the code takes the list of tables as input and extracts table and data filter permissions and data filter rules. The approach here is to get the permissions for the entire table and also for the data filters applied to the table. This way, both full access (table level) and partial access (data filter) can be extracted.
Transform data filter definitions in to a format usable in Amazon QuickSight
The extracted permissions and filters are transformed to create a rules dataset in Amazon QuickSight. There are different ways to define data filters. The following figure illustrates some of the example transformations.
The function transformDataFilterRules
in the following code can transform some of the OR and AND conditions into Amazon QuickSight acceptable format. The following are the details available in the transformed format:
- Lake Formation catalog ID
- Lake Formation database name
- Lake Formation table name
- Lake Formation data filter name
- List of columns from all the tables provided in the input for which the data filter rules are defined
See the following code:
The following figure is an example of the transformed file. The file contains the columns for both tables. When creating a rules dataset for a specific table, the records are filtered for that table pulled into Amazon QuickSight.
The function transformFilterandTablePermissions
in the following code snippet combines and transforms the table and data filter permissions into a flat structure that contains the following columns:
- Amazon QuickSight group ARN
- Lake Formation catalog ID
- Lake Formation database name
- Lake Formation table name
- Lake Formation data filter name
See the following code:
The following figure is an example of the extracted data filter and table permissions. AWS Lake Formation can have data filters applied to any principal. However, we focus on the Amazon QuickSight principals:
- The
QuickSight_Authors
ARN has full access to two tables. This is determined by transforming the table-level permissions in addition to the data filter permissions. - The
QuickSight_Readers
ARN has limited access based on filter conditions.
Store the transformed rules and permissions in two separate files in Amazon S3
The transformed rules and permissions are then persisted in a data store. In this solution, the transformed rules are written to an Amazon S3 location in CSV format. The name of the files created by the Lambda function are:
transformed_filter_permissions.csv
transformed_filter_rules.csv
See the following code:
Create a rules dataset in Amazon QuickSight
In this section, we walk through the steps to create a rules dataset in Amazon QuickSight.
Create a table in Lake formation for the files
The first step is to create a table in AWS Lake Formation for the two files, transformed_filter_permissions.csv
and transformed_filter_rules.csv
.
Although you can directly use an Amazon S3 connector in Amazon QuickSight, creating a table and making the rules dataset using an Athena connector gives flexibility in writing custom SQL and using direct query. For the steps to bring an Amazon S3 location into AWS Lake Formation, see Creating tables.
For this post, the tables for the files are created in a separate database called quicksight_lf_transformation.
Grant permission for the tables to the QuickSight_Authors group
Grant permission in AWS Lake Formation for the two tables to the QuickSight_Authors group. This is essential for Amazon QuickSight authors to create a rules dataset in Amazon QuickSight. The following screenshot shows the permission details.
Create a rules dataset in Amazon QuickSight
Amazon QuickSight supports both user-level and group-level RLS. In this post, we use groups to enable RLS. To create the rules dataset, you first join the filter permissions table with the filter rules table on the columns catalog
, database
, table
, and filter
. Then you can filter the permissions to include the Amazon QuickSight principals, and include only the columns required for the dataset. The objective in this solution is to build a rules dataset for the saas_sales
table.
Complete the following steps:
- On the Amazon QuickSight console, create a new Athena dataset.
- Specify the following:
- For Catalog, choose
AWSDataCatalog
. - For Database, choose
quicksight_lf_transformation
. - For Table, choose
filter_permissions
.
- For Catalog, choose
- Choose Edit/Preview data.
- Choose Add data.
- Choose Add source.
- Select Athena.
- Specify the following:
- For Catalog, choose
AWSDataCatalog
. - For Database, choose
quicksight_lf_transformation
. - For Table, choose
filter_rules
.
- For Catalog, choose
- Join the permissions table with the data filter rules table on the
catalog
,database
,table
andfilter
columns. - Rename the column group as
GroupArn
. This needs to be done before filter is applied. - Filter the data where column table equals
saas_sales
. - Filter the data where column group is also filtered for values starting with
arn:aws:quicksight
(Amazon QuickSight principals). - Exclude fields that are not part of the
saas_sales
table. - Change Query mode to SPICE.
- Publish the dataset.
If your organization has a mapping of other principals to a Amazon QuickSight group or user, you can apply that mapping before joining the tables.
You can also write the following custom SQL to achieve the same result:
- Name the dataset
LakeFormationRLSDataSet
and publish the dataset.
Test the row-level security
Now you’re ready to test the row-level security by publishing a dashboard as a user in the QuickSight_Authors
group and then viewing the dashboard as a user in the QuickSight_Readers
group.
Publish a dashboard as a QuickSight_Authors group user
As an author who belongs to the QuickSight_Authors
group, the user will be able to see the saas_sales
table in the Athena connector and all the data in the table. As shown in this section, all three segments are visible for the author when creating an analysis and viewing the published dashboard.
- Create a dataset by pulling data from the
saas_sales
table using the Athena connector. - Attach
LakeFormationRLSDataSet
as the RLS dataset for thesaas_sales
dataset. For instructions, see Using row-level security with user-based rules to restrict access to a dataset. - Create an analysis using the
saas_sales
dataset as an author who belongs to theQuickSight_Authors
group. - Publish the dashboard.
- Share the dashboard with the group
QuickSight_Readers
.
View the dashboard as a QuickSight_Readers group user
Complete the following steps to view the dashboard as a QuickSight_Readers
group user:
- Log into Amazon QuickSight as a reader who belongs to the
QuickSight_Readers
group.
The user will be able to see only the segment Enterprise.
- Now, change the RLS in AWS Lake Formation, and set the segment to be
SMB
for the QuickSightReaderFilter. - Run the Lambda function to export and transform the new data filter rules.
- Refresh the SPICE dataset LakeFormationRLSDataSet in Amazon QuickSight.
- When the refresh is complete, refresh the dashboard in the reader login.
Now the reader user will see SMB data.
Cleanup
Amazon QuickSight resources
- Delete the Amazon QuickSight dashboard and analysis created
- Delete the datasets
saas_sales
andLakeFormationRulesDataSet
- Delete the Athena data source
- Delete the QuickSight groups using the DeleteGroup API
AWS Lake Formation resources
- Delete the database
quicksight_lf
transformation created in AWS Lake Formation - Revoke permission given to the Lambda execution role
- Delete the
saas_sales
table and data filters created - If you have used Glue crawler to create the tables in AWS Lake Formation, remove the Glue crawler as well
Compute resources
- Delete the AWS Lambda function created
- Delete the AWS Lambda execution role associated with the lambda
Storage resources
- Empty the content of the Amazon S3 bucket created for this solution
- Delete the Amazon S3 bucket
Conclusion
This post explained how to replicate row-level security in AWS Lake Formation automatically in Amazon QuickSight. This makes sure that the SPICE dataset in QuickSight can use row-level access defined in Lake Formation.
This solution can also be extended for other data sources. The logic to programmatically extract the entitlements from the source and transform them into Amazon QuickSight format will vary by source. After the extract and transform are in place, it can scale to multiple teams in the organization. Although this post laid out a basic approach, the automation has to be either scheduled to run periodically or triggered based on events like data filters change or grant or revoke of AWS Lake Formation permissions to make sure that the entitlements remain in sync between AWS Lake Formation and Amazon QuickSight.
Try out this solution for your own use case, and share your feedback in the comments.
About the Authors
Vetri Natarajan is a Specialist Solutions Architect for Amazon QuickSight. Vetri has 15 years of experience implementing enterprise business intelligence (BI) solutions and greenfield data products. Vetri specializes in integration of BI solutions with business applications and enable data-driven decisions.
Ismael Murillo is a Solutions Architect for Amazon QuickSight. Before joining AWS, Ismael worked in Amazon Logistics (AMZL) with delivery station management, delivery service providers, and our customer actively in the field. Ismael focused on last mile delivery and delivery success. He designed and implemented many innovative solutions to help reduce cost, influence delivery success. He is also a United States Army Veteran, where he served for eleven years.