Skip to main content

Athena User-Level Connector with RBAC Integration

Overview

This guide provides step-by-step instructions for setting up an Athena connector with Role-Based Access Control (RBAC) and integrating it with Abacus.AI's ChatLLM platform. This configuration enables secure, user-level access to Athena data through a conversational AI interface, where each user authenticates individually via Amazon Cognito.

Prerequisites

  • An organization-level Athena connector with RBAC enabled (see Athena Connector Setup)
  • An Amazon Cognito User Pool with users provisioned
  • An Amazon Cognito App Client configured for OAuth 2.0 with the redirect URI https://abacus.ai/oauth/callback
  • An Amazon Cognito Identity Pool that maps authenticated Cognito users to IAM roles
  • Appropriate IAM roles and policies for user-level Athena access

Access Control Options

Before diving into setup, decide which access control model fits your needs. There are three options:

Option A — Role-Level (Group-Based) Access

All users in the same Cognito group share the same IAM role and see the same data scope. No per-user filtering.

  • How it works: Cognito User Groups → IAM roles → table/database-level permissions via IAM policies or Lake Formation tags.
  • Best for: Shared dashboards where entire departments access the same datasets (e.g. all finance analysts see all finance tables).

Option B — User-Level (Email-Based) Access

Per-user row isolation using the authenticated user's email address. Users in the same Cognito group can see different rows.

  • How it works: Cognito groups → IAM roles → Lake Formation data cell filters with owner_email = '${session:UserEmail}'.
  • Best for: Scenarios where each user should only see their own records (e.g. sales reps see only their own leads).
  • Requires: sts:TagSession permission on IAM roles, email scope on Cognito App Client.

Option C — Combined (Group + Email) Access

Two-layer security: Lake Formation tags control which tables a group can access (table-level), and data cell filters further restrict rows by user email (row-level).

  • How it works: TBAC (Tag-Based Access Control) for table isolation + email-based cell filters for row isolation within allowed tables.
  • Best for: Multi-department setups where different groups access different tables, AND within those tables, users see only their own data.

Decision Guide

Need per-user row isolation?Need department-level table isolation?→ Use
NoNoOption A — Role-level (simplest)
YesNoOption B — Email-based row filter
YesYesOption C — Combined TBAC + email filter
info

Options B and C require additional AWS configuration (Lake Formation data cell filters, sts:TagSession, Cognito email scope). See the Lake Formation Integration and Additional Requirements for Email-Based Filtering sections below.

Architecture Overview

The integration follows a secure RBAC flow:

  1. Organization Admin creates an Athena connector with RBAC enabled, providing Cognito configuration.
  2. Each User authenticates via Cognito OAuth when they first interact with a ChatLLM bot connected to the RBAC-enabled Athena connector.
  3. Abacus.AI exchanges the user's Cognito tokens for temporary AWS credentials via the Cognito Identity Pool.
  4. Queries execute with the user's own AWS permissions, ensuring data access is governed by their specific IAM policies.

Connector Configuration Reference

Org-Level Connector Fields (RBAC Enabled)

When creating the Athena connector with RBAC at the organization level, the following fields are sent:

FieldKeyRequiredDescription
Enable RBACimportRbacYesMust be set to true to enable user-level authentication
AWS RegionregionYesThe AWS region where Athena and Glue are configured (e.g. us-east-2)
Glue Database NamedatabaseYesThe AWS Glue database containing the tables to query
IAM Role ARNroleArnYesThe ARN of an IAM role that Abacus.AI will assume for org-level access
Athena WorkgroupworkgroupNoThe Athena workgroup to use for queries (defaults to primary)
S3 Output LocationoutputLocationNoS3 path for Athena query results (e.g. s3://my-bucket/query-results/)
Cognito DomaincognitoDomainYes (RBAC)The Cognito hosted UI domain prefix (e.g. my-app-auth)
Cognito App Client IDcognitoAppClientIdYes (RBAC)The Cognito App Client ID for OAuth
Cognito User Pool IDcognitoUserPoolIdYes (RBAC)The Cognito User Pool ID (e.g. us-east-2_aBcDeFgHi)
Cognito Identity Pool IDcognitoIdentityPoolIdYes (RBAC)The Cognito Identity Pool ID for obtaining AWS credentials

User Connector Auth Object

The stored user-level connector auth contains:

FieldDescription
regionInherited from the org-level connector
databaseInherited from the org-level connector
cognitoDomainInherited from the org-level connector
cognitoAppClientIdInherited from the org-level connector
cognitoUserPoolIdInherited from the org-level connector
cognitoIdentityPoolIdInherited from the org-level connector
_id_tokenThe user's Cognito ID token
_access_tokenThe user's Cognito access token
_refresh_tokenThe user's Cognito refresh token (used for automatic token renewal)
is_user_leveltrue — indicates this is a user-level connector
info

The user's email and role are determined by the Cognito Identity Pool configuration. The Identity Pool maps the authenticated Cognito user to an IAM role, which controls what databases, tables, and S3 buckets the user can access. You can use Cognito User Groups mapped to different IAM roles for fine-grained, role-based access.


AWS Cognito Setup

Step 1: Create a Cognito User Pool

If you don't already have a Cognito User Pool:

  1. Go to the Amazon Cognito Console.
  2. Click Create user pool.
  3. Configure sign-in options (email, username, etc.) according to your organization's requirements.
  4. Complete the creation wizard.

Step 2: Create a Cognito App Client

  1. In your User Pool, navigate to App integrationApp clients.
  2. Click Create app client.
  3. Configure the app client:
    • App client name: e.g. AbacusAI-Athena
    • Client type: Confidential client
    • Allowed callback URLs: https://abacus.ai/oauth/callback
    • OAuth 2.0 grant types: Authorization code grant
    • OpenID Connect scopes: openid, profile, email
    • Read attributes: Ensure email is included
  4. Note the App Client ID — you'll need this for the connector configuration.
caution

The email scope and email read attribute are required if you plan to use email-based row filtering (Options B or C). Without them, the ID token will not contain the email claim, and user-level filters will silently return zero rows.

Step 3: Configure the Cognito Hosted UI Domain

  1. In your User Pool, navigate to App integrationDomain.
  2. Set up a Cognito domain (e.g. my-app-auth). The full domain will be my-app-auth.auth.<region>.amazoncognito.com.
  3. Note the domain prefix — you'll need this for the connector configuration.

Step 4: Create a Cognito Identity Pool

  1. Go to the Amazon Cognito Federated Identities Console.
  2. Click Create identity pool.
  3. Under Authentication providers, add your Cognito User Pool:
    • User Pool ID: Your User Pool ID (e.g. us-east-2_aBcDeFgHi)
    • App Client ID: The App Client ID from Step 2
  4. Configure the Authenticated role — this IAM role defines what AWS resources authenticated users can access (Athena, Glue, S3).
  5. Note the Identity Pool ID (e.g. us-east-2:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).

Step 5: Create Cognito User Groups

Cognito User Groups let you assign different IAM roles to different categories of users. Each group maps to an IAM role that defines what AWS resources those users can access.

  1. In your User Pool, navigate to GroupsCreate group.
  2. Create groups for each access level. For example:
    • finance_analysts — access to finance tables
    • hr_viewers — access to HR tables
    • admins — access to all tables
  3. For each group, specify the IAM role ARN that group members should assume (you'll create these roles in Step 6).
  4. Assign users to groups: select a group → Add users → select the users who belong to that group.

You can also manage groups via the CLI:

# Create a group with an IAM role
aws cognito-idp create-group \
--user-pool-id us-east-2_aBcDeFgHi \
--group-name finance_analysts \
--role-arn arn:aws:iam::123456789012:role/Athena-FinanceRole

# Add a user to the group
aws cognito-idp admin-add-user-to-group \
--user-pool-id us-east-2_aBcDeFgHi \
--username alice.finance \
--group-name finance_analysts

Step 5b: Configure Identity Pool Role Mappings

After creating Cognito groups and their corresponding IAM roles, configure the Identity Pool to map authenticated users to the correct role based on their group membership:

  1. In the Cognito Identity Pool Console, select your identity pool.
  2. Click Edit identity pool → expand Authentication providers.
  3. Under your Cognito User Pool provider, set:
    • Role resolution: Choose Choose role from token — this uses the cognito:groups claim in the user's token to determine which IAM role to assign.
    • Ambiguous role resolution: Select Use most-restrictive authenticated role (for cases where a user belongs to multiple groups).
  4. Save changes.

Alternatively, use Rules-based mapping for more control:

PriorityClaimMatch typeValueIAM Role
1cognito:groupsContainsfinance_analystsarn:aws:iam::123456789012:role/Athena-FinanceRole
2cognito:groupsContainshr_viewersarn:aws:iam::123456789012:role/Athena-HRRole
3cognito:groupsContainsadminsarn:aws:iam::123456789012:role/Athena-AdminRole
tip

With Choose role from token, the Identity Pool reads the cognito:groups claim and resolves the role from the group's configured RoleArn. This is the simplest approach when each group maps to exactly one role.

Step 6: Configure IAM Roles for Authenticated Users

Create one IAM role per Cognito group. The IAM role assigned to authenticated users in the Identity Pool should have permissions to query Athena. Example policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AthenaQueryExecution",
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:StopQueryExecution"
],
"Resource": "arn:aws:athena:<region>:<account_id>:workgroup/*"
},
{
"Sid": "GlueCatalogRead",
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetTable",
"glue:GetTables"
],
"Resource": [
"arn:aws:glue:<region>:<account_id>:catalog",
"arn:aws:glue:<region>:<account_id>:database/<database_name>",
"arn:aws:glue:<region>:<account_id>:table/<database_name>/*"
]
},
{
"Sid": "S3DataRead",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<data_bucket>",
"arn:aws:s3:::<data_bucket>/*"
]
},
{
"Sid": "S3ResultsReadWrite",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<output_bucket>",
"arn:aws:s3:::<output_bucket>/*"
]
}
]
}
tip

You can use fine-grained IAM policies or AWS Lake Formation to control which databases, tables, and columns each user can access. Map different Cognito groups to different IAM roles in the Identity Pool for role-based access.


Lake Formation Integration

AWS Lake Formation provides fine-grained access control (column-level and row-level security) for data in the Glue Data Catalog. This section is required for Options B and C.

Register S3 Locations

  1. Go to the Lake Formation ConsoleData lake locationsRegister location.
  2. Register the S3 paths where your data resides (e.g. s3://my-athena-data-bucket/).
  3. Select the IAM role Lake Formation should use to access the data (or use the service-linked role).

Set Lake Formation Admin

  1. In the Lake Formation Console, go to Administrative roles and tasksData lake administrators.
  2. Add your admin IAM user or role as a Lake Formation administrator.

Create Lake Formation Tags (Option A and C)

LF-Tags enable Tag-Based Access Control (TBAC) for table-level isolation:

  1. Go to LF-TagsAdd LF-Tag.
  2. Create tags that model your access control dimensions. For example:
    • Tag key: Department, Values: finance, hr, engineering
    • Tag key: UserRole, Values: analyst, viewer, admin
  3. Assign tags to your Glue databases and tables:
    • Navigate to Data lake permissionsLF-Tags → select a database or table → Assign LF-Tag.
    • For example, assign Department=finance to your finance tables.

Create Data Cell Filters (Options B and C)

Data cell filters enable row-level and column-level security:

  1. Go to Data filtersCreate new filter.
  2. Configure the filter:

For department-based row filtering:

Filter name: finance_department_filter
Target database: my_analytics_db
Target table: transactions
Row filter expression: department = 'finance'

For email-based row filtering (Options B and C):

Filter name: email_row_filter
Target database: my_analytics_db
Target table: transactions
Row filter expression: owner_email = '${session:UserEmail}'

The ${session:UserEmail} variable is automatically resolved to the authenticated user's email address at query time.

Grant Lake Formation Permissions

Grant SELECT permission on the data cell filters to the appropriate IAM roles:

  1. Go to Data lake permissionsGrant.
  2. Select IAM users and roles → choose the IAM role for the Cognito group.
  3. Under LF-Tags or catalog resources, select the database and table.
  4. Under Data filters, select the cell filter you created.
  5. Grant SELECT permission.
note

For Option C (combined TBAC + email filter), you may need separate tables per department because AWS Lake Formation allows only one LF-tag value per key per resource. Assign the department tag to each table, then apply the email-based data cell filter within each table.


Additional Requirements for Email-Based Filtering

If you're using Option B or Option C (email-based row filtering), the following additional configuration is required.

Add sts:TagSession to IAM Roles

Every IAM role mapped to a Cognito group must include the sts:TagSession permission. This allows the UserEmail session tag to propagate through the assume-role chain to Lake Formation.

Add this statement to each role's permissions policy:

{
"Sid": "AllowTagSession",
"Effect": "Allow",
"Action": "sts:TagSession",
"Resource": "*"
}
caution

Without sts:TagSession, the ${session:UserEmail} variable in Lake Formation data cell filters will not resolve, and queries will silently return zero rows instead of failing with an error.

Cognito App Client Email Scope

Ensure your Cognito App Client (from Step 2) has:

  • email in AllowedOAuthScopes
  • email in ReadAttributes

This ensures the ID token contains the email claim, which is used as the session tag value.


Token Claims Reference

When a user authenticates via Cognito, the ID token contains claims that drive access control decisions. Here are the key claims:

ClaimExample ValueUsed For
suba1b2c3d4-e5f6-7890-abcd-ef1234567890Unique user identifier, audit trails
cognito:usernamealice.financeDisplay name in the application
cognito:groups["finance_analysts"]Identity Pool role mapping (determines IAM role)
emailalice@example.comRow-level filter via ${session:UserEmail} (Options B/C)
email_verifiedtrueTrust check — confirms email ownership

What Each User Sees — Concrete Examples

To illustrate how the access control layers work together, consider a setup with two departments and email-based row filtering (Option C):

Setup:

  • finance_analysts group → Athena-FinanceRole → LF-Tag grants access to finance_table
  • hr_viewers group → Athena-HRRole → LF-Tag grants access to hr_table
  • Data cell filter: owner_email = '${session:UserEmail}' on both tables
  • alice@example.com is in finance_analysts, assigned to rows 1, 2, 3 in finance_table
  • bob@example.com is in hr_viewers
  • dave@example.com is in finance_analysts, assigned to rows 4, 5, 6 in finance_table

Query results:

UserQueriesTBAC CheckEmail FilterResult
aliceSELECT * FROM finance_table✅ PASS (finance group has access)Filters to owner_email = 'alice@example.com'Sees rows 1, 2, 3
aliceSELECT * FROM hr_table❌ DENIED (finance group has no access)Table is invisible
bobSELECT * FROM finance_table❌ DENIED (hr group has no access)Table is invisible
bobSELECT * FROM hr_table✅ PASS (hr group has access)Filters to owner_email = 'bob@example.com'Sees only bob's rows
daveSELECT * FROM finance_table✅ PASS (finance group has access)Filters to owner_email = 'dave@example.com'Sees rows 4, 5, 6 (different from alice!)

This demonstrates the two-layer security model: TBAC controls which tables a user can see, and email-based cell filters control which rows within those tables.


Abacus.AI Setup

Step 7: Create an Athena Connector with RBAC Enabled

Follow the Athena Connector Setup Instructions, and enable the Enable RBAC toggle. Fill in the additional Cognito fields:

  • Cognito Domain: The hosted UI domain prefix (e.g. my-app-auth)
  • Cognito App Client ID: From Step 2
  • Cognito User Pool ID: From Step 1 (e.g. us-east-2_aBcDeFgHi)
  • Cognito Identity Pool ID: From Step 4 (e.g. us-east-2:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)

Click Create and then Verify Now to confirm the org-level connector works.

Step 8: Create a Project

  1. Navigate to the projects page by clicking on the Abacus.AI logo.
  2. Create a new project.
  3. Select GenAICustom Chatbot.
  4. Enter a descriptive name for your project.
  5. Select Skip to project dashboard.

Step 9: Train a Model with the Athena Connector

  1. Click on Model in the left toolbar and select Train Model.
  2. From the dropdown in Structured data source, select External Databases.
  3. Select the RBAC-enabled Athena connector you created in Step 7.
  4. Click Train Model to begin training.

Step 10: Deploy the Model

  1. Once training is complete, click on Models and select your trained model.
  2. Click Create a new deployment.
  3. Select Offline Batch + Realtime deployment type and click Next.
  4. Enter a deployment name and click Deploy.
  5. Wait for the deployment to reach Active state.

User Authentication Flow

Step 11: User-Level Authentication

When end users access the ChatLLM bot connected to the RBAC-enabled Athena connector:

  1. On their first query, users are prompted to authenticate with their AWS Cognito account.
  2. They are redirected to the Cognito hosted UI login page.
  3. After successful authentication, Abacus.AI receives OAuth tokens and exchanges them for temporary AWS credentials via the Identity Pool.
  4. All subsequent queries execute with the user's own AWS permissions.

Step 12: Interactive Chat Experience

Once authenticated, users can interact with the chat interface to query their Athena data. The system processes queries and returns results based on each user's individual RBAC permissions.

The user's Athena connector will appear alongside other user-level connectors in the Connectors List.

Security Considerations

RBAC Implementation

  • User-Level Authentication: Each user must authenticate individually with their Cognito credentials.
  • Permission Inheritance: Users can only access data according to their IAM roles mapped through the Cognito Identity Pool.
  • Token Management: Cognito refresh tokens are used to maintain sessions securely. Abacus.AI automatically refreshes expired tokens.
  • Temporary Credentials: AWS credentials obtained via the Identity Pool are temporary and scoped to the user's IAM role.

Best Practices

  1. Cognito User Groups: Use Cognito groups mapped to different IAM roles in the Identity Pool to implement fine-grained access control.
  2. Lake Formation: For column-level and row-level security, integrate with AWS Lake Formation.
  3. Token Expiry: Configure appropriate token validity durations in the Cognito App Client settings.
  4. Monitoring: Use AWS CloudTrail and Athena query history to audit user access patterns.
  5. Least Privilege: Grant each user role only the minimum permissions needed for their specific data access requirements.

Troubleshooting

Common Issues

Authentication Failures

  • Verify the Cognito Domain, App Client ID, User Pool ID, and Identity Pool ID are correctly configured in the Abacus.AI connector.
  • Ensure the App Client's Allowed callback URL is set to https://abacus.ai/oauth/callback.
  • Confirm that the user exists in the Cognito User Pool and their account is confirmed/active.

"Failed to exchange Cognito auth code for tokens" Error

  • Check that the Cognito App Client has the correct OAuth scopes (openid, profile) enabled.
  • Verify the Cognito hosted UI domain is correctly configured and accessible.

"Failed to get AWS credentials from Cognito Identity Pool" Error

  • Ensure the Identity Pool is configured with the correct User Pool ID and App Client ID as an authentication provider.
  • Verify the authenticated IAM role has the necessary Athena, Glue, and S3 permissions.

Data Access Errors

  • Check the IAM role permissions for the specific user's Cognito group.
  • Verify Lake Formation grants if using Lake Formation for fine-grained access control.

Support Resources

Conclusion

This setup provides a secure, scalable solution for enabling conversational access to Athena data while maintaining strict RBAC controls. Users authenticate individually via Amazon Cognito, and their queries execute with user-specific AWS credentials, ensuring data access is always governed by their assigned permissions.