Athena User-Level Connector with RBAC Integration
Overview
This guide provides step-by-step instructions for setting up an Athena connector with Role-Based Access Control (RBAC) and integrating it with Abacus.AI's ChatLLM platform. This configuration enables secure, user-level access to Athena data through a conversational AI interface, where each user authenticates individually via Amazon Cognito.
Prerequisites
- An organization-level Athena connector with RBAC enabled (see Athena Connector Setup)
- An Amazon Cognito User Pool with users provisioned
- An Amazon Cognito App Client configured for OAuth 2.0 with the redirect URI
https://abacus.ai/oauth/callback - An Amazon Cognito Identity Pool that maps authenticated Cognito users to IAM roles
- Appropriate IAM roles and policies for user-level Athena access
Access Control Options
Before diving into setup, decide which access control model fits your needs. There are three options:
Option A — Role-Level (Group-Based) Access
All users in the same Cognito group share the same IAM role and see the same data scope. No per-user filtering.
- How it works: Cognito User Groups → IAM roles → table/database-level permissions via IAM policies or Lake Formation tags.
- Best for: Shared dashboards where entire departments access the same datasets (e.g. all finance analysts see all finance tables).
Option B — User-Level (Email-Based) Access
Per-user row isolation using the authenticated user's email address. Users in the same Cognito group can see different rows.
- How it works: Cognito groups → IAM roles → Lake Formation data cell filters with
owner_email = '${session:UserEmail}'. - Best for: Scenarios where each user should only see their own records (e.g. sales reps see only their own leads).
- Requires:
sts:TagSessionpermission on IAM roles,emailscope on Cognito App Client.
Option C — Combined (Group + Email) Access
Two-layer security: Lake Formation tags control which tables a group can access (table-level), and data cell filters further restrict rows by user email (row-level).
- How it works: TBAC (Tag-Based Access Control) for table isolation + email-based cell filters for row isolation within allowed tables.
- Best for: Multi-department setups where different groups access different tables, AND within those tables, users see only their own data.
Decision Guide
| Need per-user row isolation? | Need department-level table isolation? | → Use |
|---|---|---|
| No | No | Option A — Role-level (simplest) |
| Yes | No | Option B — Email-based row filter |
| Yes | Yes | Option C — Combined TBAC + email filter |
Options B and C require additional AWS configuration (Lake Formation data cell filters, sts:TagSession, Cognito email scope). See the Lake Formation Integration and Additional Requirements for Email-Based Filtering sections below.
Architecture Overview
The integration follows a secure RBAC flow:
- Organization Admin creates an Athena connector with RBAC enabled, providing Cognito configuration.
- Each User authenticates via Cognito OAuth when they first interact with a ChatLLM bot connected to the RBAC-enabled Athena connector.
- Abacus.AI exchanges the user's Cognito tokens for temporary AWS credentials via the Cognito Identity Pool.
- Queries execute with the user's own AWS permissions, ensuring data access is governed by their specific IAM policies.
Connector Configuration Reference
Org-Level Connector Fields (RBAC Enabled)
When creating the Athena connector with RBAC at the organization level, the following fields are sent:
| Field | Key | Required | Description |
|---|---|---|---|
| Enable RBAC | importRbac | Yes | Must be set to true to enable user-level authentication |
| AWS Region | region | Yes | The AWS region where Athena and Glue are configured (e.g. us-east-2) |
| Glue Database Name | database | Yes | The AWS Glue database containing the tables to query |
| IAM Role ARN | roleArn | Yes | The ARN of an IAM role that Abacus.AI will assume for org-level access |
| Athena Workgroup | workgroup | No | The Athena workgroup to use for queries (defaults to primary) |
| S3 Output Location | outputLocation | No | S3 path for Athena query results (e.g. s3://my-bucket/query-results/) |
| Cognito Domain | cognitoDomain | Yes (RBAC) | The Cognito hosted UI domain prefix (e.g. my-app-auth) |
| Cognito App Client ID | cognitoAppClientId | Yes (RBAC) | The Cognito App Client ID for OAuth |
| Cognito User Pool ID | cognitoUserPoolId | Yes (RBAC) | The Cognito User Pool ID (e.g. us-east-2_aBcDeFgHi) |
| Cognito Identity Pool ID | cognitoIdentityPoolId | Yes (RBAC) | The Cognito Identity Pool ID for obtaining AWS credentials |
User Connector Auth Object
The stored user-level connector auth contains:
| Field | Description |
|---|---|
region | Inherited from the org-level connector |
database | Inherited from the org-level connector |
cognitoDomain | Inherited from the org-level connector |
cognitoAppClientId | Inherited from the org-level connector |
cognitoUserPoolId | Inherited from the org-level connector |
cognitoIdentityPoolId | Inherited from the org-level connector |
_id_token | The user's Cognito ID token |
_access_token | The user's Cognito access token |
_refresh_token | The user's Cognito refresh token (used for automatic token renewal) |
is_user_level | true — indicates this is a user-level connector |
The user's email and role are determined by the Cognito Identity Pool configuration. The Identity Pool maps the authenticated Cognito user to an IAM role, which controls what databases, tables, and S3 buckets the user can access. You can use Cognito User Groups mapped to different IAM roles for fine-grained, role-based access.
AWS Cognito Setup
Step 1: Create a Cognito User Pool
If you don't already have a Cognito User Pool:
- Go to the Amazon Cognito Console.
- Click Create user pool.
- Configure sign-in options (email, username, etc.) according to your organization's requirements.
- Complete the creation wizard.
Step 2: Create a Cognito App Client
- In your User Pool, navigate to App integration → App clients.
- Click Create app client.
- Configure the app client:
- App client name: e.g.
AbacusAI-Athena - Client type: Confidential client
- Allowed callback URLs:
https://abacus.ai/oauth/callback - OAuth 2.0 grant types: Authorization code grant
- OpenID Connect scopes:
openid,profile,email - Read attributes: Ensure
emailis included
- App client name: e.g.
- Note the App Client ID — you'll need this for the connector configuration.
The email scope and email read attribute are required if you plan to use email-based row filtering (Options B or C). Without them, the ID token will not contain the email claim, and user-level filters will silently return zero rows.
Step 3: Configure the Cognito Hosted UI Domain
- In your User Pool, navigate to App integration → Domain.
- Set up a Cognito domain (e.g.
my-app-auth). The full domain will bemy-app-auth.auth.<region>.amazoncognito.com. - Note the domain prefix — you'll need this for the connector configuration.
Step 4: Create a Cognito Identity Pool
- Go to the Amazon Cognito Federated Identities Console.
- Click Create identity pool.
- Under Authentication providers, add your Cognito User Pool:
- User Pool ID: Your User Pool ID (e.g.
us-east-2_aBcDeFgHi) - App Client ID: The App Client ID from Step 2
- User Pool ID: Your User Pool ID (e.g.
- Configure the Authenticated role — this IAM role defines what AWS resources authenticated users can access (Athena, Glue, S3).
- Note the Identity Pool ID (e.g.
us-east-2:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).
Step 5: Create Cognito User Groups
Cognito User Groups let you assign different IAM roles to different categories of users. Each group maps to an IAM role that defines what AWS resources those users can access.
- In your User Pool, navigate to Groups → Create group.
- Create groups for each access level. For example:
finance_analysts— access to finance tableshr_viewers— access to HR tablesadmins— access to all tables
- For each group, specify the IAM role ARN that group members should assume (you'll create these roles in Step 6).
- Assign users to groups: select a group → Add users → select the users who belong to that group.
You can also manage groups via the CLI:
# Create a group with an IAM role
aws cognito-idp create-group \
--user-pool-id us-east-2_aBcDeFgHi \
--group-name finance_analysts \
--role-arn arn:aws:iam::123456789012:role/Athena-FinanceRole
# Add a user to the group
aws cognito-idp admin-add-user-to-group \
--user-pool-id us-east-2_aBcDeFgHi \
--username alice.finance \
--group-name finance_analysts
Step 5b: Configure Identity Pool Role Mappings
After creating Cognito groups and their corresponding IAM roles, configure the Identity Pool to map authenticated users to the correct role based on their group membership:
- In the Cognito Identity Pool Console, select your identity pool.
- Click Edit identity pool → expand Authentication providers.
- Under your Cognito User Pool provider, set:
- Role resolution: Choose Choose role from token — this uses the
cognito:groupsclaim in the user's token to determine which IAM role to assign. - Ambiguous role resolution: Select Use most-restrictive authenticated role (for cases where a user belongs to multiple groups).
- Role resolution: Choose Choose role from token — this uses the
- Save changes.
Alternatively, use Rules-based mapping for more control:
| Priority | Claim | Match type | Value | IAM Role |
|---|---|---|---|---|
| 1 | cognito:groups | Contains | finance_analysts | arn:aws:iam::123456789012:role/Athena-FinanceRole |
| 2 | cognito:groups | Contains | hr_viewers | arn:aws:iam::123456789012:role/Athena-HRRole |
| 3 | cognito:groups | Contains | admins | arn:aws:iam::123456789012:role/Athena-AdminRole |
With Choose role from token, the Identity Pool reads the cognito:groups claim and resolves the role from the group's configured RoleArn. This is the simplest approach when each group maps to exactly one role.
Step 6: Configure IAM Roles for Authenticated Users
Create one IAM role per Cognito group. The IAM role assigned to authenticated users in the Identity Pool should have permissions to query Athena. Example policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AthenaQueryExecution",
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:StopQueryExecution"
],
"Resource": "arn:aws:athena:<region>:<account_id>:workgroup/*"
},
{
"Sid": "GlueCatalogRead",
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetTable",
"glue:GetTables"
],
"Resource": [
"arn:aws:glue:<region>:<account_id>:catalog",
"arn:aws:glue:<region>:<account_id>:database/<database_name>",
"arn:aws:glue:<region>:<account_id>:table/<database_name>/*"
]
},
{
"Sid": "S3DataRead",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<data_bucket>",
"arn:aws:s3:::<data_bucket>/*"
]
},
{
"Sid": "S3ResultsReadWrite",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<output_bucket>",
"arn:aws:s3:::<output_bucket>/*"
]
}
]
}
You can use fine-grained IAM policies or AWS Lake Formation to control which databases, tables, and columns each user can access. Map different Cognito groups to different IAM roles in the Identity Pool for role-based access.
Lake Formation Integration
AWS Lake Formation provides fine-grained access control (column-level and row-level security) for data in the Glue Data Catalog. This section is required for Options B and C.
Register S3 Locations
- Go to the Lake Formation Console → Data lake locations → Register location.
- Register the S3 paths where your data resides (e.g.
s3://my-athena-data-bucket/). - Select the IAM role Lake Formation should use to access the data (or use the service-linked role).
Set Lake Formation Admin
- In the Lake Formation Console, go to Administrative roles and tasks → Data lake administrators.
- Add your admin IAM user or role as a Lake Formation administrator.
Create Lake Formation Tags (Option A and C)
LF-Tags enable Tag-Based Access Control (TBAC) for table-level isolation:
- Go to LF-Tags → Add LF-Tag.
- Create tags that model your access control dimensions. For example:
- Tag key:
Department, Values:finance,hr,engineering - Tag key:
UserRole, Values:analyst,viewer,admin
- Tag key:
- Assign tags to your Glue databases and tables:
- Navigate to Data lake permissions → LF-Tags → select a database or table → Assign LF-Tag.
- For example, assign
Department=financeto your finance tables.
Create Data Cell Filters (Options B and C)
Data cell filters enable row-level and column-level security:
- Go to Data filters → Create new filter.
- Configure the filter:
For department-based row filtering:
Filter name: finance_department_filter
Target database: my_analytics_db
Target table: transactions
Row filter expression: department = 'finance'
For email-based row filtering (Options B and C):
Filter name: email_row_filter
Target database: my_analytics_db
Target table: transactions
Row filter expression: owner_email = '${session:UserEmail}'
The ${session:UserEmail} variable is automatically resolved to the authenticated user's email address at query time.
Grant Lake Formation Permissions
Grant SELECT permission on the data cell filters to the appropriate IAM roles:
- Go to Data lake permissions → Grant.
- Select IAM users and roles → choose the IAM role for the Cognito group.
- Under LF-Tags or catalog resources, select the database and table.
- Under Data filters, select the cell filter you created.
- Grant SELECT permission.
For Option C (combined TBAC + email filter), you may need separate tables per department because AWS Lake Formation allows only one LF-tag value per key per resource. Assign the department tag to each table, then apply the email-based data cell filter within each table.
Additional Requirements for Email-Based Filtering
If you're using Option B or Option C (email-based row filtering), the following additional configuration is required.
Add sts:TagSession to IAM Roles
Every IAM role mapped to a Cognito group must include the sts:TagSession permission. This allows the UserEmail session tag to propagate through the assume-role chain to Lake Formation.
Add this statement to each role's permissions policy:
{
"Sid": "AllowTagSession",
"Effect": "Allow",
"Action": "sts:TagSession",
"Resource": "*"
}
Without sts:TagSession, the ${session:UserEmail} variable in Lake Formation data cell filters will not resolve, and queries will silently return zero rows instead of failing with an error.
Cognito App Client Email Scope
Ensure your Cognito App Client (from Step 2) has:
emailin AllowedOAuthScopesemailin ReadAttributes
This ensures the ID token contains the email claim, which is used as the session tag value.
Token Claims Reference
When a user authenticates via Cognito, the ID token contains claims that drive access control decisions. Here are the key claims:
| Claim | Example Value | Used For |
|---|---|---|
sub | a1b2c3d4-e5f6-7890-abcd-ef1234567890 | Unique user identifier, audit trails |
cognito:username | alice.finance | Display name in the application |
cognito:groups | ["finance_analysts"] | Identity Pool role mapping (determines IAM role) |
email | alice@example.com | Row-level filter via ${session:UserEmail} (Options B/C) |
email_verified | true | Trust check — confirms email ownership |
What Each User Sees — Concrete Examples
To illustrate how the access control layers work together, consider a setup with two departments and email-based row filtering (Option C):
Setup:
finance_analystsgroup →Athena-FinanceRole→ LF-Tag grants access tofinance_tablehr_viewersgroup →Athena-HRRole→ LF-Tag grants access tohr_table- Data cell filter:
owner_email = '${session:UserEmail}'on both tables - alice@example.com is in
finance_analysts, assigned to rows 1, 2, 3 infinance_table - bob@example.com is in
hr_viewers - dave@example.com is in
finance_analysts, assigned to rows 4, 5, 6 infinance_table
Query results:
| User | Queries | TBAC Check | Email Filter | Result |
|---|---|---|---|---|
| alice | SELECT * FROM finance_table | ✅ PASS (finance group has access) | Filters to owner_email = 'alice@example.com' | Sees rows 1, 2, 3 |
| alice | SELECT * FROM hr_table | ❌ DENIED (finance group has no access) | — | Table is invisible |
| bob | SELECT * FROM finance_table | ❌ DENIED (hr group has no access) | — | Table is invisible |
| bob | SELECT * FROM hr_table | ✅ PASS (hr group has access) | Filters to owner_email = 'bob@example.com' | Sees only bob's rows |
| dave | SELECT * FROM finance_table | ✅ PASS (finance group has access) | Filters to owner_email = 'dave@example.com' | Sees rows 4, 5, 6 (different from alice!) |
This demonstrates the two-layer security model: TBAC controls which tables a user can see, and email-based cell filters control which rows within those tables.
Abacus.AI Setup
Step 7: Create an Athena Connector with RBAC Enabled
Follow the Athena Connector Setup Instructions, and enable the Enable RBAC toggle. Fill in the additional Cognito fields:
- Cognito Domain: The hosted UI domain prefix (e.g.
my-app-auth) - Cognito App Client ID: From Step 2
- Cognito User Pool ID: From Step 1 (e.g.
us-east-2_aBcDeFgHi) - Cognito Identity Pool ID: From Step 4 (e.g.
us-east-2:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
Click Create and then Verify Now to confirm the org-level connector works.
Step 8: Create a Project
- Navigate to the projects page by clicking on the Abacus.AI logo.
- Create a new project.
- Select GenAI → Custom Chatbot.
- Enter a descriptive name for your project.
- Select Skip to project dashboard.
Step 9: Train a Model with the Athena Connector
- Click on Model in the left toolbar and select Train Model.
- From the dropdown in Structured data source, select External Databases.
- Select the RBAC-enabled Athena connector you created in Step 7.
- Click Train Model to begin training.
Step 10: Deploy the Model
- Once training is complete, click on Models and select your trained model.
- Click Create a new deployment.
- Select Offline Batch + Realtime deployment type and click Next.
- Enter a deployment name and click Deploy.
- Wait for the deployment to reach Active state.
User Authentication Flow
Step 11: User-Level Authentication
When end users access the ChatLLM bot connected to the RBAC-enabled Athena connector:
- On their first query, users are prompted to authenticate with their AWS Cognito account.
- They are redirected to the Cognito hosted UI login page.
- After successful authentication, Abacus.AI receives OAuth tokens and exchanges them for temporary AWS credentials via the Identity Pool.
- All subsequent queries execute with the user's own AWS permissions.
Step 12: Interactive Chat Experience
Once authenticated, users can interact with the chat interface to query their Athena data. The system processes queries and returns results based on each user's individual RBAC permissions.
The user's Athena connector will appear alongside other user-level connectors in the Connectors List.
Security Considerations
RBAC Implementation
- User-Level Authentication: Each user must authenticate individually with their Cognito credentials.
- Permission Inheritance: Users can only access data according to their IAM roles mapped through the Cognito Identity Pool.
- Token Management: Cognito refresh tokens are used to maintain sessions securely. Abacus.AI automatically refreshes expired tokens.
- Temporary Credentials: AWS credentials obtained via the Identity Pool are temporary and scoped to the user's IAM role.
Best Practices
- Cognito User Groups: Use Cognito groups mapped to different IAM roles in the Identity Pool to implement fine-grained access control.
- Lake Formation: For column-level and row-level security, integrate with AWS Lake Formation.
- Token Expiry: Configure appropriate token validity durations in the Cognito App Client settings.
- Monitoring: Use AWS CloudTrail and Athena query history to audit user access patterns.
- Least Privilege: Grant each user role only the minimum permissions needed for their specific data access requirements.
Troubleshooting
Common Issues
Authentication Failures
- Verify the Cognito Domain, App Client ID, User Pool ID, and Identity Pool ID are correctly configured in the Abacus.AI connector.
- Ensure the App Client's Allowed callback URL is set to
https://abacus.ai/oauth/callback. - Confirm that the user exists in the Cognito User Pool and their account is confirmed/active.
"Failed to exchange Cognito auth code for tokens" Error
- Check that the Cognito App Client has the correct OAuth scopes (
openid,profile) enabled. - Verify the Cognito hosted UI domain is correctly configured and accessible.
"Failed to get AWS credentials from Cognito Identity Pool" Error
- Ensure the Identity Pool is configured with the correct User Pool ID and App Client ID as an authentication provider.
- Verify the authenticated IAM role has the necessary Athena, Glue, and S3 permissions.
Data Access Errors
- Check the IAM role permissions for the specific user's Cognito group.
- Verify Lake Formation grants if using Lake Formation for fine-grained access control.
Support Resources
- Refer to the AWS Cognito documentation for identity and access management.
- Refer to the Athena documentation for query and permissions setup.
- Contact support@abacus.ai for platform-specific issues.
Conclusion
This setup provides a secure, scalable solution for enabling conversational access to Athena data while maintaining strict RBAC controls. Users authenticate individually via Amazon Cognito, and their queries execute with user-specific AWS credentials, ensuring data access is always governed by their assigned permissions.