Cross-account Athena query

Imagine the world in which you have different AWS accounts for different purposes. There’s dev/staging/production AWS account for your shiny software. Also somewhere in the clouds, there’s AWS account used by the data team to prepare one source of truth about your universe (AWS Glue, S3 buckets, catalogues, Athena etc. etc.). Something like that:

Assumptions

As you can see we need two AWS accounts:

  • Account B — the account in which you have your own Athena you’d like to use AWS Glue from Account A and query data in the S3 bucket belonging to Account A.

Step 1

We need Lambda which will query AWS Glue catalogue. You can use the function from AWS Labs project. So grab from here or use one provided in Terraform module. In this project, you’ll also find more details about the required IAM roles etc. It’s really worth to check their README.md.

Step 2

You need your IAM role which will allow you to:

  • query the Glue catalogue in Account A.
  • get access to S3 bucket with data in Account A.
  • run queries and get results from Athena placed in Account B.

Step 3

Create the Athena’s data source.

Terraform

It’s quite easy to perform that steps through AWS console but here you can find Terraform module which will create everything for you. When you use this module you will have one important outcome: ARN of Lambda execution role. You will need to use this ARN in Account A to give Lambda access to Glue catalogue (check section What about Account A).

Current situation

After running Terraform you should have:

  • New data source in Athena — catalogue type: Hive metastore connected to your Lambda.

The final step in Account B

So right now we have the infrastructure to access Glue catalogue in Account A. We need one more thing — the role which you can attach for example to your ECS Fargate task so you can make actual queries to the data in Account A. So you need to create new role (and have its ARN) and attach such policy to it:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"lambda:InvokeFunction",
"athena:Get*",
"athena:BatchGet*",
"athena:StartQueryExecution",
"athena:StopQueryExecution"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "glue:*",
"Resource": [
"arn:aws:glue:eu-west-1:<<ACCOUNT_A>>:catalog",
"arn:aws:glue:eu-west-1:<<ACCOUNT_A>>:database/<DATABASE_NAME>",
"arn:aws:glue:eu-west-1:<<ACCOUNT_A>>:table/<DATABASE_NAME>/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::<<ACCOUNT_A_BUCKET_FOR_RESULTS>>",
"arn:aws:s3:::<<ACCOUNT_A_BUCKET_FOR_RESULTS>>/*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:CreateLogGroup",
"logs:PutLogEvents"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::<<TARGET_ACCOUNT_A_BUCKET_WITH_DATA>>",
"arn:aws:s3:::<<TARGET_ACCOUNT_A_BUCKET_WITH_DATA>>/*"
]
}
]
}
  • read-only access to the bucket with data in Account A;
  • Lambda execution;
  • Run Athena’s queries;

What about Account A

Account A should allow for cross-account access by:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"<<LAMBDA_EXECUTION_ROLE_ARN>>"
]
},
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetPartition",
"glue:GetPartitions",
"glue:GetTable",
"glue:GetTables"
],
"Resource": [
"arn:aws:glue:eu-west-1:<<ACCOUNT_A>>:catalog",
"arn:aws:glue:eu-west-1:<<ACCOUNT_A>>:database/<DATABASE_NAME>",
"arn:aws:glue:eu-west-1:<<ACCOUNT_A>>:table/<DATABASE_NAME>/*"
]
}
]
}
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "<<IAM_ROLE_ACCOUNT_B>>"
},
"Action": [
"s3:List*",
"s3:Get*"
],
"Resource": [
"arn:aws:s3:::<<BUCKET_ACCOUNT_A>>"
]
}

Summary

I hope that this article was helpful to you. If so please drop a comment or clap :)

Tech Lead / Senior Software Engineer @ Zoopla

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store