This post was originally published on Medium.
Recently, I’ve been getting my hands dirty with some EKS security, especially around credentials management. While granting an EKS pod IAM credentials is fairly straightforward, is it just as easy to trace an AWS event back to the pod that triggered it? Let’s find out.
First things first: when considering methods for pods to obtain credentials, you have two main options:
- IRSA — IAM Roles for Service Accounts
- EKS Pod Identity
Both options use OIDC (Open ID Connect) and enable temporary credential grants to pods.
What Is IRSA?
IRSA was AWS’s initial method for assigning credentials to pods. With IRSA, you can use trusted entities on the IAM role to specify the service account that will have permission to assume this role. On the Kubernetes side, you specify an annotation on the service account attached to your pod indicating which role it should assume.
What Is EKS Pod Identity?
EKS Pod Identity, on the other hand, is fully integrated with the EKS management console, allowing you to bind an IAM role with a Kubernetes RBAC role directly through the EKS API, as detailed here.
Both methods result in the same outcome: a Kubernetes pod with IAM credentials.
Now that we’ve covered that, let’s look at how to do the reverse: tracing an action made on AWS back to an EKS pod.
Each authentication method we discussed above has a different implementation on AWS’s side, which requires a unique approach to unchain the role connection. To perform this EKS data enrichment, I’ll use CloudTrail logs, all the log examples on this blog post will be partial and contain only the relevant fields.
Approach #1: Using IRSA to Unchain the Role Connection
Here is a CloudTrail event — an entity triggered the GetSecretValue action;
{
"userIdentity": {
"type": "AssumedRole",
"arn": "arn:aws:sts::012345678910:assumed-role/my-airflow-role-1c34dg",
"accountId": "012345678910",
"accessKeyId": "ASIAQV9HXLTLJIT5R3XCH",
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"arn": "arn:aws:iam::012345678910:role/my-airflow-role-1c34dg",
"accountId": "012345678910",
"userName": "my-airflow-role-1c34dg"
},
"webIdFederationData": {
"federatedProvider": "arn:aws:iam::012345678910:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/0YY74BNCIDF85NC326F7545VF21",
}
}
},
"eventTime": "2024-11-13T08:44:27Z",
"eventSource": "secretsmanager.amazonaws.com",
"eventName": "GetSecretValue",
}
As we see in the log, the $userIdentity.sessionContext.webIdFederationData field indicates that this action was triggered by an OIDC entity. Under $.userIdentity.arn, we see that the role arn:aws:sts::012345678910:assumed-role/my-airflow-role-1c34dg performed this action. But who is actually assuming this role?
To find out, we’ll use the access key specified in $.userIdentity.accessKeyId.
The CloudTrail event for an OIDC assume role is called AssumeRoleWithWebIdentity, and here’s what it looks like:
{
"userIdentity": {
"type": "WebIdentityUser",
"principalId": "arn:aws:iam::012345678910:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/0YY74BNCIDF85NC326F7545VF21:sts.amazonaws.com:system:serviceaccount:airflow:my-flow-1220d985a",
"userName": "system:serviceaccount:airflow:my-flow-1220d985a",
"identityProvider": "arn:aws:iam::012345678910:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/0YY74BNCIDF85NC326F7545VF21"
},
"eventTime": "2024-11-12T13:28:26Z",
"eventSource": "sts.amazonaws.com",
"eventName": "AssumeRoleWithWebIdentity",
"awsRegion": "us-east-1",
"responseElements": {
"credentials": {
"accessKeyId": "ASIAQV9HXLTLJIT5R3XCH",
"sessionToken": "********",
"expiration": "Nov 12, 2024, 1:43:26 PM"
},
"provider": "arn:aws:iam::012345678910:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/0YY74BNCIDF85NC326F7545VF21",
"audience": "sts.amazonaws.com"
}
}
From this event, we can extract some key information:
- $.responseElements.credentials.accessKeyId: The access key generated after executing the assume role. Every action triggered by this new session will use this access key—just like the GetSecretValue action above!
- $.userIdentity.principalId: The service account and namespace of the entity that triggered this assume role event.
- $.userIdentity.identityProvider: The cluster’s identity provider—OIDC details. This is our best clue in the log to identify the cluster.
So, we have:
- Kubernetes Namespace
- Kubernetes Service Account
To complete the picture, we just need to correlate the OIDC details with our clusters to identify which one triggered this event.
Approach #2: Using Pod Identities to Unchain the Role Connection
This approach operates a bit differently. But let’s start with a typical AWS event that records the ModifyDBInstance action...
{
"userIdentity": {
"type": "AssumedRole",
"arn": "arn:aws:sts::012345678910:assumed-role/tiller-role-01/eks-my-cluster-1-be-server--29b4f071-8cfa-4570-8723-fc43315ac690",
"accountId": "012345678910",
"accessKeyId": "ASIA4DTNDYXABJKN3G2W",
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"arn": "arn:aws:iam::012345678910:role/tiller-role-01",
"accountId": "012345678910",
"userName": "tiller-role-01"
},
"webIdFederationData": {}
}
},
"eventTime": "2024-10-08T09:01:25Z",
"eventSource": "rds.amazonaws.com",
"eventName": "ModifyDBInstance",
}
From this event, we can also retrieve the access key from the $.userIdentity.accessKeyId field.
AWS created the AssumeRoleForPodIdentity event, which is triggered when a pod needs to assume a role. However, this event actually initiates a standard AssumeRole event that contains all the relevant details!
{
"userIdentity": {
"type": "AWSService",
"invokedBy": "pods.eks.amazonaws.com"
},
"eventTime": "2024-10-08T08:47:06Z",
"eventSource": "sts.amazonaws.com",
"eventName": "AssumeRole",
"userAgent": "pods.eks.amazonaws.com",
"requestParameters": {
"roleArn": "arn:aws:iam::012345678910:role/tiller-role-01",
"roleSessionName": "eks-eks-my-cluster-1-be-server--29b4f071-8cfa-4570-8723-fc43315ac690",
"tags": [
{
"key": "eks-cluster-arn",
"value": "arn:aws:eks:us-east-1:012345678910:cluster/eks-my-cluster-1"
},
{
"key": "eks-cluster-name",
"value": "eks-my-cluster-1"
},
{
"key": "kubernetes-namespace",
"value": "production"
},
{
"key": "kubernetes-service-account",
"value": "production-sa"
},
{
"key": "kubernetes-pod-name",
"value": "be-server-7f958859c6-s62w5"
},
{
"key": "kubernetes-pod-uid",
"value": "af4af1ce-140a-41e6-96ac-4710e7ccc38b"
}
],
"transitiveTagKeys": [
"eks-cluster-arn",
"eks-cluster-name",
"kubernetes-namespace",
"kubernetes-service-account",
"kubernetes-pod-name",
"kubernetes-pod-uid"
]
},
"responseElements": {
"credentials": {
"accessKeyId": "ASIA4DTNDYXABJKN3G2W",
"sessionToken": "********",
"expiration": "Oct 8, 2024, 2:47:06 PM"
},
"assumedRoleUser": {
"assumedRoleId": "AROA4DT3YIDNSFVHQP6:eks-eks-my-cluster-1-be-server--29b4f071-8cfa-4570-8723-fc43315ac690",
"arn": "arn:aws:sts::012345678910:assumed-role/tiller-role-01/eks-my-cluster-1-be-server--29b4f071-8cfa-4570-8723-fc43315ac690"
}
}
}
Examining this AssumeRole event reveals some critical information:
- $.responseElements.credentials.accessKeyId: The access key generated after executing the assume role. Every action triggered by this new session will use this access key—just like the ModifyDBInstance action mentioned above.
- $.userIdentity.invokedBy: This field shows that the AssumeRole event was triggered by the Pod Identity API.
- $.requestParameters.tags: Contains details about the pod that triggered the AssumeRole event—most important, the cluster ARN, service account name, pod name, and namespace.
Notably, this method provides more detailed information about the entity triggering actions than the IRSA method. The logs are structured differently, so with Pod Identity, we can extract not only the service account and namespace but also the specific pod name and cluster.
How EKS Unchaining Enables AWS Event Enrichment
It essentially allows for tracking every action performed by your pods. EKS unchaining is a non-trivial but powerful way to associate AWS events with the pods that triggered them, enabling us to classify all AWS events by their corresponding Kubernetes service accounts and pods.
Going through this process manually for each event is not scalable. At Mitiga, we leverage our Cloud Security Data Lake, cloud discovery tools to map AWS environments, automatic log collection, and PySpark code for necessary aggregations. This fully automates the process, enriching each AWS event triggered and collected into our data lake with data that reveals which workload in your EKS clusters performed the action.