In our previous blog, we shortly touched upon security challenges that we can come upon when working with Terraform on AWS. In this blog we want to dig a little deeper into IAM by explaining 10 pitfalls you should look out for when you configure AWS IAM. Let’s start our journey and tackle them one by one.

1. Implicit denies

For this first pitfall, understanding policy evaluation logic is key.
Say we want to grant a role access to everything but IAM. Essentially there are two ways to configure this: implicitly and explicitly.

Take a look at the following two policies:

{
  "Version": "2012-10-17",
  "Sid": "ImplicitDeny",
  "Statement": [
    {
      "Effect": "Allow",
      "NotAction": "iam:*",
      "Resource": "*"
    }
  ]
}

{
  "Version": "2012-10-17",
  "Sid": "ExplicitDeny",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": "iam:*",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "*",
      "Resource": "*"
    }
  ]
}

Both of these policies grant access to everything but IAM. The difference is in how they do it. The first policy implicitly “denies” IAM by not including it in the allowed statements. The second one has an explicit deny on all IAM actions. This difference is crucial.

Evaluated on their own, the outcome of each policy would be the same. But what happens when we attach a different policy (or statement) to the same role, which allows IAM? Say, something like this:

{
  "Version": "2012-10-17",
  "Sid": "AllowIam",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "iam:*",
      "Resource": "*"
    }
  ]
}

Unfortunately, we bypassed the first policy: IAM is now also in the list of allowed actions. The second policy still has the desired effect. This is because 1) a deny trumps an allow and 2) IAM evaluates denies first.

This means it’s safer to use an explicit deny rather than an implicit one if you want to prevent unintended access. Like the Zen of Python states, “Explicit is better than implicit”. Apply that to your AWS policies too!

For more information on policy evaluation, check out the AWS docs on this topic.

2. The additive resource policies

This next pitfall is related to understanding policy evaluation logic as well.
Resource policies can take different shapes or forms. The S3 bucket policy is the most notable example of a resource policy. Other examples are KMS key policies or Lambda function invocation permissions.

Resource policies are additive – this means they can grant access in addition to what you define in IAM. We often see resource policies that grant access too broadly.

A role can’t access a resource if you don’t allow it in IAM. The issues arise when you don’t consider the access outside of the AWS account you’re working and testing in.

Encryption for all

For example, consider the following key policy:

{
  "Version": "2012-10-17",
  "Sid": "EnableKeyManagement",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal" : { "AWS" : "*" },
      "Action": "kms:*",
      "Resource": "*"
    }
  ]
}

There are three *‘s in this policy, two of which we don’t really have an immediate problem with. Starting from the bottom, since the resource policy is only tied to one resource, * is acceptable. Even starred out, it only applies to the resource it is attached to.

Next, for the action, kms:* isn’t really an issue here since this policy’s purpose is to facilitate key management. To improve it a little, you could scope it down so the managing party wouldn’t be able to encrypt/decrypt with the key without making a policy change.

The last * is where the problem lies. This policy grants access to anyone and everything with an AWS account. Everyone who knows or predicts your key ID will be able to do whatever with it. This would set you up for high API costs at best, or complete denial of service and possibly data leaks at worst.

With S3 being a favorable storage solution for many, we see a lot of breaches that involve too openly configured S3 buckets. It doesn’t help that there are bucket policies, bucket ACLs, public access blocks, and S3 access points on top of IAM. Anyway, that’s a topic for a different deep dive.

3. Inline policies and named policies

When you create a role, user, or group, you have a choice to attach inline policies or named policies (see also). AWS recommends that you use named policies where possible, and we’re of the same opinion.

Named policies have several advantages. They offer reusability, central change management, versioning, and rollback. For AWS’ opinion on this, check the docs.

That said, it’s not necessarily a problem to use inline policies – especially for the one-to-one relationships Lambda functions have with roles.

The problem here is that you can combine these two to create a complex mess for yourself. Do yourself and your teammates a favor and be consistent. Use named policies unless they don’t fit your use case, and don’t combine the two.

If you are already lost in the woods, one of the tools that can help you to understand what happens is the AWS policy simulator (AWS account required). It doesn’t play nice with Service Control Policies (SCPs), though.

4. Extensive role chaining

IAM Roles are a temporary identities in AWS. They can be assumed by those who need them and are allowed to use them. Users, groups, services and applications can use these roles in various ways. Let’s for now focus on your users. They can assume a role, if: 1) the user can invoke sts:AssumeRole with that role in IAM and 2) the role’s trust relationship allows it.

Let’s start with the assume role policy. This policy lets a user assume a predefined role. Take the following example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Resource": "arn:aws:iam::123412341234:role/*"
    }
  ]
}

Note: 123412341234 is your normal AWS AccountID. This policy allows a user to assume any role for the given AWS account with the defined AccountID.

Next up is the trust relationship, which is defined in a trust policy. An IAM role defines who can assume that role, by including that trust policy. Take a look at the following example:

 ...Role definitions for "example-role"...
        "AssumeRolePolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                "AWS": "arn:aws:iam::123412341234:role/dev-engineer"
              },
              "Action": "sts:AssumeRole",
              "Condition": {}
            }
          ]
        }
  }

Here we say that anyone with the role dev-engineer in the account with the ID 123412341234 is allowed to assume the example-role to which this trust policy is attached.

You can easily extend the example above by allowing another role assumption from this example-role. Maybe there are some other roles the dev-engineer or the example-role can assume (see also: role chaining).

By chaining these AssumeRole calls, it’s possible that the dev-engineer can now suddenly do more than we intended. We have seen various examples where an organization allowed (external) engineers to first use a sort of view-only role to have a look at some of the configurations in AWS, after which they could assume a role that could see classified secrets. This happened because one other engineering role had to look at those as well, using the view-only role as a step in between:

Handling role chaining

Always check the trust policies attached to a role, make sure that you do not add an overly broad trust policy to powerful roles – that is with powerful policies attached. Make use of the Condition clause of the AssumeRole Policy Document as explained in this blog post from AWS.
Always check if you attached a proper Assume Role Policy to the principals, if the AWS IAM configuration is under your control.
Leverage AWS SSO. AWS SSO is able to provision all of the roles your users need across the organization. Users log in to AWS SSO and jump to their target role in just one step. This virtually eliminates the need for role chaining and greatly simplifies the setup. Happy users, happy you.

Note: Role assumption has long been difficult to trace. This has become a lot easier when looking at CloudTrail logs by means of the SourceIdentity. For more information: have a look at this AWS blog post.

5. <ACCOUNTNUMBER>/root in trust policy documents

Remember the statement "AWS": "arn:aws:iam::123412341234:role/dev-engineer" in the trust policy of section 4? If it would have stated "AWS": "arn:aws:iam::123412341234:root" instead, that’d be a red flag. If you include this, you rely completely on the IAM setup of account 123412341234 – do this only if you fully and completely trust it. That should be almost never.

The principal statement above basically implies that the role to which the trust policy is attached, can be assumed by any authenticated and authorized principal in the 123412341234 account. So if you do this, make sure you’re in full control of that account and ensure the role has limited privileges.

6. What does your service allow you to do?

One often overlooked pitfall is the scope of policies attached to the service roles of various services. When a user cannot access data in bucket A, but can access Sagemaker, then clever usage of Sagemaker can lead to leading the contents of bucket A to the user.

One of the interesting Sagemaker policies AWS references in its docs is this one:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "iam:PassRole",
                "sagemaker:DescribeEndpointConfig",
                "sagemaker:DescribeModel",
                "sagemaker:InvokeEndpoint",
                "sagemaker:ListTags",
                "sagemaker:DescribeEndpoint",
                "sagemaker:CreateModel",
                "sagemaker:CreateEndpointConfig",
                "sagemaker:CreateEndpoint",
                "sagemaker:DeleteModel",
                "sagemaker:DeleteEndpointConfig",
                "sagemaker:DeleteEndpoint",
                "cloudwatch:PutMetricData",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:CreateLogGroup",
                "logs:DescribeLogStreams",
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket",
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage"
            ],
            "Resource": "*"
        }
    ]
}

There are several potential problematic statements in there, especially combined with the Resource: "*" part. Essentially this policy allows passing any role (potentially highly privileged ones), and allows putting in objects and fetching them from any S3 bucket. It also allows publishing logs to everywhere, deleting all Sagemaker models, etc.

You can expect this from a managed policy, as they are built for compatibility rather than security. Take this into account when selecting or copy-pasting readily available policies.

Similarly, QuickSight can serve confidential content to readers. Therefore it is key to configure these services in a way that only shows information according to the entitlements of the users.

Another example is in EC2: When users can execute actions on an EC2 instance, and the instance has access to privileged data on S3 with its role, then users can use this EC2 instance to access that data.

If you want to play around with privilege escalation, check out IAM-Vulnerable. Highly recommended!

7. The “more control by fine-grained control” fallacy

There’s a German proverb that more or less says: Who has the choice, has the pain (“Wer die Wahl hat, hat die Qual”). IAM is fantastic because of the fine-grained access it allows you to configure. This is also where it gets painful, and the complexity here is one of the reasons multiple AWS accounts are a best practice.

Sometimes you have to do some grunt work to get the policy just right, while taking into account all of the other policies that apply to the same entity. Wrapping your head around everything can be hard.
To help you, here are a few pointers:

Structure your policies. For example, if you need to grant permissions related to a specific feature, give it an appropriate name and put everything required for said feature in that policy. Resist the temptation to edit other policies just because you “already have some s3 permissions in there”. Try to apply this logic to the statements within your policy as well.
Sometimes it’s easier to add a new AWS account than to get your policies just right. AWS accounts form a natural boundary and hit the reset button on ‘what can access what’ within your account. In determining when you need a separate AWS account, think of the blast radius. What could possibly go wrong if something’s breached? How will that impact other resources in your account?

8. IAM Service limits

You’ll probably hit IAM service limits sooner or later. You write a policy that’s too big, attach too many policies to a role or group, or attach too many SCPs to an organizational unit. Even if your Terraform plan or CloudFormation change set looks alright, these limits may break your pipeline and cause you to refactor your setup.

It’s useful to check the AWS IAM and STS limits every so often. Especially do so when working with SCPs, as problems here could disrupt your entire organization.

9. Attribute-based access control (ABAC) and tag policies

ABAC is a contentious topic. AWS states it is important to making security scalable, but AWS users and security experts argue that has major flaws.

On the one hand it saves you from constantly updating your least-privilege policies when you add new resources and services. On the other hand not all resources and services support tagging. This renders it useless in some scenarios.

Our view is that you shouldn’t rely on tag-based authorization within an AWS account, but it could work for granting access through role assumptions.

The reason here is that a lot of resources don’t support fine-grained access based on tags. Quite often, if you can create resources, you can set tags for them. If you can create resources, you can probably also create roles and policies. And let’s face it, you probably need to. The problem is that this renders authorization based on these tags moot within the account that you’re able to do so – even by extension (via CI/CD).

Similarly, if you want to create a role for creating EC2 instances with certain tags, and that same application should terminate only EC2 instances with said tags, good luck creating a policy that effectively narrows this down.

On the other hand, setting tags on a group and only allowing role assumptions for roles that have the same tags works decently. If that’s your use case, rock that ABAC.

10. Not looking back

Okay, you noticed some misconfigurations. Or maybe you didn’t, and you have a bigger problem. The thing is, IAM is very much dynamic. Therefore, you need to put in place proper monitoring. Concretely, you’ll want to:

Evaluate high-risk events (do some threat modeling to know what they are).
Set up AWS config to track configurations. Trust us, you’ll want to be able to create a timeline on several resources in case of an incident. And no, that won’t necessarily be in your version-controlled infrastructure as code.
Create an organization-wide AWS Access Analyzer. This way you can keep track of the external exposure on S3 buckets, roles, KMS keys, and lambda layers.

Now what?

Whew, that’s a lot to take in. Apart from finding and selecting the AWS service you need, IAM is probably the most complex part of AWS. So keep that policy evaluation logic close at hand, and think of the structure you want in your IAM setup. Create something structured. It’s natural to pivot to a different solution or architecture when your organization or stack changes, but don’t try to do a bit of everything. That’ll leave you with a tangled mess. Create separate AWS accounts, and by all means re-read this blog on IAM pitfalls every once in a while.

If you find you still have problems, or want a thorough assessment of your AWS setup, don’t hesitate to reach out!

Ten Pitfalls you should look out for in AWS IAM