Securing Container Workloads on AWS Fargate

When containers first became mainstream (think PyCon 2013 with Solomon Hykes on stage), everyone thought it had potential and began to test running containers on their own, but almost no one set out to put containers in production that day. They wanted to see it battle-tested…which has happened over time. Containers have matured from an emerging technology to production-ready where it’s generally considered safe, but there’s a new problem. Now, we need our business processes, tools, and architecture models to mature as well.

The top ask I hear about containers comes down to security. Containers were built as a way to isolate workloads for one another, but many of the security models from virtual machines do not work in containers, and thus we must evolve our thought process.

To that end, I presented an AWS Online Tech Talk about how to secure container workloads using AWS Fargate (though many of the lessons also apply generally across containers). I demonstrated some quick steps to make your containers more secure during the build process as well as how to enhance visibility and security around containers running in AWS Fargate.

Automatically Deploy Hugo Blog to Amazon S3

I had grand aspirations of maintaining a personal blog on a weekly basis, but sometimes that isn’t always possible. I’ve been using my iPad and Working Copy to write posts, but had to use my regular computer to build and publish. CI/CD pipelines help, but I couldn’t find the right security and cost optimizations for my use case…until this year.

My prior model had my blog stored on GitLab because it enabled a free private repository (mainly to hide drafts and future posts). I was building locally using a Docker container and then uploading to Amazon S3 via a script.

At the beginning of the year, GitHub announced free private repositories (for up to 3 contributors), and I promptly moved my repo to GitHub. (NOTE: I don’t use CodeCommit because it’s more difficult to plumb together with Working Copy.)

I was able to now plumb CodePipeline and CodeBuild to build my site, but fell short on deploying my blog to S3. I had to build a Lambda function to extract the artifact and upload to S3. The function is only 20 lines of Python, so it wasn’t difficult.

But then, AWS announced deploying to S3 from CodePipeline, meaning my Lambda function was useful for exactly 10 days!

Now, I can write a post from my iPad and publish it to my blog with a simple commit on the iPad! It’s a good start to 2019, with (hopefully) more topics coming soon…

Rotate IAM Access Keys

How often do you change your password?

Within AWS is a service called Trusted Advisor. Trusted Advisor runs checks in an AWS account looking for best practices around Cost Optimization, Fault Tolerance, Performance, and Security.

In the Security section, there’s a check (Business and Enterprise Support only) for the age of an Access Key attached to an IAM user. The Trusted Advisor check that will warn for any key older than 90 days and alert for any key older than 2 years. AWS recommends rotating the access keys for each IAM user in the account.

From Trusted Advisor Best Practices (Checks):

Checks for active IAM access keys that have not been rotated in the last 90 days. When you rotate your access keys regularly, you reduce the chance that a compromised key could be used without your knowledge to access resources. For the purposes of this check, the last rotation date and time is when the access key was created or most recently activated. The access key number and date come from the access_key_1_last_rotated and access_key_2_last_rotated information in the most recent IAM credential report.

The reason for these times is the mean time to crack an access key. Using today’s standard processing unit, and AWS Access Key could take xxx to crack, and users should rotate their Access Key before that time.

Yet in my experience, this often goes unchecked. I’ve come across an Access Key that was 4.5 years old! I asked why not change it, and the answer is mostly the same–the AWS Administrators and Security teams do not own and manage the credential, and the user doesn’t want to change the credential for fear it will break their process.

Rotating an AWS Access Key is not difficult. It’s a few simple commands to the AWS CLI (which you presumably have installed if you have an Access Key).

  1. Create a new access key (CreateAccessKey API)
  2. Configure AWS CLI to use the new access key (aws configure)
  3. Disable the old access key (UpdateAccessKey API)
  4. Delete the old access key (DeleteAccessKey API)

Instead of requiring each user to remember the correct API calls and parameters to each, I’ve created a script in buzzsurfr/aws-utils called rotate_access_key.py that orchestrates the process. Written in Python (a dependency of AWS CLI, so again should be present), the script minimizes the number of parameters and removes the undifferentiated heavy lifting associated with selecting the correct key. The user’s access is confirmed to be stable by using the new access key to remove the old access key. The script can be scheduled using crown or Scheduled Tasks and supports CLI profiles.

usage: rotate_access_key.py [-h] --user-name USER_NAME
                            [--access-key-id ACCESS_KEY_ID]
                            [--profile PROFILE] [--delete] [--no-delete]
                            [--verbose]

optional arguments:
  -h, --help            show this help message and exit
  --user-name USER_NAME
                        UserName of the AWS user
  --access-key-id ACCESS_KEY_ID
                        Specific Access Key to replace
  --profile PROFILE     Local profile
  --delete              Delete old access key after inactivating (Default)
  --no-delete           Do not delete old access key after inactivating
  --verbose             Verbose

In order to use the script, the user must have the right set of permissions for their IAM user. This template is an example and only grants the IAM user permissions to change their own access Key.

From IAM: Allows IAM Users to Rotate Their Own Credentials Programmatically and in the Console:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "iam:ListUsers",
                "iam:GetAccountPasswordPolicy"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:*AccessKey*",
                "iam:ChangePassword",
                "iam:GetUser",
                "iam:*ServiceSpecificCredential*",
                "iam:*SigningCertificate*"
            ],
            "Resource": ["arn:aws:iam::*:user/${aws:username}"]
        }
    ]
}

This script is designed for users to rotate their credentials. This does not apply for “service accounts” (where the credential is configured on a server or unattended machine). If the machine is an EC2 Instance or ECS Task, then attaching an IAM Role to the instance or task will automatically handle rotating the credential. If the machine is on-premise or hosted elsewhere, then adapt the script to work unattended (I’ve thought about coding it as well).

As an AWS Administrator, you cant simply pass out the script and expect all users to rotate their access keys on time. Remember to build the system around it. Periodically query the TA check looking for access keys older than 90 days (warned), and send that user a reminder to rotate their access key. Take it a step further by automatically disabling access keys older than 120 days (warn them in the reminder). Help create good security posture and a good experience for your users, and make your account more secure at the same time!

Add Athena Partition for ELB Access Logs

If you’ve worked on a load balancer, then at some point you’ve been witness to the load balancer taking the blame for an application problem (like a rite of passage). This used to be difficult to exonerate, but with AWS Elastic Load Balancing you can capture Access Logs (Classic and Application only) and very quickly identify whether the load balancer contributed to the problem.

Much like any log analysis, the volume of logs and frequency of access are key to identify the best log analysis solution. If you have a large store of logs but infrequently access them, then a low-cost option is Amazon Athena. Athena enables you to run SQL-based queries against your data in S3 without an ETL process. The data is durable and you only pay for the volume of data scanned per query. AWS also includes documentation and templates for querying Classic Load Balancer logs and Application Load Balancer logs.

This is a great model, but with a potential flaw–as the data set grows in size, the queries become slower and more expensive. To remediate, Amazon Athena allows you to partition your data. This restricts the amount of data scanned, thus lowering costs and increasing speed of the query.

ELB Access Logs store the logs in S3 using the following format:

s3://bucket[/prefix]/AWSLogs/{{AccountId}}}}/elasticloadbalancing/{{region}}/{{yyyy}}/{{mm}}/{{dd}}/{{AccountId}}_elasticloadbalancing_{{region}}_{{load-balancer-name}}_{{end-time}}_{{ip-address}}_{{random-string}}.log

Since the prefix does not pre-define partitions, the partitions must be created manually. Instead of creating partitions ad-hoc, create a CloudWatch Scheduled Event that runs daily targeted at a Lambda function that adds the partition. To simplify the process, I created buzzsurfr/athena-add-partition.

This project is both the Lambda function code and a CloudFormation template to deploy the Lambda function and the CloudWatch Scheduled Event. Logs are sent from the Load Balancer into a S3 bucket. Daily, the CloudWatch Scheduled Event will invoke the Lambda function to add a partition to the Athena table.

Using the partitions requires modifying the SQL query used in the Athena console. Consider the basic query to return all records: SELECT * FROM logs.elb_logs. Add/append to a WHERE clause including the partition keys with values. For example, to query only the records for July 31, 2018, run:

SELECT *
FROM logs.elb_logs
WHERE
  (
    year = '2018' AND
    month = '07' AND
    day = '31'
  )

This query with partitions enabled restricts Athena to only scanning

s3://bucket/prefix/AWSLogs/{{AccountId}}/elasticloadbalancing/{{region}}/2018/07/31/

instead of

s3://bucket/prefix/AWSLogs/{{AccountId}}/elasticloadbalancing/{{region}}/

resulting in a significant reduction in cost and processing time.

Using partitions also makes it easier to enable other Storage Classes like Infrequent Access, where you pay less to store but pay more to access. Without partitions, every query would scan the bucket/prefix and potentially cost more due to the access cost for objects with Infrequent Access storage class.

This model can be applied to other logs stored in S3 that do not have pre-defined partitions, such as CloudTrail logsCloudFront logs, or for other applications that export logs to S3, but don’t allow modifications to the organizational structure.