AWS Security Configuration Scanner
The security scanner project has now been suspended. If you want to scan your workloads, I suggest you look at prowler.
Large enterprises tend to invest into CSPM systems (Cloud Security Posture Management) like Dome9, PrismaCloud, or Orca Security. For smaller companies, it may be cost prohibitive to invest in a CSPM, so they tend to simply do nothing , and hope they don't have any breaches. This is a dangerous place to be in.
Let's assume you do look at tools like Trusted Advisor once in a while. It will show you where some of the big ticket items are that you need to look at, but it doesn't go into a lot of detail. That's where the AWS Security Info Configuration Scanner comes in. The AWS Security scanner is a Python project I've been working on for the past 2 years, and it is finally ready for release.
As the title implies, it's a tool you can use to scan the configuration of your AWS account. It has a number of built-in security controls that will give you an overview of where the security issues in your AWS account could be. At a high level, the AWS CIS Foundations Benchmark was used as the basis for the majority of security controls.
I can already hear some of you saying : Why should I use this script, when I can simply use Security Hub? And you'd be right — you could use Security Hub (and in fact, I highly recommend it!). The big difference is that with Security Hub, you'll have Config rules setup, and Config will incur additional charges. This is not necessarily a bad thing. The problem however, is that Security Hub will keep generating alerts, and unless you're actively monitoring them, the alerts will simply go into a blackhole, never to be seen again.
Why use this script then? I view it more like an audit tool. It has the ability to generate a point-in-time snapshot of what the security configuration of your AWS looks like, and the output can then be used by auditors to discuss and challenge the findings with the various cloud security architecture teams.
How to use it
I would recommend that you run the script from the us-east-1 region. Since this is the central region for AWS (where all the IAM function live), most of the API calls will occur against this region, so it's recommended that you either use the Shell, or a Spot instance in that region to run the script.
Shell
Fire up the CloudShell in the us-east-1 region. Assuming you have ReadOnly access to the AWS account, simply execute the following lines of code
git clone http://github.com/massyn/aws-security
pip3 install boto3 mako --upgrade
python3 aws-security/scanner/scanner.py --collect /tmp/%a-%d.json --report /tmp/%a.html
That's it! The script should start running. Depending on the size of your environment, it may take about 30 minutes to run, maybe more.
Spot instance
This is a work in progress. I have been successful in running a spot instance to execute the script. I am busy packaging the solution, and will update this blog post once it is ready. Essentially, you need to :
- Create an EC2 IAM Instance that has read-only access to the entire AWS account, and write access to a specific S3 bucket.
- Spin up an EC2 Spot instance with a public IP address, attach the EC2 IAM instance to the spot instance, and run the same commands as mentioned above.
- When the script is done, copy the generated files to an S3 bucket, and destroy the EC2 instance.
Operation
The script connects to AWS using the default credentials, and starts to interrogate each of the services to retrieve the data. This is where the json files comes in. When it's done with the data extraction, you'll have a single json file that contains (most) of the system configuration that has been defined on your AWS account. This has huge implications. If you're interested in digging through the config, you will be able to generate your own jmespath queries to retrieve anything your heart desires.
Once the json file has been created, the policy parser kicks in. It will read through the json file, looking for the logic that has been predefined in the script, and then generating a report (in HTML format) of all the findings.
Hidden features
When specifying the output file names (--collect
, --report
, --evidence
), you can specify %a
(for the accountId) or %d
for the date. This allows you to have a batch file or a shell script you can run against a number of accounts, and it will keep a file per account, per day.
You can also request the cloud team to run the json extract for you. Once you have the json file, you can parse the output yourself, using the --nocollect
function. This will simply skip the data ingestion function, and read the provided -–collect
file, and parse the security rules.
Did you know you can specify an S3 path for the html or json files? That's right! You can store the HTML file directly to S3!
Known issues
- The script does support the ability to run AssumeRole and connect to another account that have permissions. The issue however is that the provided credentials are only valid for 1 hour. If your data collection will take more than an hour to run, the script will start failing. As a workaround, the json file is constantly updated, so simply restarting the process will allow the script to continue where it left off, and complete.
- Not all use-cases could be tested. There is a chance that some data collection or policy parsing would fail, simply because I wasn't able to test it. For example, I do not have access to Direct Connect on my lab system (and I'm really not going to request a dedicated leased line just for that), so there is a possibility that you may have some failures as a result of that. If so, simply open a case on the Github Issue log, and let me know about the issue, so I can resolve it.
- The script takes too long to run as a Lambda function.
- Data ingestions for cloudfront – list_functions may fail. If so, update your boto3 python library.
What's next?
This is where you come in. The main driver for this project is to give something back to the AWS community, to make AWS a more secure environment for its customers. Some of the things I'd like to still do are:
- Fix the Lambda function. This will require decoupling the script, and let multiple Lambda functions run to collect the data from multiple regions, and possibly storing the data in Dynamodb.
- Add more policies. Do you have some ideas? Log them in the Github Issue log.
- Add multi-threading. When connecting to individual regions, do it in a multi-threaded manner so that we can speed up the execution of the data collector.
- Build a web Frontend. I am playing with the idea of turning the script into a full-blown CSPM solution.
Community Support
The project is hosted in GitHub, and being in GitHub means that you can fork your own copy of the code, and adjust it. All I ask is that you give credit, and that you contribute to the overall project with source code suggestions, or new policies you'd like to see.
Reference guide
collector
The collector script is responsible to connect to AWS, and retrieve the data objects from the internal database.
--collect
Specify the output path where the target file should be stored. Note that throughout execution, the file will be written. This is done by design, that should the script fail, you're able to restart it, without losing any of the previously extracted data.
--collect c:/temp/myfile.json
Save the file to the c:/temp folder called myfile.json--collect c:/temp/output-%a.json
Save the file to the c:/temp folder, but call it output- followed by the AWS account ID--collect c:/temp/output-%a-%d.json
Save the file to the c:/temp folder, but call it output- followed by the AWS account ID, and today's date in YYYY-MM-DD format. This is useful if you want to save a point-in-time snapshot of the data extract.--collect s3://mys3bucket/%a.json
Save the file to an S3 bucket
--nocollect
Skip the data collection part. Useful if you only want to parse the output against a set of policies.
--evidence
Once the policy engine has compared all the policies, you can (optionally) save the output file as a local json file.
--report
The output report (in HTML) can also be stored for consumption through a regular web browser.
Ways to authenticate
- By default, the script will use your default credentials, typically the [default]
- The second way is to provide the access keys. This is not recommended, as this is typically a risk of exposing credentials when they're not properly managed.
--aws_access_key_id AWS_ACCESS_KEY_ID
--aws_secret_access_key AWS_SECRET_ACCESS_KEY
--aws_session_token AWS_SESSION_TOKEN
-
The preferred way to authenticate is by assuming a role. For this to work, you will need the RoleARN, and the External Id.
--rolearn "arn:aws:iam::123456789012:role/name-of-role" --externalid "your-external-id"
-
The last way you can use, is to combine the two. When both access keys and role details are provided, the script will use the access keys to perform the authentication against the STS service to assume the new role.
Release Notes - December 2021
The December release is a significant rewrite. If you relied on the original data format, you'll notice that it is no longer the same.
- Splitting collector and policy parser - Previously the
scanner
script was calling both the collector and the policy parser. They have now been split into two distinct scripts. - Data ingestion moved to yaml file - Rather than hardcoding data ingestion in Python, the data ingestion can now be configured through a yaml file (note that some use cases cannot be moved to yaml, so they will remain as Python modules)
- Updated data format - The json data output file had a signifcant change, by simplifying the data format, and flattening the data structure for typical pagination operations, making it easier to parse the data in the downline. Any custom policies or code that rely on the data format will need to be modified to cater for the updated format.
Release Notes - August 2021
It's been a bit of a quiet month for updates to the AWS Security Info modules. There's been a couple of changes that I'm publishing today.
New features
- We now support Organizations! That's right.. If you point the script to the master account, and you specify the
--organization
parameter with the name of your organizational role, the script will interrogate every account in your organization. - The
--regions
flag will allow you to specify the regions you operate it, thus reducing the total number of API calls being made. - Managed AWS policies'
get_policy_version
data is now added to the initial.json file, and fed into the data load on first load. This is speeding up the data collection process significantly by reducing the amount of API calls for managed AWS policies that do not change very often. Should AWS make changes to their policies, simply delete the initial.json file, and let the script run through it once.
Bug fixes
- Using
--assumerole
works fine, however when you have an empty--externalid
, the sts module fails. checkVersion
incorrectly flagged newer boto versions as not being upgraded.
Data collection
- AWS SSO has been added. Note that some aspects are still missing (like identitystore, and the visibility of MFA settings in SSO, and that for users)
Policy updates
- NEW :
Ensure SSM is enabled on all EC2 instances
- UPDATE :
Lambda functions with depreciated runtimes
now checks for nodejs10.x, ruby2.5 and python2.7