Securing EC2 instances using Ansible and Conjur
August 11, 2016 | DevOps | Kevin Gilpin
Ansible provides powerful orchestration capabilities to launch and manage machines in the cloud. Conjur provides advanced identity management and access control capabilities to help secure that infrastructure.This is the first in a series of articles in which we explore best practices for using them together, combining Ansible’s legendary ease of use with the visibility and control provided by Conjur’s declarative security, public key management, granular authorization and detailed audit logging.
In the modern IT environment, code and machines provision and access other code and machines, and therefore need the same privileged access that used to belong only to people. Conjur integrates with, and complements Ansible and Ansible Tower security to control and audit access between the thousands of interconnected microservices, machines, and people that compose your digital business infrastructure. It also distributes that control to your cloud edge, across data centers and remote/mobile infrastructure for fast, non-stop protection.
In this article, we show how to automate the process of:
- Launching a new EC2 instance on AWS
- Defining an identity and access policy for the new EC2 instance in Conjur
- Associating the identity and access policy with the new instance
- Configuring the instance for SSH access
- Auditing access to the instance
Launching an EC2 instance with Ansible and Conjur starts in exactly the way you’d expect. Our sample playbook, below and in Github, follows common patterns found in other Ansible EC2 examples:
- The playbook creates and manages its own security group called EE_Demos_Ansible. This operation is idempotent; the security group will be updated if it already exists, so it’s safe to run the playbook repeatedly.
- The instance is tagged with a distinct EC2 tag indicating the application name. The Ansible ec2 action is instructed to launch exactly one instance with this tag, so the playbook is also idempotent with respect to the instance.
- The add_host action is used to add the host to the Ansible in-memory “inventory” (the list of hosts and host groups that the playbook knows about).
- The wait_for action is used to wait for the SSH port 22 to become available. Ansible needs this to do anything else, because it uses SSH to connect to machines and configure them.
Once the EC2 instance is up, we configure it for Conjur security. In this example, the instance runs a sample application called “frontend.” A declarative policy (below or in Github) which defines a Conjur “layer” for an application called “frontend” looks like this in YAML code:
The policy is loaded into Conjur like this:
$ conjur policy load --as-group ops frontend.yml
Then the policy can be inspected in the Conjur user interface:
Policy role graph:
Enrolling the new instance into Conjur
Once the instance is launched it needs to be enrolled, the process by which the instance obtains a unique Conjur identity and a set of roles (provided by membership in layers).
Conjur has a Host Factory facility for this purpose. Each Host Factory manages a set of cryptographic tokens (opaque, time-limited secrets). When the instance presents a valid token to the Host Factory it creates a new host identity which it communicates to the instance, and adds the host to the appropriate layers.
Ansible can provide Host Factory tokens to new instances as part of its workflow. Once Ansible has launched an instance, it obtains a Host Factory token from Conjur and passes it to the /opt/conjurize script on the instance, which in turn contacts the Host Factory on the Conjur server using the token.
Configuring the instance for SSH access
Once the instance has its identity, the “conjurize” script proceeds to configure the host for Conjur SSH access management (the installer for which is free software available in GitHub)
Conjur SSH provides SSH authentication via public keys and SSH authorization via PAM + Conjur LDAP, in addition to the following features:
- Each user has their own private key.
- This private key is used to SSH to EC2 instances using the standard SSH tool chain (ssh/putty client and openssh server).
- Users are uniquely authenticated by private and public key, with the public key provided by the Conjur pubkeys service.
- Users are then uniquely authorized to the host according to the Conjur role-based access control model, as defined by policies loaded into Conjur.
- Users do not login to the host as the “root” user or any other shared account (e.g. “ec2-user” or “ubuntu”). This is wonderful from an audit and compliance standpoint.
- All login, logout, and sudo activity on the instance is recorded by the Conjur audit database, which is compatible with external auditing services such as Splunk.
Enrolling and rotating the host “break glass” SSH key
In the case that a machine becomes accidentally misconfigured, certain highly privileged personnel can be given a “break glass” SSH key to log in and regain control of the machine. This is a shared login, so Conjur makes this “last resort” credential secure by managing a unique SSH key for each machine which is rotated regularly and on-demand.
Once Ansible has finished launching the instance and conjurizing it, it runs a local play using Conjur to create and install the instance’s unique “break glass” key. Conjur writes the public key to the authorized_keys file on the machine, and Ansible stores the private key in a Conjur variable accessible only to defined set of authorized security personnel.
In order to maintain least privilege and prevent itself from becoming a backdoor, Ansible revokes its own access to the “break glass” key once it’s been created.
The Power of Ansible and Conjur together
The full power of Ansible automation can be combined with Conjur security and access management capabilities to create deployment environments which are both highly automated, and highly advanced from a privileged access management standpoint.