Video Thumbnail for Lesson
10.2: Secrets Management (ESO)

Secrets Management (External Secrets Operator)

Transcript:

The other major developer experience issue that I've hinted at throughout the course is managing of secrets. So for all of our Kubernetes configurations, we are able to declare them within our Git repository. That serves as a source of truth that we can then use to deploy from and interact with. However, for sensitive information, we don't want to store those in plain text in a Git repo, and this leads to some friction or some challenge in terms of managing those secrets. There's a number of approaches that I have seen used, some of them work better than others. The first one that many teams get started with is to manually create and update secrets separate from whatever process they're updating and managing their other configurations. Maybe they store the secret.yaml in their password manager, and then whenever they need to make an update, someone has to log into that password manager, pull down that yaml, make some updates, apply it manually to the cluster. This is how many teams get started because it requires no additional effort up front. However, it quickly becomes a nightmare to manage, and it's very easy to have a human make a mistake in that process, forget to update the source of truth, or forget to apply it to the cluster at the right time. I suggest that teams look at some of these other options rather than stick with that initial approach. The second option that we're going to leverage here and demo is the external secrets operator. So this allows you to store the source of truth for a secret in a system outside of the cluster, such as a secret manager from a cloud provider. It could be HashiCorp Vault. There's a number of sources that you can use, but this allows you to have a non-sensitive configuration in your repo that points to an external configuration that has the sensitive data, and then it will mirror the value from that external secret into a Kubernetes secret for you to then consume within your cluster. Another couple of options are sealed secrets and SOPs. These are both mechanisms to encrypt your secrets. And so if you encrypt your secrets with a secure key, the encrypted version then becomes less sensitive, and you're able to store those encrypted versions inside of Git along with your other configurations. And then the key to unseal those secrets or decrypt them will live only in the cluster. And so this can provide a mechanism to keep your secrets in your Git repo and manage them just like all your other configurations. The challenge with these workflows is that now in order to create a secret, you need to have access to that encryption key, and there's also some challenges with rotating those encryption keys if you need to down the road. You also could skip using Kubernetes secrets entirely and use something like HashiCorp Vault or AWS Secrets Manager or Doppler, and then you could have your application query those systems directly and pull in the credential values at runtime, avoiding the need to deploy Kubernetes secrets at all. This is sort of a bootstrapping process, right, where you need some credential to get access to the system, at which point you can pull in what credentials you need. Depending where your secret store lives, that could be a Kubernetes secret that also could be using workload identity like we did for the cloud-native PG backups where we were able to use a Kubernetes service account and tie it to a Google Cloud service account to leverage those cloud resources. One good option here that avoids the need for any static credentials is to use your cloud secret manager and then use workload identity to access those secrets using either a tool like External Secrets Operator or pulling them in at runtime from your application directly. The version of this that we're going to implement for the course is using External Secrets Operator with the Google Cloud Platform Secret Manager. So on the left-hand side, we're defining a secret in GCP's Secret Manager. It's called ExternalSecretsExample. The value is the secret is stored in Google Secrets Manager. We then can define a manifest within our Git repository for this external secret, and it's going to reference the Google Cloud secret, pull those values in to a Kubernetes secret. We deploy this external secret resource to our cluster. The External Secrets Operator will be running in that cluster, see our new custom resource, go retrieve the value, and populate a Kubernetes secret accordingly. Let's go ahead and set this up. Navigating into the External Secrets subdirectory, the first thing that we're going to do is switch over to our Google Cloud cluster. The reason we want to use our Google Cloud cluster is that we can leverage that workload identity feature to access our Secrets Manager with no static credentials. Now we can install External Secrets, which we're going to do via Helm. We're adding the External Secrets repo, and then we're installing it with a Helm install command. With our External Secrets Operator installed, we're now going to follow the same set of paths we did for the Google Cloud Storage Bucket permissions. We're going to create an IAM service account, we're going to tie that to a Kubernetes service account, and then we can go ahead and use that to access Google Cloud Platform Secret Manager. So we're creating a service account named External Secrets with the gcloud command line. We then need to grant access for that IAM service account to access Secrets within Google Cloud. So we tie our service account that we just created to that role so that we can access Secrets. We then need to attach the workload identity user role to that service account. That will allow us to tie together our Kubernetes service account with the IAM service account. And specifically, we give it the context of the Kubernetes service account that's going to be the consumer. In this case, it's an External Secrets namespace, and it's named External Secrets. We then need to annotate our Kubernetes service account with the appropriate annotation to enable that connection to take place. If we look here at our service account, the annotation we provided is for the GKE IAM service. Specifically, we want to be able to use the IAM service account named External Secrets in this GCP project. Now, if instead of defining this service account YAML in our repo and applying it after the fact, we could have handled this within our Helm's installation directly. And so to do that, we would need a values.yaml file, and specifically, we would set the... Looking at the Helm chart here, we can see that there's a service account field, and within that, there's an annotations object. And so we can set service account annotations, and then here is our annotation. And now within our installation command, we're going to make that an upgrade install command, and let's pass it our values file. And now if we reinstall... Now that annotation would automatically be added to our service account without needing that extra step. The next piece of the puzzle is to create what is known as a secret store, and so this is the custom resource within External Secrets Operator that tells the operator where to find the secrets. And so for this, use the cluster secret store kind. You could also use a secret store which is namespace scoped. I'm calling this the GCP store, and specifically, I'm using the Google Cloud Platform Secrets Manager within my particular project. So if I apply that... It's come up, it's come up, it's ready, and it has rewrite capabilities. Great. Now we can apply an external secret configuration, and so this is the same configuration I showed here on the slide, and let's take a look at what's in here. I'm naming it example. Every one hour, I want the operator to check for new values and automatically pull those in. There is a way to add an annotation to our external secret to force it to sync sooner, but one hour is fine as a default. I then reference that cluster secret store that I just created, and here I'm specifying which Kubernetes secret I want External Secrets Operator to populate using these data, and then finally, I specify where External Secrets Operator should find the values for this particular secret. So I'm saying look for the secret in Google Cloud Secrets Manager named External Secrets Example, and then where within the Kubernetes secret I should store those values. So if you recall, if we go back to the Secrets Manager, I've pre-populated this secret. It's called External Secrets Example GCP, and the value here is this. If we look at that external secret I just created, we can see that it has successfully synced and is ready, and what that means is if we now look at the secrets in this namespace, here is the secret that it created based on the target name. If we look at the values, and specifically under the data key, and under this key, so here is the key that I specified it should store the value at, and then we'll Base64 decode, and we get the value that was stored here on the Secret Manager side within our Kubernetes secret. And so now we can store this external secret resource, which has no sensitive data in our Git repo alongside with the rest of our configurations, but store the source of truth within Google Cloud Secret Manager, which is purpose-built for controlling and managing sensitive information. The team can go into Google Cloud Secret Manager and update and manage secrets accordingly. You can control access to those secrets directly via Google Cloud IAM, but you also get the benefit of automatically syncing those into your Kubernetes secrets that you can consume from applications. This is a pattern that I use quite frequently for managing secrets. As long as you're on a cloud provider that has a secret manager like this, and AWS, Google, and Azure all do, this can be a great option to streamline secrets management and avoid that manual out-of-band process that you may have started with. So hopefully that gives you an idea of how you can solve some of the developer experience challenges associated with Kubernetes, both the iteration speed, so that process of building, pushing, and deploying new images into your cluster as you iterate, as well as the secrets management problem of how you can efficiently manage secrets while also keeping them secure.