Video Thumbnail for Lesson
9.2: Trivy Operator

Trivy Operator

Transcript

The next tool that I want to add to my Kubernetes cluster is a tool called Trivy Operator. Trivy is an open-source project from a company called Aqua Security, and it is used for scanning container images and cluster configurations. And so by deploying this into our cluster, it's going to automatically detect every container image that is running in the cluster and scan it when it is detected, but then also it's going to rescan every 24 hours or whatever time period you specify. And so this is really awesome, one, because you're going to get automatic visibility into all of the container images, whether they're your first-party container images or third-party container images. And also, if you were to scan images within a CI pipeline, for example, oftentimes vulnerabilities are discovered after you have deployed an image. And so that initial scan result may become outdated and need to be superseded by one that includes the latest vulnerability database. It also exposes metrics that you can, for example, send to Prometheus or Datadog and use those to alert your security team based on certain criteria. We're going to deploy the Trivy Operator into our cluster. It's then going to watch for any job, replica set, daemon set, or stateful set, extract the image and tag from that, and that will trigger Kubernetes jobs that are going to both do a vulnerability scan as well as a configuration audit, which then produces additional custom resources that'll be stored in the cluster, and we can view a vulnerability report as well as a configuration audit report. So let's go ahead and install Trivy and then view what these look like. We're going to install the Trivy Operator with Helm. So we're doing a Helm repo add for the Aqua repo, and then a Helm upgrade install for the Trivy Operator. We're using the default values. Again, there's a number of configuration options, but for this use case, the defaults work just fine, and we're specifying a version. If we do k get pods in the Trivy system namespace, you can see first, here's the operator pod, which came up 26 seconds ago, but then already in this first 30 seconds of its life, it has spun up a number of vulnerability report scan jobs, one for each image that it's detecting in the cluster. We can then do k get vulnerability reports. None of those scan jobs have completed yet, but as they complete, Trivy is going to write to these custom resources, where we will be able to then take a look at the specific vulnerabilities that our images have. We can see that as those vulnerability scan jobs complete, more and more of these vulnerability reports are added. It started with the kube system namespace, and now we can see some of the images that we built are now included. You can also use the dash o wide flag. We now, in the output, get a summary of how many vulnerabilities at each severity level Trivy has found. For example, in our migrator job, we see that there was one critical vulnerability. What if we wanted to dive in and look at specifically what that was? We can do k get vulnerability report in the demo app namespace, give it the name, and now if we do o yaml, and pipe that to YQ, now we get all of the details about the vulnerabilities that it found. I can search for critical, and we can see that there is a standard library vulnerability in Golang 1.22.2 associated with the net IP module that was fixed in 1.22.4. If we wanted to fix this, we should upgrade the version of Go that our base image is using in order to get that updated version. We can click in to the link provided to see the Aqua security summary of the vulnerability, or we can search the CVE number and see the official NIST details about the CVE. It describes what the CVE is, has some additional resources, and again tells us which configurations it has been fixed in. With that basic install of the Trivia Operator, we got these custom resources that are now giving us deep insight into vulnerabilities across our workloads.