Article originally posted on InfoQ. Visit InfoQ
Transcript
Losio: In this session, we are going to be chatting about the security of a Kubernetes and cloud native environment. I would like to give a couple of words about today’s topic, security of Kubernetes and cloud native environment. With the advent and rapid evolution of cloud-based solutions, Kubernetes, I think, is the dominant force at the moment in orchestrating containerized application. When we think about security, it’s a key part we need to think about. We are going to discuss about what are common mistakes in security of Kubernetes clusters, what are best practices? Also, when you start, what can you do? Where should you start? Are the defaults good enough or should you change something? Is there any tool that’s going to make your life easier? We’ll try basically to understand what we can do, what is the next step in our security journey.
Background, & Journey with Kubernetes Security
My name is Renato Losio. I’m Principal Architect at Funambol. I’m an editor here at InfoQ. I’ll start by giving each one of our speakers the chance to introduce themselves, and basically give our audience the idea of where you are in your journey with security and Kubernetes.
Nardiello: My name is Jacopo Nardiello. I’m the founder and CEO of SIGHUP. We are a scaleup based in Milan. We are an Italian company. We work mainly in Europe. We’re focused on Kubernetes adoption within the enterprise, open source Kubernetes. We are focused on the upstream side of the cloud native environment, and how it can be used efficiently within large organizations. That’s mainly what we do. Of course, right now we are in the moment where Kubernetes is reaching Day 2 life. We already have our clusters, now what? Of course, security is a big topic into that overall discussion.
Devochko: My name is Kristina Devochko. I’m based in Oslo, Norway. When it comes to my role, it’s always hard to say because I do a lot of different things as part of my work, but I call it for a software architect. In Norway, we have a term called potato where you do a lot of different tasks. I work a lot with Kubernetes, both on my network and also as part of my personal community contributions, and especially with the managed Kubernetes service like Azure Kubernetes Service provided by Microsoft, for instance. I have always been interested in security, since I started my career in IT as a developer. There was an essential transition when we started working with Kubernetes as part of my job, and I think it’s a huge topic when it comes to the Kubernetes ecosystem. There is a lot of important and also fun topics to touch upon.
Chenetz: I’m Michael Chenetz. I work for Outshift at Cisco. Don’t think of Cisco as this networking only company. I have nothing to do with networking. Really what we do is we look at the cloud native technologies. We have cloud native technologies that are around the cloud native application protection and things around Istio and Kafka. Really, the reason why I’m really interested in this is because I tend to think about all of the people getting into cloud native, the people that are thinking of, I have to take this legacy application and turn it to microservices. I have to think about all of the different elements that are needed in order to get this application up and running. Developers we know because I’m one of them, is made for speed, meaning that you have to get this application out the door. What you’re not thinking about when you’re told that you have to get to microservices, is this aspect of security. That’s an afterthought. The CNCF is great, but it’s also overwhelming. You have 5000 different things on the landscape. You’re thinking about, how do I maintain? How do I secure this? There’s a lot of this stress. It’s a great spot to start talking about why we’re here.
Security Pain Points of a Kubernetes Cluster
Losio: I’d like really to start with, what do you consider a pain point today? I have a Kubernetes cluster on the cloud or wherever I want to have it. From a security point of view, what do you see as the main pain point? I think Michael just mentioned the too many options, and how to decide, how to manage? What do you consider a pain point right now?
Nardiello: The whole point is that it’s such a complex topic. What do we mean by security? Do we mean threat analysis? Do we mean artifacts management, software bill of materials? Do we mean runtime security, secrets management? How do we run our builds and our pipelines? These are all challenges. This is not even to mention the regulation part, certifications. Now from Europe, we have the CRA, which is the new act from the European Union, basically holding accountability on open source software for whatever weakness they introduce in your software.
Devochko: This is totally quite an extensive and complex topic. We hear a lot about the complexity of Kubernetes and Kubernetes ecosystem. You have that landscape that Michael mentioned. There are some great memes out there about you watching that infographic of CNCF landscape, and you are close to getting a heart attack. What are you going to choose? Then, it’s also the change in the mindset. How are you thinking about containers? How are you thinking about orchestrators like Kubernetes, that’s not the same like working, for instance, purely with VMs. Understanding also that layered approach and how the changes you make in your software, for instance, in that software development lifecycle can affect the security posture of, for instance, your Kubernetes clusters, where those apps are running. All of those can become pain points, especially in the beginning of your adoption journey. Getting those helpers, some frameworks that could help would also really be important to be aware of, to not make it more complicated than it should be, I think.
Chenetz: There’s a couple things I always think about. First of all, is Kubernetes the right solution? You shouldn’t be thinking about initially your platform, you should be thinking about the needs of the application. What are the needs of the application? Break it down. Do you need microservices? Do you need that at all? If you do, maybe something like Wasm, depending on what your needs are. Figure out your needs first. Do you need redundancy? Do you need a way for your developers to work on things by themselves and then bring it together at the end? You may need microservices. I’m not saying you don’t. What you shouldn’t do is just say, I’m going to pick a platform and design for it. That’s the wrong way, in my mind, of doing this. Once you have decided on that, then you should start to think about, how do I design this in a way that’s going to scale, that’s going to be secure, and that we can manage? Unfortunately, with things like Log4j, we’ve seen that supply chain is an issue. There’s a lot of issues that come into play when you’re using all these. The good thing and the bad thing about developers is that we tend to go for the easiest API that’s going to get something done. We don’t think about the ramifications of that until afterwards. We’re like, “The application’s good. We’ve done it. It’s built. It’s working. I put in this cool thing called Kubernetes. I’m going to bring it to prod.” No, you’re not. Have you thought about all these different ramifications? Really, the developer, we can’t put the onus on them to be the security experts. It’s not the same domain knowledge.
Starting Out with Kubernetes
Losio: I’m a developer. I join this roundtable. It’s all very cool, but I’m building my application. Yes, I might have to use Kubernetes. Is it even my problem? As a developer, as an architect, where is the barrier? I know the entire concept of shift left. It’s like, with all these things, all these concerns, and as Jacopo mentioned before that security means many different things. What should I care about? Can I even just see where I put my head under the sand, and I don’t care about anything as security is not my front? I know that’s incorrect. What do developers see as the goal? I know there are 10,000 different ramifications, but if I start today with Kubernetes, what should I really do as a developer?
Nardiello: As a developer, the bare minimum is to have a little bit of ownership on the Kubernetes primitives, when it comes to security. Have an idea of what RBAC is. Have an idea of how to manage your policies. What does it mean to use Kubernetes in a secure way? Know your tool, if you want to be aware of Kubernetes. Sometimes you just don’t want to care about it. Sometimes you just run it, and I don’t want to know how you run it. These are all topics for someone else. Still, I think that as a developer, you need to be aware of these basic things. Then the other key thing, which is not really Kubernetes related, is be aware of your dependencies. That’s a big thing. Your operating system is a dependency, and all the libraries within it are a dependency. Especially when it comes to containers, where, yes, we have the kernel running stuff, and then you as a developer pick, what’s the base image that you are going to use? You need to be aware of the choices that you’re doing? Yes, dependencies and basic stuff like policy management and RBAC are the things any developer should know. They don’t need to be an expert in other things, which are more specialized, but at least this stuff, they should take it seriously.
Devochko: When it comes to developers, it really depends on how the organization is set up, like how your teams are operating. I have been in projects where developers were directly involved with, for instance, Kubernetes, because, for example, they didn’t have a dedicated platform team, due to different constraints. You have teams that are more mature that they have a dedicated team, maintaining, administering those Kubernetes clusters. I think the silver line there is that for you, as a developer, it’s important to understand the fundamentals of Kubernetes, what it is. What are containerization orchestration technologies? I think it’s also important to be aware of the policies and the gateways that will be set in case you have others ensuring the security and reliability for instance of the workloads and of the Kubernetes cluster. In case you do a misconfiguration, and you will be stopped due to the automated gateways, you need to be prepared for this. That you will need to check it out and follow up and take the responsibility for the service you’re developing. Many teams do have responsibility for building those containerized applications, so they have responsibility for ensuring that the container image is hardened, secured, you’re using the Slim base image, and all that stuff. Understanding how to do that, and having the dialogue with eventually Kubernetes cluster administrators, if you have those, would be important to ensure that they can help to reduce that cognitive load on you.
Since Michael mentioned Log4j, I also want to mention one thing, which is something I really always want people to remember when you start working with supply chain, wherever, don’t use the wildcard. Don’t use the latest version, because, let’s say in Kubernetes, you have different clusters, but you have a single source code base, where you define the latest version, and those clusters are being deployed, provisioned at different points in time. Then you may end up using different versions of the same library in case you’re using latest because a new release could have come out. With Log4j, I’ve seen that pain, trying to get that overview of, which version do I actually have of that library, in which cluster? That is not easy if you have the latest as part of your source code definition. Understanding that and using a specific version could also be an important step, a quite easy one to begin with.
Chenetz: Just going back to whose responsibility is it, though? I had Kelsey Hightower on my show, and we were chatty about this. He’s like, you wouldn’t take a plumber and tell them to do electrical work. I love the way he always phrases stuff, because it makes it so much easier to think about stuff. Really, the key is you have to think of the main objective. The main objective of designing these applications for a business is profit. You want to make sure that your development is not coming off the track. Yes, the developers need to do secure coding. Yes, they need to do stuff that’s in their domain, but, really, we want them to still be doing development. That means that we have to start to enable other processes, other tools, other things to do to have a framework and to keep them on track and make sure it’s checking for things like supply chain and all these other kinds of CVEs and things like that.
K8s Specific Vulnerabilities
Losio: I fully get the part about the Log4j example. As a developer, I feel like that applies as well, if I don’t use Kubernetes. I understand that there’s nothing really specific for Kubernetes. I know that the distributed architecture of Kubernetes can make it worse. What I’m thinking is, is there any specific on Kubernetes, for example, vulnerabilities that we should as cloud specialists think about?
Chenetz: Actually, the Kubernetes webpage has a really good overview of cloud native security. They talk about the 4C’s of cloud native security, which is code, container, cluster, and cloud, or colo, or corporate data center. They wanted to keep with the C, obviously, there. They have a really good page that talks about each of the different configuration options and things that you should consider in cloud native security. Really, it comes down to, and this is what I was thinking about, is that the containers within Kubernetes and parts of Kubernetes are designed to be open by default. That’s because they want people to be able to run their applications. People would complain and say, our application is not working, and there would be a lot more tickets, there’d be a lot more problems and it wouldn’t be adopted as well. It’s, by default, very open and there are things that you need to do to lock it down. That’s the key there.
How to Secure the Kubernetes Cluster
Losio: That’s really the key. As a person who is starting to play with Kubernetes, we are saying, ok, so the default is open. What is my first step in the direction to secure my Kubernetes cluster?
Nardiello: There are so many aspects that you need to take care of whenever you run Kubernetes, and whenever you run stuff on top of your Kubernetes cluster. It will be easy for us to say just use a managed Kubernetes, that’s going to take away some of the aspects. Running clusters is one of the aspects. There are so many other things that you need to take care of.
Advantages of Managed Kubernetes vs. Other Alternatives
Losio: You mentioned something actually interesting in terms of managed Kubernetes. I think the word managed Kubernetes already came up a couple of times in terms of cloud offering. Really thinking about security, what are the advantages you see in adopting a managed solution, AWS, Azure, or Google Cloud, versus, basically do it yourself.
Nardiello: I’m an expert on this topic because we maintain our own distro. Running Kubernetes on-prem is a big topic for us. The key difference is that whenever you have managed Kubernetes, beside the complexity of running Kubernetes, itself, you get some pre-configuration, and some enhanced features that are abstracted away by the provider. That’s definitely the key benefit. Still, you need to be aware of the stuff that you do. As an example, by default, you might have public control plane. You need to be aware that that shouldn’t be the case, in case you’re running production workloads. The fact that they are abstracting away some of the complexity, making it more accessible, it’s truly a benefit. On the other hand, it does not really take away the fact that you should be aware of what you do. Of course, if you run stuff by yourself, on your own, whatever infrastructure, it could also be EC2 instances, so just VMs, or VMs on your provider, you need to take care of everything. Nothing comes for free. It’s a tradeoff. There’s no simple answer to that.
Chenetz: There’s two different things. There’s the actual Kubernetes infrastructure that you have to maintain, if you’re going to do that yourself. Obviously, there is that Kubernetes page, which we can link and maybe show people at some point. The great thing about using a managed service versus that is that there are preset templates, preset security, preset ways for you to handle those things. When you’re doing it yourself, then you have to think about certain things. You have to think about the infrastructure layer. You have to be a domain semi expert in multiple things. One thing is infrastructure, which is like network access, things like that, access that CD. You have to be an expert in containers, so things like container vulnerability, image signing, using container runtimes, all those kinds of things. Also, the workload security. You’re getting down to these different levels that you need to be experts in. If you’re doing it yourself, you have to have a lot more of that knowledge. If you’re doing it in a managed, they have that knowledge for you. It’s a lot easier. You don’t have that barrier for entrance, to go into the cloud provider versus your on-prem.
Devochko: I’m coming also from the consumer perspective. You do not only need to maintain multiple things like control plane, which is the operation center. It’s the crucial component of Kubernetes. You need to ensure that it is redundant, available, secure, because everything in terms of your workloads depends on it. I would think that you would also need multiple resources to be able to handle all of these components, because you don’t want to be the only one handling that. Also, it requires much more from you to do that does not come out of the box, which means that you would need more people to ensure that it’s production ready. In terms of managed Kubernetes service, cloud providers do invest quite a lot into the security aspect of things. To quite a high extent you get some security controls from them out of the box, for instance, in terms of the infrastructure, the servers that those Kubernetes nodes are running on, and the control plane as well. Which can help especially if you don’t have that hardcore production experience with running bare metal Kubernetes service.
You mentioned earlier Renato, like for a new developer or an architect who would like to understand the security aspect of Kubernetes. I think it’s important to mention that if you are going for a managed Kubernetes service on cloud, I think it’s important that you start from the security benchmark that the cloud provider has for their managed Kubernetes service. Because going through that information will give you an understanding of how the clusters are configured by default, when you start them up. For example, if we think about Azure Kubernetes Service, it has less secure configurations enabled by default, because the strategy overall is to make it easier and faster for consumers to start playing around with it. For instance, the API server is enabled and publicly available by default. You would know that by checking that documentation where it says step by step, for every layer, what is coming by default, and what you would need to do to secure it, more to harden it, and what limitations it has. If you go for a private cluster, which is only available privately through private IPs, and all that, then it does have some limitations in terms of build agents and how you’re having all the CI/CD flow in place. I would not underestimate the security benchmarks that you would need to check as the foundational place to start getting an understanding of the Kubernetes offering you’re opting in for.
Chenetz: One other aspect of this that you have to consider is the human aspect too. Because we’re in a time where there’s a lot of shifting employment and things like that, and the person that you have that’s going to manage this internally might be way different, that might not be the same person that’s going to manage it in the future. If you do it in the cloud versus doing it yourself, you might have a better chance of continuing the management of that. That’s one aspect of it. The other thing is that there’s different skill sets. There’s different skill sets to manage the cloud providers’ ecosystem versus managing it on-prem. That could lead to more security drift, because you have to know a little bit different domain knowledge in order to do one versus the other.
Handling Kubernetes Plugins/Operators
Losio: How do you handle Kubernetes plugins in terms of security? Is that as scary as it sounds, as scary as you read about?
Nardiello: What do you mean by Kubernetes plugins? Are we talking about operators?
Losio: I was thinking maybe operators.
Nardiello: There are pieces of software that use the event-based system internal of Kubernetes to handle the state of pieces of infrastructure. They are very complex. They can contain arbitrarily complex logic within themselves. Is it scary as it sounds? I don’t think it’s as scary as it sounds in terms of Kubernetes providing you some common framework and common layer of logic that you can rely on. Again, you can embed whatever you want within them. Yes, they are just another piece of software that you are running within Kubernetes. You need to be aware of that. In general, I agree that they are complex. You need to be aware of what you’re doing. A few years ago, we had this operator craziness, because operators were like the cool thing to do, and they just got started. I think that right now, they’re getting a little bit more mature. In general, they are a super powerful pattern. They allow you to use the control feedback loop of Kubernetes within your other pieces of infrastructure. Again, they might introduce some complex logic. Yes, better if they are supported by someone, if you have to run them in prod, or you have the sufficient level of skills to handle them.
Chenetz: I just go back to one thing, and that is managing your supply chain, because, really, what it comes down to is that every piece of software that you’re running, whether it be in a container or reaching out to an API, you have to think about, what is that built of? If you’re running more operators, you’re running more things within your system. That should be considered as part of that, because of the fact that it’s additional things to manage. It’s additional things that have different images and different components in there. When you have that, then you’re increasing that attack surface, not always, but potentially increasing that. You have to make sure that you really need that operator. What we mentioned before is that people went operator crazy. We went from Helm, then we went to operators, and everything was operators. Honestly, not everything needs to be an operator. Really think about what needs to be in there. What function does it have? Do you need to maintain a lifecycle of something? Then you may need to use an operator. If there’s a simpler way of doing it, where you don’t need to maintain a lifecycle, then you probably don’t need an operator. It comes down to that.
Best Practices for Securing a Kubernetes Cluster
Losio: I’d like to go back to the beginning where we were saying that there are common mistakes in securing a Kubernetes cluster. I would like to start to give some advice, action item, and what do you see as the, for example, best practice for least privilege, secret management, or whatever you think is important in terms of securing your cluster?
Chenetz: I think the keys here is, don’t allow privilege access if you don’t need it. That’s first and foremost. A lot of people, by default, just allow that, and that’s going to open up doors to everything. Role Based Access Control, make sure you use that. A lot of people run things as the default Kubernetes user and administrator, and that shouldn’t be the way that you run things, but it is the way that people come out the door. A lot of these simple things. Those are the two biggest things that I think right out the door, you should take a look at. If we had to take two things away, those are the two things that I think are key. We could start to talk about some of the littler things too. Yes, network policy between all the microservices, make sure you’re only communicating what you need to communicate. Outside world, what are you using as a load balancer? What are you using for your inbound network? Are you using a service mesh? If so, how is that configured? There’s a lot of things to consider. Really, it comes down to least privilege. You want to make sure that you provide the least amount of privileges needed to run your application. If you’re doing that, you have a good start.
Devochko: Michael mentioned quite a few great pinpoints, like least privilege principle, in this case is a silver line. Of course, what we mentioned about the API server, you don’t need to have that broad, unnecessary threat vector in place even though it just returns for instance unauthenticated, or 401 error code. It still gives an attacker some information that they can work further with. Thinking about what you can do with this. Can you introduce IP ranges to begin with, or could you use like a private cluster in itself? Here, I think it’s important also to take it step by step. Taking one step at a time, implementing it, seeing how it works and if anything breaks. If you need to follow the top and not trying to implement everything at the same time. In terms of RBAC, what is good with managed Kubernetes services that you can use the RBAC solution that is provided by the cloud provider? In terms of Microsoft Azure, you could integrate it with Azure AD. What is good about this is that you could limit the roles and the permissions every user has in the cluster to the bare minimum that they need in terms of doing their job, once again, least privilege. You could implement just-in-time access for those who need just-in-time access there, and then to do something, to check something. This is a good advantage for the managed Kubernetes service.
Network policies, obviously, are important because namespaces do not provide isolation by default. This is also a misconception that has been there for a while. By default, workloads and all namespaces in the cluster can communicate freely with each other. You would need to restrict that with the help of network policies or service mesh that can provide you these capabilities in any way, in some cases. Once again, all those security frameworks that are there should be used as a checklist for you, as a cheat sheet, like even OWASP Top 10 for Kubernetes is a great start. It gives you a checklist exactly, check this. This is the way you can fix it, for example, by using security context on the pods and containers to ensure that they are not running as root. Here are some of the examples and pinpoints to think about. I see that I can derail a lot if I start going step by step, but these are the first things to check. Maybe to make it easier, you could start by implementing some tool that can actually flag these things for you, and make it easier to get an overview of what you need to get in place.
Nardiello: Another great framework that I feel like mentioning is the NIST. NIST is providing great guidelines into how to secure containerized workloads. They are pretty broad. Kubernetes is one of the key aspects, but it’s not the only key aspect. NIST provides a great, even introductory document. It’s a great place to start. Michael and Kristina gave a great overview into some of the key important things that you need to take care of. I may add some others or I may elaborate a little bit. Someone was mentioning pod security policies or OPAs. Open Policy Agent is a key element that I would definitely integrate into all production clusters. There are alternatives like Kyverno. Just define your policies as code, and define what it means for your organization. What is acceptable to your organization? Like Michael was mentioning, avoid the whole privilege escalation path, or avoid running containers as root, or these kind of stuff. These can all become policy as code that you can actually embed within your cluster. The clusters, they will become aware of what it means to be secure, and what is a secure workload.
Another thing I want to mention is sign your containers. There is a Sigstore whole initiative, implement Cosign within your clusters. Make sure that you run signed workloads and not just random stuff that you just find on the internet. Make sure that what you run is what you build. It’s not to be taken for granted. Kristina was mentioning service meshes, definitely. They will give you basic stuff, like mutual TLS, and basic observability within your clusters. Invest into that. These are all expensive topics, meaning that they still require you to put effort into it. These are all very actionable things and very practical projects that you should definitely onboard within your clusters. That’s not even to mention the whole ingresses. How you secure your ingresses or how you expose your services. That’s another key topic. It’s related to service meshes, but not only.
Securing Container Images
Losio: You mentioned something that I found interesting was the topic of securing container images. What am I supposed to do as a developer?
Nardiello: This is a topic, which is very close to my heart, meaning that it’s the typical topic, which is mostly ignored 90% of the times. Whoever is using Kubernetes clusters, they’re just, yes, I will just take whatever from the Docker Hub, and run from there. It’s such a core thing. Make sure to use sane good base images. You can invest as much as you want into policy as code, into service meshes, into operators, whatever. If you’re using an image which has bitcoind running within it, it’s a problem. Right now, I think that also we are at a turning point when it comes to secure base images, meaning that there is a lot happening with Sigstore, Cosign. Chainguard is doing amazing work into that with Wolfi and distroless images. We have our own secure containers catalog from SIGHUP. The whole point is that, make sure to use images which are sane. By sane I mean, maintained, somehow, because sometimes they are just not maintained. That’s the very first thing. Second thing is, they run with good defaults, so rootless, mostly. Then, with good best practices. What these best practices are, it depends on the workload that you’re running.
Devochko: I can maybe pitch in here to what Jacopo mentioned, regarding trusted base images. I think also, it’s important to mention, like opting, if you can use the slimmest, the minimal base image that is out there, because many providers of those base images have different options that you can choose from. Being aware of what image you go for, not just choose the one that is there by default, but evaluating, can I use the minimal image? Does it have enough there for me to run my application on, because it has multiple benefits in terms of security that you don’t run what you don’t need as part of that image so the attack surface is being reduced. Also, in terms of resource requirements, it will be more lightweight when there is less packaged into that image. This is also a benefit. Also, looking into like both in terms of trusted registries, thinking about what registries you’re pulling the images from will also be a benefit in terms of security, of course.
Chenetz: This is all about repeatability. You don’t want to do any of this stuff manually. You want to make sure that you set up a pipeline, that checks for these things. You’re creating this application, it has all these microservices, it’s going to pull down Docker images. You want tools that can check those images and the layers in those images for things like CVEs, and other things like that. You also want to check the software supply chain. You want to check SBOMs and all these other things. You want to make sure that everything is good before it goes into production. You don’t put something into production until it’s verified as known good. It has to be repeatable. None of this stuff is manual. We have tools. This is where you won’t think Cisco plays. We have open source tools that will help with API security, with Kubernetes security, with data security, all these other things. We also have a full cloud native application protection suite which you could just say, let’s check the policies. Let’s check the Kubernetes policy. Let’s check the APIs for vulnerabilities. Let’s check all these things. None of this is manual. You can actually automate a lot of this stuff. Find the tool that works for you. Make sure to implement and automate out all of this stuff, all of these best practices, all these things, and leave the code for the coders. Make sure that you trust that automation that’s going to check for all these various things.
Devochko: I think it’s important just to also mention one thing in terms of building container images, and in terms of using the root user. I think it’s important also to have in the back of your head that there are still many base images that come with a root user that runs that container by default. There are providers of those images that can offer an unprivileged version. For instance, NGINX has two different offerings, where they have one that is NGINX unprivileged, and one which is regular. That one will run with root user by default. This also drills back to the pods that are being exposed and which user is allowed to use them. Being aware of that, and looking into if you can use an unprivileged base image offering, or in some cases, you may need to do something when you build your image. In some cases, you can override it. You can do some workarounds like with .NET images, where you could still override it and use a different user to run your containers. Just knowing about that, is important. Also, Open Policy Agent and policies you can enforce will also help you flagging the containers that are running with a reducer.
Chenetz: This isn’t a one and done, for base images. Images are constantly changing. There are tools like Slim.AI, and some of these other places that constantly monitor the images and check for those vulnerabilities. It is one of the most important things that you could do. Definitely be consistent with these things.
Stripped Down Distributions
Losio: I think Kristina before mentioned the topic of distribution that are basically streamed down to the basics to run as a container. The less packages you have is not just faster, but is more secure as well. Of course, in that sense, cloud providers go in that direction. I was wondering if there’s any recommendation, because if I’m on a cloud, I tend to stick to the distribution of the specific cloud provider, whatever, it’s the Linux distribution of Amazon 2, or whatever else I’m using. I was wondering from your side when I’m running my own cluster, what do you recommend? Do you have any specific recommendation for a specific use case?
Chenetz: There are so many, depending on the provider that you have. We mentioned things like Wolfi and things like that, and there’s Slim.AI. There’s a bunch of different ways to get that great base image. The key is to keep it slim to make sure that you understand what’s in it. Don’t just pull something down. There are so many that you can go for. I’m hesitant to recommend one because it’s still a space that’s emerging.
Container Runtime Execution Rules
Losio: It would be good if Kubelet could stop the container runtime if it doesn’t pass the security health instance check according to a predefined ruling profile.
Nardiello: The Kubelet is just a process, the container runtime as well. It’s unlikely that what he suggests could happen, at least not that I know of. With policy management, OPA and stuff, you can establish what are the rules for the container runtime to actually execute a container, which is what Omar means and what he wants to achieve. If you have a compromised CRI container runtime, or if you have a compromised Kubelet, or whatever other components, it means that your host machine, the node is compromised. That’s one level of security. Then you have what happens within the containers, or escalations from containers to node. For what it means to have security issues within the container, OPA policy management, it’s your answer, mostly. You can define rules by which the CRI is going to execute, or the orchestrator is going to schedule your containers. If we’re talking about node level runtime security, I think it’s mandatory to mention projects like FICO, for example, or eBPF based security tools. There are a bunch of them. FICO security, it’s an open source CNCF upstream project that you can run within your nodes, which is going to perform runtime security checks. Two things that I feel like they’re unrelated to the question of Omar, which I think we answered. Two things that we did not mention, which are also very important. First one, dynamic secrets management. That’s something super relevant. You can bake passwords within your containers, you shouldn’t do that. They need to be injected dynamically. How you do that? Again, a bunch of tools, more or less enterprise, more or less upstream, more or less open source. That’s the very first thing.
Open Source Kubernetes Security Tools
Losio: What are open source tools or tools in general, that can help me that you would recommend? I know it’s a wide spectrum. If you had to suggest one tool, which one would you recommend to a developer?
Devochko: I would like to mention the Trivy tool from Aqua Sec. I have used it for a while now. I think it’s quite good. It can cover a lot of like checking both the container images for misconfigurations, also, the source code. I think you have a bunch of different functionality that you can configure in the tool to both do it as part of the CI/CD pipeline, and also in the clusters itself. I also want to mention Kubescape because it’s a relatively new tool. I really like how easy it is to set up and how lightweight it is to just scan it towards the specific frameworks. I think it is scanning both the NIST framework. It at least scans towards the NSA hardening guide for Kubernetes, which is also a very good framework to use. What I really loved about it is that it can actually visualize the RBAC configurations in your cluster, which I think is pretty cool. You could see the whole connections between the different service accounts and role bindings and cluster role binding so that you could see who has permission to do what? I think there are some limitations in terms of how many nodes you could use in terms of utilizing that functionality. You would need to check that. For enterprise production settings, I think it may require a license. Just to start playing around with it and checking it out, I think you get quite a few nodes for free. That would be my choice.
Secrets Management
Losio: Why should we consider changing the secret management from the cloud provider to something like Vault or other ways to store secrets? The default seems good enough.
Chenetz: I think that’s a matter of preference. People tend to go towards what’s easiest for them at that time. I think that’s perfectly fine. If you’re on a cloud provider, it’s always easier just to use the cloud provider. If your needs change, and you need to do something that’s maybe hybrid, or on-prem, you may need to use a secrets solution, something like Vault. Vault is probably the most common. There’s operators. There’s that open Secrets Operator, that manages some of that for you, still using things like Vault. It really depends on your needs. It’s not one-size-fits-all. If you’re going to go with the cloud provider, obviously, that’s the easiest way to do it. If you don’t have a need to do anything else, don’t do it. If you do ever need to do something else, then do it. It’s a really easy answer. That’s really what it comes down to. There’s not a silver lining for it. It depends on how locked in you want to be towards a vendor, and how much you need to have that openness to do other things.
Security Tooling for Kubernetes
Losio: Do you have any other tool that you’d like to recommend for security?
Nardiello: There are so many. Kubernetes by itself would schedule stuff on your nodes, so everything else pretty much is a tool. I actually wanted to connect to what Michael was saying regarding secrets management and stuff. I totally agree with him. It depends where you’re running your stuff, and what is your specific use case. Are you a big enterprise with an HSM? Probably, then you need a more enterprise solution like Vault, or CyberArk, or whatever else. Are you just on AWS, or any other provider? Just use whatever the provider is giving you.
Actions Items to Improve Security of a K8s Deployment
Losio: Can you suggest one action item, something that I can do as a developer to improve the security of my Kubernetes deployment tomorrow. It can be reading an article. It can be watching something. It can be doing something, a checklist of any concrete one action item you suggest.
Devochko: Since I prefer learning by doing, if you’re using a managed Kubernetes service, since it’s so easy, I would suggest just enabling the policy management solution, just to start getting that overview of what you have configured. If not, more of a theoretical way to start with I think is just checking either the NIST framework, or I would really recommend OWASP because it’s really straight to the point in terms of concrete actions.
Chenetz: Everything we said, but also there’s a learning aspect of this. Make sure you understand what you’re getting into. Make sure you understand what you have, and make sure you can visualize it. Use tools. There’s a tool called Monokle, which does baseline security and a bunch of other things. It’s a newer tool. It’s like your IDE for everything that is Kubernetes. It also provides scanning tools for those things. It’s open source. I love it. Just know what you have. Go into it learning.
See more presentations with transcripts