Article originally posted on InfoQ. Visit InfoQ
Transcript
Brunton-Spall: I’m Michael Brunton-Spall. This talk is going to be about proactive defense in the age of advanced threats, which is, essentially, I am going to talk through a whole bunch of really interesting, juicy security stories that you probably read about, in a whole bunch of detail. I need to get through some of the things that I’m contractually required to do.
First of all, I wrote a book with Laura and Rich and Jim a while ago, “Agile Application Security”. I write a weekly newsletter in a very entirely personal context, not the thoughts, responsibilities, opinions of government in any way, shape, or form. It just summarizes the news I’ve read that week. I am told that the second most terrifying words in the English language are, “I’m from the government, and I’m here to help”. Ronald Reagan apparently said that. What are the most terrifying words in the English language? Actually, the most terrifying words in the English language are, “Hi, I’m from security, and I’m here to help”.
I’m from government security, and I’m here to help. I work in the UK government, in the cabinet office, right in the heart, defending how government does its things, its systems, and its things online. My team in particular does horizon scanning. It thinks about what happens in the next 2 to 5 to 10-year period in cybersecurity. It does cyber threat. It tries to understand what’s going on in all the bad places. It does cyber policy, how do we tell government departments how to defend themselves effectively? You might think that sounds incredibly exciting and cool, which it is. It is really cool. Most of what we do is read stuff online and tell people about it, because most of you don’t have time to do that.
The three things I’m going to cover on this talk are, what are advanced persistent attackers doing today? Why should you care in any way, shape, or form? What can you do now that will help protect you against those kinds of things? I think what they do is really interesting and really exciting, but you need to be able to take something away. You need to be able to not just go away going, I heard some interesting stories. That’s fabulous. You need to actually feel like you’ve got something you can focus on. I’m going to try and give you five takeaways, five things you can do in your organizations today that will protect you tomorrow, and hopefully you’ll have a bit of an understanding of why.
Before that, I want to ask, how many people here are from security? How many people here are in engineering leadership, in some way, shape, or form? How many of you are software developers, or in actual hands writing code? I’m going to do you a quick thing on why security matters, and that’s because in most organizations, security is the thing the CISO does, and in engineering, we ignore it as much as possible, because it gets in the way. I’m going to try and explain to you why this matters, and then pick some fun stories.
Why Security Matters
Why does it matter? I’m going to while back to when I was a wee lad in 2006, and this is from Information is Beautiful, a guy called David McCandless who does amazing visualizations of data. In this case, this is data breaches over the years. In 2006 you can see some of the early data breaches that we had up there on screen. AOL being one of the first big recorded ones. What happened was one of the systems administrators took down the names and email addresses of everybody on AOL back in 2005 and sold it to spammers so that they could send emails to everybody saying, “Please buy Viagra pills online”. Fabulous breach of everybody’s personal data. Nobody liked it. We’ve got some early ones here.
The UK Revenue and Customs, UK Government put two CDs in the post, and they didn’t turn up at the other end. That’s considered a data breach, because the CDs contained unencrypted data of everybody who was claiming child benefits, which turns out a lot of young people. A not very happy situation for the UK government. This is where cyber policy started for us, in a lot of places. TK/TJ Maxx, who lost a whole bunch of cardholder data.
It’s 2006, the internet’s young and exciting. Memes are only just a thing. We move on to 2012, things aren’t getting much better. In fact, actually, what we’re seeing is more data being breached. We have Yahoo winning the prize there for the first loss of a billion records in 2012. We’ve got Court Ventures. We’ve got the Massive American business hack, which is one of my favorites of all time, because I think the company is called Massive American. It sounds like a giant thing. We move on to 2017, it’s getting worse. Actually, we are getting better at security. We’re getting better at technology, but the amount of breaches that we’re having is going up, and the amount of data in them is going up.
You’ve got Yahoo here going for another big loss. You’ve got Aadhaar, 1.1 billion people from the Indian biometric database. That is Indian state holding the fingerprints and iris recognition information for a whole bunch of people in the Indian state. Marriott International lost records of people who stayed in their hotels around the world. All public data breaches and all showing that we are getting increasingly bad at security to a certain degree. Or, you could take it another way, we’re getting better at being public about reporting the breaches, which is a different way of saying the same thing. There is a lot of this stuff going on, and there are stories in all of this as well.
This was a reveal of an APT malware that was deployed to the telecoms systems that was picking up individual SMS messages when they were targeted, and shipping them off, so deployed underneath, monitoring the phone calls, SMS records, and shipping them out elsewhere. Georgia was attacked in a massive cyber-attack. The Indian nuclear power plant’s network was hacked, said the official’s confirmation that there was malware on the system. It obviously wasn’t serious, is what they said, which is a lovely thing you want to hear when you hear your nuclear power plant is being hacked, is, it’s ok, it is malware, but it’s not serious. Former Twitter employees were charged with spying for Saudi Arabia.
They were taking money on the inside to look for accounts of the kingdom’s critics. Sweden handed over the keys to the kingdom in an IT security slip-up. It turned out they’d handed them over to one of their subcontractors, who was based not in Sweden, but in fact, in a nearby country that they didn’t think should have the keys to the kingdom, full of their accounts, for some reason. You can’t possibly imagine why. UK Customs and Border Protection said photos of travelers were taken in a data breach. Company called Perceptics had managed to get ransomwared. One of their developer’s laptops was imaged, put up on the dark web, and on it you could see the face photos of everybody who went through the border, that was being used to train the system.
You could also see the mp3 files they had on their desktop of what they listened to on a regular basis while coding this facial recognition system, which is a fascinating insight into the systems that protect our nation. British Airways suffered a compromise, 22 lines of JavaScript added to their payment processing portal, which took a copy of every credit card number being passed through. Wasn’t actually on the system for very long, but it did manage to steal thousands of card details from thousands of victims, and they had to say, how did this happen? It turned out, the JavaScript was hosted by a subprocessor somewhere. That one bit got compromised and it was pulled into the PCI environment. Really bad breach. Lots of things learned from that.
We’ve got APT33 targeting industrial control system software jumping around in operational technology. We have Log4Shell. Some of us lost quite a lot of Christmas to the realization that write once, run anywhere for Java, turns out to mean run Log4j anywhere and everywhere, including in our cases, our phone systems, our printers, our scanners. If it had a microchip, it was probably running both Java and a version of Log4j which was almost certainly vulnerable to JNDI exploit. That was a bad couple of weeks for quite a few of us trying to work out, how do you fix this? It turns out, our software is everywhere.
AttachMe, this was a vulnerability in VMware that allowed people to map the virtual disks you had in your Oracle Cloud, over to another running instance in a different tenancy, which is generally bad. You don’t really want another company to be able to suddenly attach your disks to yours. That was slightly embarrassing for them. This was outstanding bit of work by cybersecurity research in abusing company CI systems via webhooks. They were able to use webhooks to take over the rest of those pieces. Then Uber, who had a contract, had their account compromised, and it was compromised because the attacker sent thousands of MFA things saying, please sign in, please sign in. Eventually the person just gave up and went, ok.
Dark Halo, SolarWinds and FireEye/Mandiant
Attackers don’t have to be advanced and complex. They can do it very simply. So many to pick from. I said I’m going to pick three examples I’m going to talk through in detail. I’m going to talk through the timeline. I’m going to talk through some of the details of what happened. I’m going to talk about SolarWinds, which is everyone’s favorite vulnerability and experience. Something called Volt Typhoon, and something called Storm-0558. Dark Halo, SolarWinds, and FireEye/Mandiant. All of these organizations have funky names because the first security organization to find them gets to name them, and so we tend to use the same name. Dark Halo is the name given to this adversary in various forms. What was the timeline? Does anybody here remember SolarWinds?
The timeline we have, back in 2019, a security company called Volexity spots a bunch of odd activity by an organization they call Dark Halo. They’ve not seen them before. They are running rampant around one of their customer’s systems, and they’re stealing emails. They manage to kick them off. They find some really odd behavior in there, around the use of tokens to get hold of their Outlook Web Access server. They find them back again in mid-2020, June 2020, I think it is. They’re like, they got back in. How did they get back in? That’s a really weird thing they got back in. We’d kicked them all out. It’s the same people, we saw the same sets of behaviors and activities.
At the same time, the Department of Justice says they noticed something going on in some of their networks as well, and they reach out to a bunch of their suppliers to be like, did you see anything weird? People start connecting the dots. Nothing public comes out, apart from the original Volexity reports, being like, there’s this group called Dark Halo. They’re quite neat. They do some funky things. You should look out for these things.
In November 2020, Mandiant/FireEye, it was FireEye at the time. They were bought by Mandiant later. Mandiant, they see a really odd behavior, which is one of their employees who is on vacation, does an MFA alert, authorized an account. They do it from a VPN that is not one of the corporate VPNs on a phone that they don’t normally have. Mandiant being a security company, are like, that’s really weird. We should investigate that. They investigate that, and it’s not the employee. They contact them outbound and they’re like, “Not me, I don’t know what’s happened”. They start investigating. They realize they have been compromised.
They are a security company, and somebody has got into their network and is looking for the red team tools, the penetration tools they use to attack other companies legally when you contract them. They don’t just do it for fun. Those tools that enable them to get access to network systems and services. FireEye are one of the best in the world at this. They’re very good. They keep their tools private, connected, and they don’t let people use them. This is a really competent, powerful set of tools. They do some investigation in December, they announce, we found the compromise. It is this set of activity. They reference back to the Dark Halo report to say, we think it’s the same actors. There are some similar things, but there’s some really odd stuff going on.
Over that weekend, there’s a flurry of activity between security companies. Mandiant and Microsoft release public blog posts saying that they’ve also been hacked, and they’ve traced it down to a piece of software called Orion, published by a company called SolarWinds. SolarWinds produce a blog post that says, we were hacked as well. This is really bad. We were hacked. That statement and that thing really kicked off the whole thing of people going, what’s going on? How has SolarWinds been hacked? What do we use it for? Orion is a systems management piece of software, so you put it on your network and it manages all your servers for you. If it’s compromised, your network is compromised.
How does it work? There were hidden files injected into SolarWinds’ Orion, so bits of additional code that ran in it. That code which they called SUNBURST, when it was deployed, when you deployed your update of Orion, it would stay dormant for 12 to 14 days. It wouldn’t do anything. It literally had a timer in it being like, I’ve started, don’t do anything, wait 12 days or 14 days. It was randomized. Once it then ran after 12, 14 days, it would analyze the server. It would get information about what network it was on, what DNS records it had, what IP address it had. It would bundle it all together, and it would send it off to the internet, and it would send it off to a unique domain name that was made from a hash of the IP address of the company.
Something on the backend would see that information and would make a decision about whether to download new malware onto that system. That malware was much more targeted, and it was a piece of malware called TEARDROP. It was a far more complex architecture. It had pluggable systems that allowed the attackers to run different types of code. What it did when it got onto the system was it stole credentials. It looked across the network for emails, for documents, for source code, and it shipped it off of your network. It was targeted to a few small organizations.
I don’t think that’s the interesting bit of this attack, although, for the organizations targeted, they probably thought that was an interesting bit of this attack. What I thought was really interesting was, how did they actually compromise this Orion bit of software? Because you went and downloaded the legitimate patch, and in fact, if you went and checked and did the GPG signing, which we all do, you would have found that it was, in fact, a signed update. It was deployed properly and built by SolarWinds. They did it by compromising SolarWinds as a company. They got into SolarWinds. SolarWinds publicly said that sometime around February 19, they first were able to deploy the SUNSPOT code into the SolarWinds TeamCity agent.
How many people here run TeamCity, or some other CI system? TeamCity is one of many continuous integration systems. It does builds. It has an agent-based system which can spin up a VM, build all your software inside that VM, and then shut it down. It turns out that what had happened in this case, and there’s an excellent write-up in Wired that you should go read after this if you want to see all more, is that one of those VMs didn’t shut down properly and was not cleaned up properly, which meant that when they went back, they found this VM and found the compromised code in the shutdown version’s VM. What they had done was they’d put it into the build agent. When the build agent was building all of the software, what it would do is, instead of just running the C# compiler to build the Orion software, it would add files into the build directory.
Those files weren’t in the source code repository. They didn’t exist, but they were added in during the build stage on the build agent. It would correctly build the software, sign it, package it, and then push it into the repository, which was a fabulous attack because almost nobody can see it. There’s no evidence anywhere apart from on your build agent that this thing is actually compromising and backdooring your software.
In June 2020, the attackers came back and deleted everything. They removed the malware. They removed the backdoor. They just deleted everything, and then they disappeared. I thought it was really interesting, why? Why did they delete it in June? The reason they deleted it in June, it seems, is because this thing, the Department of Justice has spotted some unusual activity, and they wrote out to a bunch of their suppliers, including one of their suppliers, SolarWinds, and they said, we spotted some really weird activity on the network. It looks a bit like this. Wonder if any of you have seen it? It’s suspected, it’s not actually confirmed by anybody, but the Wired article strongly suggests that the attackers were reading the corporate emails in SolarWinds from customers to see if anybody had spotted the activity, saw that, and went, “We are rumbled. Let’s delete everything so that nobody can see how it works”.
That is an incredibly cool hack that affects the highest end of the sorts of things that we do. What does it mean for you? Because, most of you are going, they wouldn’t target me. That’s not terribly interesting to me. Actually, I think there’s something we can pull out from this that I think is really interesting. Your development infrastructure needs to be as secure as your product. Your development infrastructure is a part of building the security of your product. For most of us, the build servers that we run are probably hung together by a couple of developers in their spare time that are managed. They’re not run by enterprise IT because enterprise IT aren’t necessarily entirely sure what TeamCity is, or why somebody would run CircleCI, or whatever your CI system is.
Your development infrastructure has to be as secure as your product, because otherwise somebody can use it to compromise your product. It doesn’t matter how good your actual security software development lifecycle is. If your build servers and your build infrastructure is compromised, your product is compromised. I think that’s something that we need to think about and make sure that we are investing in that. There are a set of public references here.
I’m going to add an addendum to this story, which is an activity by what is probably the same actors, but named by a different company, in this case, named by Microsoft, called Midnight Blizzard. This adds one thing to the end of that, which was, Microsoft itself in January 2024 says that they were attacked by Midnight Blizzard. They were compromised, and they were looking for emails. Primarily, they’re looking for emails about Midnight Blizzard. This, I thought was interesting, because of how the attack happened as well.
In this case, there was a compromise of an old test application that Microsoft ran that was unmonitored, unmaintained, and it turned out, had some weak passwords, and the attackers just sprayed every possible password at the system. Because it was a test system, it didn’t have any of the normal things that would alert or do anything about that. They found a password, they got into the system. It turns out that system had an OAuth component in it that allowed you to authenticate to that test application with your core Microsoft corporate credentials, because you don’t want to roll your own identity system inside your test system.
That application had powers far beyond it should have had. One of them is the ability to act as any user who had been there. They were able to use it to create a new OAuth app with wider permissions. They were able to use that to jump into the corporate admin of Microsoft Corp itself. Then they used that to generate tokens and access rights to their exchange web server. They targeted some of the highest-level executives in Microsoft itself, as well as the security team looking for emails about Midnight Blizzard.
Again, it’s a really neat attack, probably not one that would target you, you would think. What does it mean for you? I drew that out, your test environments are almost certainly part of production. It’s easy to think the test environments are just test environments, but actually, for most of us, there is a bunch of infrastructure that we have to share. It might be our core corporate identity system. It’s quite likely, in some cases, our VPN infrastructure. It’s quite likely, a bunch of bits, certificates, things you have to deploy into your test infrastructure, because doing duplicates of that is hard.
Your test environment can become a weak part, because it’s very rare that you monitor it to the level that your production infrastructure has. Probably, for good reasons. The monitoring and governance in your production infrastructure is painful, difficult, and the test environment needs to change on a regular basis. Your test environment will be the weak part of your organization, if you’re not careful. There are some more references to that. That’s the first of three. The actors are probably the same.
Volt Typhoon – Critical Infrastructure Organizations
I was talking about Volt Typhoon. Volt Typhoon, we have a lot less information on, in part because the public attributions about it don’t go into quite as much detail, so the timelines are a bit vaguer, but we’ll go with the timeline here. Volt Typhoon in mid-2021 probably started this campaign of targeting critical national infrastructure across the U.S. The activity was attributed by Microsoft in 2023, and the U.S.’s Cybersecurity and Infrastructure Security Agency published the very snappily named AA23-144a advisory, which said, you should watch out for this. They are scanning networks. They are compromising people’s systems. They are targeting critical national infrastructure.
Microsoft also issued the Volt Typhoon blog post with a whole bunch of technical details about how this stuff happened. People probably didn’t pay as much attention as we would like. In February 2024, CISA issued the snappily named AA24-038A. You got to love a government organization that knows how to name a document. This one really talked about the fact that not only is this activity ongoing, not only has it got some really unusual behaviors that you should be aware of and you should pay attention to, but that it’s operating in a way that’s unlike many of the normal espionage activities that we see online.
The U.S. authoring agencies assessed with high confidence that Volt Typhoon are pre-positioning themselves on IT networks to enable lateral movement into the operational networks to potentially disrupt functions. They were like, we’re worried about this. This is really dangerous. Who did they target? Critical national infrastructure. They targeted communications, manufacturing, utility, transport, construction, maritime, government. Who works in IT, information technology? Anybody work in education? I think Duolingo is here as well. They targeted a lot of people.
How did it work? Again, incredibly neat. First thing, preemptive reconnaissance. They looked online at things like public conference talks. They looked online at blog posts. They looked online at GitHub repos to see, what did people put out about their infrastructure? What type of infrastructure did they run? Were they customers of certain vendors? They then used initial access with zero-day attacks on the network’s border. They used these zero days. Over the last couple of years, you will see news reports about, there is an Ivanti vulnerability, or there is a Cisco vulnerability. Quite a lot of these probably come from a bunch of this activity, seeing people use it. They used it against pretty much every remote access vendor there is out there, VPNs and remote desktop gateways primarily.
Once they got onto those VPNs, what they would do is sit on the VPN and they would modify it to take copies of the usernames and passwords used to access those VPNs and the credentials. Then they would reuse those to move around the network, which is quite neat. It’s really dangerous when you think, I have to give my username and password to the VPN, so if somebody compromises the VPN, you’re in trouble. Once on the network, they didn’t download malware. That’s not the way that the SolarWinds operator did with their SUNBURST malware, which was a really neat bit of software. It was what they called hands on keyboard.
The chances are somebody, some real person, was sat there with a remote terminal, typing letters into the terminal and running things on the network. They ran tools that were already on the servers. They didn’t have to download anything. They didn’t trigger any weird warnings for your computer going out to the internet, and downloading an odd bit of malware. They used PowerShell. They used Ntdsutil, which is a tool for dumping the domain directory in Windows. They used xcopy, which is a tool I didn’t realize still existed. I used that in DOS days in Windows 3.1. They used, whoami. They used tasklist to find tasks like your security systems, and taskkill to turn them off and kill them.
They also had a really interesting activity, which is the compromise of small and home office routers. They would target small organization, small and medium enterprises across the U.S., and, in fact, across the world, compromise their routers, the same kind of Ivanti, Cisco tech things you would buy, but they wouldn’t compromise the organization. They would just sit on the router and use it to bounce their traffic through it, so that you become part of the network that is attacking these critical national infrastructures.
Again, bar the fact that most of you seem to work in the IT industry, what does it actually mean to you? Actually, the thing I think you take away from this is that you might be more interesting than you think, not because of who you are necessarily and your customer’s data, but actually the organizations your networks are connected to, the network infrastructure that you provide, and the things that you manage can be a target for some of these organizations. You might be compromised just to provide a smoke screen for those actors. Some more references for those of you who like to read some of those, it’s the Microsoft Advisory. Google, Volt Typhoon, this will all come up. It’s fine.
Storm-0558 and Microsoft
The nice thing about these names is that they are, in fact, all quite memorable in interesting ways, apart from this one, Storm-0558. There’s a catchy name for an actor. The reason it’s called Storm-0558, is because Microsoft use this when they don’t know who to attribute it to. Storm is one of the things where they’re like, we don’t know who this is, but it’s a collection of activity we’ve grouped together. This is the 558th one that they say they’ve seen, assuming they started at 1 and not at 100, or something to that effect. This is, I think, probably the neatest of these three attacks. What did they do?
Back in 2021, a crashdump on a developer’s desktop inside Microsoft seemed to contain some key material that it shouldn’t have. In fact, what it had was a private signing key for Microsoft’s own customer infrastructure that allowed it to sign certain requests in the customer infrastructure. Essentially the private SSH key, or equivalent private key, that allowed them to sign requests for the Office 365 online estate in entirety. It was only a sign-in key, but that’s fine. According to the report, the activity started sometime in May. Microsoft noticed it just a few weeks later. It was quite quick they noticed this activity, which is, they targeted about 25 different organizations, according to Microsoft. Within a few weeks, Microsoft have said, we’d spotted it, we’d contacted all the organizations, we remediated the issues.
Then they published a blog post three weeks after that, saying, “We were compromised, and this one was bad. We are doing a deep technical investigation”. Two months after that to the day, they published an incredible deep dive technical investigation into what happened.
How did it work? This is where I think it’s absolutely astonishing. Microsoft takes really good care of this key material. The ability to craft an SSL certificate or a key of the Office 365 estate is incredibly powerful. Nobody in Microsoft is supposed to be able to access those keys. They are generated in secure environments. They never leave the secure environment. We’re software developers, sometimes we write software and it doesn’t work. I know it’s shocking. Mine’s always worked first time. I’ve never had to debug anything. Sometimes it doesn’t work. How do you debug these things?
Microsoft had a tool for pulling down a crashdump onto a developer’s desktop, and it specifically went through, in the secure environment, it went through and zeroed out all of the areas where the private key material would be. It turned out there was a bug, and it once didn’t run, and so a crashdump came down that included private key material from the secure area. The key material was on the developer’s desktop. The developer’s desktop was compromised by this actor who found the key material, extracted it from an in-memory crashdump of a crash process, and were able to find the key. It also turned out the key only had very limited permissions.
It was only really for signing. It wasn’t to generate authorization tokens. There was a bug in the software library that meant that if you generated a token with it, the receiving system did not validate whether the key used to actually issue the authentication token was an issuing key, or whether it was a signing key. It had a valid signature, and therefore it was fine. It didn’t check the role of the key that had done the signing. They were able to use that to sign a request to log in to web access. That request said, I am user x, I have completed my MFA. I’m currently coming from this location in your office, doing all these things, and I would like to download my email over the web. The system went, sure, so signed by Microsoft’s infrastructure, you are good to go. Then they highly targeted, went for email inboxes of individuals.
This is probably one of the most targeted attacks that we have. What does it mean for you? What can you learn from this kind of attack? Your developer desktops should be assumed to be compromised, because, actually, developers are not always the best at their personal security for a bunch of reasons. Sometimes they’re better than IT, sometimes they’re worse. What’s on that developer desktop that gets compromised? Those of you who are developers, particularly those of you who are SREs, how many of you have SSH keys that enable access to production on your desktop? I’m going to go with, quite a lot of you.
How many people have got keys that would allow you to contribute code into your GitHub source repository? How many of you sign those keys and that key is available somewhere on your desktop? If you assume those things are compromised, you have to think about what it means to be running and writing code on a compromised device. That’s the level of threat you can really worry about. There are things you can do, and that’s what I’m going to come to. We’ve got some references.
I’ve given a couple of takeaways throughout it, but what does this actually mean for you? To summarize, attackers are interested in your data. Attackers are interested in your products and your systems in various ways. Attackers might just view you as collateral and be quite happy to use that in some way, shape, or form. Attackers care about your customers, because you are part of a supply chain that reaches people who might be more interesting to them in various ways. Then the final one that I didn’t really cover through in a bunch of this but something that I believe is true in most organizations, your CISO probably has very little oversight or power over your development infrastructure.
In most organizations, your CISO’s primary concern is the compliance requirements of the organization and ensuring that your user data doesn’t get breached. They care about the production things that happen. They care about the data, the way your business runs. They probably have very little oversight and power over the security of development infrastructure. The question I leave with you is, who is responsible for the security of that? Because I think the answer is probably your CTO, your VP of engineering, which I think means most of you are probably responsible for the security of a lot of this stuff. You can’t outshore it to the CISO.
What can you do? This could be a very dismal tale. If I just walk off stage now, you’ll all go away being like, everything’s ruined. Why do I even get up in the morning? Actually, there are things you can do. These are the top naught 1% of attacks. I picked these three because I think they are probably over the last 3 or 4 years of the publicly attributed stuff that we can find good records of. They’re both well documented, and they are relevant in various interesting things, and they are exciting. If you’ve not heard about them, you’re going away, going like, I’ve got cool stories to tell at the pub about attacks. I can go read these. I love them for that.
There are things that we can learn. I’ve got five things I’m going to go through. The first is good enterprise security. The second is multifactor authentication. The third is modern administration practices. The fourth is about knowing your assets. The fifth is about helping your customers. I’m going to give you a little bit of detail about that. Good enterprise security. What does it mean to do good enterprise security? I’m going to say something that I think might get a groan from people. It might depend.
I’ve not been allowed to write code for the last 7 years of being a manager instead, so I never get hands on keyboard anymore. Use your organization’s managed devices for your development teams. I know lots of developers who hate this idea, the idea that somebody in IT can control their device. As a developer, I wouldn’t spend weeks crafting my bash login to make sure that the prompt was exactly right, that when I changed into a directory, it showed me the git status of it. Developers, we like to craft our machines. Every developer I know has a different desire for how they want their desktop to run, the tools they want to run on it. Lots of those things require power over the device.
When you get given a managed device, which often feels like it runs slowly, it prevents you doing any of the things you want to do, developers find it really frustrating. However, those devices are secured at least to a level where people will know when they’re compromised. They come with a bunch of features and things on it. The most important one of those is, run an endpoint detection and response tool on those endpoints, and your servers in block mode. Your developer desktops should be running something that is detecting malware running on the device and at least telling people that it’s there.
Your servers, your source control servers, your build servers, your development infrastructure should all be running this stuff and making sure it’s detecting it. Because if you got SUNBURST, or you got one of these pieces of malware dropped on the device, these things get updated with signatures, and they can spot it, and they can tell you that this thing has happened. The hardest of these is application allowlisting where practical. This is particularly true of those big servers. There is no good reason I can think of that your source control server sat in the middle of your organization with your gold, your organization source code for everything that you do, should allow you to run xcopy.exe.
There is no good reason for you to be able to run that on that machine. If, for some reason, the administrators need to run it, they can allow it, run a script, disallow it. They have the power to do that. Where practical, you should be saying, only things I know should be running on this machine, should be running. Ntdsutil is not something you should be running on general machines, so make sure that you are allowlisting the things that should be running on those. Development machines can’t do this because you probably build XEs and run them on a daily basis. Even when you do an npm update, you’re probably running a billion pieces of unsigned code that you didn’t know where it came from, but you should be doing it on the servers and that development infrastructure.
Multifactor authentication, I talked earlier about that exhaustion attack, about the people pressing the buttons. If you can, use phishproof multifactor authentication. If at all possible, use hardware tokens. The FIDO2 hardware tokens are excellent. You can get a Google Titan key and get a YubiKey and get all kinds of hardware tokens. One of the things you can do with those is you can configure SSH, GPG, and other things to use that hardware token to keep the cryptographic secrets on the token. That means if somebody compromises your device, steals things off your device and moves it elsewhere, they still can’t log in to the production servers as you because they need a piece of hardware that only you have.
If you use that, you can enable that. You should be using that hardware security to store the developer secrets. You may not want to buy hardware tokens for every single developer in your organization. That’s fine. They’re not expensive, but they’re not cheap. You can also use things, so on Windows, Microsoft Hello, on Mac, the Secure Enclave. You can store passwords in the Enclave on the device. That means that if the device is imaged and taken elsewhere, nobody can get passwords, nobody can get keys. Turn that on, use it. Use it to store your most critical credentials for your developers, because that stuff matters, and it matters to a huge amount of the security organization.
Modern administration practices. You should be managing your developer infrastructure, and in fact, all of your infrastructure with modern administration practices. Why I’m assuming most of you do not run the IT department in your organization, because this is mostly a developer, CTO level conference: people are writing software, running development teams. Some of you might manage your IT team. I suspect most of you have an outsider IT team of enterprise IT, who provide things for you. Some of you will be the joyful experience of being in a very small company where you are both the IT team, you’re the expenses team, and the CEO at the same time.
If you have an IT team, you might want to help them with this one, because this is for them as well. Do not log into your servers as the domain admin. The domain admin account has got the power to do anything, including create other domain admins, including do all kinds of things. You should almost never log in to that server as the domain admin, because it is the number one thing that ransomware operators and some of these organizations do, is run a tool on the device that will take the password out of memory. It’s a really neat little tool, it should be picked up if you’re running MDR.
Lots of people are not running the endpoint detection response system on there. Instead, you should be creating accounts that have got just the right sets of privileges, that don’t have the privileges to create a new domain admin. However, you should log into your laptops and developer desktops as an admin instead of using sudo. This is probably quite a controversial one, and this particularly affects Windows, but it’s true on Linux and Macs as well. If an adversary can compromise your normal account, they can do anything on that account that you can do. If that means they can type sudo rm -rf everything and not type password, then they can do that for you.
In Windows, quick user switching is unbelievably good. It means that you switch between another account and this one. It means you keep all of your normal stuff running, not as administrator, with no power over the device. When you need to install software on your computer, you can switch to the admin user and switch back. On the Macs, the quick user switching is a bit harder. You might want to think about how you do that on the Mac, or how you do it in other places. One organization I know didn’t let the developers run as admins, but they did allow them to contribute to a source control repository that would push admin commands to their machine.
The idea is, you can push something to the repository and it can push to you, and it means everything’s logged and monitored. It’s harder on Linux and Mac desktops, but on Windows, use quick user switching and use two separate accounts, if you can. The reason for that is that thing where it pops up and says, do you want to run this as an administrator? Is trivially easy to bypass on lots of computers. It’s actually incredibly easy to hide. It’s very easy to make it automatically do. It is one of the things that red teams often present at security conferences, here’s the latest way that I’ve got through that.
You should also ensure your development infrastructure is logging. Does your source control build systems data science laboratory log to the central enterprise logging server? It probably doesn’t. It definitely should. Putting all those logs in a central location lets people find odd anomalies, things going on in your development infrastructure. Make sure that you know what’s going on. I’ll come back to it, but you should be logging weirdness in that stuff as well. Things that are really odd or unusual, because that stuff matters.
This bit, I hope we’ll be preaching to the choir, but knowing your assets. This is really hard for enterprise IT, people who don’t know what systems they’ve got. If you have immutable infrastructure, so if every time you deploy, you blow away the virtual machine, you put a new one out, it’s really hard for adversaries to maintain any persistence on your system. They have to keep compromising you over again. Any system you are doing where you’re rolling out new VMs from golden images or whatever, and rolling it on a regular basis is much harder to compromise. You should also be using infrastructure as code.
Huge numbers of organizations just do not know how many servers they have. They don’t know how many build servers they have, what they have for their TeamCity agents. If you’re using infrastructure as code, you do have a good recognition about what those are. This is a place where development teams are almost certainly ahead of IT teams, because a lot of IT teams are still configuring your corporate IT by logging into machines and clicking things on the GUI. It’s been 10 years. We can show them better ways of doing a lot of this stuff, PowerShell, and there’s neat tools that will do a lot of this stuff for you. It will give you massive amounts of confidence in what you can and can’t do.
You should help your customers too. SolarWinds was a really good example where the tool ran on the machines and it was easy to compromise in various ways. Reduce the attack surface of your products. Don’t expose ports you don’t need to. Encourage people to use the administrative tools that are out there. Allow people to use your single sign-on solution to sign into your tool, preferably at the free tier. When that’s locked behind the enterprise tier, it drives me up the wall. People who want to use your product, people who want to use Trello, or Jira, or whatever inside their development community, if you can’t do that single sign-on, you create a vulnerability where somebody can target those things. Make it easy for people to use your tool effectively. I would also add, if you can, try and make it that your tools can work either offline or in a really limited network connectivity.
One of the things that caused Orion to be so damaging was it controlled the core corporate network of huge organizations in various ways. It let you automate a bunch of that systems administration. It also was very chatty, it talked to the internet, and it meant you had to have an open connection to the internet and allow it to go off and download malware and run it on the device. If that machine could operate offline and couldn’t connect to the internet, it would have been so much harder for the adversaries to get commands to that machine. Would have been really difficult to do it. If you can’t run offline, and sometimes you can’t, there’s lots of things, at least describe what connections your system should make so that your customers can put that into their logging. Did it make a call to a domain it doesn’t normally make a call to? Did it do something unusual?
That’s what the Department of Justice noticed on the network, was their SolarWinds machine making unusual calls. They weren’t sure quite what it was because it changed regularly, so they constantly got these alerts. They were like, something odd is going on? Then, finally, your development logs are security logs in an incident, those logs that you create that say these weird things happened, this signature is wrong, or this key is in a really weird format. After, when you discover you’ve been breached and you go back, those logs help you reconstruct what happened. They don’t feel like security logs. They’re not there about authentication failures. They’re not there about the VPN. Those things that you log can be used in a security thing. Make sure that your logs are logging in a structured fashion, if at all possible. I shouldn’t have to tell people to do structured logging, it’s 2024. Make sure they’re date and timestamped. Make sure they can be shipped off into another location.
What Does the Future Hold?
I said my team does horizon scanning. I thought I’d give you five quick tips about things my team thinks are coming in the future. The reason for this is, for John Boyd, who said he was a fighter pilot who helped build the modern fighter jet back in the ’50s, I think. He said, “We have to act, not just faster than the adversary. We don’t just need faster planes. We need more maneuverable planes, because we have to be able to observe the environment, orient, decide to do something and act and go around this loop faster than the other people”. Horizon scanning lets us do that. I love this quote from William Gibson, who wrote some sci-fi, “The future is already here, it’s just not evenly distributed”. When we think about what’s happening in the future, the answer I look at is stuff that’s already happening but is, in small places, what’s going to grow.
Key Takeaways
I’ve got five areas that I thought I would leave you with. The first is zero trust. I hate the term zero trust. As Ashley said, zero trust is sold to people, they can buy it as a box. Actually, the fundamental principles of, know who your people are, make sure you assume your networks are breached, assume that your developer devices are breached. That’s going to be a much bigger thing over the next 5 years. Dependencies and vulnerability management, they’re going to become increasingly an issue. They’re already an issue for us. Those who do follow the headlines will be seeing stuff. npm, the Node Package Manager, the amount it downloads, and the amount of dependencies that are in modern software is unthinkable, and it’s getting so hard to know what’s going on.
That leads into that fundamental software supply chain. In reality, huge amounts of the stuff that we rely on day to day is relying on open-source projects maintained by some poor guy in Texas somewhere with no budget. Build pipelines are going to be a huge thing. The fact that we saw it in an advanced attacker means that in the next 5 years, we will see ransomware operators, other people doing it. We’re already seeing reports of lower tier actors starting to target some of those systems because they’re interesting, and they contain credentials, they contain stuff. Then, finally, cloud admins. You can set up the most secure system in the world, but if somebody can log in and turn off all your firewalls into your cloud control panel, the security doesn’t matter. How you administer the cloud itself, how you manage the control of that cloud is going to be a huge thing that we need to pay attention to.
Conclusion
It is really easy for us to kick some of these organizations for being breached. It’s really easy to go look at them, they did a bad job. They were breached. Look at that thing they did, their system failed. This thing was really bad. Actually, when people are open and transparent about what went wrong, it lets us all learn. It’s something that I picked up in the DevOps movement about the value of retrospectives, the value of saying, there’s not a blame culture, there’s a thing of saying, what did we learn from this thing? That can only happen if there was a culture of transparency. We need organizations to say, something went wrong, or, there was a near miss, and doing it publicly and providing information about it makes us all better as defenders.
It makes it much easier for us to learn from each other’s failures, because we’re not all targeted equally. Somebody will target one organization, and if it works well, they’ll move on to another one. If we share stuff, when it goes wrong, we can defend. It’s like building antibodies in a system. That transparency is one of the most powerful things we can have. In this case, those organizations who have reached the amount of transparency in those public reports enable talks like this, which enable you to go away and go, “I learned something valuable from it”. We have to bundle that with empathy.
I like to go back to the Agile Retrospective, the core manifesto of the Agile Retrospective, which is, regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills, and abilities, the resources available, and the situation at hand. Because it’s easy to kick people when they’re transparent. It’s easy to say, “Look at them. They did these things badly, and they should do better”. Actually, if we have to have empathy with people who were breached, then we have to go, it could have been me, and it could be me next time. If I have empathy for them now, then they’ll have empathy for me next time. That kind of classic HugOps.
Work for your government. They need you. We do interesting, fun stuff. We aren’t always liked for a whole bunch of reasons. Being a civil servant is not the most exciting thing. Some of you do not live in the UK. I am not saying come and work for the UK government. Work for your government in some way, shape, or form. If you do work in the UK, then I would encourage you to come and work in the UK Government, because we are doing cool, exciting, and interesting things. You can have a full career working in digital data technology, or come and work with me in security in some way, shape, or form.
If you don’t want to commit to that, if you just want to do a tour, we have a digital and data and security secondment program trying to bring people in from industry. If you go to there and your organization is willing to give you up two days a week, three days a week, something like that, you can come and work in government. You can give back to government, make sure that public services are good, are capable, and deliver the things that make our citizenry better in various forms. You can do it while seconding into government. If at all possible, do a tour, work for government, because your government needs you.
See more presentations with transcripts