As part of Microsoft’s research into ways to use machine learning and AI to improve security defenses, the company has released an open source attack toolkit to let researchers create simulated network environments and see how they fare against attacks.
Microsoft 365 Defender Research released CyberBattleSim, which creates a network simulation and models how threat actors can move laterally through the network looking for weak points. When building the attack simulation, enterprise defenders and researchers create various nodes on the network and indicate which services are running, which vulnerabilities are present, and what type of security controls are in place. Automated agents, representing threat actors, are deployed in the attack simulation to randomly execute actions as they try to take over the nodes.
“The simulated attacker’s goal is to take ownership of some portion of the network by exploiting these planted vulnerabilities. While the simulated attacker moves through the network, a defender agent watches the network activity to detect the presence of the attacker and contain the attack,” the Microsoft 365 Defender Research Team wrote in a post discussing the project.
Using reinforcement learning for security
Microsoft has been exploring how machine learning algorithms such as reinforcement learning can be used to improve information security. Reinforcement learning is a type of machine learning in which autonomous agents learn how to make decisions based on what happens while interacting with the environment. The agent’s goal is to optimize the reward, and agents gradually make better decisions (to get a bigger reward) through repeated attempts.
The most common example is playing a video game. The agent (player) gets better at playing the game after repeated tries by remembering the actions that worked in previous rounds.
In a security scenario, there are two types of autonomous agents: the attackers trying to steal information out of the network and defenders trying to block the attack or mitigate its effects. The agents’ actions are the commands that attackers can execute on the computers and the steps defenders can perform in the network. Using the language of reinforcement learning, the attacking agent’s goal is to maximize the reward of a successful attack by discovering and taking over more systems on the network and finding more things to steal. The agent has to execute a series of actions to gradually explore the networks but do so without setting off any of the security defenses that may be in place.
Security training and games
Much like the human mind, AI learns better by playing games, so Microsoft turned CyberBattleSim into a game. Capture the flag competitions and phishing simulations help strengthen security by creating scenarios in which defenders can learn from attacker methods. By using reinforcement learning to get the reward of “winning” a game, the CyberBattleSim agents can make better decisions on how they interact with the simulated network.
The CyberBattleSim focuses on threat modeling how an attacker can move laterally through the network after the initial breach. In the attack simulation, each node represents a machine with an operating system, software applications, specific properties (security controls), and set of vulnerabilities. The toolkit uses the Open AI Gym interface to train automated agents using reinforcement learning algorithms. The open source Python source code is available on GitHub.
Erratic behavior should quickly trigger alarms, and security tools would respond and evict the malicious actor. But if the actor has learned how to compromise systems more quickly by shortening the number of steps it needs to succeed, that gives defenders insight into the places that need security controls and helps with detecting the activity sooner.
The CyberBattleSim is part of Microsoft’s broader research into using machine learning and AI to automate many of the tasks security defenders are currently handling manually. In a recent Microsoft study, almost three-quarters of organizations said their IT teams spent too much time on tasks that should be automated. Autonomous systems and reinforcement learning “can be harnessed to build resilient real-world threat detection technologies and robust cyber-defense strategies,” Microsoft wrote.
“With CyberBattleSim, we are just scratching the surface of what we believe is a huge potential for applying reinforcement learning to security,” the company added.