>> What an honor it is to be here um at at DefCon, I’m grateful to uh be able to share with you some research. We’re going to do three things. First uh we’re releasing a tool to attack next-gen AV and you can find it GitHub address. I’m gonna describe it today and demonstrate it and and show you how to use it. So uh first, let’s uh set the stage. Today we’re going to be talking about uh evading next-gen AV that uses static analysis for detecting Windows PE malware. And uh to motivate this, let’s first talk about uh rules and how one might write a rule to uh to detect malware. So on this little chart here, I’ve plotted a bunch of totally fictitious red dots and blue dots. Which are meant to represent uh files. As described by first um, file size and second by the number of registry keys contained you know strings containing the file. Um, [laughter] A feature of this presentation. And then I’m gonna just by hand create a yara rule which I have in that black box there. That sort of defines this region of this feature space, file size number of strings, that cordons off all of the malware in my data set. So this is nice but of course it’s real easy to break. And and if I just you know take my my malware sample and maybe add some bytes to to the end of the file that has no uh d- d- does not break either the the format, the PE file format and doesn’t break the function of the malware, then then I can break this rule. So what makes machine learning maybe harder to break? Harder to attack? And um I guess there are a couple reasons and and one is that you you can kind of think of uh the machine learning as the much more sophisticated and and graceful rule and uh it learns these complex relationships from the data automatically and uh it it and further more it kind of instead of presenting this sort of brittle uh cliff from malicious to benign, there’s sort of this smooth territory where uh the the machine learning uh model can uh tell you about its confidence. Uh that that a sample is either malicious benign. And this is important because um this allows sort of a graceful degradation if one modifies a malware sample, there’s sort of this graceful fall off from malicious to benign. So that that can make it hard. So can we break machine learning? Uh well the the short answer is, yes we can. But the idea for a windows PE is that we like to uh input a file that are model knows is malicious with high confidence. And make a few subtle changes to the bytes on disk modifying elements that don’t break the PE file format or don’t break the behavior of the malware. And then trick our model into thinking that’s that’s benign. So, it’s actually become quite fashionable to break machine learning. In uh, in recent years. And uh this will be a constant reminder to keep me on my toes. Look at how fast at that. If you haven’t seen this image yet, this is sort of famous in the uh image domain that one can take uh uh an image of a bus for example that uh uh image recognition computer vision model knows is a bus with high confidence and change the pixels ever so slightly and now even though there’s no difference to your eye now the model thinks it’s an ostrich with high confidence. And uh this is you know fun for images, um but there’s kind of three take aways whether images or not from this kind of adverserial machine learning research. And the first is that all machine learning models have blind spots. So nex-gen AV buzz word uh we do it at my company Endgame uh they have blind spots. Number two is depending on how much knowledge an attacker has about your your model, they can actually be really convenient to exploit. And um, uh the the talk we’re we’re doing today, the research I’m presenting to your today is actually the in in the category of in the least convenient, the most inconvenient spot for the attacker to attack. And uh sort of a third attack take away is is a little scary and that’s that if I find for my model this sort of bus ostrich confusion example, there’s a decent chance that it will also work against your model. So then the attacker doesn’t really need often need to attack your model in order to find some success for the invasion against your model. And that keeps people up at night. Alright so, that’s for images. That’s for images for bus ostrich. The thing about this um really how this works is that in most cases there are two things, the attacker knows everything about your model. He sort of has the source code to the model. He knows the weights, he knows the parameters. And in fact it has to be a special kind of model like deep learning and neural nets that are fully differentiable. And given that for my image of a bus I can actually ask the model, what would confuse you the most? Tell me which pixels I should change to confuse you the most. And will happily give you an answer. And buy changing those few pixels the good news is by changing pixels I have not broken what it means to be an image. But let’s think about applying this now to PE malware. If I were to present some model with bytes from a malware system and ask it what bytes or what feature should I change, and I you know I I change those bytes on disk, well at worst I’ve totally broken the PE file format. And at at best I’ve broken what the malware was in- was intended to do. So two things, requires full knowledge. You have to know everything about a deep learning mode and um the samples it generates are not necessarily malware in fact are not necessarily PE files. So kind of cooler attack that’s based on a black box so it doesn’t need to be deep learning, it can be any sort of machine learning model that reports a score to you. Has been investigated by my co-researchers at the University of Virginia and essentially it’s based on genetic algorithms and just in a nut shell, you know these are based on the evolutionary principles of survival of the fittest and I start with uh uh a big batch of malware and um and sort of breed it with benign ware. So elements of a malicious sample will will take structures or elements in this case for for PDF malware. And it will insert elements randomly. Mutate sort of the the DNA of the malware and uh pass it back to the the machine learning model and if I see that this decreasing it’s score, well then I’m gonna keep that malware sample around for the next round. The the next generation of of breeding. And um after doing this ya know for two weeks you can evade these kind of classifiers. Now the the difficulty of this however, two things, I have to have a model that reports a score. It has to give me a number between zero and one. Not just malicious or benign, it has to say, 90 percent malicious or 20 percent malicious, right? And secondly, um there in this process, it’s very possible, quite possible that some mutated variant of malware actually doesn’t co- doesn’t do the malicious behavior. So uh my colleagues at University of Virginia have used um a sandbox in Oracle to make sure that before mutation and then after mutation that behavior is not changed. And that can be quite expensive and is why why this kind of attack can take so long. So I’m setting the stage here, I hope you realized, uh I’m trying to paint a picture by why this is hard for PE malware to attack machine learning, uh we want to avoid requiring full knowledge about um, you know a deep learning model or any other kind of model in fact we don’t want to care what kind of model we’re attacking or even that it is a machine learning model. Secondly we want to make sure that whatever malware we produce by attacking this model maintains file format and maintains functionality. And thirdly, we don’t want to, we want to avoid the the expenses of running things through a sandbox to to to check to see if they are uh where possible. So our goal is to de- design an AI buzz word in the title but true. Design an artificially intelligent agent that will learn to byte, it will play a game against your machine learning model. Um it will it will choose mutations that are known to preserve file format and function and uh for this we’re gonna turn to reinforcement learning and to do that I’m gonna hopefully not insult your retro childhood or or current retro lifestyle and explain to you the game of Atari Breakout in uh two sentences. So this is a game where you uh move a paddle left to right and um you you hope to to bounce a ball with your paddle and uh make it launch towards a brick and every time you knock down a brick um you you get a you get a reward for for knocking down that brick, right? So um how would I build an AI agent for this based on machine learning? Well one way to do that that has been done by the folks at open AI is to wrap it in so called uh a reinforcement learning framework. And it’s actually really simple. So I’ve got a a screen shot from my you know my environment that includes the display of the Atari uh output. It includes an ability to manipulate the paddle left to right or do nothing and there’s some scoring mechanism. It gives me a reward every time I know down a brick. And then I train an agent, on the on the bottom side. And here the agent learns through some sort of delayed feedback. So given a state of the environment which is literally like a screenshot of the of Atari game place, where it supposedly can learn the position of the ball and of the bricks and of the paddle, it needs to choose the best action. Choose to left or choose to right. And based on that eventually it may re- receive some sort of reward for for doing an an action that resulted in in a a reward. So um the the basic idea here is after playing thousands and thousands of games, then the agent can learn and answer the question, you know, what action is most useful given a screenshot from Atari game play. So this is uh a fun problem and is, you can actually go and download uh an AI for Atari for from that website at open AI that will be better than you at Atari Breakout. Um we’re gonna change this to play a new game. Let me first describe to you the why we we’ve wrapped this in uh reinforcement learning. So in the Atari example, when I move my paddle right there is no reward for that, I get no points, right? But I move it again right by by chance um move it left you know by chance, move it right and and by some stroke of luck the ball bounces of my paddle, again no points. I move right again but eventually that ball goes and breaks a break and I get some point. Now in isolation, none of these moves were actually useful and resulted in a reward, but because I got this sort of eventual reward for all of my moves, I’m gonna distribute and I’m gonna sort of reward that sequence of actions as having provided some kind of useful benefit. And so this is the same things, this this very same concept we’re gonna use to break next-gen AV. So here’s the new game. Instead of a screenshot of Atari uh Atari pixels, we’ll have a malware sample. The scoring mechanism will be my next-gen AV. It will give me a score uh sorry instead of a score it’s going to say, yes, I believe you’re malware or no, I believe you’re benign ware. And the agent now will learn to select from a buffet of options that are known to preserve the file format and the function of the malware by manipulating static the binary and disk. And by playing thousands and thousands of games, the hope is that that uh the agent can sort of, sort of learn like basic ideas that given this kind of malware sample I should um add an import or I should append to the overlay or I should um you know I should create a new entry point and use a trampoline to get to the old entry point. Things like that that can sort of hide the the presence of malicious activity by sort of creating camouflage in the binary. So we are releasing a tool um to to do this and you can go to uh GitHub endgame inc gym malware and download some very rudimentary code to do just this. Um we have provided the following sort of uh the the following elements of this game play. This literally is a gym that can be used in the open AI framework for creating your own reinforcement learning agent. And we provided some uh some some very basic ones in there to begin with. But it works like this. So in the case of Atari the state was a screenshot. In our case the state will be a feature vector that sort of summarizes poorly, but you know, uh coarsely the the state of the malware. So what is what is the malware look like that that I’m using to attack the the next-gen AV. That’s that feature vector is based on you know, general file information, header info, section characteristics, imported strings uh file byte and file entropy. Things actually that are often used in in uh static malware classification by next-gen AV. Now we’re gonna feed that into a neural network that will learn this uh state, so given this state what’s what’s my best action. The actions that I can choose, right now we’ve included just you know our buffet has just a few options. I can uh create an entry point, create session, I can add um add bytes in places that don’t break the file format. Uh or functionality. Or modify things that are not know, so you know these um item potent operations of of uh packing and unpacking don’t change the behavior of the malware, but change how it is presented to a to a malware classier. And we’re using the the very cool tool called uh LIEF, the library to uh instrument execute formats by Quarkslab, so shout out to them. And um finally, we are also you know included in this free pose is sort of a toy next-gen AV you know, it’s a decent toy. It’s worth while to attack and see how it performs. Um the key here is that this game doesn’t care what you put in that black box. It could be our toy model, you could rip it out put in your own next-gen AV model. It could be a traditional antivirus agent or whatever. At the end of the day you just have to retro fit it so that it will report a zero or a one. One for malicious. Zero for benign. Alright so let me just demonstrate how this worked on a on a a some samples here, but uh first, just to drive home how hard this is, you know, I have the agent has a very incomplete state of of the world. The the malware he inspects into a feature vector that is that is and and not at all perfect, um, uh he uh his actions are are still caustic in nature. So for like Atari game play I can say move right, but I don’t know how far the paddle’s gonna move right. There’s a similar thing here, I’ll say add an import, but it’s going to choose randomly from a list of known benign imports. So there’s this random nature and um and further right I know nothing about the model I’m attacking. So this might be a little bit like you trying to um solve um you know a maze, or traverse a maze without a map and wearing kaleidoscope glasses and um and while intoxicated, which which I probably described a lot of your day jobs. I don’t know, but. This is a really really difficult problem. Nevertheless we hope that we can we can learn. So uh, you can probably can’t see this. I’m gonna dist- whoa. I can hardly see that, so what I’m showing here, this is the output and and I two examples at at first the model is just total totally guessing at random and getting no where. And after I wait for several minutes, the model through his exploration process catched a lucky break and he creates a new entry point which evades the machine model on that malware sample and he updates how he learns to evade that model. And by getting lucky enough many many times over tens of thousands of games, then you know, with this sort of rudimentary model that we put in place, um we can begin to learn to break next-gen AV. So here are the results. In one minute. Uh given a batch of malware samples, that neither the agent nor the model have ever before seen, we could modify those samples with our agent that has learned to play the game against the next-gen AV. And 60 percent of those snuck past. Furthermore, do you remember how, I don’t necessarily have to attack your model? To bypass a different model? We uploaded those samples to virus total both the pre-modification and the post-modification. Pre-modification, 35 out of 62 caught those samples. After our agent got a hold of them uh there were 10 additional antivirus engines that whiffed on those malware samples. So that’s pretty cool um we also ran sort of random mutations. We want to make sure our agent was learning something and not just getting lucky all the time. So we did the lucky experiment and you know it turns out lucky is pretty good to, but we uh the agent’s about 50 percent than lucky. Alright, we’re done. The summary is this. You can go to GitHub end game inc. gym malware and try this game for yourself uh no knowledge of the target model is needed. Will manipulate raw binaries and produce new binaries this world has never seen. Um some fraction of which may evade your machine model. I hope that people contribute and make it better. We use these things at end game to help Harden our models. Um stepping back a moment, it turns out that machine learning is actually fully robust, even under direct attack, the machine learning models warded off most of these attacks. Nevertheless, all models have blind spots. So don’t buy into the hype. And with that. Thank you. [applause]