>> Good afternoon how are we? We can do better than that. Maybe get beer. So how many people made it over to ctf? Go check it out. Look at it. It's fun. And we've heard a little bit ctf and you will talk more about this? Good stuff. Let's give these guys a big round of applause [ applause ] >> All right. I sound really loud. Welcome to our talk. We are here to talk about some work we've been doing in the context of ctf team and security research and in the context of cdc and open-source. We are here to talk about arraign and we will get to that later. First, we are we. I'm Zidus and this is Fish. We are from shellfish. We are ctf team. We are playing the ctf right now on the Defcon floor there. We are fighting hard. As you will see so this talk is full of these live demos. We are going to try some ctf challenges. Show how to approach that challenge with this framework. You will see everything melting down completely and a lot of fun stuff. The point is, something is happening on the ctf floor. We come bad ass and prepared and then the game starts. Any ways, we are shellfish. And we are also from UC Santa Barbara and there we have an awesome computer security lab and that's where anger was created. And major attributers, me and fish, Andrew Dutcher. Amazing dude. Nezgor, aka John. He's the leadest one of us. And Chris, very creative. And Chris. Pose dock. This is the anger team. As for me, I've been coming to Defcon since Defcon 9. Back in the park. It's awesome then and now. It's been a lifelong dream to come here speaking. And now I'm here it's awesome to be here in front of you guys. I'm a student in Santa Barbara. And I'm actually there because of ctf. Shellfish is mostly from Santa Barbara and I joined them for Defcon qualifier and got pulled in and pulled into the lab. That was pretty cool. And I let my fish here to introduce himself. >> All right. You guys can hear me. Awesome. Thanks. I'm fish obviously. This is not my real name. I don't think you guys can read my real name if you are not from china. Anyone from china can read it. Probably not. You guys are honest. I'm a ph.d. student from Santa Barbara. Super famous from melting work. >> Super famous. >> And I've been playing Defcon ctf, this is my 4th year. But I've been playing ctf for 6 years. I'm a reverser. Yesterday I just solved my first one. First out of four years. So I'm very happy. So solve that challenge I didn't really sleep last night. And if I talk about some none sense today...that's a little bit about me and I will hand it back to yan. So this is I tin narrow today. And we might fail completely. Why we build it? Spoil alert releasing it. Then we will talk about what we designed it to do. We will talk about the different parts of our analysis system. We will talk about application of it. Show off some ctf challenge, solving or assisting ctf challenge and open-source release. >> Any of you guys playing ctf right now? Raise your hand. Couple of them. Okay. So you don't want to leave before the talk ends. >> Unless you are not shellfish then you should leave before we get to the live example. Okay. Let's jump into that. Why anger? Why do we build the binary analysis platform mostly from scratch of using the one available out there right now. In fact there are tons of them out there. I went through and see what's our competition kind of. There are enough to fill a fire slide. When we started anger two years ago there were not this many. But now everyone staring binary in ida. It came out 2005 or so. Ida kind of a defect toe binary analysis tool that everyone uses. So, of course, our long-term goal is to, you know, replace ida. Working with ida quite a lot it's sometimes frustrating. We have moments of why it's doing this, why it's designed it that way. The truth is design analysis tool is extremely difficult. They are nowhere near replacing ida but there are things we do ida does not and the way we do them there are no other software out there is capable. We will add one more name on it and hopefully you will find it useful, at least we do. So we are pretty excited. So let's talk about the fundamentals of anger. What did we design it to be like? The idea is we are all python user mostly in the lab. Show of hands. Who uses python as a primary hacking language. So anger is written in python and for python working on binary that gives us flexibility and explores analyses that are powerful and unique way. We will show how to you anger to quickly script symboliks executions, the finding of ropgadget and is so forth. And, of course, a core component of any modern analysis system is that it has to work on platform other than the nexus 86. By levering the station called vex, we support the 64-bit and 32-bit of all major architecture. And legitimate bs we did spark troll twit. We spent couple of days to hack spark four into this and it's almost there. It's pretty extensible. This is what a user using anger might go see. You just import it. Open up an example binary and then you will go into all of analyses and all of the things that anger offers later in the demo. Anger have several different components. A binary loader that's general. We can load pe files. We can't do much with pe files yet. But we can load them and start executing until we hit some environmental interactions. We are linux binary and so forth, and we even support [ inaudible ] so if you have dump off of iot device, anger will tell you where you can load in memory and start analysis it. Fish will talk and symboliks execution engine and the symboliks execution engine is capable identifying unsafe situations and reversing what inputs need to be drive a program down a spect bat. So let's dive in. We will start with symboliks execution. Whoa. And we are done. [ laughter in the room ] It's been nice talking to you guys. I'm sure this is one of many situations. Awesome. And now it's no longer full screen. Awesome. And okay. We are almost there. Boom! [applause] Thank you. That was the first demo. Start with symboliks execution. It's a sub that has been around a little while and gaining more prominence. I don't know if you were at yesterday's talk on symboliks execution on another analysis system. This is kind of analogous thing. What's symboliks execution is. Answer the question how do I trigger a certain path or a certain condition. So you might imagine a binary that does something when you give it a certain input like crack me ctf challenge which we will look at later. And how would you interact with that. Just give it input. You say, here's a guess, is it good. And it will tell you no. Most likely you are not going to guess a flag. And you can do some sort of flag analysis. Do ida randomly. Looking at binary and clicking here and there. And you can do this way, it's not going to give you an answer because it's not precise enough. We will talk about status analysis later on. Now we need symboliks execution. We interpret an application as we interpret it we track constraint symboliks variables and required condition is trigger and see a path that we like we conquertize the input, the variables to identify possible concrete input. A quick example of this, if you have a constraint on symboliks x. You can do constraints solve. Come up with a number 42. In this case it's super trivial and in general it's an empty problem and it's kind of pain in the ass. You start constraint solve and it will never finish. That's one of the challenge in symboliks execution. Let's go for example program. Anger analysis binary. Not python. But python is more approachable. Bonus points if you can catch the vulnerability in this program by the way. First thing we do with symboliks execution is go line by line, hit this input. You see blue and execute it. That input symboliks variable x. X is unfounded. It does no known constraint on it. Then we continue executing and we will hit this branch and what it does is it splits into two possibilities. And so one of the possibilities is when x was greater than ten and that branch was taken otherwise it's the inverse of that x is less than ten. And so we continue executing. Now we have two states. Keep that in mind. So my frustration there's multiple state and we will see why that's the problem. So now we have two states. And one branch, one state is not done yet. And it splits as well according to the different possibilities. Then if you want to answer the question of what does it take to print two in this scenario. In order to print two we have our constraint we have the state that path made it there and we do constraint solve and constraint solve tell you you can put 99 or 42 and so forth and give that dynamically to the program, launch it and see the expected there. So let's do a demo of kind of very simply binary that have this tool backdoor that we want to detect with symboliks execution. This key come is from me because I stupidly get pulled right before doing this and so I don't think it works anymore. So we might have to switch to fish's laptop. Can everyone see that? Awesome. So we launch it. Okay. We will switch to fish's laptop for this demo. This is what live demo is all about. Python exceptions. There you go. So fish uses window, I know it's embarrassing. Bare with us. >> You know like in this kind of situation window is never [ inaudible ] [ laughter in the room ] >> At least it's a linux vn. >> Anger currently support linux. In the future it will be run on windows. >> Allegedly. So this is an anger management. Anger's gui to do symboliks analysis and static analysis. So we will look at this tool binary we have that's nice for testing and explain what symboliks execution mean and let's look at ida first. As I said, of course, everyone uses ida, oh, the source code. Yeah. We do. Great. All right. So we will just look at it. The, this binary is a binary that asks...we do have the source code. All right. So here is.... >> Is it readable? >> Awesome. All right. Guys, if you can't read this, then we are screwed. So it's a very simple binary. It has user name password. It takes the user name password as input. And it cause authenticate function return one and says you've been accepted. Authenticate function has a backdoor. If you pass string compare, then you will authenticate automatically. So it's possible to detect this automatically in anger management. So here's the gui. Over here we have the display of what paths are currently active in the analysis. We can run multiply analyses at the same time and never run just one. We can stub these paths and we can look at what's present currently in their registers. Is there a way you can scroll somehow? So this is what's currently on the stack. So then, we can take that and stop it. Let's execute until it branches. So here we have a path that branched. And it branched for some reason and that reason is because there's user that's symboliks variable that can be compromise to anything. And here we can actually look at it. You can look at user input and we see that...fisher I can't use your mouse. Oh, I'm touching...but you can see that on one hand the user can input any password and it does one thing and if the user input sneaky it does another. If you look at standard output instead and we keep stepping, there. You'll see that here when the user input so sneaky it immediately trusted him and let him in so this is an example how symboliks execution can help us analysis binary and we will go into more complexed one for ctf challenges. So let's...there. Oh, come on. Great. That was my temporary Defcon password. It's gone! Great. Yeah. Let's keep going on yours. People tell me not to use linux for presentation but I don't believe them. I think it's just fine. But it's dark magic. So oh, well it's you. So along with that, status analyses. >> I just figure out has taken too much of my time so I will keep it simple so I will talk a little bit it. If you are interested you should come to my lab and become ph.d. student. So let's start. Before we know binary we all need to know control flow. The first thing you see is a property box. You click okay. You will see control flow. We also do the same. In anger management that's our gui. We will show you the graph of every single function very to ida. What's the difference? It's more accurate, more adjustment, the result is it's much slower. That is because we support multiple options like contact activity level support like backward sleazing, et cetera. To automatically resolve some stuff that's hard to resolve normally or statically. For example jump target or virtual pointer tables. In comparing that cfg is faster. This is how we create cfg in anger. First line input, second line create, the binary name, third line you see.cfg. Press enter and it will give you cfg. We want to see how many basic blocks there are. There are 78. So if you want a faster cfg and you don't want to buy ida you can check this out. It's a fast mode of cfg generation that doesn't do any symboliks solving. There's also boy scott. All right. Another static analysis routine in anger is value set analysis. This is a kind of abstract interpretation. In case some people haven't heard of that is kind of static analysis to execute part of the program. There's a loop, in that it will loop three times. Then we figure out the semantics of the program and execute part of the program. So that gives us the possibility if enumerating the state space because we are not executing all the program. We are exhausting the state space. On top of that we can have variable recovery. And on top that we can build memory and type inference. Credit goes to the author of this paper. I tried so hard to read your name. He's the creator of vsa value set analysis. Here's an example of what the value set analysis looks like. Here's a piece of x 64 assembly. You have 5 seconds to read it. Okay. Great. I think if you know this, you will understand this program. So what is that in the yellow square? It's symboliks execution, it will just keep executing it. The problem is at every of the loop it will branch out. Zero acts 25 thousand different states. If you are using this, rbi will do anything. Because we are not following every single branch. With random analysis we can actually tell rbx is less than 1025. Is that good enough? Try to do better. Value set analysis, this is one of the type of values that the value set analysis is using. It's called strident intervals. A set of number can describe in upper bound and must strike between each single value and their size. So here the interval can be computerize and be it means nine different values. Between zero x 100, zero x 4. That's what stripe mean. What is rbx in the little square? We take the loop, rbx can be from zero to 4. Second interaction it can be from z to 8. And next z to -- and after the loop is not terminating. What do we do? If it's looping forever if we continue? No. Rbx go to infinite. After that zero to infinite is not accurate enough. We perform a narrate. It becomes zero 1024 with that. In this case it's pretty accurate. We extended the original random set analysis following two different improvements. The first one we name it limited related analysis. In this case. Normal vsa will be able to tell the bound of rex should be 5, rcx they don't do any relation tracking. They don't know that. We are doing some limited amount of variable relation tracking and in this case we are able to tell rcx equals r plus 1 and rcx 36. That improvement we made our vsa agnostic. We included another analysis called rapt interval analysis. The credit goes to this guy published in 2012. With that the precision is quickly improved. And now I'll give back to yan, and we will talk about application and reel demo. >> All that technical talk or theoretical talk maybe a little boring but it's necessary to get us into the actual anger application. Here we will demo off the thing that we do and you can do with anger. First we will demo off ropgadget finder. Ropgadget or x rub that will tell you there's the gadget and the instructions. This tells you what the gadget does and you can filter it down later. And, of course, implemented in anger and it's super easy to use. So here's the example. So we load ctf binary called nuclear. I'm not from this Defcon ctf but different ctf in the past and we analyze it. We want all ropgadget find them and print them out. So let's do that. Because it does semantic analysis. It's a little bit slow. So it takes 20 seconds maybe a little bit more for this guy. So right now anger is analyzing every basic block and figuring out semantics. What register to touch, how much change the stack by and where it writes to in relation to the variation registers that it uses. So here's an example gadget. It's a gadget at ox 4040 c and binary changes the stack by ox 14 it pops rbx and rbv and it does a memory write to this address. So this is actually on the stack. It doesn't memory write onto the stack. And the memory read from address that depends on rvp. This gives of information. In fact our next step is to implement rop compiler based on this. We were hoping to have this but not quite ready but stay tuned. Another thing that I'll demo off is a how to solve a crypto challenge in anger. This is more of a crack me. But it's a cool little demo of anger's ability. The challenge is from a white hat ctf. It's a ctf that happened last month and then I was looking at the challenge later to see some crypto. And found this. Figured it would be a good example for you guys. This challenge takes input on the command-line and standard crack me fax. It tells you if you are right or wrong. In general we try to guess, we are wrong. Let's open up an ida. We start looking around and the binary is really big. So, of course, we can start drilling down into parts of the binary. Figuring out what it might do. We can did he come pile it and try to figure out. One of the first things we see immediately is it does something. And if this return zero, it says please check again. All right. Let's look at it. And it does some complicated stuff. But there are equals equals eight. This is part of the process for solving the challenge. So I went, quit out of ida. Went to anger. I wrote a little...whoa a little bit of code. So it just heavily commented in our example repository. We opened the binary. Anger, symboliks execution have trouble with certain kinds of code especially in static binary. I hook with python replacement to help them along. And then basically I ran it. I said, I created a path gui which is a single entry point of symboliks execution engine and I told it to go and find this point. And this is the point where it says it passes this stage and it says the input is okay. And it does some more processing and this is the pain in the butt to get through. So let's look at where we are at this point. It turns out at that point the key space is much reduced the possible key space to make that point in and after that is root forcible. So here I get the possible value from anger. And, of course, how to do this is all not docs. Or look at this example. With the very fancy practicing bar and test every possible value until I get the right now. Allowing me to solve this crypto challenge. So let's see how this goes. Run it here. Here anger stepping through the binary, it's at this point where the input was is tested again. I guess at this point. And now it's just trying to reduce the set of possibilities which found from 8 bytes to 6 thousand more possibilities. This is an example guess debugging iterating through the possible keys that can even make it through this point and try to find one that says success. It should find it at the 80% mark. I'm surprised that it hasn't crashed yet. Boom. We found it. The flag is this. If we run a crypto, actually this is the input. If you run the binary, boom. So anger is very useful for these sorts of challenges. I'll pass it onto fish to look at real world or ctf that's happening now and how to use anger for that. >> So one of the anger's ability is load up binary, execute arbitrary part of the code in it. I had some demos for it before and prepared Defcon but yesterday when I was playing ctf Defcon there was a challenge for another. This is a good one to talk about for anger. >> Cover your ears, please. >> Rxc is 64-bit binary. It's big reverse is hard. We spend a long time reversing it. Before that we got some suspected rob chain. What does rob chain do? I mean we can definitely hire a bunch of monkey to figure out but we have anger. >> The monkey we hire would be ourselves. >> So this is our rop chain execution program called derop. You pass the rop chain load the binary rfc and dump all our stack. Create a state. I dump that on our stack, and then I execute it. I use explorer, execute and return. Let's run this program. Python rop chain to pi. We return the first rop chain. Bummer. It's called r. >> Very descriptive variable name. >> There's an unconstraint path. >> Of course, this is all. >> Fie rows. >> This is technical. You can read the documentation to see what's going on. >> This is the exact path that rop chain is following. And now, of course, you have the ability to read every single state and every point in the chain. The next example for the same binary, in this binary there's a really interesting function. It does some encryption. And later on we figure out it's t. We don't want to implement out when we were writing exploits -- (inaudible) -- what do we do about it? Luckily we have python. Great. So there's another program I wrote it's small called collarbone. What it does is it takes in a data live and encrypted with the exact program, srd exact function in that rxe program with the exact encryption function. So it has 30 something lines of code and then you don't understand to encryption function anymore. You spot python and automatically encryption for you. Let's try it. Python. Call it pi. 8 bytes. And then you get encrypted data that it all works. >> Whoa. Binary dipping but interest of time you can check this out on your own and we will briefly talk about cdc. You know it's a cyber grand challenge. That's the machine, one of the machines that will be running the finals where machines will battle each other for hacking some premises next Defcon. Shelf fish accept this challenge and we manage to qualify. There has been a lot of presenting. Go back. This is a very clever sets of slides. Shellfish participated in this challenge. And we qualify putting from just another ctf into the richest ctf teams in the world along with others who qualify. With the cdc we use the cyber link system exploits from binary and patch them. It is complex and anger actually sat at the core of every component. Which is pretty cool. So check out the system. It's real world system with real world uses and we love it. And it's open-source. With special thanks to our professor darpa with two different project anger was developed for. And, of course, all of, the contributors to anger that we've gone over. You can pull it at get hub. Anger dot o scribe to our mailing list and we welcome questions. We were hoping to make this next generation binary analysis tool and we hope to work with you to do it. Anger is two years old now with almost 60 thousand code about 6 thousand commits and we love all of you working with it with us. Any questions? I guess no questions. [ applause ] Thanks. >> Thank you guys.