>> Good morning everybody how are we feeling today? Good thank you all for coming. These guys are unfair advantage because everybody come in yesterday morning and said what in the world is that thing. It looks like the pro motion worked. So let's learn about how machines are going to replace us all. Let's give these guys a big hand [ applause ] >> Hello Defcon. So thank you for the intro. Yes, we did put a high performance machine this year. My name is mike walker I'm the program manager about researcher agency darpa. >> Hi everybody I want to put out the url on the left front stage, if we are going to talk about capture the flag we should let you guys play one. If you go there, you might want a laptop, there's binary, source information. Connection information. Have at it. We might have to give you hints later on. And one a little bit advice if you connect to it, try not a long session, you will see why later. >> Hopefully some of you can hack it and capture the flag. In the meantime, I want to talk to you a little bit why we are here. When I walk around Defcon and I'm around darpa. We are also known for our history of challenges. Starting in 2004d arpa started opening global competitions. First in self-driving cars. We gave a million dollars to the first team that can drive an autonomous car and another in city traffic. Again, global open competition developed technology that doesn't really exist yet. We are here today to talk about bringing autonomy to the sport of hackers to capture the flag and we will take this room and knock down the air walls next year and have three times bigger realtime, with sports casting and imagine an events where all the contestants are machines. We will talk about how it works and why it's so important. So if we are going to talk about computer security and autonomy, we need to recognition that computer security is adversarial contest of the mind. Bruce and Dan talk about a field that is defined by an intelligent opponent. And computers have been playing the adversarial contest of the mind for a while. We will start with checker. It's a solved game. We are able to write every single position in a database all ten to the 20th position. Solve for what perfect play looks like. It turns out once the checker was solve the conclusion with perfect play the winning move is not to play. But the big play is chess. Chess was proposed as a grand challenge for machine. Father of information gear. And the idea of the computer that can be the very best people at chess took 46 years. It wasn't until years later where deep bluetooth became the chess champion. In 1970 acm created an all computer chess league just so computers can play each other. At prototype competition in 1977 one of those competition computers beet the grand master for the first time. It was foreshadowing things to come. That's chess. Let's talk about harder game. This is go. Recently go computers have started to be competitive. Very best master at go. This board which is a beginner board. This is not the board that master play on. They play on this. 10 to the 7th 6th power position. 10 to the 86th power number of atoms in the known universe. That is not infinite but not possible to reason about. When it comes to boards computer don't have a chance with the go players. So when we start talking search spaces, machine starts to breakdown when they start to play against us. But the game I've talk about is actually a very simple game. Let's talk about a real game. We appeared locked up. I'm just going to keep talking. Let's talk about poker. Poker is very difficult game for machines to play. And it's difficult for a variety of reasons. The first is this is multi opponent. You don't have a single opponent in poker. If I'm at a table and I'm playing poker and I'm a little fish and at the table is another little fish and big fish. It behooves me to get the big fish to eat the other guys, which means I need to mold player versus player interactions that are not my own. Thank you. Additionally, keep the mic? All right. I'll keep the mic. Oh, yeah. We can switch that back to the slides. We'll do it. This game is none zero sum. So what does that mean? It's if I play 9 games of chess and I win except the 11th. If you win 10 games of chess and win every one but loses the 11th game, I'm winning. If you win 10 games for a $1 and lose a million dollars at the 11th, you are not winning. Poker is a game of incomplete information. I can see my pieces and you can see mine. Poker everything is a secret. As a player you have to keep a statistical probability model what your opponent has and move throughout the game. So how are computers doing against human with poker? This year four versus 4, 7 hundred thousand chips. Do not reproduce. So since we are not just in Las Vegas we are Defcon. Let's talk about a really hard game. Capture the flag. It's being played right now in the valley with about 15. You can tell from the get go that this is multi opponent. This is the live network exercise. Big team sport. Let's talk about what the teams are doing. Imagine you have a friend who is not a c coder. And he's written a whole bunch of new services and he's giving you a new one and said plug into a server with best ones in the world. That's capture the world. First you have to defend information. Capture the flag. Flag is data. And code you have never seen. You want to keep flag so you are protecting data. All as fast as you can. Second you want to take your opponent's flag. You want to feel the patch and take as many flags as you can in the short amount of time as possible. If somebody hands you that server please plug it into this hardware, the clever amongst you will say I know how to win this game. I will turn everything off. This is a network exercise. I'm going to talk about what that referee is doing. It's a gigantic cinder all the data in the game. The game changes over time. The upshot of this you don't know who the sender is and the referee the game organizer talking to every piece of software and making sure that it still work, they are connecting to it and e-mail server they send you e-mail and make sure it's work agree correctly. And web server to make sure all the content is there and up and running and running as it needs to be. So if you slow the software down, damage it you lose points. If you turn it off you lose all your points. Keep your data, take other os data and don't break any of the software you are trying to defend. This is obviously a game of incomplete information. The flaws your o-popes have. So sounds hard. Let's continue. How is it that teams play this exercise. If you play live network defense contest. It's simple and I'm going to sit down with wireshark and get to work and, wireshark will decode single pack of data. And the reason for that is it's running all new protocol, software. Nexus don't have a single vulnerability signature that works. You have to do binary reverse engineering the entire time. You have to, you are given binary code, no documentation. And the only way to know how it works is to reverse it. To write your own vulnerability scanner and do it as fast as you can while your opponents are trying to do the same thing to you. Research space. The number of atoms in the universe and size of the go board. Big numbers. It turns out when you want to reason about the number of inputs into arbitrary un examine software we have a good proof that says we don't know anything about software in the general case. We don't know when it's going to halt let alone input it has. Research software is infinite and it gets harder from there. If you want to explore space and software and it's none trivial. You need to learn how to have a conversation with it. If I'm an e-mail client and I'm calling a brand new e-mail server and I say hello. And hello your sequence number is 50. I don't know what to do next unless I reverse out. Maybe I need to add 51 or maybe I need to hash it or add a match. Even to explore the state pace and know how many position there are, I think to synthesize in order to talk to the software. You have multiple opponent. None zero sum game with incomplete information. If machines can't win at go and win at poker do machine have a chance of doing this at all? And that's exactly what we are talking about doing. Taking the team away from ctf table and let the machine. But any machine but this one over here. We will fire it up for you guys [applause] so this year we brought one of these. Next year we are bringing 15 racks. But that one is 1300, 16 tb of ram. The whole computer outlay is a half megawatt. And we are going to run it and all that heat as the Las Vegas summer. Machines is not enough. This contest is about automation and software about the system that's going to solve this challenge. Let's talk a little bit why this is feasible. For the last year we've been running the qualifying round. The result of that qualification are free and open and available and download them. That's everything the machine did in our qualifiers. Every binary that patch themselves every vulnerability they built. How many capture the flag we let the machine played? In scale, Defcon capture the flag, team up to 80 people have to solve 10 challenges in 48 hours. Ten difficult reverse engineering binary challenges. Machine 131 in 24 hours. The machine skill capture the flag. They were able to synthesize vulnerability in 75% of the software we released. When I say they prove vulnerability I don't mean they binary code and spit out we need there's an integer over flow here. Not false positive. They were able to create the input and logic recreate the binary they've never seen. That means the conversation logic and the input that creates the sig fault. We asked them to patch software. Now, obviously we have conditions on this because it's easy to do that. Start exit. So what we asked them to do original functionality. The unit test, make sure the software was reasonably undamaged and not slow if software down. Performing within the limits. Given those preconditions we know there are 590 bugs and of those bugs we can test for as a field the machine's patch 100%. So we think we have believable autonomy in this space. 7 finalist qualified who built the system we will introduce to you later. But think about the scale of that capture the flag that we will try to bring onstage near. 131 binary to run it in a day live network head to head that is whole tons of data sports to you live event. So how to go deeper from a sports casting perspective. It's easier to have raise of the bar contest. Team has points. But we wanted to actually to be able to see in realtime structurally what great patch looked like. What a great cashing input, flag capture looked like. To do that we have to build some visualization software and we did not decide to bring you screenshot. We brought a live demo. With that Jordan you are on. >> Thankfully I'm shorter. This mic works. Let's go ahead and pull up what we are talking about here. Mike talk about we have software that's able to look at other software. If you've done binary reverse engineering this will be familiar to you. S tray this. Sample sys call log. I'm able to see from here what kernel system called were made. But I don't see the logic. I don't see comparisons. I don't see certain things so to do that I'm going to pull out the debugger and look at what it's actually doing. Ctf is command-line game and understand is command-line exercise. >> I do pro, so that get you some graphical interface. You are looking at 886. Don't try to read that. This one, though is the challenge you are running right now. You've seen the source. Hopefully you are working hard at it. I would love to see someone do crash on it. 300 lines of code. Clapping for something they saw there. Oh, that's good [ applause ] >> So we have little traditions to welcome new speakers. Are these guys doing a good job? [ applause & cheers ]. >> So we have a couple of patches here for these guys that we want to give to them. And we have one for the competitor over there. Thank you, guys. Give them audio round of applause. [ applause ] >> Nice. That makes more sense than source code. So this isn't really all that useful. Let's look at what the program does. It's a customer support message board. It's a classic application. Two vulnerabilities that we know about. Possibly more that's a beauty of c. I'm going to go ahead and put my name in there. Let's view some threads. Welcome to our message board. Let's reply to this one. Your software broke. Like most customer support request. Check the threads. Oh, good. It's got my message. So we can add new messages. Simple service. Let's exit back out. Now we get to the fun spot. Instead of looking at that, that execution right there is loaded up live in here and maybe it's 37. This is good. We got lots of input. I don't know if you could tell which one is mine. Let's see if this one is mine. >> So you want to give us technically what this is. >> This is me. I'm looking at someone else's. This is hexes. It's a visualization engine for cgc and this view is tracker. Memory trace viewer. That's what we are looking at. This is software running over time. And time, of course, on the x that's how it's done. When we start here at the beginning. The assembly bottom left. >> So the program ran in the dynamic sandbox the events recorded and what you are seeing left to right is execution overtime. And structure that was created by that data being executed by that software. >> So the fancy explanation is we've got realty address mapped into curve to maintain locality what it means it's a picture that shows what the program did. We took the program and we feed it input. And in this case is what I just did. That's what we are looking at right now. When other tracks, these are what you are doing. I'll see a little red one. If somebody manages to figure that out we will see it there. We can step through this track. Let it run. Change some layouts views of it. We've had system calls. Saw earlier this is my use. I've got transmit. I can see output to the program. Allocation. I can see memory. Layer on top of linux. All of these being shown in the gui. We've cheated. You don't need source code to track. This is viewer thread. And looking at this you can see structure. You can tell the piece that outputs data over here. This is the region of memory the piece of code that is doing word wrapping. It's reading over. Bouncing it back and when it gets to the end of it. You don't have to know exactly what the assembly is. You can make comparison without having that. >> Locality is preserved. Code that is closed together, it's grouped together. So a far jump is a far jump. >> And likewise a tight loop down here or something looks thin. This is the code in the original disassembly program. >> Over time is cool. Over time it's able to compare traces. We are hoping still that somebody is going to capture our flag and give us a crashing trace. We will show one now. Should we drop a hint? >> Yes. >> Structure links are super interesting. Strength of 30. So I might want to try that minus 1 plus 1. Anyway. >> I'm going to go ahead and generate a different use of the program now. I'm going to post new threat. My subject. Software. It's better. My spelling is too. All right. Back and look. That one is not mine. I don't think that's mine. >> Let's go take a look at that. >> Here's mine. That's the different one. Let's look at here. Let's pull them both up. So now, you. >> You can see the green trace as the red x of doom. Which means a security harmful crash. Signal 11. Generally bad. >> Well done. We have your name. If you can tell us your name that validate that you were the person. And this is yours the first to solve our capture flag. What was your name, sir? Funny thing that. Trust and verify. >> If you can come up and tell us to the table quietly what you signed in as. >> You can also tell us password. There's a root password [ inaudible ]. You want to scroll through what happened in the case here. >> So we've got, in this case whoever was able to crash it, you can see outputs, we already know. We see output. Someone viewed the message board. But in this case there's extra data. Leak out extra data. If your subject line is exactly 30 characters long. So [ inaudible ] read out the data actually send way too much data and causes a segmentation fault. >> It's a memory over read. >> So let's look at patch to that. Patch it for us. We want to see what it looks like when it's fixed. >> Before we were using the same software to compare different inputs. We will put the same input into different software. So the crash generated by the crowd going into patch and unpatched. So you can tell immediately which one is patch. Huge memory leak towards the horizon. The patch one ends normally. We should see the moment that they diverge. >> We can see right in here. Let's look at a different one. Different patch for a different program. >> So notice that was a very tight patch. Whoever wrote it knew exactly where to test for. They didn't change the program, there wasn't whole bunch of testing. This is software that is not connected to the internet. This is a completely different program. >> If you look at the shape of these, clearly it's doing different stuff. There's a flat initialization and broader peaks and valley. But they look similar. Two runs on the same program. One was able to trigger a program. >> The blue as the doom. See what we can learn about this patch. Anyone see a patch? >> Remember, a far jump is a far jump. In the very beginning is jumping as far away as it can and it's calling allocate and jumping back and that to me looks like a classic jump patch. Inserted by the patch author. I'm doing something allocate to protect the program. Never this fast, you will see at ctf. And here end. >> Nice easy. Something happen at the beginning. Allocation maybe it's tracking where memory is legitimately allocate. Instead of crashing, it exits cleanly. With an error code. >> Stack cookie detection. It looks like detection jump away hand code. We only have one more surprise for you about this patch and how it was made. >> This one was written by a computer. This is a binary from the qualification round that we recently finished and this is software patching software with no human interaction. Completely autonomous. This software was unknown to the system generated this patch. No access to source code or documentation. It did this all automatically. It decide to submit this to us as best case approach. And healed vulnerability. And it looks like what it is incredible tight patch and a flawed that it probably know about. >> In that case, it was a tight patch at the moment of the flaw. So it was able to detect that exactly there. It went go ahead clean and exited it. Readable error. Those about 200 thousand. Third one is 9 hundred thousand. So right off the bat it's a much longer execution but it's still the same program. If you were to pull this on ida and step through gdb it would have different functionality. If you look at it here, no system call, no other interaction until the very end. Much like the others. So it's the same shape, same gram and in this case different program doing the same thing. It's got the same input. So you get the overrule shape that looks the same but it's got a lot more to it. And we can see what it's doing. Hide underneath it. I call this the railroad track. If I click along any of the spot blue plain, it's where it was in that binary. If it's in the same spot, it's in the same spot. This bottom rail, line here, it was doing the same little check over and over again. We can hone in and look at the assembly. We don't need to because we know this program was successfully defend this particularly flawed. Instead of patch at that flaw, it was checking everywhere. It was a longer long. But it was able to do that not knowing exactly where the flaw is. I like to look at the traces and digging through the qualifier. And they almost have a personality. Because you see different approaches the system will take. Here anybody could see it you could tell right away where the changes are and what's happening. >> In that third trace machine grab with uncertainty. Jordan when his team can actually pull out difference in approaches. It's another thing to say I know which system built that. So which system built this and more important who built this system. Because we started this whole thing with about a hundred team registered around the world and we qualified the top 7 scoring team and they are mostly in the audience with us today. So I want to take a moment to appreciate that. We say can machine do this stuff in adversary format. We have researcher say maybe it's possible and maybe we can do it. So I want to talk about the teams. You can see the submission of their system. Capture the flag operating system available online. I want those teams to come up right now and join us in front of the stage. If you are a finalist and you are here today, come on up, guys. [ applause ] >> So when I call you out please step forward and let everyone give you a round of applause. Our finalist in no particular order. Team partnership between university of Virginia team tech x. [ applause ] From the home of university in pa. This is a team with deep ctf for all secure. Come on up [ applause ] Team code gazitsu. This was a much bigger team than was able to make it. An international team. Whole bunch of folks calling in from skype. [ applause ] Team dissect. If you play on the ctf circuit. Michael Cotraous. [ applause ] From the university of Idaho, led by Dr. Jim Elsfas. [ applause ] Team shellfish. You know shellfish working out of the lab of University of California Santa Barbara led by Yuns. [ applause ] And team deep red lead by Dr. Tim. Say hi, everybody [applause] These are our seven finalists and I want to close with a few parting thoughts. First, why are we doing this? Teach computers to play this game. Because it's not just a game it's one of the toughest applied reverse engineering on earth. And these are the hard skills of computer [ inaudible ]. Hard skill right now where machines have no chance. I'm from darpa. That means we try to invent the technology of the future and it's really easy to understand why we need something that can react to a new threat and attack in realtime. There was time to deny before. These are the pioneers who signed up to do it. I want to send out a big thank you for Defcon conference. Everyone on the team. Who let us put this big logistic into the move and hotel. Thank you very much. I want to close with the most important word I've said all day. It's not cyber and not darpa. This talk is Thursday august 6. Machine play capture the flag before Defcon. That means if you want to come back this room when we triple the size put in 15 of these racks and play capture the flag today you need to show up one day early. I want like to close with a round of applause for these teams. By this game on that machine live in this room next year the machine play capture the flag. Thank you very much. [ applause ] >> And I think we close this up with time for questions. >> If you have questions, come down here. I have the mic. >> That guy has the question's mic. >> There's one important question. Looks interesting. >> So what do you guys think about inviting the winning team to play against humans Defcon capture the flag near? >> I'm going to repeat the question. So I believe we have the of the legitimate business that run capture the flag with us and he asked if machines will play human next year. I can't ask these teams to do that. Well you can. Show of hands would you guys do that if we set it up? [applause] Okay. If you guys will play, we will keep the computers on. If you will basically tell everybody that it is a basically a fair and open contest for machine and people, good? Okay. We've got a deal [ applause ] And for the record I think it's probably early. But if you guys will make the game open to the machine entering and team wants to play, we will keep the machines on absolutely. >> Awesome. >> Great presentation. So just a couple observations and two questions. I want to start off with [ inaudible ] a lot of discussion domain way. Checker poker ctf and throwing a lot of power at this so is it safe to say this is mainly brute force type of problem or Denis at deep mind you can create game ai in naturalistic from neuroscience or is that just science fiction. Sorry. A lot of rants there. >> But I think the question is basically like how feasible is this and what category of problem does it fit into. The world top programmer's mind are literally standing right in front of me. You should ask them. We don't know what kind of problem, we will find out. Is that it? >> That was a cool visualization of the tracing. What did you use to trace instruction to do that? Emulator. >> How the tracer were generated in this case? We got a patch qmu. You can substitute that with anything. You can use anything that does execution system tracing. >> To make your visual software available after the contest. >> So far everything that's part of the infrastructure from the operating system we built for ctf up to all the challenges are going to be released to the public and also releasing everything that happens in the event. We have open track for almost everything we are building but that software, I'll let him answer that question. >> That's joint work with another company that generates that bits of it. We are still working on that. Mainly because it's an early prototype. We wouldn't rule it out. That's for sure. >> Thanks. >> Thank you, guys. Let's give them another big round of applause. [ applause ].