1 00:00:00,600 --> 00:00:03,900 ALEJANDRO CASERES: Thank you, mistress. 2 00:00:03,900 --> 00:00:09,740 So that all just happened. (laughter) where do we even go from here, 3 00:00:09,740 --> 00:00:15,849 really? All right. Let's get started with the talk. Let's talk approximate some computer 4 00:00:15,849 --> 00:00:21,519 security and not wooden paddles. Who asked that? 5 00:00:21,519 --> 00:00:27,729 Are you nervous? ALEJANDRO CASERES: I can't even see who 6 00:00:27,729 --> 00:00:32,090 is that? I'm in pain, but thank you. You are welcome. 7 00:00:32,090 --> 00:00:39,090 ALEJANDRO CASERES: Oh, it's the girl that spanked me. Oh, no, I think you got me good. 8 00:00:41,480 --> 00:00:48,480 Thank you, though. No, I'm good. Thanks. Appreciate it. That was great. 9 00:00:49,040 --> 00:00:52,720 So that's like a little hidden perk that they don't tell you about for the speaker's package. 10 00:00:52,720 --> 00:00:57,020 You get a nice cool badge. You get access to the speaker's room and a little bit of 11 00:00:57,020 --> 00:01:01,400 ass touching. Anyway, let's go ahead and get started. So 12 00:01:01,400 --> 00:01:06,350 welcome, everyone. Thanks a lot for coming to my talk. I hope everyone's enjoyed the 13 00:01:06,350 --> 00:01:11,640 conference so far. This talk's on massive attacks with Open Source distributed computing, 14 00:01:11,640 --> 00:01:15,819 and obviously I'll tell you all ‑‑ what all those words mean together here in just 15 00:01:15,819 --> 00:01:21,579 a minute. I hope you guys enjoy it. So who am I? Just so you know who is up here 16 00:01:21,579 --> 00:01:27,869 talking at you, I'm Alejandro Casares. You can call me Alex. I'm the owner/founder of 17 00:01:27,869 --> 00:01:34,030 Hyperion Gray. It's just a small R & D and Open Source start‑up. We are completely 18 00:01:34,030 --> 00:01:38,469 focused on the nexus between distributor computing and offensive security. So I think there's 19 00:01:38,469 --> 00:01:42,969 huge potential in the field. Hopefully after this talk you guys agree with me. 20 00:01:42,969 --> 00:01:48,229 So I studied physics back in college. Most of my research was focused on kind of distributed 21 00:01:48,229 --> 00:01:53,520 computing with scientific experiments. Now I'm really hoping to branch out into breaking 22 00:01:53,520 --> 00:01:57,909 shit with that. That's where I'm at. I'm also the founder of the PunkSPIDER project. 23 00:01:57,909 --> 00:02:03,709 Anybody heard of the PunkSPIDER project? Oh, sweet. More than, like, five people. More 24 00:02:03,709 --> 00:02:07,340 than I expected. Awesome. So I won't say too much about it because we 25 00:02:07,340 --> 00:02:11,710 are going to get into it here in just a couple slides. So don't worry about it. 26 00:02:11,710 --> 00:02:17,320 So as a little background, came up with this talk after I presented PunkSPIDER at ShmooCon. 27 00:02:17,320 --> 00:02:24,320 Word got back to the CEO at the time that I was building a cyber weapon, PunkSPIDER 28 00:02:25,950 --> 00:02:32,950 is a community. So after laughing about that for a minute, kind of got to thinking, you 29 00:02:33,430 --> 00:02:38,950 know, what would it take to actually build a distributed attack platform, right? So different 30 00:02:38,950 --> 00:02:42,990 examples that I'll show you here today are just kind of what came out with tinkering 31 00:02:42,990 --> 00:02:48,470 with that idea. So there's also three demos in this talk. It's really highly demo focused. 32 00:02:48,470 --> 00:02:55,470 So stick around. It will be a lot of fun. My ass hurts, too. 33 00:02:56,430 --> 00:03:03,060 Anyway, so let's get into it. To start off, distributor computing is really big right 34 00:03:03,060 --> 00:03:08,090 now. You've heard a lot about it. There's all kinds of IBM commercials and stuff like 35 00:03:08,090 --> 00:03:15,090 that. You hear big data lot. It's a nice little buzz word. So big reason for that is that 36 00:03:15,330 --> 00:03:18,450 we've seen some really cool stuff that comes out that makes distributing processing things 37 00:03:18,450 --> 00:03:23,700 really, really easy. The short of it is pretty much what folks are doing with this is lots 38 00:03:23,700 --> 00:03:30,700 of powerful analytics. Kind of cool. Analytics are cool. We all like that. But I'm not really 39 00:03:31,420 --> 00:03:35,450 into that kind of thing. It bores me a little bit. So I've been looking for more interesting 40 00:03:35,450 --> 00:03:39,570 use cases for distributor computing. A couple of technologies that have come out 41 00:03:39,570 --> 00:03:46,570 are Apache Hadoop. We'll get all up in that in a few minutes and I won't go too far up 42 00:03:49,900 --> 00:03:55,240 into that right now. You might ask, you know, if analytics bores 43 00:03:55,240 --> 00:04:00,760 you what is some fun stuff we can do with distributor computing. The answer is massive 44 00:04:00,760 --> 00:04:04,440 attacks with Open Source distributor computing, which you might notice is the title of my 45 00:04:04,440 --> 00:04:09,420 talk. So what's the high level idea behind distributed 46 00:04:09,420 --> 00:04:14,430 attacks? What exactly do I mean when I say something like massive attacks; right? So 47 00:04:14,430 --> 00:04:18,780 what I'm talking about here is conducting really well‑known, often effective attacks, 48 00:04:18,780 --> 00:04:24,680 stuff that has a relatively high rate of success, and then doing that hundreds ‑‑ hundreds 49 00:04:24,680 --> 00:04:30,060 of thousands or even millions of times even in a really coordinated and effective manner. 50 00:04:30,060 --> 00:04:34,680 So what I found in my research into this so far is that ‑‑ hopefully this isn't too 51 00:04:34,680 --> 00:04:40,000 much of a spoiler ‑‑ is you'll break into so many things. Part of the problem is 52 00:04:40,000 --> 00:04:44,520 what do I do with all this broken shit? What do I do with all this information from the 53 00:04:44,520 --> 00:04:48,570 stuff I've broken. We're not going to get into far with what 54 00:04:48,570 --> 00:04:53,350 we do with that information afterwards. We'll be more interested in the breaking of things, 55 00:04:53,350 --> 00:04:59,520 if you will. Everybody with me so far? Cool. Some head nods. 56 00:04:59,520 --> 00:05:06,410 All right. So let's define what we mean by a distributed attack. By this I mean an attack 57 00:05:06,410 --> 00:05:12,160 that uses various computing resources in an effective and coordinated manner. So why do 58 00:05:12,160 --> 00:05:18,380 we want to do this, really? Why is this going to be to our advantage? The time required 59 00:05:18,380 --> 00:05:22,490 to attack a massive amount of things, again, remember I'm talking about hundreds of thousands 60 00:05:22,490 --> 00:05:27,810 or millions of things all at once, is that it could take a really long time to do this. 61 00:05:27,810 --> 00:05:31,900 So you don't want to be waiting months, even potentially a year or years for an attack 62 00:05:31,900 --> 00:05:37,460 to finish. It's not just annoying and just impractical, but it also allows for response 63 00:05:37,460 --> 00:05:42,270 teams to the particular target to respond in lots of different and complex ways. So 64 00:05:42,270 --> 00:05:47,260 you kind of want to bang out the attack, get in and get out, sort of thing. 65 00:05:47,260 --> 00:05:52,610 So just to give you an example, picture a target of, like, 250,000 web applications, 66 00:05:52,610 --> 00:05:57,990 for example, associated with a particular target; right? So this could be every web 67 00:05:57,990 --> 00:06:04,600 application associated with a country, for example. So let's say you try to run basic 68 00:06:04,600 --> 00:06:10,690 web app fuzzing followed by an automatic SQL kind of thing. With an optimistic estimate 69 00:06:10,690 --> 00:06:15,960 doing this in a nonparallel way, it might be something like a minute per target. That's 70 00:06:15,960 --> 00:06:22,960 pretty optimistic. You end up with 173 days, 174 days to actually finish that attack. We 71 00:06:23,419 --> 00:06:30,110 don't want to wait that long for obvious reasons. If you think that is unrealistic, you heard 72 00:06:30,110 --> 00:06:37,110 me mention PunkSPIDER, we've done checks on 1.3 or 1.4 million sites so far. Our target 73 00:06:37,180 --> 00:06:44,130 is 250 million sites. It's a completely realistic target when you're talking about really large 74 00:06:44,130 --> 00:06:51,130 attacks. So why else? So sometimes you need a little 75 00:06:52,070 --> 00:06:55,970 bit of coordination between your computing resources. To illustrate this, again, picture 76 00:06:55,970 --> 00:07:00,800 a large scale attack on a massive network, maybe a fairly significant portion of the 77 00:07:00,800 --> 00:07:05,040 Internet, for example. And let's say that you realize that in order to conduct that 78 00:07:05,040 --> 00:07:09,020 attack on a large scale you'll need more computing power. Like I said, we don't want this attack 79 00:07:09,020 --> 00:07:16,020 to take too long. In a noncoordinated manner, spin up some Cloud servers, sort of in a dumb 80 00:07:16,540 --> 00:07:22,490 way how you would expect, have a little script that runs and executes a few shell commands 81 00:07:22,490 --> 00:07:27,639 on each machine, for example. You start running into a whole bunch of problems 82 00:07:27,639 --> 00:07:32,889 with that; right? If anybody's ever tried an attack like that, you know this. So you 83 00:07:32,889 --> 00:07:39,110 might want to know, like, when an attack is actually finished on one of those nodes; right? 84 00:07:39,110 --> 00:07:43,400 So once a node has finished its part in the attack, you have just freed up some computing 85 00:07:43,400 --> 00:07:49,320 resources. In order to make it as efficient as possible, you'll want to run more stuff 86 00:07:49,320 --> 00:07:54,639 on that node. That's really not going to be possible in this way. You could hack something 87 00:07:54,639 --> 00:07:59,919 out, but it's not gonna be ideal. So another issue that you run into is how 88 00:07:59,919 --> 00:08:03,580 do you actually make sure that your computing resources are kind of pushed to the limit? 89 00:08:03,580 --> 00:08:07,090 You might have lots of different types of servers. Maybe you're running this out of 90 00:08:07,090 --> 00:08:13,070 your basement somewhere on commodity hardware. How actually know that all these resources 91 00:08:13,070 --> 00:08:19,680 are being pushed to the limit. You could hack out some threading code, something that monitors 92 00:08:19,680 --> 00:08:24,360 the resources on a particular machine and ensures it's using them all at once. Again, 93 00:08:24,360 --> 00:08:28,400 it's not going to be ideal or you'll spend a significant amount of time on that felt 94 00:08:28,400 --> 00:08:35,400 we want to be able to do this relatively easily. If all of this sounds hard to you, there are 95 00:08:37,310 --> 00:08:42,750 advances in the field that makes this not hard to do to solve every single problem associated 96 00:08:42,750 --> 00:08:48,060 with using large numbers of nodes to conduct a coordinated attack. 97 00:08:48,060 --> 00:08:51,930 You'll get into talking about some of these and then move right into the three examples 98 00:08:51,930 --> 00:08:55,770 and three demos that I talked about. So for the most part we'll be talking about one of 99 00:08:55,770 --> 00:08:59,399 the best and most popular tools out there for distributor computing, which is Apache 100 00:08:59,399 --> 00:09:04,620 Hadoop. How many of you are familiar with Apache Hadoop and know all about it already? 101 00:09:04,620 --> 00:09:09,490 That's way more than I expected. We need to go over some background on what 102 00:09:09,490 --> 00:09:16,490 Hadoop is. Bear with me if you know all this. I've used a couple different protocols for 103 00:09:18,300 --> 00:09:23,870 message passing, for distributor computing, like MPI, which mainly has support for Fortran 104 00:09:23,870 --> 00:09:30,870 and C. I had to deal with Fortran MPI and it was a pain in the ass, even Fortran 77, 105 00:09:33,260 --> 00:09:39,550 which is ancient stuff. If we get into how Hadoop works, which is 106 00:09:39,550 --> 00:09:44,460 through map reduce, what I'll show you, if it's implemented right, if it is in Apache 107 00:09:44,460 --> 00:09:51,460 Hadoop, really easy code to not have to do that much work. I'll show you how you would 108 00:09:53,240 --> 00:09:56,430 do that. I've mentioned map produce a couple times 109 00:09:56,430 --> 00:10:01,130 already, but what exactly is it; right? How many of you guys are familiar with map produce, 110 00:10:01,130 --> 00:10:07,270 parallel programming concept? Cool. Awesome. Pretty good amount of people. Awesome. 111 00:10:07,270 --> 00:10:10,860 Let's say you have a problem that you'd like to distribute across the node. This is how 112 00:10:10,860 --> 00:10:16,050 map produce works. You would start out with what's called a map function. So I'm actually 113 00:10:16,050 --> 00:10:21,730 going to go very in depth of what Map Produce is. It might appear a little bit of confusing 114 00:10:21,730 --> 00:10:26,410 why we are doing things the way we are. There is a couple more slides on this that will 115 00:10:26,410 --> 00:10:32,890 illustrate all of that for you guys. Also, a couple really good examples. Just bear with 116 00:10:32,890 --> 00:10:38,529 me if you don't get all this all at once. So first thing you do is you write a map function. 117 00:10:38,529 --> 00:10:43,230 Map function is really simple. It takes in data as key value pairs and outputs a set 118 00:10:43,230 --> 00:10:49,310 of key value pairs as its result. That function is written in that it's a single operation 119 00:10:49,310 --> 00:10:54,820 on a single key value pair for you. So as the person writing it, you're just writing 120 00:10:54,820 --> 00:10:58,520 this for one input at a time. You don't have to worry about all that massive amount of 121 00:10:58,520 --> 00:11:04,160 data. You're writing it for one input record at a time only. This is automatically distributed 122 00:11:04,160 --> 00:11:09,089 across the cluster in Hadoop, this operation for each of your key value pairs. Each machine 123 00:11:09,089 --> 00:11:12,890 in the cluster has the map function and it has a set of key value pairs that it's responsible 124 00:11:12,890 --> 00:11:17,649 for doing whatever operation it is that you'd like your map function to do. I like to think 125 00:11:17,649 --> 00:11:22,680 of the map step the part that generates somewhat processed big data, if you will, in a distributed 126 00:11:22,680 --> 00:11:26,660 manner. It's usually not the solution to your problem, although sometimes it can be. 127 00:11:26,660 --> 00:11:31,670 It's pretty simple. All it is is input key value pairs, run a map function leveraging, 128 00:11:31,670 --> 00:11:37,050 all the machines in the cluster, and then outputting key value pairs after that. Pretty 129 00:11:37,050 --> 00:11:39,220 simple. After the map step is done, you move on to 130 00:11:39,220 --> 00:11:44,300 the reduced step. There can be some intermediate steps for the processing, but generally you 131 00:11:44,300 --> 00:11:48,520 would move to the reduced step. The input of the reduced step is really simply just 132 00:11:48,520 --> 00:11:53,060 the output of the previous map step. So a partitioner is going to take the value of 133 00:11:53,060 --> 00:11:57,339 the map step with common keys and distribute them such that one note in the cluster is 134 00:11:57,339 --> 00:12:03,810 responsible for running the function on all the values with common keys. So this is, again, 135 00:12:03,810 --> 00:12:07,420 distributor across the entire cluster. The reducer is usually the part that gives you 136 00:12:07,420 --> 00:12:12,000 the solution to the problem. And I know that was, like, a lot of words that I just said 137 00:12:12,000 --> 00:12:14,980 at you, and it might be a little bit confusing. There are a couple slides that might clarify 138 00:12:14,980 --> 00:12:21,980 this if it's not completely clear to you yet. I'll actually shut up for second. I'll read 139 00:12:23,910 --> 00:12:30,910 through this and read through it myself. This will clear it up, along with the example after 140 00:12:31,050 --> 00:12:38,050 it. Here is all that's happening in summary of what Map Produce is. You have inputs to 141 00:12:53,680 --> 00:13:00,680 map function. Map function is, the list of results is the key. So that can be something 142 00:13:02,360 --> 00:13:07,110 like case of two, values of three, case of two, values of four, and so on and so forth 143 00:13:07,110 --> 00:13:12,970 for each value pair. All the values with the same key, key sub two are logically grouped 144 00:13:12,970 --> 00:13:19,339 together. Our reducer function would be applied to this group in parallel so that for each 145 00:13:19,339 --> 00:13:22,850 group and then yield something in return. So these would usually be what we would call 146 00:13:22,850 --> 00:13:29,269 our results. So a common question when you're kind of first dealing with Map Produce is 147 00:13:29,269 --> 00:13:33,660 why do we do it that way? What's the use of having the values with the same key group 148 00:13:33,660 --> 00:13:40,660 together? I'll show you why we do that here in just a minute in the next slide. 149 00:13:40,839 --> 00:13:46,050 So a few things to keep in mind: Once you write a map and a reduce Hadoop will write 150 00:13:46,050 --> 00:13:53,050 it to the node and slaves automatically. Anything that's distributing things and dealing with 151 00:13:54,290 --> 00:14:00,390 where things happen or why things happen in those places that they do happen. Hadoop takes 152 00:14:00,390 --> 00:14:07,390 care of a lot of compute distributing, automated partitioning through remote nodes, automated 153 00:14:07,450 --> 00:14:14,290 assurance that the job's going to get done. If, for example, you have a node that goes 154 00:14:14,290 --> 00:14:21,290 down, Hadoop just very seamlessly detects that. It takes it a step further in dealing 155 00:14:22,769 --> 00:14:27,740 with nodes that go down that actually expecting nodes will go down. You can run it on really 156 00:14:27,740 --> 00:14:34,350 shitty hardware I do all the time and get really solid results from it. So what else? 157 00:14:34,350 --> 00:14:37,790 There's also a few configuration items that you can set in Hadoop that are really useful. 158 00:14:37,790 --> 00:14:42,350 I mentioned before that you want to be able to push your resources to their absolute limit; 159 00:14:42,350 --> 00:14:46,610 right? You can do that very easily with Hadoop, and just a couple lines of configuration. 160 00:14:46,610 --> 00:14:50,540 You don't have to deal with going to each of your nodes and figuring out some kind of 161 00:14:50,540 --> 00:14:54,600 code to make sure your resources are all being pushed to the limit. Hadoop will do all of 162 00:14:54,600 --> 00:14:59,690 that for you with just a couple of configuration items. Pretty cool. 163 00:14:59,690 --> 00:15:06,690 So let's get into the specific example. First off, I have very few complaints about the 164 00:15:07,870 --> 00:15:14,870 distributed computer committee and a Apache. If you look up Map Reduce and just Google 165 00:15:18,470 --> 00:15:24,329 it and find some examples of it, the only freaking thing you'll find is a word count 166 00:15:24,329 --> 00:15:30,360 example. So that's really annoying, because once you start seeing the same example again, 167 00:15:30,360 --> 00:15:34,930 if you don't quite get it at first, you want to see another simple example that will kind 168 00:15:34,930 --> 00:15:39,750 of help you out with that. So it always seems to me like with Hadoop you're either reading 169 00:15:39,750 --> 00:15:46,750 a word count example or you have to pour through hundreds of lines of Java code. Also, word 170 00:15:47,200 --> 00:15:53,380 counts are really, really boring. It essentially counts the instances of a word in a particular 171 00:15:53,380 --> 00:16:00,380 piece of text. So that's kind of lame. This example is a tool called PunkSCAN, free, 172 00:16:08,769 --> 00:16:14,670 that Hyperion Gray released. We'll get into it of the picture a situation where you have 173 00:16:14,670 --> 00:16:21,110 a list of URL's. You have a ton of URL's potentially like a few hundred thousand or even a million. 174 00:16:21,110 --> 00:16:26,470 So we want to be able to perform a Map Reduced job in Hadoop to fuzz these URL's quickly 175 00:16:26,470 --> 00:16:31,420 and search for vulnerabilities on the pages. Another constraint we'll place on the job 176 00:16:31,420 --> 00:16:38,420 is we want all the vulnerabilities to be focussed at the same time. You don't want a bunch of 177 00:16:40,180 --> 00:16:47,180 disparate on their own and not who they belong to. This is where you will see the way that 178 00:16:49,120 --> 00:16:53,470 Map Reduce works by grouping the specific keys together during the reduce step is going 179 00:16:53,470 --> 00:16:57,510 to help you a lot. So are we still good? Everybody still good? 180 00:16:57,510 --> 00:17:02,810 Could I get some head nods from everybody? Cool. 181 00:17:02,810 --> 00:17:09,380 What's the job flow look like within something like PunkSCAN. As I mentioned before, we start 182 00:17:09,380 --> 00:17:16,380 with the mapper step, inputting key value pairs. We care about a list of URL's in this 183 00:17:17,199 --> 00:17:23,059 case. Our input key will be none. We don't really care about a key in this case. Our 184 00:17:23,059 --> 00:17:27,260 URL will be the value. Essentially what this does is it makes it just a dumb list. We're 185 00:17:27,260 --> 00:17:33,840 not associating any keys with the specific URL's that come in. Not yet at least. We apply 186 00:17:33,840 --> 00:17:40,000 to each URL in parallel E the mapper just essentially fuzzes the URL's using a really 187 00:17:40,000 --> 00:17:44,429 simple fuzzing library that I wrote and then determines the domain of the URL. That's it. 188 00:17:44,429 --> 00:17:51,419 That's all the mapper's going to do. So after that, it yields its output in which it's output 189 00:17:51,419 --> 00:17:55,630 is going to be the domain of the URL fuzzed as the key and the list of vulnerabilities 190 00:17:55,630 --> 00:18:02,630 for that URL is the key being the domain, value being the list of vulnerabilities. 191 00:18:08,220 --> 00:18:11,900 All this will be get distributed across the cluster for you. The URL's are going to be 192 00:18:11,900 --> 00:18:18,900 fuzzed in parallel as much as possible. Completely in an automated way using Hadoop. We don't 193 00:18:22,400 --> 00:18:27,160 have to write that logic ourself, which is really, really useful. Now because the domain 194 00:18:27,160 --> 00:18:32,549 of the URL is fuzzed is the key of the mapper as well as the input of the reducer. So keep 195 00:18:32,549 --> 00:18:38,450 that in mind. The reducer functions for each URL for with a common domain will be single 196 00:18:38,450 --> 00:18:45,450 processing. Each group of parallels with the common domain. All that, of course, is going 197 00:18:45,679 --> 00:18:51,140 to get distributed across the cluster as well. What you're seeing already is that each domain 198 00:18:51,140 --> 00:18:56,410 is going to get handled by a particular node at a time in a specific reduced step. 199 00:18:56,410 --> 00:19:03,410 Now, why is that actually useful? The reducer function is just a combined ‑‑ just outputs ‑‑ 200 00:19:03,640 --> 00:19:07,750 it does ‑‑ sorry ‑‑ all the reducer function does is combine the list ‑‑ 201 00:19:07,750 --> 00:19:14,750 I think that vodka is hitting me like right about now, by the way. I'm all fucked up. 202 00:19:16,110 --> 00:19:22,020 Anyway, all the reducer function does is combine the list of vulnerable pages in the one big 203 00:19:22,020 --> 00:19:27,919 list for a specific domain. Then it will index them to a back end search end. PunkSPIDER 204 00:19:27,919 --> 00:19:32,400 we're using Apache Solar as our back end, which wasn't that tough a choice because we 205 00:19:32,400 --> 00:19:39,400 were running a search engine Apache Solar back end. Over all that's pretty simple; right? 206 00:19:40,890 --> 00:19:46,950 But how easy is it to code, really? I keep mentioning and it's still kind of an track 207 00:19:46,950 --> 00:19:52,320 to you guys. I mention it's easy. What do I mean by easy? A hundred lines of code? What 208 00:19:52,320 --> 00:19:56,020 is it? I wanted to show you this of the don't worry 209 00:19:56,020 --> 00:20:02,000 about actually reading all of this and doing a thorough code review or anything like that, 210 00:20:02,000 --> 00:20:07,049 but just take a look at it. If you notice, it's about 12 lines of actual code. It's written 211 00:20:07,049 --> 00:20:12,870 in Python. So that's one. This is our mapper right here. And up next is going to be our 212 00:20:12,870 --> 00:20:19,539 reducer, which our reducer is just like ridiculously simple. It's like six lines of code. 213 00:20:19,539 --> 00:20:25,780 You notice a couple things in the mapper and reducer. First off, as I mentioned, they are 214 00:20:25,780 --> 00:20:32,320 written in Python. What we have done is used Hadoop streaming, standard in and standard 215 00:20:32,320 --> 00:20:38,039 out to set up the job properly. I don't want to get into how exactly you would use that, 216 00:20:38,039 --> 00:20:42,770 but suffice it to say it's a batch one‑liner to run a job in Map Reduce after you've run 217 00:20:42,770 --> 00:20:49,770 your mapper and your reducer. If you're the kind of person that wants nitty‑gritty details, 218 00:20:53,010 --> 00:21:00,010 follow me on Twitter, I'll be giving you the particulars and the blog, if you want to keep 219 00:21:01,760 --> 00:21:05,140 in touch. Another thing I wanted to point out is that 220 00:21:05,140 --> 00:21:09,520 the mapper and the reducer that I showed you are really the only part of PunkSCAN that's 221 00:21:09,520 --> 00:21:14,230 distributed computing focused. In other words, if you were to actually download PunkSCAN, 222 00:21:14,230 --> 00:21:21,230 which you can off of Bit bucket, it's pretty standard stuff. We're not doing anything too 223 00:21:21,270 --> 00:21:28,270 crazy to distribute this code. It's standard fuzzing library, some solar indexing stuff, 224 00:21:29,530 --> 00:21:34,520 some other fairly simple things, but then you see also a mapper and reducer, which, 225 00:21:34,520 --> 00:21:38,039 again, is the only part of it that's really distributing computing focused. What I'm trying 226 00:21:38,039 --> 00:21:43,419 to get at here is there is nothing too mysterious about writing your own distributed focused 227 00:21:43,419 --> 00:21:48,789 code. It's all ‑‑ if you understand the base concepts, you'll really be able to write 228 00:21:48,789 --> 00:21:52,870 distributed attack code relatively easy. This guy is falling asleep, by the way. That's 229 00:21:52,870 --> 00:21:59,350 killing me. Hopefully that will prevent that from happening: 230 00:21:59,350 --> 00:22:06,350 What's that? Drink me or him? Both. 231 00:22:14,570 --> 00:22:21,570 (applause) (laughter) Drinking will keep him from falling 232 00:22:37,910 --> 00:22:41,710 asleep; right? (laughter) 233 00:22:41,710 --> 00:22:45,510 Great idea. (laughter) 234 00:22:45,510 --> 00:22:48,720 So demo time. I keep mentioning that this talk has a bunch of demos, but all I've been 235 00:22:48,720 --> 00:22:55,720 doing is talking at you. We need to stop that. All right. So first off, the first demo I'll 236 00:22:55,799 --> 00:23:02,030 show you, this is PunkSPIDER. Obviously, first thing we want to do here is read the banner. 237 00:23:02,030 --> 00:23:09,030 We're providing a lost vulnerability on stuff we tonight own. The goal is to provide free 238 00:23:11,520 --> 00:23:16,110 information to website users and owners regarding website security status. If you go on the 239 00:23:16,110 --> 00:23:21,669 site and look for vulnerabilities, what I'm looking for is if you're a site owner or site 240 00:23:21,669 --> 00:23:27,160 user, you want to know the vulnerability state of that site. If you're giving the credit 241 00:23:27,160 --> 00:23:31,840 card number or personal information, you want to make sure that's not being leaked all over 242 00:23:31,840 --> 00:23:36,130 the place. That's really what the site is being used for. 243 00:23:36,130 --> 00:23:41,070 Don't be a dick. (laughter) 244 00:23:41,070 --> 00:23:44,030 So a couple things you can see here. Can everybody see that okay? Does that come out all right 245 00:23:44,030 --> 00:23:48,460 over there? Perfect. So a couple things we can do here. We can 246 00:23:48,460 --> 00:23:53,049 search by a particular URL or by the title of a site. So we'll just go ahead and search 247 00:23:53,049 --> 00:23:58,919 by URL. Down here is where you specify the specific vulnerabilities that you would like 248 00:23:58,919 --> 00:24:05,919 to search for. So we'll go ahead and check all of them. And this changes it from an end 249 00:24:07,240 --> 00:24:13,900 or we'll see any site with the search term that I type in with any type ‑‑ any of 250 00:24:13,900 --> 00:24:17,830 these types of vulnerabilities. These are blind SQL. 251 00:24:17,830 --> 00:24:24,260 Google.com. I'll do you one better. We'll search for 252 00:24:24,260 --> 00:24:30,270 every single site that has vulnerabilities in it. It supports wildcard characters. You 253 00:24:30,270 --> 00:24:33,650 can go in and type a little star and you get absolutely everything in the database and 254 00:24:33,650 --> 00:24:37,070 it will be dumped back to you. It will take it just a second to search, because that's 255 00:24:37,070 --> 00:24:40,320 actually a large query right there, but not too long. 256 00:24:40,320 --> 00:24:46,710 If we scroll down, you'll start seeing sites that are essentially a mess; right? These 257 00:24:46,710 --> 00:24:50,510 are vulnerability sites that if you were a user giving your personal information to any 258 00:24:50,510 --> 00:24:55,720 of these sites you would be pretty pissed off; right? Let's go down to the bottom. We 259 00:24:55,720 --> 00:25:02,250 actually see the number of pages of vulnerability sites is 6,166. Just to be clear on this, 260 00:25:02,250 --> 00:25:09,250 a lot of articles on PunkSPIDER on this after we presented it at ShmooCon. But this is 6,166 261 00:25:12,210 --> 00:25:19,210 pages of vulnerability domains. So within each domain we can have several vulnerable 262 00:25:19,840 --> 00:25:26,840 websites and vulnerable pages. We have ten domains per page. So that's 61,060 vulnerable 263 00:25:27,799 --> 00:25:34,480 domains of the within each domain, if we go ahead and expand it, searching for more than 264 00:25:34,480 --> 00:25:40,450 one, within each page for each domain we have several vulnerabilities. Anyway, long story 265 00:25:40,450 --> 00:25:44,750 short, what I'm trying to get at there's a lot more vulnerabilities than 61,166. It's 266 00:25:44,750 --> 00:25:51,750 right up at about 300,000 or so so far. This was all made possible by using PunkSCAN. As 267 00:25:52,460 --> 00:25:57,630 I mentioned, PunkSCAN is what powers this on the back end. And making it distributed 268 00:25:57,630 --> 00:26:02,080 over actually a relatively small Hadoop cluster and pushes our resources really, really hard 269 00:26:02,080 --> 00:26:07,190 is what allowed us to get this level of data. Actually, the main issue that we've had with 270 00:26:07,190 --> 00:26:14,190 PunkSPIDER is terms of service. We tried to run it on Cloud servers and we run it through 271 00:26:14,470 --> 00:26:21,080 a bunch of proxies and stuff like that. I guess they have some kind of monitoring, we 272 00:26:21,080 --> 00:26:26,409 get kicked off of Cloud providers all the fricking time of the anybody work for a cloud 273 00:26:26,409 --> 00:26:30,120 provider? Rack space. 274 00:26:30,120 --> 00:26:37,120 I love rack space. I won't say too much more because we have cloud providers here. 275 00:26:42,860 --> 00:26:49,130 (laughter) Sort of see the picture of Map Reduce job 276 00:26:49,130 --> 00:26:56,130 returning here. Look at ajaxa.cnn, maybe something with AJAX. You can see us attempting to inject 277 00:27:03,380 --> 00:27:07,940 into parameters and reading the output. So here you see that this one's looking for, 278 00:27:07,940 --> 00:27:13,100 let me zoom in a little bit more, cut ID over here, and then we see it moving to the next 279 00:27:13,100 --> 00:27:17,770 parameter page over here. Then we see it kind of moving down. This is our map step that 280 00:27:17,770 --> 00:27:24,159 I was talking about. We're essentially taking a URL, attempting a few basic, basic, really 281 00:27:24,159 --> 00:27:28,919 safe, by wait, injections, and reading the output. We're not doing anything else with 282 00:27:28,919 --> 00:27:33,409 that. We're not exploiting any vulnerabilities, obviously, or anything like that. We're just 283 00:27:33,409 --> 00:27:40,409 providing this back to the user in order to be used for good things and not bad things. 284 00:27:41,110 --> 00:27:47,309 Somebody's laughing over there for some reason. Anyway, so this is PunkSPIDER. What made all 285 00:27:47,309 --> 00:27:53,529 of this possible, what allowed us to basically target the entire Internet is to distribute 286 00:27:53,529 --> 00:27:58,330 this job; right? This actually would not have been ‑‑ we probably would have 10,000 287 00:27:58,330 --> 00:28:02,890 sites done here if we hadn't been distributing this and using Hadoop to help us push our 288 00:28:02,890 --> 00:28:09,890 resources as well as coordinate all this stuff in a really simple manner. So that's PunkSPIDER. 289 00:28:10,140 --> 00:28:17,140 What do you guys think of PunkSPIDER? (applause) 290 00:28:21,690 --> 00:28:28,690 Thank you, guys. All right. So I've shown you some stuff. Now I want to get into specific 291 00:28:34,460 --> 00:28:38,850 use cases of that. That was just an example to kind of whet your appetite. What you'll 292 00:28:38,850 --> 00:28:45,850 see is me showing you or explaining demos. We'll see tools related to each one. The one 293 00:28:47,330 --> 00:28:54,330 is distributed recon. I'll talk about this really, really quickly. Essentially you want 294 00:28:55,029 --> 00:29:00,880 to greatly speed up repetitive tasks. A lot of network or application reconnaissance on 295 00:29:00,880 --> 00:29:04,669 targets is repetitive tasks when you're dealing with massive targets, so we're not getting 296 00:29:04,669 --> 00:29:11,039 into really low level complex attacks here. We're getting into common stuff that succeeds 297 00:29:11,039 --> 00:29:16,070 a lot is our goal. The only thing I really did want to say about 298 00:29:16,070 --> 00:29:20,240 distributed recon and writing your own Map Reduce jobs and things like that is to always 299 00:29:20,240 --> 00:29:26,990 be careful to consider your problem. Are you in need of CPU, memory, bandwidth, what exactly 300 00:29:26,990 --> 00:29:33,990 is it you're trying to solve. So with PunkSCAN, we had the issue, we just needed faster fuzzing. 301 00:29:34,919 --> 00:29:41,919 We had to figure out what would help us fuzz faster. Are we going to need bandwidth, CPU, 302 00:29:42,840 --> 00:29:46,360 memory? We actually had to do a little bit of pre‑research to figure that out. If you're 303 00:29:46,360 --> 00:29:51,730 interested in the details, and this was committed at ShmooCon this year. It goes into a lot 304 00:29:51,730 --> 00:29:58,730 more detail. CPU were far more important than any kind of bandwidth. This turned out to 305 00:30:00,429 --> 00:30:05,490 be useful to us because distributing the job we knew would help us. It turns out it did. 306 00:30:05,490 --> 00:30:10,870 It helped us a ton. So just always consider your problem and be really careful before 307 00:30:10,870 --> 00:30:16,960 you write these things. All right. The next one is the really fun 308 00:30:16,960 --> 00:30:23,960 one. So just to be clear and just as I've mentioned, don't misuse PunkSPIDER and don't 309 00:30:24,000 --> 00:30:28,450 attack the sites on PunkSPIDER. That's really not what it was built for. I would be kind 310 00:30:28,450 --> 00:30:30,919 of pissed and I found out that people were actually using it for that. 311 00:30:30,919 --> 00:30:37,730 But now we'll look at what we could do with that type of information if we were complete 312 00:30:37,730 --> 00:30:44,730 dicks. Mostly because it's fun, and that's off and on thing to do, and we like writing 313 00:30:46,320 --> 00:30:50,409 distributed computing code, but also for the same reasons that I've been mentioning all 314 00:30:50,409 --> 00:30:56,029 along. We want to speed up our attack and we want it to help us coordinate our resources. 315 00:30:56,029 --> 00:31:03,029 Demo is a distributed version of SQL map. How many of you are familiar with SQL map. 316 00:31:04,330 --> 00:31:08,760 Essentially an automated database take over and stealing tool kind of thing. Really, really 317 00:31:08,760 --> 00:31:13,610 cool. It was presented at DEF CON about four years ago or something like that. That's probably 318 00:31:13,610 --> 00:31:16,409 completely wrong. It was presented at DEF CON at some point. I have no idea when and 319 00:31:16,409 --> 00:31:22,299 I made up that number on the fly. The demo and example I'm going to show you, 320 00:31:22,299 --> 00:31:27,520 all this stuff is the source code is going to be available online immediately after the 321 00:31:27,520 --> 00:31:31,610 conference. So definitely take a look at it if you want to know more about it. It's in 322 00:31:31,610 --> 00:31:38,220 the proof of concept phase right now and not what I would call a real tool just yet. But 323 00:31:38,220 --> 00:31:45,220 if you're coming to Derby CON, I'll be working on a refined feature. The name of the tool 324 00:31:49,260 --> 00:31:56,190 is called Mr. Injector. The reason for this is MR equals Map Reduce. So injector because 325 00:31:56,190 --> 00:32:03,190 injection, obviously. MR injector turned into Mr. Injector, which I think is kind of funny. 326 00:32:04,730 --> 00:32:09,770 Literally nobody else has ever thought that this was a funny name for anything, but that's 327 00:32:09,770 --> 00:32:14,679 kind of just how I work. Also, in my head I picture it as, like, across between Mr. Donut 328 00:32:14,679 --> 00:32:20,029 and Mr. Peanut. And it gets amazing for me. But nobody else really thinks that that's 329 00:32:20,029 --> 00:32:27,029 entertaining in any way. So we'll just move on. 330 00:32:27,149 --> 00:32:32,049 So let me set the stage for you here. This is the next demo. The screen you're seeing 331 00:32:32,049 --> 00:32:37,539 is divided into two parts. So the left‑hand side is SQL map owning targets in a nondistributed 332 00:32:37,539 --> 00:32:43,200 manner. This is written how you would expect it. You have a simple Python or shell script, 333 00:32:43,200 --> 00:32:47,299 runs SQL map on targets in a row. You go one after the other exactly how you would script 334 00:32:47,299 --> 00:32:50,029 it if you didn't want to spend much time on it. 335 00:32:50,029 --> 00:32:57,029 Right‑hand side uses distributed using Hadoop cluster for the attack. This is a real attack 336 00:32:57,529 --> 00:33:04,529 running on a test bed of servers. It's an actual attack that we conducted. What you'll 337 00:33:05,520 --> 00:33:09,799 see is a series of ‑‑ you'll see those shells run, but you don't have to read that 338 00:33:09,799 --> 00:33:14,730 too much. Under them you'll see little red squares pop up each time a target has been 339 00:33:14,730 --> 00:33:20,539 owned. By "owned" I mean we're stealing the system hashes. So take a look and what I really 340 00:33:20,539 --> 00:33:24,470 want you to pay attention to is the rate at which these things attack. It will actually 341 00:33:24,470 --> 00:33:29,909 be pretty obvious what I want you to look for. But again, this is not a simulation. 342 00:33:29,909 --> 00:33:33,570 We didn't just do a bunch of calculations to see if this would work. We actually ran 343 00:33:33,570 --> 00:33:40,570 this attack and recorded it to show to you guys. We're also kind of jumping in the middle 344 00:33:40,590 --> 00:33:44,890 of the attack. The whole thing was barely too long to make a good demo. But the important 345 00:33:44,890 --> 00:33:51,890 thing is the rate that you're seeing here. So you're starting to see targets get on. 346 00:33:53,380 --> 00:33:57,380 This is actually real‑time. This is not sped up in any way or anything like that. 347 00:33:57,380 --> 00:34:02,059 It's real‑time targets being owned. You see that obviously the right side is much, 348 00:34:02,059 --> 00:34:06,500 much, much faster. Even though when I look at the left side I'm always kind of pulling 349 00:34:06,500 --> 00:34:13,249 for it; right? I'm like, come on, little buddy, let's go. Come on. Hey, hey, there's another 350 00:34:13,249 --> 00:34:16,109 one. All right. (applause) 351 00:34:16,109 --> 00:34:23,109 One point. And it will continue to run. I'll stand here awkwardly while I let that run. 352 00:34:26,629 --> 00:34:33,629 I'm feeling that alcohol even more now. (Off microphone) 353 00:34:34,599 --> 00:34:41,599 How many mappers? I believe we were running something like ten mappers per node, ten nodes. 354 00:34:43,029 --> 00:34:48,219 Yeah. So that's what it's running in parallel. So already you see that just with the relatively 355 00:34:48,219 --> 00:34:55,219 small cluster, small nodes, greatly, greatly speed up the attack. Right? Gotcha. 356 00:34:56,509 --> 00:35:03,509 It greatly speeds up the attack. That was 61 targets in 45 seconds. So we have under 357 00:35:04,049 --> 00:35:08,440 a second per target. What makes this really possible, it's not just the fact that you 358 00:35:08,440 --> 00:35:11,709 have more computing resources. It's really not. It's the fact you're able to push those 359 00:35:11,709 --> 00:35:16,650 resources to their absolute limit with really simple code. You don't have to get into complex 360 00:35:16,650 --> 00:35:21,229 stuff in order for that to happen. So my goal with this, what I really wanted 361 00:35:21,229 --> 00:35:25,839 to show you is these techniques actually work. So maybe there's some skeptics out there that 362 00:35:25,839 --> 00:35:29,390 think oh, well, bandwidth will be your limiting factor, you're at the same gateway, that's 363 00:35:29,390 --> 00:35:35,420 just not gonna work and you suck. So I don't suck, first of all. And next of all, it actually 364 00:35:35,420 --> 00:35:42,420 works. So shut up, imaginary person. (laughter) 365 00:35:43,079 --> 00:35:46,839 This is an example of the mapper that I wrote. Actually, this is a really, really early version 366 00:35:46,839 --> 00:35:53,680 of the mapper that I wrote. It's really simple; right? It's Python code. All we're doing is 367 00:35:53,680 --> 00:36:00,680 running a simple subprocess that runs a shell command and replacing it. If you this through 368 00:36:04,900 --> 00:36:11,900 Hadoop screaming, we refined this with the help of my friend Mark right there in the 369 00:36:12,469 --> 00:36:18,749 red shirt. We refined that a good amount. But this code actually works and runs really, 370 00:36:18,749 --> 00:36:23,160 really well. So as you can see, that's, what? Ten lines or something like that? Really, 371 00:36:23,160 --> 00:36:30,160 really simple. All right. So the output gets output into 372 00:36:31,880 --> 00:36:38,069 the Hadoop file system. This is something else that will make all this stuff even easier 373 00:36:38,069 --> 00:36:42,459 for you. So the Hadoop file system is a virtual file system that's distributed across all 374 00:36:42,459 --> 00:36:46,719 the nodes. It's fully accessible on absolutely any node that you have out there. So you don't 375 00:36:46,719 --> 00:36:51,069 have to worry about what node you're on in order to retrieve the output. You can actually 376 00:36:51,069 --> 00:36:54,869 be on any one of your distributed nodes and just grab it from anywhere and you have that 377 00:36:54,869 --> 00:37:01,109 information right at your disposal for whatever it is that you're into with that information. 378 00:37:01,109 --> 00:37:06,549 So it's really, really convenient. So what do we end up with? A punch of password hashes. 379 00:37:06,549 --> 00:37:11,209 We need to do something with these. What else would we be here for? So we just owned a large 380 00:37:11,209 --> 00:37:16,619 amount of targets. What would be really cool is if only we had a really fast distributed 381 00:37:16,619 --> 00:37:21,390 password cracker. So I'll tell you about a really fast distributed 382 00:37:21,390 --> 00:37:26,989 password cracker that I wrote. We conducted reconnaissance on a bunch of attacks; right? 383 00:37:26,989 --> 00:37:31,769 We exploited a number of targets and stolen a bunch of password hashes. This could take 384 00:37:31,769 --> 00:37:38,769 a long time to crack. We're impatient. We want something quick and not any specialized 385 00:37:39,359 --> 00:37:46,359 hardware. We don't want anything ‑‑ we don't want to have to go out and buy a bunch 386 00:37:46,729 --> 00:37:53,519 of GPU's. We want to be able to click a few things and crack some hashes; right? 387 00:37:53,519 --> 00:37:57,739 So you might notice in the previous examples I made the assumption that you can build or 388 00:37:57,739 --> 00:38:01,670 have access to enough machines to actually run a Hadoop cluster. That's actually not 389 00:38:01,670 --> 00:38:07,209 that hard. For anybody that seems intimidated by that stuff, that's really a simple process. 390 00:38:07,209 --> 00:38:11,789 There's a bunch of guides out there. You can get a decent one running with eight to ten 391 00:38:11,789 --> 00:38:16,640 nodes running a couple hours. So it's really, really simple. Let's say that you're just 392 00:38:16,640 --> 00:38:19,999 really busy. You don't want to deal with all that. What you want to do is be able to click 393 00:38:19,999 --> 00:38:24,319 a few buttons and have an instant cluster to use. 394 00:38:24,319 --> 00:38:29,969 So I'll show you how you can do that, and then crack a password over the cluster by 395 00:38:29,969 --> 00:38:36,969 using Hyperion Gray's tool which is called PunkCRACK. Admittedly this wasn't a simple 396 00:38:38,369 --> 00:38:45,099 tool to. The job of actually distributing the stuff was not trivial. You actually have 397 00:38:45,099 --> 00:38:48,979 to worry about how exactly you'll partition this stuff. I mean, to me when we started 398 00:38:48,979 --> 00:38:54,380 this, it seemed really simple; right? Each operation you're just hashing a string and 399 00:38:54,380 --> 00:39:01,380 then comparing it to another hash. It seemed simple enough. It's easily "parallelizable." 400 00:39:04,039 --> 00:39:11,039 What you run into is I am ‑‑ is that a word? It's a list of things. We don't have 401 00:39:11,279 --> 00:39:17,180 that massive list in any way. If we would just try to compute all the hashes and input 402 00:39:17,180 --> 00:39:23,489 a list from a file that would crash for any reasonable password, so it was a little bit 403 00:39:23,489 --> 00:39:27,130 complicated. We had to write our own little language that could represent a series of 404 00:39:27,130 --> 00:39:32,849 characters in order to distribute this job. So I think I'm actually getting close to the 405 00:39:32,849 --> 00:39:38,380 end here. But what I'll show you is spinning up a power cluster over Amazon, a point and 406 00:39:38,380 --> 00:39:45,380 click, run this Hadoop job and get me on my way. Really cool. Last thing, I know there 407 00:39:47,209 --> 00:39:51,369 are lots of ways to crack passwords. I'm not claiming this is the best way, the fastest 408 00:39:51,369 --> 00:39:54,940 way, the most efficient way, anything like that. Just saying it's an option and something 409 00:39:54,940 --> 00:39:59,339 you can have in your tool belt. If you don't mind spending some money for convenience for 410 00:39:59,339 --> 00:40:06,339 a cracker, this is a really good technique. This is actually a really long video. There 411 00:40:16,380 --> 00:40:23,380 we go. I start out ‑‑ there we go. I start out by showing you a really screwed 412 00:40:27,420 --> 00:40:34,420 up screen. There we go. I go here. Can everybody see that okay, by wait? Sort of? 413 00:40:34,459 --> 00:40:39,559 I'll walk you through it anyway. Don't worry about it. I'll walk you through it. It's okay. 414 00:40:39,559 --> 00:40:46,559 Full screen. How do you full screen Windows Media player? That didn't work. 415 00:40:47,109 --> 00:40:50,059 (laughter). That's the last time I'll listen to anybody 416 00:40:50,059 --> 00:40:57,059 at DEF CON. This is the job flow, setting up your specific job configurations. I'm telling 417 00:41:01,569 --> 00:41:06,779 it the location of the jar and a few basic arguments on the jar. I'll skip forward. This 418 00:41:06,779 --> 00:41:13,779 is a cool screen. I'm specifying the instance types and instance numbers. How large do you 419 00:41:14,819 --> 00:41:20,759 want the cluster to be? I want an extra large machine for the master node, which is a pretty 420 00:41:20,759 --> 00:41:27,759 big machine on Amazon's EC 2. For my one slave machine I set a cluster compute 8 extra large, 421 00:41:28,369 --> 00:41:35,049 which is a 32 processor machine, which is pretty big. Then at the bottom over here where 422 00:41:35,049 --> 00:41:40,509 you see this 17, I'm setting it to, again, really large number. So I have about 19 nodes 423 00:41:40,509 --> 00:41:45,759 here. I really wanted to show you this demo with some extra zeroes, so that would be like 424 00:41:45,759 --> 00:41:51,289 170 or even 1700, and you could pretty much crack a password like that. It does get a 425 00:41:51,289 --> 00:41:54,959 little bit expensive and you have to be careful how you use that, because you need special 426 00:41:54,959 --> 00:42:01,199 permission from Amazon. They already kind of hated me. Anybody from Amazon here? No. 427 00:42:01,199 --> 00:42:03,219 Okay. (laughter) 428 00:42:03,219 --> 00:42:10,219 They have some really powerful stuff and some really cool stuff, but they hate me. So what 429 00:42:12,519 --> 00:42:19,519 we're doing sheer we're configuring the node to ‑‑ what? Sorry. Kind of skipped around 430 00:42:20,709 --> 00:42:23,019 a little bit. Anyway, what we're doing here, if you see 431 00:42:23,019 --> 00:42:27,029 down here where it says one bootstrap action created. It specified ‑‑ one minute. 432 00:42:27,029 --> 00:42:34,029 Okay. What it did was specify one particular action to do on this across the cluster, before 433 00:42:34,479 --> 00:42:39,219 your job actually starts. What I did there is set the number of mapper tasks. So that's 434 00:42:39,219 --> 00:42:43,069 the number of parallel tasks that occur on each node. And that's what I'm talking about 435 00:42:43,069 --> 00:42:47,099 where I say you can push your resources to their absolute limit with really, remand simple 436 00:42:47,099 --> 00:42:50,489 configuration items. Long story short, because I'm running out 437 00:42:50,489 --> 00:42:57,420 of time, in case you can't predict what's going to happen, you crack the hash and it's 438 00:42:57,420 --> 00:43:03,339 done. And you it that in a completely distributed manner pretty quickly. 439 00:43:03,339 --> 00:43:10,339 And that's PunkCRACK. (applause) 440 00:43:12,519 --> 00:43:19,519 I hope you guys have enjoyed it ‑‑ you. No, you can't finish yet. Give it up. 441 00:43:21,329 --> 00:43:28,329 I need your time. Just stand there. Rebecca, are you a first time speaker at 442 00:43:31,420 --> 00:43:34,789 DEF CON? ALEJANDRO CASERES: I am. But I presented 443 00:43:34,789 --> 00:43:38,779 earlier today. Oh, right, you were the guy with the thing. 444 00:43:38,779 --> 00:43:45,069 We're not here for you because we already shot you. However, we learned Rebecca, please 445 00:43:45,069 --> 00:43:52,069 come up, did anybody see Rebecca's talk? Come on. Clap: Rebecca did not do a shot. So we'll 446 00:43:58,369 --> 00:44:04,279 fix that right now. Rebecca is going to start a new tradition. She's going to take Tylenol 447 00:44:04,279 --> 00:44:09,849 with her shot. That's awesome. I had a shot last night. Didn't go over 448 00:44:09,849 --> 00:44:14,449 so well this morning. (laughter) 449 00:44:14,449 --> 00:44:21,449 ALEJANDRO CASERES: So ridiculous. This one is yours. 450 00:44:25,279 --> 00:44:30,010 (Off microphone) (laughter) 451 00:44:30,010 --> 00:44:32,829 No. Don't touch it. (laughter) 452 00:44:32,829 --> 00:44:39,829 You're not done yet. All right. Thank you. All right. Here is to Rebecca. 453 00:44:41,660 --> 00:44:48,660 (applause) Thank you. Now you can finish. 454 00:44:51,369 --> 00:44:55,689 ALEJANDRO CASERES: Thank you. Sorry to interrupt. 455 00:44:55,689 --> 00:44:58,130 And thanks for coming. (applause) 456 00:44:58,130 --> 00:45:04,019 ALEJANDRO CASERES: It's like the fourth shot I've had to do today because of this 457 00:45:04,019 --> 00:45:10,049 whole thing. And we're out of time. I'll give you another 458 00:45:10,049 --> 00:45:14,859 minute. We hazed you enough. ALEJANDRO CASERES: Thanks. I appreciate 459 00:45:14,859 --> 00:45:20,699 it. The spanking was worth at least another minute. Definitely enjoyed this whole thing. 460 00:45:20,699 --> 00:45:26,170 Short of it is, distributing computing is awesome. When you need to run extremely ‑‑ 461 00:45:26,170 --> 00:45:31,680 I'm freaking hammered at this point. (laughter) 462 00:45:31,680 --> 00:45:33,529 When you need to run massive attacks ‑‑ More. 463 00:45:33,529 --> 00:45:39,959 ALEJANDRO CASERES: Can I do another one and take another minute? 464 00:45:39,959 --> 00:45:42,400 (applause) (Cheers) 465 00:45:42,400 --> 00:45:49,400 ALEJANDRO CASERES: This one is for you, I assume? 466 00:45:53,369 --> 00:45:57,029 Oh, yeah. (laughter) 467 00:45:57,029 --> 00:46:00,809 Considering I'm not going to have time to go home and shower before I get on an airplane, 468 00:46:00,809 --> 00:46:06,769 the person ‑‑ the people on both sides of me on the southwest flight will love me. 469 00:46:06,769 --> 00:46:13,769 ALEJANDRO CASERES: There's very little. We're out of liquor. Liquor? More booze. 470 00:46:19,380 --> 00:46:22,529 More booze. (applause) 471 00:46:22,529 --> 00:46:26,730 (Cheers) Not you again! 472 00:46:26,730 --> 00:46:32,339 ALEJANDRO CASERES: I'm actually mixing the vodka with the rum. 473 00:46:32,339 --> 00:46:35,130 (laughter) ALEJANDRO CASERES: That's disgusting. And 474 00:46:35,130 --> 00:46:38,390 I'm doing this for you. You're welcome. 475 00:46:38,390 --> 00:46:45,390 ALEJANDRO CASERES: All right. Cheers! (applause) 476 00:46:48,670 --> 00:46:55,670 ALEJANDRO CASERES: So distributed computing. (laughter) ‑‑ shit. All right. So where 477 00:47:06,410 --> 00:47:13,410 do you even go from here? What do I even do? Where am I? ‑‑ did somebody say drink 478 00:47:15,829 --> 00:47:22,249 again? So definitely enjoyed presenting the concept to you here. What exactly does this 479 00:47:22,249 --> 00:47:27,170 mean for you; right? Leveraging distributed computing from an offensive perspective let's 480 00:47:27,170 --> 00:47:31,839 you run really powerful massive attack scenarios, all using Open Source technologies, commodity 481 00:47:31,839 --> 00:47:35,989 hardware, shit you can say to your friends I need a bunch of hardware, give me your old 482 00:47:35,989 --> 00:47:41,569 shit and run it on there. Really, really cool stuff. Imagine pentesting massive targets 483 00:47:41,569 --> 00:47:48,569 with this. So something like pentesting an entire freaking country would be awesome. 484 00:47:51,809 --> 00:47:56,519 I really think the security implications of this are broad. If we can feasibly simulate 485 00:47:56,519 --> 00:48:00,619 a massive attack scenario we can better study it and better prepare for it and see what 486 00:48:00,619 --> 00:48:05,509 exactly that's going to mean for massive targets like an entire country. 487 00:48:05,509 --> 00:48:12,509 Follow me on Twitter. I'll answer all your questions. Anything, almost anything at all, 488 00:48:14,989 --> 00:48:21,069 definitely see more about us and check out some more details on the presentation at www.HyperionGray.com. 489 00:48:21,069 --> 00:48:28,069 I don't even know what last one says. Thanks to everybody. Thomas, who is the dude, when 490 00:48:31,699 --> 00:48:38,699 I say "we "write that, it's usually Thomas, if it's not Thomas, it's Mark right there. 491 00:48:38,779 --> 00:48:45,170 Thanks to Amanda, my girlfriend, SQL foundation, and Apache software. Thanks a lot. 492 00:48:45,170 --> 00:48:45,420 (applause)