1 00:00:00,000 --> 00:00:02,792 MICHAEL SCHRENK: Thank you for coming to my talk. 2 00:00:02,792 --> 00:00:04,751 It's always a treat to be able to do this. 3 00:00:04,751 --> 00:00:11,918 I've had the opportunity to do a lot of really cool things in my career 4 00:00:11,918 --> 00:00:14,542 and with bots. 5 00:00:14,876 --> 00:00:18,792 But the one thing that gave me more satisfaction than anything else I've ever 6 00:00:18,792 --> 00:00:21,999 done is the time I wrote a botnet that purchased millions 7 00:00:21,999 --> 00:00:24,125 of dollars' worth of cars and defeated 8 00:00:24,125 --> 00:00:26,334 the Russian hackers. 9 00:00:26,626 --> 00:00:29,999 So let's have some fun with this all right? 10 00:00:29,999 --> 00:00:33,417 I'm going tell you a story that involves hacking. 11 00:00:33,999 --> 00:00:35,999 It involves cars. 12 00:00:35,999 --> 00:00:36,999 I like cars. 13 00:00:37,709 --> 00:00:41,083 It involves Russian hackers which is pretty cool. 14 00:00:41,542 --> 00:00:45,334 And more than anything else it involves screwing with the system. 15 00:00:45,334 --> 00:00:46,542 (Applause) Thank you. 16 00:00:46,542 --> 00:00:47,542 Thank you. 17 00:00:49,999 --> 00:00:53,626 Or as I like to tell my mother, creating competitive advantages 18 00:00:53,626 --> 00:00:55,209 for clients. 19 00:00:55,542 --> 00:00:58,751 (Laughter) That's important. 20 00:00:58,751 --> 00:01:00,667 It's easier to get a loan that way, too. 21 00:01:01,209 --> 00:01:05,792 So I've been writing bots since about '95. 22 00:01:05,834 --> 00:01:09,876 Started out doing remote medicine bots if you can believe that. 23 00:01:09,876 --> 00:01:14,834 I've been involved with privacy, fraud detection, private investigations. 24 00:01:14,834 --> 00:01:16,876 I've done work for foreign governments. 25 00:01:16,959 --> 00:01:18,751 I've got a fair amount of my business that 26 00:01:18,751 --> 00:01:21,083 is with automotive clients. 27 00:01:21,542 --> 00:01:23,292 What makes me a little bit different I mean, a lot 28 00:01:23,292 --> 00:01:25,250 of people write bots. 29 00:01:25,250 --> 00:01:28,292 What makes me a little different is I actually talk about it. 30 00:01:28,417 --> 00:01:31,542 Unfortunately, the only projects I get to talk about are things that are 31 00:01:31,542 --> 00:01:34,459 in house projects that I've been doing. 32 00:01:34,459 --> 00:01:38,083 It's really rare that I get a chance to talk about a specific project that I've 33 00:01:38,083 --> 00:01:40,083 done for a client. 34 00:01:40,250 --> 00:01:42,792 But I got permission to talk about this one. 35 00:01:42,999 --> 00:01:48,083 And it came about largely because when my last book was done, 36 00:01:48,083 --> 00:01:54,250 this one, No Starch Press by the way, they approached Linux magazine 37 00:01:54,250 --> 00:01:59,083 and said man Mike write an article for us? 38 00:01:59,209 --> 00:02:02,999 I really didn't have anything ready to write for them. 39 00:02:02,999 --> 00:02:05,083 So I approached this old client and I said, you know, 40 00:02:05,083 --> 00:02:07,292 enough time has passed. 41 00:02:07,292 --> 00:02:08,792 It's been, like, six years. 42 00:02:08,792 --> 00:02:10,834 Let me write about this for a change. 43 00:02:11,083 --> 00:02:13,083 And they agreed to let me to do this. 44 00:02:13,209 --> 00:02:14,959 But that's really key. 45 00:02:14,959 --> 00:02:17,626 Because when you've got a piece of technology that provides 46 00:02:17,626 --> 00:02:22,459 a competitive advantage or allows you to screw with the system strategically, 47 00:02:22,459 --> 00:02:26,292 you don't want to tell people about it right? 48 00:02:26,292 --> 00:02:28,918 Because that's a trade secret really. 49 00:02:29,125 --> 00:02:31,918 So if you want to get a little different view of this project, 50 00:02:31,918 --> 00:02:35,501 you can pick up one of the old copies of Linux magazine, I write about it 51 00:02:35,501 --> 00:02:39,999 in a little different way than the way I am presenting it here tonight. 52 00:02:41,167 --> 00:02:42,918 What are you going to learn? 53 00:02:42,999 --> 00:02:46,542 You are going to learn what makes a good bot project. 54 00:02:46,999 --> 00:02:50,083 I'm going to have to give you a little bit of insight 55 00:02:50,083 --> 00:02:53,792 in how retail automotive works in order for this whole thing 56 00:02:53,792 --> 00:02:55,584 to make sense. 57 00:02:55,834 --> 00:03:00,375 You are going to get an awareness of commercial bots and botnets. 58 00:03:00,375 --> 00:03:01,834 And they actually do exist. 59 00:03:01,918 --> 00:03:04,751 And I'm also going to talk a little bit about if I were 60 00:03:04,751 --> 00:03:09,083 to do this again today, how would I do this differently? 61 00:03:09,083 --> 00:03:12,959 Because keep in mind this happened, like, six, seven years ago. 62 00:03:13,999 --> 00:03:16,918 So what makes a good bot project? 63 00:03:16,918 --> 00:03:19,083 The very first thing you need to know is that you cannot be afraid 64 00:03:19,083 --> 00:03:21,375 to do something different. 65 00:03:21,375 --> 00:03:22,375 Okay? 66 00:03:22,375 --> 00:03:25,125 If your company has an internet strategy, assuming it has 67 00:03:25,125 --> 00:03:28,167 an internet strategy, that just involves browsers 68 00:03:28,167 --> 00:03:33,125 and things you can do with a browser, you are really missing out. 69 00:03:33,125 --> 00:03:35,584 Because you got the whole big wide internet available 70 00:03:35,584 --> 00:03:36,999 to you. 71 00:03:37,250 --> 00:03:41,667 And everybody uses the same tool, the browser, right, to access it. 72 00:03:41,667 --> 00:03:44,751 And if you expand your scope a little bit and do things 73 00:03:44,751 --> 00:03:47,834 outside of the way browsers work, or do things 74 00:03:47,834 --> 00:03:51,999 outside of the way websites are presented to you, you can create 75 00:03:51,999 --> 00:03:54,918 a lot of really cool things. 76 00:03:55,918 --> 00:03:58,250 Don't assume just a raise of hands. 77 00:03:58,250 --> 00:04:00,999 How many people here have written a screen scraper? 78 00:04:01,999 --> 00:04:03,125 Okay. 79 00:04:03,125 --> 00:04:04,125 Cool. 80 00:04:04,125 --> 00:04:06,125 How many people have written a Spider? 81 00:04:06,125 --> 00:04:07,125 Wow. 82 00:04:07,125 --> 00:04:08,125 Cool. 83 00:04:08,125 --> 00:04:09,083 Just if you've got a client, make sure 84 00:04:09,083 --> 00:04:11,250 they realize that just because you know how 85 00:04:11,250 --> 00:04:14,959 to scrape screens and can write a spider it doesn't mean you can make 86 00:04:14,959 --> 00:04:16,999 a copy of the internet. 87 00:04:17,334 --> 00:04:18,334 Okay? 88 00:04:18,459 --> 00:04:20,792 And you will be surprised I get people approaching me all the time 89 00:04:20,792 --> 00:04:22,834 with ideas for projects. 90 00:04:22,834 --> 00:04:25,999 A lot of them basically want to create a copy of the internet. 91 00:04:26,501 --> 00:04:30,083 So if your project requires both batch processing and realtime results, 92 00:04:30,083 --> 00:04:32,334 you've got a problem. 93 00:04:32,999 --> 00:04:36,375 Or if you've got a project that requires just ridiculous 94 00:04:36,375 --> 00:04:39,125 scaling, you've got a problem. 95 00:04:39,250 --> 00:04:44,626 Because unless you've got one of these, your project's going to fail. 96 00:04:44,792 --> 00:04:47,083 You know, you are not going to replicate Google unless you've got 97 00:04:47,083 --> 00:04:48,751 one of these. 98 00:04:49,083 --> 00:04:52,542 Then I tell clients after I say you really can't do this. 99 00:04:52,542 --> 00:04:53,626 It's like why not. 100 00:04:53,834 --> 00:04:55,999 I'll say because Google spends about a million dollars 101 00:04:55,999 --> 00:04:58,083 a day on electricity. 102 00:04:58,083 --> 00:04:59,083 That's why. 103 00:04:59,083 --> 00:05:01,083 That's why your project's going to fail. 104 00:05:02,459 --> 00:05:09,417 Realize that you don't own I refer to targets as the subject server. 105 00:05:09,417 --> 00:05:12,083 Don't assume that you own that server. 106 00:05:12,083 --> 00:05:13,083 Okay? 107 00:05:13,083 --> 00:05:16,083 For example, I had a potential client approach me 108 00:05:16,083 --> 00:05:18,250 a few years ago. 109 00:05:18,542 --> 00:05:21,918 And he wanted to monitor prices on Amazon. 110 00:05:22,959 --> 00:05:25,167 For about 100,000 items. 111 00:05:25,459 --> 00:05:29,292 I thought, that sounds really like a useful thing to do. 112 00:05:29,292 --> 00:05:31,999 This guy was a big time Amazon seller. 113 00:05:31,999 --> 00:05:35,209 Until I found out he wanted to do this every five seconds. 114 00:05:35,999 --> 00:05:37,083 (Laughter). 115 00:05:37,083 --> 00:05:38,501 That's not going to work. 116 00:05:38,501 --> 00:05:39,751 It's not going to work. 117 00:05:39,999 --> 00:05:41,667 For lots of reasons. 118 00:05:41,999 --> 00:05:45,292 If you did something like this, Amazon would actually have 119 00:05:45,292 --> 00:05:50,167 to build additional infrastructure to support your project. 120 00:05:50,334 --> 00:05:52,626 And you'd end up in court with what they call 121 00:05:52,626 --> 00:05:56,834 a trespass chattels suit and you want to avoid that. 122 00:05:57,167 --> 00:05:58,584 It's very illegal. 123 00:05:59,626 --> 00:06:00,999 Okay. 124 00:06:00,999 --> 00:06:01,999 Number four. 125 00:06:01,999 --> 00:06:03,999 This is maybe the most important thing. 126 00:06:03,999 --> 00:06:07,375 You have to have a realistic profit model. 127 00:06:07,375 --> 00:06:10,584 You notice I'm saying "profit model" and not business model. 128 00:06:10,999 --> 00:06:12,584 Why do I say that? 129 00:06:12,584 --> 00:06:13,584 This is why. 130 00:06:13,999 --> 00:06:15,250 Okay? 131 00:06:15,999 --> 00:06:21,626 If I'm showing my age here a little bit, you can look at these. 132 00:06:22,626 --> 00:06:25,834 Myspace actually made the list twice. 133 00:06:25,834 --> 00:06:27,501 I think that's pretty impressive. 134 00:06:27,501 --> 00:06:28,501 (Laughter). 135 00:06:28,501 --> 00:06:30,083 That's staying power. 136 00:06:30,083 --> 00:06:32,999 Why is it important you have a realistic profit model? 137 00:06:33,334 --> 00:06:35,542 Why is it when people approach me and they want 138 00:06:35,542 --> 00:06:39,501 to do something that could just as easily be done on EBay for example, 139 00:06:39,501 --> 00:06:43,709 this is important because the developer has to get paid. 140 00:06:43,709 --> 00:06:44,709 Okay. 141 00:06:44,709 --> 00:06:45,876 It's very important. 142 00:06:46,584 --> 00:06:47,999 Okay. 143 00:06:47,999 --> 00:06:49,792 About automotive retailing. 144 00:06:49,792 --> 00:06:53,083 Just a little bit here without this the project doesn't make sense. 145 00:06:53,542 --> 00:06:58,334 New car sales are not as profitable as people think they are. 146 00:06:58,334 --> 00:07:00,501 Even if you combine service with that. 147 00:07:00,501 --> 00:07:03,542 Because it's incredibly capital intensive. 148 00:07:03,999 --> 00:07:07,209 And it's super, super competitive. 149 00:07:07,542 --> 00:07:09,999 But you need to have new car sales. 150 00:07:09,999 --> 00:07:13,375 So you've got credibility if you want to sell used cars. 151 00:07:13,542 --> 00:07:18,083 This is particularly true if you want to sell high end used cars. 152 00:07:18,083 --> 00:07:21,083 Nobody wants to go to the corner lot for that kind of stuff. 153 00:07:22,417 --> 00:07:26,999 The thing that I learned and I didn't realize I just assumed they'll 154 00:07:26,999 --> 00:07:30,709 used cars on a car lot were all trade ins. 155 00:07:30,709 --> 00:07:32,083 Well that's not the case. 156 00:07:32,083 --> 00:07:33,876 It can't be the case. 157 00:07:33,876 --> 00:07:37,125 Because you can't grow a business if you are going to do that, right. 158 00:07:37,125 --> 00:07:38,542 And it's really limiting. 159 00:07:38,542 --> 00:07:42,751 Car dealerships spend tons of money acquiring good used cars 160 00:07:42,751 --> 00:07:45,542 to put on the car lot. 161 00:07:45,999 --> 00:07:48,250 It's kind of bizarre the way it works. 162 00:07:48,250 --> 00:07:51,125 Because you walk in to a car lot and you know what 163 00:07:51,125 --> 00:07:54,292 the price should be for a particular car 164 00:07:54,292 --> 00:07:57,999 because it's very well documented. 165 00:07:57,999 --> 00:08:00,501 You can go to Kelly Blue Book or anyplace. 166 00:08:00,501 --> 00:08:04,375 So dealers don't have a lot of space to work on the price, 167 00:08:04,375 --> 00:08:07,083 the final retail price. 168 00:08:07,459 --> 00:08:10,584 But down on the wholesale side, that's where the profits that's where 169 00:08:10,584 --> 00:08:12,501 the margins are made. 170 00:08:12,501 --> 00:08:14,709 If you are good at buying things for a great price, 171 00:08:14,709 --> 00:08:17,999 that's how you make money with used cars. 172 00:08:17,999 --> 00:08:19,999 And that's what this project is about. 173 00:08:20,751 --> 00:08:25,999 So a car dealer came to me he had this great opportunity. 174 00:08:25,999 --> 00:08:27,751 Found this wonderful website. 175 00:08:27,999 --> 00:08:30,709 It was part of the national franchise. 176 00:08:30,709 --> 00:08:36,083 They were getting in used rental cars, two years old, 12 to 16,000 miles, 177 00:08:36,083 --> 00:08:41,584 perfect cars that you would want to have on your lot. 178 00:08:41,584 --> 00:08:42,709 Well maintained. 179 00:08:43,083 --> 00:08:46,999 Unfortunately, there was a lot of competition for these cars. 180 00:08:46,999 --> 00:08:49,083 Because all the people in that dealership chain wanted 181 00:08:49,083 --> 00:08:50,999 the same cars. 182 00:08:50,999 --> 00:08:52,999 And the website was horrible and made it almost impossible 183 00:08:52,999 --> 00:08:54,876 to buy the cars. 184 00:08:55,083 --> 00:08:56,918 So there's a lot of frustration. 185 00:08:56,999 --> 00:08:59,334 This is kind of the way it worked. 186 00:08:59,334 --> 00:09:03,626 There would be maybe two to 300 cars presented every day. 187 00:09:03,667 --> 00:09:06,334 And the cars would have little display ads like this that gave a little bit 188 00:09:06,334 --> 00:09:08,083 of a description. 189 00:09:08,209 --> 00:09:11,959 And there was an enactive buy now button. 190 00:09:11,999 --> 00:09:12,999 Okay? 191 00:09:13,459 --> 00:09:20,250 And at exactly sale time the button would appear. 192 00:09:20,709 --> 00:09:21,999 Okay. 193 00:09:21,999 --> 00:09:24,751 But the problem with this was it wasn't using AJAX 194 00:09:24,751 --> 00:09:26,501 or anything. 195 00:09:26,501 --> 00:09:29,292 You had to physically sit and refresh the browser constantly 196 00:09:29,292 --> 00:09:31,999 to get that button to appear. 197 00:09:32,083 --> 00:09:34,876 Well this led to another problem. 198 00:09:34,959 --> 00:09:37,542 In that there was incredible server lag. 199 00:09:37,667 --> 00:09:41,083 My client and I think he was probably pretty typical 200 00:09:41,083 --> 00:09:44,999 of all of them in this chain, he would grab every person 201 00:09:44,999 --> 00:09:48,792 he could find, people out of parts, off the sales floor, 202 00:09:48,792 --> 00:09:52,584 administrative assistants, he would sit them all in front 203 00:09:52,584 --> 00:09:57,667 of computers and each one was assigned maybe about six cars. 204 00:09:57,792 --> 00:09:58,584 So they would have six browser 205 00:09:58,584 --> 00:10:00,083 Windows open. 206 00:10:00,292 --> 00:10:04,999 And they're all sitting there frantically hitting the refresh button constantly. 207 00:10:05,501 --> 00:10:08,751 So, if you think about this, okay so this would have been roughly 208 00:10:08,751 --> 00:10:12,292 the equivalent of 36 users for one dealership. 209 00:10:12,751 --> 00:10:16,667 I don't know maybe there were 750 dealers that were doing this. 210 00:10:16,667 --> 00:10:19,542 So that was almost 30,000 simultaneous downloads that were 211 00:10:19,542 --> 00:10:21,959 happening at sale time. 212 00:10:22,250 --> 00:10:23,959 What made this worse, I mean, servers should be able 213 00:10:23,959 --> 00:10:25,792 to handle that right. 214 00:10:25,792 --> 00:10:29,999 But I think there was some inefficiency with the database possibly. 215 00:10:29,999 --> 00:10:31,792 Some bad queries were being made. 216 00:10:32,000 --> 00:10:35,792 This caused a ridiculous peak in server lag time right 217 00:10:35,792 --> 00:10:40,125 at the point where you don't want to have it. 218 00:10:40,125 --> 00:10:43,167 It wouldn't be unusual for it to take 15 or 30 seconds 219 00:10:43,167 --> 00:10:46,417 for the screen to refresh at sale time. 220 00:10:46,999 --> 00:10:48,667 Sometimes it would just time out. 221 00:10:48,999 --> 00:10:50,542 So this was a real problem. 222 00:10:50,999 --> 00:10:55,667 The other problem is that out of these, say, 200 cars that were 223 00:10:55,667 --> 00:11:00,083 up for sale every day, there were maybe five that every single 224 00:11:00,083 --> 00:11:03,542 dealership in the country wanted. 225 00:11:03,751 --> 00:11:06,334 Either because they were the right color. 226 00:11:06,334 --> 00:11:09,542 Probably because they were a really great price. 227 00:11:09,918 --> 00:11:11,792 Or for whatever reason. 228 00:11:11,792 --> 00:11:12,792 I don't know. 229 00:11:12,792 --> 00:11:15,167 But every dealership would want these five cars. 230 00:11:15,250 --> 00:11:18,501 So he had a lot of competition for the same cars. 231 00:11:18,667 --> 00:11:23,125 Plus, server lag, bad web design. 232 00:11:23,250 --> 00:11:26,125 Had to involve a lot of people to do this. 233 00:11:26,125 --> 00:11:28,501 So this particular client, I had written a number of bots 234 00:11:28,501 --> 00:11:30,501 for him in the past. 235 00:11:30,751 --> 00:11:32,751 And he gave me a call and said can you help me 236 00:11:32,751 --> 00:11:34,209 out, Mike. 237 00:11:34,209 --> 00:11:35,459 I said let's take a look. 238 00:11:35,751 --> 00:11:39,834 So the problems were the system was way too manual 239 00:11:39,834 --> 00:11:41,999 to begin with. 240 00:11:42,209 --> 00:11:44,083 So the way this would work, he would have to manually go 241 00:11:44,083 --> 00:11:46,999 and select the cars that he wanted to buy. 242 00:11:46,999 --> 00:11:50,626 He would have to distribute the VIN numbers to the various people. 243 00:11:50,626 --> 00:11:53,125 He would have to call people off their normal duties 244 00:11:53,125 --> 00:11:55,459 they would be doing. 245 00:11:55,667 --> 00:11:58,417 They would be dedicating probably a good 15 20 minutes hitting 246 00:11:58,417 --> 00:12:00,999 the refresh button every day. 247 00:12:01,918 --> 00:12:04,667 So that wasn't good. 248 00:12:04,667 --> 00:12:08,250 Plus the buy button took way too long to appear because of the server lag. 249 00:12:09,667 --> 00:12:12,792 So we ended up with two solutions. 250 00:12:12,792 --> 00:12:15,209 One of them because it worked. 251 00:12:15,209 --> 00:12:17,501 The second one because we had competition. 252 00:12:17,626 --> 00:12:19,626 So let's look at Phase I first here. 253 00:12:19,918 --> 00:12:24,375 Again, this is not, like, classic bot design. 254 00:12:24,667 --> 00:12:27,083 Keep in mind this, was done, like, six years ago. 255 00:12:27,083 --> 00:12:29,626 So I don't develop like this anymore. 256 00:12:29,999 --> 00:12:31,876 So here's what I did. 257 00:12:31,999 --> 00:12:34,751 I came up with a web interface for my client. 258 00:12:34,999 --> 00:12:38,751 If you look here, this is basically just for HTML frames that were independent 259 00:12:38,751 --> 00:12:40,626 from each other. 260 00:12:40,876 --> 00:12:45,918 And, you know, they could just go to the URL, pull this up, and by the way, 261 00:12:45,918 --> 00:12:50,501 I say botnet but this was all done on computers that we controlled not 262 00:12:50,501 --> 00:12:52,999 controlled, we owned. 263 00:12:52,999 --> 00:12:54,375 There's a difference, right. 264 00:12:54,375 --> 00:12:55,375 (Laughter). 265 00:12:55,876 --> 00:13:01,626 In fact, all of the bots that I write, they're all commercial bots. 266 00:13:01,626 --> 00:13:02,876 We own all the hardware. 267 00:13:02,876 --> 00:13:04,584 Just want to let you guys know that. 268 00:13:04,584 --> 00:13:07,125 So instead of hauling in all these people to hit 269 00:13:07,125 --> 00:13:11,999 the refresh button constantly while they should be doing something else, 270 00:13:11,999 --> 00:13:16,334 my client was able to pull up something like this. 271 00:13:16,459 --> 00:13:19,334 And quite frequently we would have two or three computers set up with this 272 00:13:19,334 --> 00:13:20,999 in the browser. 273 00:13:20,999 --> 00:13:23,250 He would select what cars he wanted. 274 00:13:23,542 --> 00:13:25,959 The first step was to log on. 275 00:13:26,709 --> 00:13:31,167 They had several accounts for it was a closed sale basically. 276 00:13:31,375 --> 00:13:33,584 They had several accounts they could use. 277 00:13:33,584 --> 00:13:35,999 The first thing they would do is they would pick which account 278 00:13:35,999 --> 00:13:39,501 they wanted to use for this particular bot. 279 00:13:39,542 --> 00:13:45,375 And the next step was you would pick the VIN number of the car you wanted. 280 00:13:45,626 --> 00:13:48,083 And it would go ahead and it would validate that that was 281 00:13:48,083 --> 00:13:50,250 an actual car for sale. 282 00:13:50,626 --> 00:13:51,999 That's important. 283 00:13:51,999 --> 00:13:54,626 Because any time you are writing a bot you don't want 284 00:13:54,626 --> 00:13:59,626 to do something that could not possibly be done by a human. 285 00:13:59,999 --> 00:14:03,167 And if there's a car that says is notice available for sale, 286 00:14:03,167 --> 00:14:06,250 you don't want to try to buy that. 287 00:14:06,292 --> 00:14:08,876 Because some system admin somewhere is going 288 00:14:08,876 --> 00:14:11,709 to say how did they do that? 289 00:14:11,959 --> 00:14:13,999 What was that IP address? 290 00:14:14,250 --> 00:14:16,125 They're generating a lot of traffic. 291 00:14:16,167 --> 00:14:17,959 Really good traffic. 292 00:14:18,334 --> 00:14:20,834 So it's important to validate stuff like that. 293 00:14:20,999 --> 00:14:25,209 So as soon as the VIN was validated, a little start button would appear. 294 00:14:25,459 --> 00:14:28,375 So instead of being, you know, right on time when the sale was, 295 00:14:28,375 --> 00:14:31,501 you could do this hours in advance, hit the start button, 296 00:14:31,501 --> 00:14:34,542 and then it would start to count down. 297 00:14:34,876 --> 00:14:38,999 The way it would do this is it was basically synchronizing its 298 00:14:38,999 --> 00:14:43,083 clock with the server clock of the sales server. 299 00:14:43,459 --> 00:14:46,125 And this was really simple stuff. 300 00:14:46,334 --> 00:14:48,999 In the meta refresh, the HTML meta refresh, 301 00:14:48,999 --> 00:14:53,292 it would just start refreshing every so often. 302 00:14:53,542 --> 00:14:56,250 And it would get, you know, as the sale got closer and closer, 303 00:14:56,250 --> 00:14:59,709 it would refresh more often until right at the end it was, like, 304 00:14:59,709 --> 00:15:02,709 right lock step with the server clock. 305 00:15:03,999 --> 00:15:06,999 As soon as it timed out, it would go ahead and would attempt 306 00:15:06,999 --> 00:15:08,999 to purchase the car. 307 00:15:08,999 --> 00:15:12,250 Now this shows just one bot client. 308 00:15:12,250 --> 00:15:15,709 Basically the bot clients acted as triggers for the server that actually 309 00:15:15,709 --> 00:15:17,751 made the purchase. 310 00:15:17,876 --> 00:15:21,876 And there may have been 16 to 30 of these bots running, 311 00:15:21,876 --> 00:15:24,375 triggering the server. 312 00:15:25,250 --> 00:15:27,999 Sometimes we'd miss one. 313 00:15:28,209 --> 00:15:32,626 But more often, the sale was successful. 314 00:15:32,999 --> 00:15:34,792 And we would send an email confirmation 315 00:15:34,792 --> 00:15:37,999 to my client saying you bought this car. 316 00:15:38,167 --> 00:15:41,459 And we would also arrange for financing for him. 317 00:15:41,918 --> 00:15:43,083 And while we were at it, we made sure that 318 00:15:43,083 --> 00:15:46,959 the car actually was shipped correctly back to his dealership. 319 00:15:46,959 --> 00:15:49,667 So the bot provided a lot of utility in that regard. 320 00:15:50,250 --> 00:15:53,083 So how successful were we? 321 00:15:53,083 --> 00:15:55,999 Well before he wasn't getting anything. 322 00:15:55,999 --> 00:15:58,250 And this was really frustrating for him. 323 00:15:58,250 --> 00:15:59,999 Because these were cars he really wanted and he knew 324 00:15:59,999 --> 00:16:02,209 he could make a profit on them given the price 325 00:16:02,209 --> 00:16:04,334 they were selling for. 326 00:16:04,417 --> 00:16:08,083 After we were getting probably about 95 to 97 percent of the cars 327 00:16:08,083 --> 00:16:10,375 he was trying to buy. 328 00:16:10,375 --> 00:16:12,584 So the difference was phenomenal. 329 00:16:12,999 --> 00:16:17,542 It was so much fun because even after I was done developing this I would 330 00:16:17,542 --> 00:16:20,667 get a call every day from my client 15 20 minutes 331 00:16:20,667 --> 00:16:26,083 after the sale and he would say, Mike, we bought five out of six today. 332 00:16:26,417 --> 00:16:28,250 We got seven out of seven. 333 00:16:28,250 --> 00:16:29,999 We got nine out of 12. 334 00:16:29,999 --> 00:16:33,209 I'm, like, settle down don't get greedy here. 335 00:16:35,417 --> 00:16:37,584 Don't kill the golden goose. 336 00:16:38,334 --> 00:16:41,417 So why were we successful at this? 337 00:16:41,918 --> 00:16:44,959 Well the main problem with the old one is that people had 338 00:16:44,959 --> 00:16:49,459 to wait for that stupid refresh button or that buy it now button to happen 339 00:16:49,459 --> 00:16:53,417 and there was so much problems, so much server lag that that was 340 00:16:53,417 --> 00:16:55,125 the problem. 341 00:16:55,459 --> 00:16:57,999 And usually whoever got the buy button first was 342 00:16:57,999 --> 00:17:00,876 the person that bought the car. 343 00:17:01,250 --> 00:17:05,083 So, basically, what we did is we got rid of the buy button. 344 00:17:05,083 --> 00:17:06,626 We just got rid of it. 345 00:17:06,709 --> 00:17:09,334 And we replaced it with a timer that was automated so 346 00:17:09,334 --> 00:17:11,918 he didn't need the person hitting refresh 347 00:17:11,918 --> 00:17:14,999 all the time and it would just know what time to buy 348 00:17:14,999 --> 00:17:18,459 the car and it would go ahead and buy it. 349 00:17:18,459 --> 00:17:22,501 This type of a bot is typically called a sniper. 350 00:17:22,501 --> 00:17:24,167 Ever heard that term before? 351 00:17:25,999 --> 00:17:28,584 I remember back in the day when I was doing this we 352 00:17:28,584 --> 00:17:30,209 were testing. 353 00:17:30,667 --> 00:17:34,125 And I was going to write him an email that said something 354 00:17:34,125 --> 00:17:37,375 to the effect of I've got six snipers waiting 355 00:17:37,375 --> 00:17:39,918 to hit cars at noon. 356 00:17:42,250 --> 00:17:45,459 (Laughter) Hopefully we'll make some hits today or have some kills 357 00:17:45,459 --> 00:17:47,584 or something like that. 358 00:17:47,584 --> 00:17:49,709 I was just about ready to send that email. 359 00:17:49,709 --> 00:17:50,501 And I started thinking about carnivore and some 360 00:17:50,501 --> 00:17:52,999 of the stuff happening back then. 361 00:17:52,999 --> 00:17:54,999 And I thought no I'll just give him a call. 362 00:17:56,876 --> 00:18:00,209 Today I would never send an email like that. 363 00:18:00,334 --> 00:18:01,375 Never. 364 00:18:01,375 --> 00:18:03,292 I'm not even sure I'd make a phone call. 365 00:18:03,626 --> 00:18:05,751 So watch your language. 366 00:18:05,999 --> 00:18:07,250 Okay. 367 00:18:07,250 --> 00:18:09,959 So everything worked great for about six months. 368 00:18:09,999 --> 00:18:15,501 And then all of a sudden things weren't as rosy anymore. 369 00:18:15,501 --> 00:18:20,083 We started not, you know, my client would call and he would say, 370 00:18:20,083 --> 00:18:24,999 you know, we only got two out of seven today. 371 00:18:24,999 --> 00:18:26,459 Something's wrong. 372 00:18:26,792 --> 00:18:29,334 And he did some research. 373 00:18:29,667 --> 00:18:31,999 And he discovered through his connections he's got lots 374 00:18:31,999 --> 00:18:35,626 of connections that there was a group of Russian hackers that were 375 00:18:35,626 --> 00:18:38,542 hired to write a competing bot and they were some place 376 00:18:38,542 --> 00:18:40,667 out in New Jersey or the dealership was 377 00:18:40,667 --> 00:18:43,250 out in New Jersey or something. 378 00:19:02,542 --> 00:19:04,999 Competition is good right. 379 00:19:04,999 --> 00:19:06,834 That leads to innovation. 380 00:19:06,834 --> 00:19:08,334 That was the kind of thinking, yeah this is going 381 00:19:08,334 --> 00:19:12,083 to be fun now we've got an arms race going on here. 382 00:19:12,959 --> 00:19:15,501 Here's part two of the solution. 383 00:19:16,542 --> 00:19:20,250 What I did differently is while I was synchronizing clocks 384 00:19:20,250 --> 00:19:24,751 with the sale server, I started looking at lag time. 385 00:19:24,999 --> 00:19:29,334 I got to the point where I got really good at estimating how much lag time there 386 00:19:29,334 --> 00:19:31,876 would be at the sale time. 387 00:19:31,876 --> 00:19:33,501 In other words, what I was essentially doing was 388 00:19:33,501 --> 00:19:36,792 estimating how many users were on the system. 389 00:19:36,999 --> 00:19:41,083 And with that information I would not set one attempt to buy the car 390 00:19:41,083 --> 00:19:44,751 but for each bot I would launch maybe between I forget what 391 00:19:44,751 --> 00:19:47,334 the real number was because I haven't looked 392 00:19:47,334 --> 00:19:49,667 at the code for ages. 393 00:19:49,999 --> 00:19:52,292 But I probably launched between five and seven attempts 394 00:19:52,292 --> 00:19:54,083 to buy the car. 395 00:19:54,250 --> 00:19:56,999 And based on the amount of lag time that I was going 396 00:19:56,999 --> 00:20:00,209 to anticipate at the sale time, I would launch them just 397 00:20:00,209 --> 00:20:04,667 a little bit before, incrementally before the sale time. 398 00:20:04,751 --> 00:20:07,417 And this was real successful. 399 00:20:07,626 --> 00:20:10,626 So now there will be a number of bots and each one 400 00:20:10,626 --> 00:20:13,999 of those basically had a warhead that launched multiple 401 00:20:13,999 --> 00:20:16,375 attempts to buy the cars. 402 00:20:16,999 --> 00:20:20,125 So our success rate prior to making this fix, 403 00:20:20,125 --> 00:20:25,876 during the competition was about he was getting about 50 percent. 404 00:20:25,876 --> 00:20:28,542 After it was were back right on the money. 405 00:20:28,542 --> 00:20:30,667 We were getting every car we wanted. 406 00:20:30,792 --> 00:20:34,334 And it stayed that way through the duration of this program. 407 00:20:35,918 --> 00:20:38,999 So how successful was the bot? 408 00:20:39,167 --> 00:20:41,999 These are all guesses okay because I don't have any hard 409 00:20:41,999 --> 00:20:43,626 facts here. 410 00:20:43,751 --> 00:20:47,918 But I know it was in operation for about 40 weeks. 411 00:20:47,999 --> 00:20:51,501 And they were buying roughly five cars a day. 412 00:20:51,626 --> 00:20:54,417 So that's about 800 cars, I'm going estimate, 413 00:20:54,417 --> 00:20:56,999 were purchased with this. 414 00:20:57,334 --> 00:21:00,167 If you figure the average wholesale cost 415 00:21:00,167 --> 00:21:03,417 of the cars they were purchasing was probably 416 00:21:03,417 --> 00:21:05,501 around $16,000. 417 00:21:05,542 --> 00:21:10,083 So in a 40 week period, this bot purchased almost $13 million 418 00:21:10,083 --> 00:21:12,167 worth of cars. 419 00:21:13,375 --> 00:21:19,167 And that has a huge impact on a small dealer like this one. 420 00:21:19,167 --> 00:21:23,250 So this is a great example of not accepting the web as it is. 421 00:21:23,417 --> 00:21:26,918 Not using browsers the way everybody else would. 422 00:21:26,918 --> 00:21:29,709 And doing something different and not being afraid to step 423 00:21:29,709 --> 00:21:32,501 outside of the box a little bit. 424 00:21:32,959 --> 00:21:36,626 So what would I do differently today if I was going to do this? 425 00:21:37,959 --> 00:21:41,083 First there were things that were done pretty well back then and things that 426 00:21:41,083 --> 00:21:42,999 I still do today. 427 00:21:43,083 --> 00:21:46,209 I really like having very light weight clients. 428 00:21:46,209 --> 00:21:47,709 The lighter the better. 429 00:21:47,876 --> 00:21:51,709 Everything is easily updated because it was all online. 430 00:21:51,792 --> 00:21:53,584 And it was easily distributed. 431 00:21:53,584 --> 00:21:55,584 I could make changes on the server. 432 00:21:55,584 --> 00:21:57,501 It would get distributed everywhere. 433 00:21:57,501 --> 00:22:00,959 Because basically these were just these bot clients were essentially just web 434 00:22:00,959 --> 00:22:04,792 pages with some java script and stuff going on. 435 00:22:05,250 --> 00:22:09,375 One of the things that I really definitely would do if I were to do this 436 00:22:09,375 --> 00:22:13,918 over is I would build in some analytics and collect metrics. 437 00:22:13,999 --> 00:22:16,501 So I would really want to know exactly what our success 438 00:22:16,501 --> 00:22:17,999 rate was. 439 00:22:17,999 --> 00:22:20,959 I would want to know exactly how much these cars 440 00:22:20,959 --> 00:22:23,167 were purchased for. 441 00:22:23,375 --> 00:22:25,334 It would be really great to also know how much 442 00:22:25,334 --> 00:22:27,375 they were sold for. 443 00:22:27,417 --> 00:22:29,250 So I could actually show value. 444 00:22:29,876 --> 00:22:32,167 That's something I really wish I had done. 445 00:22:33,083 --> 00:22:37,250 The other thing I think that would have been nice if I were to do this 446 00:22:37,250 --> 00:22:40,751 over again is build in some process that actually assists 447 00:22:40,751 --> 00:22:43,167 in the selection of which vehicles you want 448 00:22:43,167 --> 00:22:44,999 to purchase. 449 00:22:44,999 --> 00:22:47,584 So, in other words, maybe what I would have done 450 00:22:47,584 --> 00:22:51,751 is I would have also had my bot look at Kelly blue book and figure 451 00:22:51,751 --> 00:22:57,375 out what the good wholesale prices are for cars and look for discrepancies. 452 00:22:57,542 --> 00:23:00,167 Locate the ones that are under priced. 453 00:23:00,167 --> 00:23:02,584 That would have been a really good thing to do. 454 00:23:04,167 --> 00:23:08,292 The other thing that occurred to me actually within the last week 455 00:23:08,292 --> 00:23:10,999 is probably the only thing I really needed 456 00:23:10,999 --> 00:23:13,584 to do here is make that buy it now button 457 00:23:13,584 --> 00:23:15,417 happen, right. 458 00:23:15,834 --> 00:23:18,375 I could have done that simply by making the server act kind 459 00:23:18,375 --> 00:23:20,209 of like a proxy. 460 00:23:20,626 --> 00:23:23,417 So as the HTML is coming in with the grayed out button, 461 00:23:23,417 --> 00:23:26,999 I could have just replaced it with a real button and sent it 462 00:23:26,999 --> 00:23:29,709 off to the browser, right. 463 00:23:29,876 --> 00:23:32,125 That probably would have worked. 464 00:23:32,751 --> 00:23:34,626 The problem there is that conceivably you could have 465 00:23:34,626 --> 00:23:37,250 bought cars before the purchase time. 466 00:23:37,876 --> 00:23:39,626 And that may have been allowed. 467 00:23:39,999 --> 00:23:42,834 But that is something you don't want to do for the same reason you don't 468 00:23:42,834 --> 00:23:45,083 want to buy cars that don't exist. 469 00:23:45,167 --> 00:23:47,999 You don't want to show your hand. 470 00:23:49,959 --> 00:23:56,125 The website, the target was a very traditional website. 471 00:23:56,125 --> 00:23:59,584 It used HTML forms which were really easy for me 472 00:23:59,584 --> 00:24:04,083 to emulate or submit using just PHP and curl. 473 00:24:04,083 --> 00:24:06,709 Today you don't find that so often. 474 00:24:06,709 --> 00:24:08,125 You find a lot of JavaScript. 475 00:24:08,125 --> 00:24:10,083 You kind a lot of AJAX. 476 00:24:10,501 --> 00:24:17,959 There's a lot of java script validation of form data before it's submitted. 477 00:24:18,209 --> 00:24:20,999 Makes it a lot harder to do this kind of thing today. 478 00:24:21,626 --> 00:24:24,542 So today the kind of approach that I take now is I end 479 00:24:24,542 --> 00:24:27,250 up with a task queue which is basically a table and 480 00:24:27,250 --> 00:24:31,125 a database that keeps track of what needs to be done. 481 00:24:31,501 --> 00:24:33,626 And there's a web interface into that. 482 00:24:33,834 --> 00:24:39,584 In this particular case my client would essentially be loading a task queue. 483 00:24:39,999 --> 00:24:45,959 And that task queue would be fed to individual computers which I refer 484 00:24:45,959 --> 00:24:48,250 to as harvesters. 485 00:24:48,792 --> 00:24:51,792 And they can exist anywhere. 486 00:24:51,792 --> 00:24:52,999 They can be on the cloud. 487 00:24:52,999 --> 00:24:55,999 They can be in a, you know, in a closet. 488 00:24:55,999 --> 00:24:57,501 They can be in your office. 489 00:24:57,834 --> 00:24:58,999 They can be anywhere. 490 00:24:59,125 --> 00:25:04,083 What I have them do now, since there's so much more complexity 491 00:25:04,083 --> 00:25:09,542 in websites and so much more use of client site scripting, I do a lot 492 00:25:09,542 --> 00:25:12,125 of stuff in I macros. 493 00:25:12,292 --> 00:25:14,626 Anybody here use I macros? 494 00:25:14,918 --> 00:25:17,584 It is the most amazing tool. 495 00:25:17,584 --> 00:25:19,751 It's just an add on to your browser that essentially lets 496 00:25:19,751 --> 00:25:22,542 you create a macro for your browser that you can just play 497 00:25:22,542 --> 00:25:24,501 over and over again. 498 00:25:25,167 --> 00:25:31,999 And what I do now is the harvesters will dynamically create that macro. 499 00:25:31,999 --> 00:25:34,459 So you can get them to do some very specific things. 500 00:25:34,459 --> 00:25:36,834 Once I learned how to do that, there was not a single website 501 00:25:36,834 --> 00:25:39,709 on the planet I could not manipulate. 502 00:25:39,999 --> 00:25:42,417 It was like the Gods handing me fire. 503 00:25:42,417 --> 00:25:43,417 Like here. 504 00:25:43,626 --> 00:25:45,792 Here, Mike, you have been a good boy. 505 00:25:46,083 --> 00:25:48,292 So that's what I do now. 506 00:25:48,292 --> 00:25:52,417 And so I actually communicate through Fire Fox so it's very easy 507 00:25:52,417 --> 00:25:56,999 for me to emulate human activity now with bots. 508 00:25:57,375 --> 00:25:59,876 So I would have them hit the sale server. 509 00:25:59,876 --> 00:26:02,999 The difficulty there would be to get the timing down correctly. 510 00:26:02,999 --> 00:26:04,999 But I think that could have been done. 511 00:26:04,999 --> 00:26:06,999 And then the harvesters after they do their thing 512 00:26:06,999 --> 00:26:10,834 with the sale server, the target server, they report back to the bot server 513 00:26:10,834 --> 00:26:13,417 and the queue is updated and that's how you can tell 514 00:26:13,417 --> 00:26:16,250 what the results are of what you did. 515 00:26:17,083 --> 00:26:21,584 If you are interested in how that kind of stuff works, go on YouTube 516 00:26:21,584 --> 00:26:24,626 and look up my DEF CON 17 talk. 517 00:26:24,792 --> 00:26:30,792 That's all about manipulating I macros in that way to do screen scrapers 518 00:26:30,792 --> 00:26:38,375 for very difficult to scrape sites or difficult to automate kind of sites. 519 00:26:39,250 --> 00:26:41,375 So that's my talk. 520 00:26:41,375 --> 00:26:43,250 Thanks all of you for coming. 521 00:26:43,250 --> 00:26:46,125 Thank you to the call for paper goons. 522 00:26:46,125 --> 00:26:47,125 Thank you.