1 00:00:00,000 --> 00:00:01,792 TOM KEENAN: I'm going to actually start while they try 2 00:00:01,792 --> 00:00:03,250 to set this set up. 3 00:00:03,250 --> 00:00:06,459 I'll tell you a little bit about what we are going to talk about. 4 00:00:06,459 --> 00:00:07,959 The open data is a movement and it has to do a lot 5 00:00:07,959 --> 00:00:10,999 with the idea that the government has information 6 00:00:10,999 --> 00:00:12,292 on you. 7 00:00:12,292 --> 00:00:14,167 We all know a lot about that. 8 00:00:14,167 --> 00:00:18,083 But legitimately they collect all kinds of stuff. 9 00:00:18,083 --> 00:00:19,792 They collect bus schedules. 10 00:00:19,792 --> 00:00:22,125 They collect voting records, things like that. 11 00:00:22,125 --> 00:00:24,375 So the reason there's an open data movement 12 00:00:24,375 --> 00:00:29,125 is because in fact there is a need for it, right? 13 00:00:29,125 --> 00:00:30,751 We want to know the bus time. 14 00:00:30,751 --> 00:00:32,542 And let's face it, governments are not always 15 00:00:32,542 --> 00:00:35,042 the best writers of software. 16 00:00:35,042 --> 00:00:36,167 I'm a professor. 17 00:00:36,167 --> 00:00:38,709 I spent a frantic two hours yesterday trying to deal 18 00:00:38,709 --> 00:00:42,751 with a government grant software in Canada because it was designed 19 00:00:42,751 --> 00:00:44,876 by civil servants. 20 00:00:44,876 --> 00:00:49,834 Often it's best to actually say: Let the community go out and design it. 21 00:00:49,834 --> 00:00:51,999 In New York City, for example, all the subways have 22 00:00:51,999 --> 00:00:56,042 a sign up: "Our transit apps are whiz kid certified." 23 00:00:56,042 --> 00:00:58,334 That means the MTA didn't do them. 24 00:00:58,334 --> 00:01:00,250 They let other companies create applications 25 00:01:00,250 --> 00:01:04,459 to use their data which they make available openly. 26 00:01:04,459 --> 00:01:07,334 One of the reasons governments open up the data is in the interest 27 00:01:07,334 --> 00:01:10,334 of giving better service to customers. 28 00:01:10,334 --> 00:01:12,876 Another one relates to the fact that if they have the data, 29 00:01:12,876 --> 00:01:15,459 they figure sooner other later somebody will put it 30 00:01:15,459 --> 00:01:17,792 out on Wikileaks anyway. 31 00:01:17,792 --> 00:01:19,999 They might as well release what they have. 32 00:01:20,083 --> 00:01:23,250 The ability to go out there and see the government data seems 33 00:01:23,250 --> 00:01:25,667 to be a natural thing. 34 00:01:25,667 --> 00:01:28,751 It resonates with the whole spirit of DEF CON which is data wants 35 00:01:28,751 --> 00:01:30,292 to be free. 36 00:01:30,292 --> 00:01:31,876 Data wants to be open. 37 00:01:31,876 --> 00:01:33,501 As a result, governments are opening up more 38 00:01:33,501 --> 00:01:35,751 and more of their data. 39 00:01:35,751 --> 00:01:37,999 The other thing, of course, is they are cheap. 40 00:01:37,999 --> 00:01:39,459 They don't want to go out there and spend money 41 00:01:39,459 --> 00:01:43,250 on building applications if other people will do it for free. 42 00:01:43,250 --> 00:01:46,125 The reality is, we have a lot of good reasons. 43 00:01:46,125 --> 00:01:48,834 I am not -- this is kind of the big preamble. 44 00:01:48,834 --> 00:01:50,375 I am not against open data. 45 00:01:50,417 --> 00:01:51,918 Open data is great. 46 00:01:51,918 --> 00:01:55,250 There are open data hackathons all over the world. 47 00:01:55,292 --> 00:01:58,834 They have basically the advantage of getting the best minds to take 48 00:01:58,834 --> 00:02:02,667 the government's data and think of creative uses. 49 00:02:02,667 --> 00:02:05,584 If you have four kids and need to take one to the soccer field 50 00:02:05,584 --> 00:02:08,999 and one to the baseball diamond and you want to optimize that, 51 00:02:08,999 --> 00:02:10,834 that's good. 52 00:02:10,876 --> 00:02:13,167 The problem is that a lot of the government the -- data 53 00:02:13,167 --> 00:02:16,626 the government collects is related to people. 54 00:02:16,959 --> 00:02:18,292 I'm still hoping we will see images 55 00:02:18,292 --> 00:02:21,626 because I have images to share with you. 56 00:02:21,626 --> 00:02:24,876 I'll tell you one example relates to voting. 57 00:02:24,876 --> 00:02:27,959 It's pretty obvious in a democracy that we want 58 00:02:27,959 --> 00:02:31,709 to know what the voting results were. 59 00:02:31,709 --> 00:02:33,125 So just, and this is an example that I cooked 60 00:02:33,125 --> 00:02:36,667 up to illustrate one of the problems with open data. 61 00:02:36,667 --> 00:02:38,999 We have a city in Alberta called Edmonton. 62 00:02:38,999 --> 00:02:42,626 It has a wildly popular mayor, Stephen Mandel. 63 00:02:42,626 --> 00:02:46,751 He got over half the votes in every ward. 64 00:02:46,999 --> 00:02:49,876 When they published the votes there were minority 65 00:02:49,876 --> 00:02:53,999 candidates, people who only got a few votes and got zero votes 66 00:02:53,999 --> 00:02:55,959 in some areas. 67 00:02:55,959 --> 00:02:59,999 Say Mr. Dowdele, one of the other candidates, the wife was 68 00:02:59,999 --> 00:03:06,876 in the hospital and she says, "Yes, dear, I voted and I voted for you." 69 00:03:06,876 --> 00:03:09,999 He goes and pulls down the online data and guess what? 70 00:03:09,999 --> 00:03:13,083 He got zero votes in the city wide hospital poll. 71 00:03:13,083 --> 00:03:14,834 That's a type of torturing of data that 72 00:03:14,834 --> 00:03:17,667 the government never anticipated. 73 00:03:17,667 --> 00:03:22,083 They never anticipated we would ask questions like that of the data. 74 00:03:22,083 --> 00:03:24,751 So the term torturing data in my title really relates 75 00:03:24,751 --> 00:03:28,083 to doing what most people in this room love to do, which 76 00:03:28,083 --> 00:03:34,083 is to go out there and find things that you are not supposed to be able to find. 77 00:03:34,083 --> 00:03:36,459 So when people ask what is DEF CON? 78 00:03:36,459 --> 00:03:40,667 I say basically it's not people who are doing things that they shouldn't do. 79 00:03:40,667 --> 00:03:43,667 It's people doing things that they shouldn't be able to do. 80 00:03:43,667 --> 00:03:46,375 And you can do that with data quite effectively. 81 00:03:46,375 --> 00:03:48,959 So again while we are praying to the A/V Gods there, 82 00:03:48,959 --> 00:03:52,209 I'll start telling you about an example. 83 00:03:52,209 --> 00:03:54,999 We'll go through it quickly when the video comes up. 84 00:03:54,999 --> 00:03:55,999 There actually was a challenge called 85 00:03:55,999 --> 00:03:58,999 the open data challenge held in Europe. 86 00:03:59,125 --> 00:04:01,999 They gotten tries from most of the EU countries. 87 00:04:02,375 --> 00:04:05,292 The one that won was from Slovenia. 88 00:04:05,292 --> 00:04:08,501 I can't speak Slovenian, so I can't say the name. 89 00:04:08,501 --> 00:04:12,083 But it translates as: From our taxes. 90 00:04:12,083 --> 00:04:16,292 What they did is they actually took all the government contracts that were 91 00:04:16,292 --> 00:04:18,999 registered online and they started going 92 00:04:18,999 --> 00:04:22,083 out there and putting them together and finding 93 00:04:22,083 --> 00:04:25,083 out what names were in common. 94 00:04:25,083 --> 00:04:26,999 Which people were the directors of different companies, 95 00:04:26,999 --> 00:04:29,209 which people were the lawyers who represented 96 00:04:29,209 --> 00:04:31,250 the deals and so on. 97 00:04:31,250 --> 00:04:35,667 And an interesting thing happened: A woman named Ludmilla decided this 98 00:04:35,667 --> 00:04:38,417 was invading her privacy. 99 00:04:38,501 --> 00:04:42,375 She actually convinced a judge in Slovenia to go out there and 100 00:04:42,375 --> 00:04:46,834 to order the NGO that created this, award winning application, 101 00:04:46,834 --> 00:04:49,292 to take down the data. 102 00:04:49,417 --> 00:04:51,417 Somebody asked me before the talk how technical this 103 00:04:51,417 --> 00:04:53,083 is going to be. 104 00:04:53,083 --> 00:04:55,209 This is about as technical as it gets. 105 00:04:55,209 --> 00:04:56,999 They didn't have that data. 106 00:04:56,999 --> 00:04:59,999 That data was scraped off government databases that were 107 00:04:59,999 --> 00:05:02,876 maintained by the government. 108 00:05:02,876 --> 00:05:04,417 So here we have a judge, and it proves 109 00:05:04,417 --> 00:05:07,250 the old saying that judges are about ten years 110 00:05:07,250 --> 00:05:09,542 behind the average member of society 111 00:05:09,542 --> 00:05:12,459 in understanding of technology. 112 00:05:12,459 --> 00:05:13,459 Yes, that's right. 113 00:05:13,459 --> 00:05:14,459 Yes, thank you. 114 00:05:14,459 --> 00:05:19,292 And so the judge is ordering them to take down this information. 115 00:05:19,292 --> 00:05:22,459 And they can't take it down because they don't really have it. 116 00:05:22,459 --> 00:05:25,459 I suppose they could put filters into their program that somehow allow 117 00:05:25,459 --> 00:05:29,501 them to go out there and to get that information trapped so that it 118 00:05:29,501 --> 00:05:33,501 couldn't be revealed, but it just kind of shows a misunderstanding, 119 00:05:33,501 --> 00:05:36,667 appear fundamental misunderstanding. 120 00:05:36,667 --> 00:05:38,584 Do we have any hopes back there, guys? 121 00:05:39,209 --> 00:05:41,417 (Speaker away from microphone.) TOM KEENAN: 122 00:05:41,417 --> 00:05:43,834 Hope springs eternal, I guess. 123 00:05:43,834 --> 00:05:45,375 I could hold the screen up and give everybody a telescope 124 00:05:45,375 --> 00:05:47,459 or something like that. 125 00:05:47,709 --> 00:05:50,751 Let's talk about other kinds of open data. 126 00:05:50,751 --> 00:05:52,584 We have a member of the legislative assembly 127 00:05:52,584 --> 00:05:57,209 of Alberta who had the misfortune to do two things wrong. 128 00:05:57,209 --> 00:06:01,584 On a government trip he solicited a prostitute and got caught. 129 00:06:02,751 --> 00:06:04,876 His name is Mike Allen. 130 00:06:04,876 --> 00:06:06,167 It was all over the news. 131 00:06:06,167 --> 00:06:10,083 I said okay, what else can I find out about Mike Allen? 132 00:06:10,083 --> 00:06:11,459 I went to a database. 133 00:06:11,459 --> 00:06:16,083 I went to the Ramsey County, Minnesota, sheriff's office database. 134 00:06:16,125 --> 00:06:19,834 There is poor Mike's information on what he was doing, 135 00:06:19,834 --> 00:06:22,542 the fact that it was gross. 136 00:06:22,542 --> 00:06:25,417 They use the word gross misconduct. 137 00:06:25,417 --> 00:06:27,083 And his home address. 138 00:06:27,083 --> 00:06:29,709 And that's where things get pretty interesting. 139 00:06:29,709 --> 00:06:31,459 If you think about it, having his home address up there, 140 00:06:31,459 --> 00:06:34,417 it's private information in a sense. 141 00:06:34,626 --> 00:06:39,209 Hey, he was arrested, some sheriff in Minneapolis took that and posted it. 142 00:06:39,209 --> 00:06:43,999 We have different countries have different regard for privacy. 143 00:06:43,999 --> 00:06:45,459 So anybody from Germany here? 144 00:06:45,751 --> 00:06:49,083 (One cheer.) TOM KEENAN: If you're from Germany, you're the best. 145 00:06:49,083 --> 00:06:52,459 The people from Facebook they can't even operate are in Germany. 146 00:06:52,999 --> 00:06:54,876 Everything is against the law there. 147 00:06:56,083 --> 00:06:59,083 Canada is between germ nip and the U.S. 148 00:06:59,083 --> 00:07:01,918 A lot of things you can get away with in the U.S. 149 00:07:02,083 --> 00:07:06,125 in terms of posting information on people would not fly in Canada. 150 00:07:06,125 --> 00:07:09,167 To take that specific example I'm sure if this guy was arrested 151 00:07:09,167 --> 00:07:12,792 they wouldn't willy-nilly put up the information about where 152 00:07:12,792 --> 00:07:16,167 he lives because what is the need for that? 153 00:07:16,167 --> 00:07:18,250 But in the United States there's certainly a tendency to do more 154 00:07:18,250 --> 00:07:19,999 and more of that. 155 00:07:19,999 --> 00:07:22,834 One of the photos if we get to it, I will be able to show you, it's 156 00:07:22,834 --> 00:07:25,292 from Henderson county, Florida. 157 00:07:25,292 --> 00:07:26,999 Henderson County is a place you never want 158 00:07:26,999 --> 00:07:28,751 to get arrested. 159 00:07:29,876 --> 00:07:34,125 Those from Germany who don't know, the sheriffs are elected officials, 160 00:07:34,125 --> 00:07:37,999 dog catchers, people like that, they get to be what they are 161 00:07:37,999 --> 00:07:40,959 by having people vote for them. 162 00:07:40,959 --> 00:07:43,083 The reality is, they go out there and want to show that 163 00:07:43,083 --> 00:07:45,501 they are doing their job. 164 00:07:45,501 --> 00:07:47,083 They are actually a good sheriff. 165 00:07:47,083 --> 00:07:51,918 What better way than to actually arrest people? 166 00:07:51,918 --> 00:07:54,375 And sure enough, in Henderson county, they post the mug shot 167 00:07:54,375 --> 00:07:56,999 of everyone who is arrested. 168 00:07:56,999 --> 00:07:59,083 If you are done for speeding or whatever down there, 169 00:07:59,083 --> 00:08:02,334 and there's a really seedy -- and we may have the slide one 170 00:08:02,334 --> 00:08:04,083 of these days. 171 00:08:04,083 --> 00:08:07,999 If we do, you'll see seedy characters, people you would cross the street 172 00:08:07,999 --> 00:08:11,083 to avoid, who have been arrested. 173 00:08:11,334 --> 00:08:14,709 You'll also see a photo of a 12-year-old boy. 174 00:08:14,834 --> 00:08:17,792 I captured that information off the government database 175 00:08:17,792 --> 00:08:20,751 and used it in my presentation. 176 00:08:20,751 --> 00:08:22,083 Now, I obscured his name. 177 00:08:22,083 --> 00:08:23,417 First name is Bobby. 178 00:08:23,417 --> 00:08:24,667 I obscured the surname. 179 00:08:24,667 --> 00:08:27,999 Put a black bar across his eyes so you can't see 180 00:08:27,999 --> 00:08:33,876 Bobby's personal identity, but the sheriff didn't do that. 181 00:08:33,876 --> 00:08:38,417 The first thing to realize, he's up there, a minor child, 12 years old. 182 00:08:38,876 --> 00:08:42,334 He's being permanently tarred in some ways. 183 00:08:42,334 --> 00:08:44,999 Now, you say, well, not really permanently because after all, 184 00:08:44,999 --> 00:08:47,876 it's only up there for 30 days. 185 00:08:47,876 --> 00:08:52,125 The answer is yeah, but it's been in my presentation for over a year now. 186 00:08:52,125 --> 00:08:56,876 When data is out there, it is -- there's no way to call it back. 187 00:08:56,876 --> 00:09:00,999 There's no way to bring data back into the fold if you've let it loose. 188 00:09:00,999 --> 00:09:03,375 This brings a lot of vulnerability. 189 00:09:03,375 --> 00:09:07,083 The reality is quite a bit of data is being put out there, 190 00:09:07,083 --> 00:09:09,626 captured by people. 191 00:09:09,626 --> 00:09:11,167 They say if you put a photo up on Facebook, 192 00:09:11,167 --> 00:09:15,417 at the very least it's been copied by the NSA, but probably by a lot 193 00:09:15,417 --> 00:09:17,999 of other people as well. 194 00:09:18,250 --> 00:09:24,250 There's no way to call back data from being posted somewhere. 195 00:09:24,876 --> 00:09:26,501 One aspect again, I'm randomly remembering 196 00:09:26,501 --> 00:09:28,334 my presentation. 197 00:09:28,334 --> 00:09:32,250 I usually print out a cheat sheet, but I trusted the tech here. 198 00:09:32,417 --> 00:09:33,417 Ha-ha. 199 00:09:33,876 --> 00:09:36,459 (Cheers and applause and laughter.) TOM KEENAN: 200 00:09:36,459 --> 00:09:40,083 Another aspect relates to DNA information. 201 00:09:40,083 --> 00:09:42,959 How many of you know ancestry.com? 202 00:09:44,667 --> 00:09:47,999 Or ancestry.co, all the different versions. 203 00:09:48,083 --> 00:09:53,083 I won't ask because it's embarrassing to ask, who has profiles on that? 204 00:09:53,083 --> 00:09:55,542 The business of ancestry, they allow you to sign 205 00:09:55,542 --> 00:10:01,125 up for 14 days for free and do all kinds of exploring with their data. 206 00:10:01,125 --> 00:10:04,792 They are using public information, census data. 207 00:10:05,000 --> 00:10:09,626 They are using military records, prison records, all kinds of things. 208 00:10:09,626 --> 00:10:11,999 Those are particularly useful for Australia. 209 00:10:11,999 --> 00:10:13,792 Lots of prison records down there. 210 00:10:13,792 --> 00:10:15,417 And so they take all this data. 211 00:10:15,417 --> 00:10:18,584 They make it freely available to you for 14 days. 212 00:10:18,584 --> 00:10:23,417 On the 15th day they charge you $299 if you don't cancel the membership. 213 00:10:23,417 --> 00:10:24,999 It's an interesting model. 214 00:10:24,999 --> 00:10:29,042 They take public data and make money from it. 215 00:10:29,042 --> 00:10:31,000 We assume that's probably okay. 216 00:10:31,000 --> 00:10:35,792 The second thing, of course, they, after you leave, after you tell them, hey, 217 00:10:35,792 --> 00:10:39,626 I don't want to sign up for $299, they go out there and 218 00:10:39,626 --> 00:10:41,999 they keep the data. 219 00:10:41,999 --> 00:10:43,834 They know your family tree. 220 00:10:43,834 --> 00:10:46,999 You have enriched their database and guess what? 221 00:10:46,999 --> 00:10:49,209 You can never -- you can cancel your account. 222 00:10:49,209 --> 00:10:51,999 You can check into the Hotel California. 223 00:10:51,999 --> 00:10:53,626 You can never check out your data. 224 00:10:53,626 --> 00:10:55,959 It is permanently part of their database. 225 00:10:55,959 --> 00:10:57,999 A few years ago they got a brain wave. 226 00:10:57,999 --> 00:11:00,334 You know what would make this a lot better? 227 00:11:00,334 --> 00:11:04,459 Send us your DNA and we'll tell you if you are descended from Adam 228 00:11:04,459 --> 00:11:09,459 or Noah or whoever is way, way back in your family tree. 229 00:11:09,626 --> 00:11:13,250 Some people were actually dumb enough to send their DNA. 230 00:11:13,751 --> 00:11:14,959 Doing that, of course, provides 231 00:11:14,959 --> 00:11:17,792 a tremendous amount of information. 232 00:11:17,918 --> 00:11:19,999 If you think about it, with DNA information it's not just 233 00:11:19,999 --> 00:11:21,459 about you. 234 00:11:21,459 --> 00:11:25,999 It's about your siblings, your family, all kinds of people who didn't give any 235 00:11:25,999 --> 00:11:29,626 consent for you to give that information. 236 00:11:29,626 --> 00:11:31,667 So the reality is, putting DNA information 237 00:11:31,667 --> 00:11:33,918 out there is risky. 238 00:11:33,918 --> 00:11:36,083 Giving it up voluntarily is dumb. 239 00:11:36,083 --> 00:11:39,167 And privacy international, the NGO in the U.K. 240 00:11:39,167 --> 00:11:43,083 actually launched a lawsuit against ancestry.com. 241 00:11:43,083 --> 00:11:45,083 Now, I was down in Utah and I thought I would visit 242 00:11:45,083 --> 00:11:48,999 ancestry.com and I tried to find out all about this. 243 00:11:48,999 --> 00:11:52,250 They said oh, well, we just keep the database here. 244 00:11:52,250 --> 00:11:55,667 That's the genetic genealogy project and it's with the smart people 245 00:11:55,667 --> 00:11:57,876 down in California. 246 00:11:57,876 --> 00:12:02,083 So the genetic genealogy project is something to worry about. 247 00:12:02,083 --> 00:12:03,459 The possibility is out there for somebody 248 00:12:03,459 --> 00:12:06,918 to get tremendous amounts of information on you. 249 00:12:06,999 --> 00:12:09,125 Do we have any hope back there? 250 00:12:09,292 --> 00:12:11,375 Hope springs eternal? 251 00:12:12,709 --> 00:12:14,125 (Speaker away from microphone.) TOM KEENAN: 252 00:12:14,125 --> 00:12:15,999 You're getting closer? 253 00:12:18,999 --> 00:12:21,751 (Speaker away from microphone.) TOM KEENAN: 254 00:12:21,751 --> 00:12:25,167 Everyone leave your laptop at the door, right? 255 00:12:25,292 --> 00:12:28,999 A couple of principles on dealing with this data. 256 00:12:28,999 --> 00:12:31,918 The New York City released with great fanfare 257 00:12:31,918 --> 00:12:35,292 in 2009 New York City data mine. 258 00:12:35,292 --> 00:12:36,999 They had 210 databases. 259 00:12:36,999 --> 00:12:39,999 There they had interesting things like all the women organizations 260 00:12:39,999 --> 00:12:43,501 in New York City are now in this database. 261 00:12:43,542 --> 00:12:46,999 They realized a day later that they had forgotten to take 262 00:12:46,999 --> 00:12:51,999 out the private e-mail addresses and the secret questions. 263 00:12:52,125 --> 00:12:57,167 So what was your first pet is the most common secret question. 264 00:12:57,167 --> 00:12:59,292 And fluffy is the most common answer. 265 00:12:59,584 --> 00:13:01,999 That information was disclosed out there. 266 00:13:01,999 --> 00:13:04,501 You know, the next day there were only 109 public databases 267 00:13:04,501 --> 00:13:06,999 from New York City because they had to go in there 268 00:13:06,999 --> 00:13:09,999 and take back some of that information. 269 00:13:09,999 --> 00:13:13,375 I want to explain an experiment to you that I did with what 270 00:13:13,375 --> 00:13:16,125 is called open Philly.com. 271 00:13:16,125 --> 00:13:17,626 Philadelphia is one of the leading cities 272 00:13:17,626 --> 00:13:19,626 in making data open. 273 00:13:19,626 --> 00:13:22,417 They went out and actually put online 274 00:13:22,417 --> 00:13:25,417 contribution records. 275 00:13:25,542 --> 00:13:29,375 Because it's an international audience, it varies a lot from country to country, 276 00:13:29,375 --> 00:13:33,999 but in the United States there's laws about campaign contributions. 277 00:13:34,167 --> 00:13:37,083 Those over $200 say that if you are making a contribution, 278 00:13:37,083 --> 00:13:41,999 you have to go occupant and give your name, address and occupation. 279 00:13:42,083 --> 00:13:43,918 Those are public. 280 00:13:45,334 --> 00:13:49,250 Philadelphia took all the contributions, even those of $1.49 281 00:13:49,250 --> 00:13:52,667 for the last seven years retroactively and put them 282 00:13:52,667 --> 00:13:55,375 up in a wonderful database. 283 00:13:55,375 --> 00:13:57,667 We are going to get technical one more time. 284 00:13:57,667 --> 00:14:00,250 It was put up there with a front end that said you can query 285 00:14:00,250 --> 00:14:02,834 this database, but hey, you better not go 286 00:14:02,834 --> 00:14:05,959 out there and actually download it. 287 00:14:05,959 --> 00:14:07,334 It is not for downloading. 288 00:14:07,667 --> 00:14:09,292 Here is how you download it. 289 00:14:09,292 --> 00:14:13,417 Any of my computer science students who didn't get this get an F. 290 00:14:13,417 --> 00:14:17,501 First you say download all the people whose name begins 291 00:14:17,501 --> 00:14:17,999 with A. 292 00:14:17,999 --> 00:14:23,292 Then the last name beginning with A, and then last names, B and C and D. 293 00:14:23,292 --> 00:14:25,501 See the pattern? 294 00:14:25,918 --> 00:14:28,501 Within a few minutes we had a comma separated file 295 00:14:28,501 --> 00:14:31,125 of all these contributions. 296 00:14:31,417 --> 00:14:34,083 Then you torture the data and have fun with it. 297 00:14:34,459 --> 00:14:35,999 What can I find out? 298 00:14:35,999 --> 00:14:37,709 Who in this room knows who Ronald 299 00:14:37,709 --> 00:14:39,250 Rivest is? 300 00:14:43,542 --> 00:14:46,334 There is a candidate, named Shelly something 301 00:14:46,334 --> 00:14:48,999 or other in Philadelphia. 302 00:14:48,999 --> 00:14:50,709 All of her contributions seem to be local 303 00:14:50,709 --> 00:14:53,501 except one came from Massachusetts. 304 00:14:53,501 --> 00:14:54,501 Who was it? 305 00:14:54,501 --> 00:14:55,501 It was Ron Rivest. 306 00:14:55,501 --> 00:14:59,834 He obviously endorsed this candidate enough that he sent money to her. 307 00:15:00,083 --> 00:15:02,918 By plotting the data on a bit of a graph I was actually able 308 00:15:02,918 --> 00:15:05,375 to find some interest things. 309 00:15:05,417 --> 00:15:09,209 The most interesting thing I found was that an awful lot of people 310 00:15:09,209 --> 00:15:12,918 in Philadelphia live in the same place. 311 00:15:12,999 --> 00:15:15,125 1719 Spring Street. 312 00:15:15,375 --> 00:15:17,999 I mean, thousands of people live there. 313 00:15:17,999 --> 00:15:20,876 (Laughter.) TOM KEENAN: I went: What the heck is this? 314 00:15:20,876 --> 00:15:22,834 I went to Google maps and brought it up. 315 00:15:22,834 --> 00:15:24,751 It is the offices of the International Brotherhood 316 00:15:24,751 --> 00:15:26,999 of Electrical Workers. 317 00:15:27,542 --> 00:15:31,417 So when Guido comes in and is going to get a job, 318 00:15:31,417 --> 00:15:36,709 they say "Come over here and make this contribution." 319 00:15:36,709 --> 00:15:38,375 Oh, yeah, I have to do that. 320 00:15:38,375 --> 00:15:39,501 You fill out the form. 321 00:15:39,501 --> 00:15:41,999 When people do campaign contributions, they put 322 00:15:41,999 --> 00:15:43,999 the address down because they want 323 00:15:43,999 --> 00:15:45,999 the tax receipt. 324 00:15:47,083 --> 00:15:48,999 Maybe he's in the witness protection program 325 00:15:48,999 --> 00:15:52,167 or something and doesn't want to use his address. 326 00:15:52,167 --> 00:15:53,999 They use the address of the union. 327 00:15:53,999 --> 00:15:55,999 A significant number of the contributions 328 00:15:55,999 --> 00:16:01,083 to some candidates come back to one address, the union. 329 00:16:01,083 --> 00:16:06,709 I thought I would run a statistical study. 330 00:16:06,709 --> 00:16:11,542 I found out the most common names in the United States, Jones and Smith. 331 00:16:11,542 --> 00:16:13,375 And I did an analysis. 332 00:16:13,375 --> 00:16:15,918 The only reason really to publish the name and address 333 00:16:15,918 --> 00:16:20,083 of political contributors that I could see is if there's two John Smiths 334 00:16:20,083 --> 00:16:23,250 and you want to know which one it is. 335 00:16:23,250 --> 00:16:25,375 I actually did a statistical test. 336 00:16:25,375 --> 00:16:26,834 I found there were 385 J. 337 00:16:26,834 --> 00:16:33,083 Smiths and the actual power or the need to actually resolve duplicates 338 00:16:33,083 --> 00:16:37,626 came down to only about three or four. 339 00:16:37,626 --> 00:16:41,083 There was almost no information added by having this, but there was 340 00:16:41,083 --> 00:16:43,999 a zipping can't privacy risk. 341 00:16:43,999 --> 00:16:45,709 So what I'm suggesting to governments is that they need 342 00:16:45,709 --> 00:16:48,167 to go out there and think long and hard about how they put 343 00:16:48,167 --> 00:16:50,999 out the data, why they put out the data. 344 00:16:51,375 --> 00:16:53,999 Again, I don't know if we will ever get the images. 345 00:16:53,999 --> 00:16:57,334 We are going to have a flash dance of the images if we get them up here. 346 00:16:57,334 --> 00:17:01,999 You have a complete stalking exercise I did on a notable Albertan. 347 00:17:01,999 --> 00:17:04,334 He was the president of an airline. 348 00:17:04,334 --> 00:17:08,709 I said tell you the truth, I started with the premiere of the province. 349 00:17:08,709 --> 00:17:11,292 I said how well does he protect his privacy? 350 00:17:11,292 --> 00:17:14,125 The first test was, is his phone number unlisted? 351 00:17:14,125 --> 00:17:15,125 It was. 352 00:17:15,125 --> 00:17:16,751 I worked down to other people. 353 00:17:16,751 --> 00:17:19,083 I found the president of a pretty big airline whose home 354 00:17:19,083 --> 00:17:21,083 number is listed. 355 00:17:21,083 --> 00:17:23,792 It is still listed if you look hard enough. 356 00:17:23,959 --> 00:17:24,999 I went from -- (Cheers 357 00:17:24,999 --> 00:17:30,459 and applause.) TOM KEENAN: Hey, we have the technology! 358 00:17:30,501 --> 00:17:31,876 Okay! 359 00:17:32,083 --> 00:17:33,999 So monkey, or whoever is back. 360 00:17:34,709 --> 00:17:36,709 Work with me darling. 361 00:17:36,709 --> 00:17:37,709 Make it happen. 362 00:17:37,709 --> 00:17:38,834 Go forward. 363 00:17:39,834 --> 00:17:43,083 Just keep -- I'll say next, next, next. 364 00:17:43,584 --> 00:17:44,959 We talked about this. 365 00:17:44,959 --> 00:17:45,959 Stop right there. 366 00:17:45,959 --> 00:17:46,959 New York City. 367 00:17:46,959 --> 00:17:48,459 This is one we didn't talk about. 368 00:17:48,459 --> 00:17:52,459 New York City data mine where the women's organizations were outed. 369 00:17:52,459 --> 00:17:54,083 So there's a problem which is neglecting 370 00:17:54,083 --> 00:17:57,876 to weed and redact data before releasing it. 371 00:17:57,876 --> 00:17:58,876 Next slide. 372 00:17:58,918 --> 00:18:00,626 Yes, Toronto. 373 00:18:00,626 --> 00:18:03,250 You call 311, you complain about something. 374 00:18:03,250 --> 00:18:06,626 Sure enough, in Toronto they built a database of that. 375 00:18:06,626 --> 00:18:09,792 They are supposed to be careful to anonymize it. 376 00:18:09,792 --> 00:18:11,959 We have six digit postal codes. 377 00:18:11,959 --> 00:18:13,999 They are supposed to put in the first three. 378 00:18:13,999 --> 00:18:15,292 I looked at the database. 379 00:18:15,292 --> 00:18:18,999 Sometimes they go and put actual intersections. 380 00:18:19,834 --> 00:18:23,417 Sloppy, lazy data entry, the next problem. 381 00:18:23,417 --> 00:18:24,417 Next slide. 382 00:18:24,501 --> 00:18:27,584 This is actually from my place in Calgary. 383 00:18:27,584 --> 00:18:31,292 Do any of you know C click fix, you want to report a pothole or 384 00:18:31,292 --> 00:18:35,125 a place where criminals hang out, you go anonymously 385 00:18:35,125 --> 00:18:38,999 on this system and you enter the data. 386 00:18:38,999 --> 00:18:40,584 Sure enough, in Calgary somebody hates 387 00:18:40,584 --> 00:18:43,334 the car wash in his neighborhood. 388 00:18:43,334 --> 00:18:47,083 Every day he puts in excessive noise, dangerous ice. 389 00:18:47,083 --> 00:18:49,501 He has a hate on for that car wash. 390 00:18:49,501 --> 00:18:50,501 Next slide. 391 00:18:50,959 --> 00:18:53,334 Okay, this is the European data channel. 392 00:18:53,334 --> 00:18:54,334 Next slide. 393 00:18:54,459 --> 00:18:55,999 Who won? 394 00:18:55,999 --> 00:18:58,334 There it is in Slovenia, from our taxes. 395 00:18:58,334 --> 00:18:59,334 Next slide. 396 00:18:59,709 --> 00:19:01,709 Okay, who wouldn't agree with that? 397 00:19:01,709 --> 00:19:04,167 Well, the judge ordered them to take down the data. 398 00:19:04,167 --> 00:19:05,167 Next slide. 399 00:19:05,375 --> 00:19:07,083 A little technical side. 400 00:19:07,083 --> 00:19:08,834 They couldn't take down the data. 401 00:19:08,834 --> 00:19:11,167 Judges lag behind technology much we know that. 402 00:19:11,167 --> 00:19:13,999 Next slide, election results. 403 00:19:13,999 --> 00:19:15,876 There's the guy who got all the votes. 404 00:19:15,876 --> 00:19:16,876 Next slide. 405 00:19:16,876 --> 00:19:17,999 Did my wife vote for me? 406 00:19:17,999 --> 00:19:19,209 Apparently not because all these people got 407 00:19:19,209 --> 00:19:20,959 zero slides. 408 00:19:21,999 --> 00:19:24,167 Philadelphia, finance records. 409 00:19:24,167 --> 00:19:27,709 Next slide, there it is, full home addresses provided. 410 00:19:27,709 --> 00:19:28,709 Next slide. 411 00:19:28,792 --> 00:19:31,999 Okay, that's the kind of computer that we had. 412 00:19:31,999 --> 00:19:33,459 I had one of those come doors. 413 00:19:33,459 --> 00:19:34,459 Anybody have a pet? 414 00:19:34,459 --> 00:19:35,459 Yeah. 415 00:19:35,459 --> 00:19:37,709 1970s when they wrote the election law. 416 00:19:37,709 --> 00:19:39,375 Maybe they're a little out of date. 417 00:19:39,375 --> 00:19:40,375 Next slide. 418 00:19:40,501 --> 00:19:42,334 There's the kind of stuff I can get. 419 00:19:42,334 --> 00:19:45,375 Ronald Rivest's home address if you want to write to him. 420 00:19:47,999 --> 00:19:50,999 Yes, that's Ron and all of his family gave 421 00:19:50,999 --> 00:19:54,125 to Barack Obama in this case. 422 00:19:54,125 --> 00:19:55,125 Next slide. 423 00:19:55,292 --> 00:19:57,083 Let's keep going. 424 00:19:57,083 --> 00:19:58,083 Next slide. 425 00:19:58,083 --> 00:20:01,083 So the files were downloadable. 426 00:20:01,083 --> 00:20:02,083 Next slide. 427 00:20:02,083 --> 00:20:05,501 Common names, Smith, Johnson, Williams. 428 00:20:05,501 --> 00:20:06,999 That's the number of Smiths and Johnsons, 429 00:20:06,999 --> 00:20:09,626 very small number of duplicates. 430 00:20:09,626 --> 00:20:10,626 Next slide. 431 00:20:10,918 --> 00:20:14,999 It was really useless and when I looked at the address on spring street, 432 00:20:14,999 --> 00:20:16,709 next slide. 433 00:20:16,709 --> 00:20:19,999 There it is, International Brotherhood of Electrical Workers. 434 00:20:24,125 --> 00:20:27,292 Inconsistent data like some people giving home address, 435 00:20:27,292 --> 00:20:29,626 some giving the union. 436 00:20:29,626 --> 00:20:32,999 And the why are we requiring an address needs to be examined 437 00:20:32,999 --> 00:20:35,709 in the light of new tools. 438 00:20:35,709 --> 00:20:36,709 Next slide. 439 00:20:36,709 --> 00:20:38,918 There's the guy, Clyde Beddow. 440 00:20:39,876 --> 00:20:42,292 There's his home address from the phone book. 441 00:20:42,292 --> 00:20:43,918 Took out the phone number. 442 00:20:43,918 --> 00:20:44,918 It's still valid. 443 00:20:45,000 --> 00:20:47,999 Property tax assessment, next slide. 444 00:20:48,167 --> 00:20:50,876 Details of his property tax. 445 00:20:50,876 --> 00:20:51,876 Next slide. 446 00:20:51,876 --> 00:20:53,250 There he is plotted on a map. 447 00:20:53,250 --> 00:20:56,292 His house is worth 710 less. 448 00:20:56,292 --> 00:20:57,501 The neighbors less. 449 00:20:57,501 --> 00:21:01,375 This is an industry of writing letters. 450 00:21:01,417 --> 00:21:05,751 Dear Mr. Beddow, you are assessed this much money. 451 00:21:05,751 --> 00:21:09,083 Do you know that your neighbor is assessed at 414,000? 452 00:21:09,083 --> 00:21:11,751 Would you like us to appeal your taxes? 453 00:21:11,792 --> 00:21:14,375 The city had to shut down the database. 454 00:21:14,999 --> 00:21:19,375 They now have technical and legal safeguards which are dumb. 455 00:21:19,501 --> 00:21:23,083 You can only query ten times from an IP address in a day. 456 00:21:23,083 --> 00:21:25,751 Anybody in this room can spoof that one. 457 00:21:25,751 --> 00:21:26,751 Next slide. 458 00:21:27,083 --> 00:21:28,501 Next one? 459 00:21:28,959 --> 00:21:31,999 Here is Hernando County, the sheriff, next slide. 460 00:21:31,999 --> 00:21:32,999 Smiley face. 461 00:21:33,417 --> 00:21:36,626 There is not so friendly people who got arrested for things 462 00:21:36,626 --> 00:21:40,542 like uttering forged instrument and contempt of court. 463 00:21:40,542 --> 00:21:41,542 Next slide. 464 00:21:41,667 --> 00:21:43,709 Poor Bobby, that's real. 465 00:21:43,959 --> 00:21:48,999 1997, that made him 12 years, 12 months at the time of arrest. 466 00:21:48,999 --> 00:21:50,375 Shame on you, sheriff. 467 00:21:50,999 --> 00:21:52,959 Data journalism. 468 00:21:52,959 --> 00:21:54,918 I lived in the Bronx for a long time. 469 00:21:54,918 --> 00:22:00,250 The Backman County newspaper listed people with pistol permits. 470 00:22:00,250 --> 00:22:03,751 Sure enough, one of them lived in the Bronx on Milltown Road 471 00:22:03,751 --> 00:22:06,959 a few blocks from where I lived. 472 00:22:07,209 --> 00:22:09,125 This is public information. 473 00:22:09,125 --> 00:22:10,999 You get interesting results. 474 00:22:10,999 --> 00:22:12,334 People have pistol, multiple pistol permits 475 00:22:12,334 --> 00:22:15,542 and live opposite an elementary school. 476 00:22:15,542 --> 00:22:16,542 Next slide. 477 00:22:19,417 --> 00:22:22,083 Should the rich have less privacy? 478 00:22:22,083 --> 00:22:23,292 When we go out and do data journalism 479 00:22:23,292 --> 00:22:26,083 on rich people like corporate directors we can go 480 00:22:26,083 --> 00:22:30,876 out there and find people ab abusing the SNAP food stamp program. 481 00:22:30,918 --> 00:22:35,459 Interesting societal questions, do we go after those people or just 482 00:22:35,459 --> 00:22:39,292 the easy pickings, the high profile ones? 483 00:22:39,709 --> 00:22:41,542 Indirect risk. 484 00:22:41,542 --> 00:22:43,999 People collecting this data like check point. 485 00:22:44,542 --> 00:22:47,584 Ancestry.com, they have 6 billion records mostly 486 00:22:47,584 --> 00:22:49,999 courtesy of the public. 487 00:22:49,999 --> 00:22:50,999 Next slide. 488 00:22:50,999 --> 00:22:52,792 There is your DNA test. 489 00:22:52,792 --> 00:22:56,501 They want your DNA and charge you $99 to take it from you. 490 00:22:56,501 --> 00:22:58,918 Next slide or if you're in New York City there's mobile 491 00:22:58,918 --> 00:23:00,584 DNA testing. 492 00:23:00,584 --> 00:23:02,999 All you have to do is flag him down, there he is. 493 00:23:02,999 --> 00:23:05,999 He will tell you who your daddy was right on the spot. 494 00:23:05,999 --> 00:23:07,083 Or who your daddy isn't. 495 00:23:07,083 --> 00:23:10,584 (Cheers and applause.) TOM KEENAN: Next 496 00:23:10,584 --> 00:23:12,999 slide, please. 497 00:23:12,999 --> 00:23:16,334 There is the complaint against him from privacy international. 498 00:23:16,334 --> 00:23:17,334 Next slide. 499 00:23:17,334 --> 00:23:19,375 Tools that we use for this. 500 00:23:19,375 --> 00:23:21,959 Inquiry mind, that's the important one. 501 00:23:21,959 --> 00:23:25,999 Having a motive like financial motive, ID theft, et cetera. 502 00:23:25,999 --> 00:23:31,501 Scripting language, Scraper Wiki again, getting Python, PHP scripts. 503 00:23:31,501 --> 00:23:32,999 All this stuff is out there. 504 00:23:32,999 --> 00:23:36,709 There are libraries of scripts, if you want to do it, go to a hackathon. 505 00:23:36,999 --> 00:23:38,999 Some friendly person will tell you how. 506 00:23:46,083 --> 00:23:48,876 Companies like ChoicePoint actually go out there and 507 00:23:48,876 --> 00:23:51,459 they pay high school students to go into the basement 508 00:23:51,459 --> 00:23:54,292 of courthouses and hand copy people's divorce settlements 509 00:23:54,292 --> 00:23:57,375 because you are not allowed to photocopy them and not allowed 510 00:23:57,375 --> 00:24:01,709 to download them electronically, but they are public information. 511 00:24:01,709 --> 00:24:04,501 Once that stuff gets put into a public file as it 512 00:24:04,501 --> 00:24:08,834 is at ChoicePoint it can be accessed for a fee. 513 00:24:08,834 --> 00:24:10,918 One guy can never get a job. 514 00:24:10,918 --> 00:24:11,918 He didn't know why. 515 00:24:11,918 --> 00:24:12,999 He was an engineer. 516 00:24:12,999 --> 00:24:14,999 He would get interviewed, never hire. 517 00:24:14,999 --> 00:24:19,334 He finally got a friend to pull his private file from choice point. 518 00:24:19,334 --> 00:24:20,334 What was it? 519 00:24:20,334 --> 00:24:21,999 He was a convicted murderer. 520 00:24:21,999 --> 00:24:24,375 Well, he wasn't actually a convicted murderer! 521 00:24:24,375 --> 00:24:26,667 They got his Social Security number wrong. 522 00:24:26,709 --> 00:24:29,334 His friend said you know why you're not getting a job. 523 00:24:29,334 --> 00:24:30,876 They said you're a murderer. 524 00:24:30,876 --> 00:24:31,876 He said no, I'm not. 525 00:24:31,876 --> 00:24:32,876 Next slide. 526 00:24:32,876 --> 00:24:34,083 Here are the dirty dozen. 527 00:24:34,709 --> 00:24:37,751 Sloppy data entry, malicious information, lag 528 00:24:37,751 --> 00:24:40,459 between law and policy, making inferences 529 00:24:40,459 --> 00:24:45,083 from the small numbers like very few people who voted. 530 00:24:45,083 --> 00:24:47,501 Assuming nobody like me will torture the data. 531 00:24:47,501 --> 00:24:49,125 My name is Tom and I torture data. 532 00:24:49,125 --> 00:24:50,125 Next. 533 00:24:50,709 --> 00:24:52,167 Inconsistent data. 534 00:24:52,167 --> 00:24:55,375 Where, you know, I did a project with the Rand corporation years ago 535 00:24:55,375 --> 00:24:59,999 where we were collecting data on New York City fire trucks. 536 00:24:59,999 --> 00:25:03,584 We had people getting from the battery, like battery park to Harlem 537 00:25:03,584 --> 00:25:07,459 in a fire truck in three minutes and 402_nd_s. 538 00:25:07,501 --> 00:25:09,709 We couldn't figure out how that could be. 539 00:25:09,709 --> 00:25:10,999 They are not helicopters. 540 00:25:11,250 --> 00:25:12,999 We watched them. 541 00:25:12,999 --> 00:25:13,709 And what happened, the guy who sits 542 00:25:13,709 --> 00:25:16,459 next to the driver was recording the data. 543 00:25:16,584 --> 00:25:19,375 They would go out and put fires out all day and say oh, we forgot 544 00:25:19,375 --> 00:25:22,999 to do that, but the Rand guys, we need to fix it. 545 00:25:22,999 --> 00:25:24,167 They made up the data. 546 00:25:24,834 --> 00:25:29,292 Jigsawing, taking data public and private and putting it together. 547 00:25:29,292 --> 00:25:32,125 Facial recognition, big area coming up. 548 00:25:32,125 --> 00:25:33,667 Your face can be. 549 00:25:35,083 --> 00:25:39,709 Somebody stood here last year and said he can take your photo 550 00:25:39,709 --> 00:25:43,918 on Facebook, compare it to a photo on Match.com, and tell 551 00:25:43,918 --> 00:25:48,083 if you're Sexy Babe or you're Hung Dude 204. 552 00:25:48,501 --> 00:25:52,501 He can disambiguate those using your face. 553 00:25:53,999 --> 00:25:57,375 Recommendations, we have to scan the file for PII things that need 554 00:25:57,375 --> 00:25:59,083 to be included. 555 00:26:00,918 --> 00:26:02,999 This is on the CD. 556 00:26:02,999 --> 00:26:04,417 I don't need to read it to you. 557 00:26:04,417 --> 00:26:05,417 Next slide. 558 00:26:06,083 --> 00:26:08,918 I want to tell you a little bit about this image. 559 00:26:08,999 --> 00:26:12,876 There's a great project out there, concepts for all. 560 00:26:12,876 --> 00:26:16,292 It basically is, I want to end on a hopeful note. 561 00:26:16,292 --> 00:26:19,918 There was a poster program at the University of Quebec in Montreal, 562 00:26:19,918 --> 00:26:21,918 a design program going to be shut 563 00:26:21,918 --> 00:26:25,834 down because it's so expensive to print posters. 564 00:26:26,083 --> 00:26:29,167 My friend discovered the Internet. 565 00:26:29,167 --> 00:26:31,999 Now all these images get posted on the Internet. 566 00:26:31,999 --> 00:26:33,709 They are beautiful images. 567 00:26:33,709 --> 00:26:36,999 The only condition for using them for something like this 568 00:26:36,999 --> 00:26:41,250 is that you say this was courtesy of Elyse Pachef. 569 00:26:42,918 --> 00:26:44,626 I did that. 570 00:26:44,999 --> 00:26:47,417 I recommend that you talk to the government. 571 00:26:47,417 --> 00:26:48,626 We are the people. 572 00:26:48,626 --> 00:26:49,751 We are the democracy. 573 00:26:49,751 --> 00:26:53,999 We get to decide supposedly what our governments do to or for us. 574 00:26:53,999 --> 00:26:56,792 This is an area you will be hearing much more about. 575 00:26:56,792 --> 00:26:58,417 I promised to stay on time. 576 00:26:58,417 --> 00:27:00,542 I think I have five minutes for questions. 577 00:27:00,542 --> 00:27:01,542 Is that about right? 578 00:27:01,876 --> 00:27:02,876 Okay. 579 00:27:02,876 --> 00:27:04,501 So let's have the first question. 580 00:27:04,501 --> 00:27:05,501 Thank you! 581 00:27:09,876 --> 00:27:11,250 (Cheers and applause.) TOM KEENAN: 582 00:27:11,250 --> 00:27:14,999 Thanks for all the hard working goons who made this come up there. 583 00:27:14,999 --> 00:27:17,209 We paved the way for first speakers. 584 00:27:17,209 --> 00:27:18,999 It's hard to get the first question. 585 00:27:18,999 --> 00:27:20,792 I'll answer the second question. 586 00:27:23,083 --> 00:27:26,959 Who has a question? 587 00:27:26,959 --> 00:27:27,959 Okay. 588 00:27:27,959 --> 00:27:28,959 Turn him on, please. 589 00:27:34,999 --> 00:27:37,918 AUDIENCE: I would rather stay off the stage. 590 00:27:37,918 --> 00:27:39,292 I don't want to be recognized. 591 00:27:39,667 --> 00:27:42,417 There is a movement in the United States to crack 592 00:27:42,417 --> 00:27:46,250 down on perceived voter registration fraud. 593 00:27:46,250 --> 00:27:50,083 It's mostly bullshit, but that's the way to keep black people from voting. 594 00:27:50,501 --> 00:27:53,999 (Cheers.) AUDIENCE: But in general, I think governments feel 595 00:27:53,999 --> 00:27:58,501 like if you require people to provide an address it keeps them honest even 596 00:27:58,501 --> 00:28:01,501 if it isn't particularly useful. 597 00:28:01,501 --> 00:28:05,334 Is there another way you can think of to keep people more honest. 598 00:28:05,334 --> 00:28:07,083 TOM KEENAN: Is there another way? 599 00:28:07,083 --> 00:28:08,999 AUDIENCE: Should we stop worrying about it, 600 00:28:08,999 --> 00:28:12,999 actual voter fraud where somebody pretends to vote or votes twice, 601 00:28:12,999 --> 00:28:15,999 it's hard to find an instance. 602 00:28:15,999 --> 00:28:16,999 What do you think? 603 00:28:16,999 --> 00:28:20,292 TOM KEENAN: You can take their DNA, all kinds of things you can do. 604 00:28:20,501 --> 00:28:21,999 The point of the address. 605 00:28:21,999 --> 00:28:24,542 AUDIENCE: (Speaker away from microphone.) TOM KEENAN: 606 00:28:24,542 --> 00:28:27,250 You can do -- he asked how you can do something 607 00:28:27,250 --> 00:28:30,999 other than addresses to eliminate voter fraud and I joked you 608 00:28:30,999 --> 00:28:33,083 can take people's DNA. 609 00:28:33,083 --> 00:28:36,250 There are countries in the world where people vote. 610 00:28:36,751 --> 00:28:39,209 They get their finger marked with ink and sometimes 611 00:28:39,209 --> 00:28:42,999 they don't want to vote because they might get that finger cut 612 00:28:42,999 --> 00:28:46,459 off by somebody or be punished by voting. 613 00:28:46,626 --> 00:28:51,542 There are alternative ways of doing it, but I can't think of too many other ways. 614 00:28:51,542 --> 00:28:53,999 If the contribution is small enough, the reality is giving 615 00:28:53,999 --> 00:28:56,292 the address is optional. 616 00:28:56,292 --> 00:28:59,334 Give the address of the union and never go pick up the result. 617 00:29:00,083 --> 00:29:04,999 Campaign laws, the big idea is that the campaign law needs to be reviewed. 618 00:29:04,999 --> 00:29:07,083 It was passed in the 1970s. 619 00:29:07,959 --> 00:29:10,459 We didn't have the technology to do the kinds of things we do now 620 00:29:10,459 --> 00:29:11,999 with the data. 621 00:29:11,999 --> 00:29:13,918 It's only going to get bigger. 622 00:29:13,918 --> 00:29:16,751 The data now is going to stick around forever. 623 00:29:16,751 --> 00:29:20,667 So they will go back seven years, but maybe 27 years. 624 00:29:20,751 --> 00:29:21,999 Yes? 625 00:29:21,999 --> 00:29:25,999 AUDIENCE: (Speaker away from microphone.) TOM KEENAN: 626 00:29:25,999 --> 00:29:30,083 Scraper Wiki has a whole bunch of tools. 627 00:29:31,501 --> 00:29:35,501 I used Excel because that's all I really had to use to take that, 628 00:29:35,501 --> 00:29:40,999 but in the presentation there are those lists of Scraper Wiki and so on. 629 00:29:40,999 --> 00:29:42,876 There are a bunch of places where you have customized 630 00:29:42,876 --> 00:29:47,375 scripts that allow you to go out there and scrape public databases. 631 00:29:47,501 --> 00:29:49,501 The world is your oyster. 632 00:29:49,501 --> 00:29:54,083 I think every municipality now has something like that. 633 00:29:54,125 --> 00:29:56,501 We find more and more. 634 00:29:56,501 --> 00:29:59,999 I haven't found one I don't have problems with. 635 00:29:59,999 --> 00:30:03,000 The challenge between now and next year is to find more holes 636 00:30:03,000 --> 00:30:07,042 in open government and bring them to our attention. 637 00:30:08,250 --> 00:30:10,167 One more question maybe? 638 00:30:12,584 --> 00:30:14,042 Going once? 639 00:30:14,584 --> 00:30:17,334 You have one more. 640 00:30:17,334 --> 00:30:19,334 TOM KEENAN: Good. 641 00:30:19,334 --> 00:30:21,999 AUDIENCE: Do you know if any of those terms of service 642 00:30:21,999 --> 00:30:25,792 for public data requests have been tested in courts? 643 00:30:25,792 --> 00:30:28,876 TOM KEENAN: I don't, and I actually asked Marcia Hoffmann 644 00:30:28,876 --> 00:30:32,375 that question and she's researching that. 645 00:30:32,375 --> 00:30:34,999 I thought that's an interesting thing. 646 00:30:34,999 --> 00:30:38,501 All I can tell you in Calgary, I'll summarize the story. 647 00:30:38,501 --> 00:30:42,000 Because of people doing what I did and also because of that secondary use 648 00:30:42,000 --> 00:30:45,834 of the data to actually try to commercialize it, the city shut 649 00:30:45,834 --> 00:30:49,709 the database down for a period of time and when it came back there 650 00:30:49,709 --> 00:30:52,709 was a very elaborate terms of use. 651 00:30:52,709 --> 00:30:54,584 I violated it here today because it says this is only 652 00:30:54,584 --> 00:30:57,501 for checking your tax records because when it first came 653 00:30:57,501 --> 00:31:01,375 out people were checking their boss's house assessment. 654 00:31:01,375 --> 00:31:04,501 They were checking their ex-wife, everybody that they knew. 655 00:31:04,876 --> 00:31:06,459 They now control it. 656 00:31:06,459 --> 00:31:08,959 So what they've done, and it's a good point. 657 00:31:08,959 --> 00:31:10,999 They have two different levels of access. 658 00:31:10,999 --> 00:31:13,334 If you want to just know -- I should tell you, you are allowed 659 00:31:13,334 --> 00:31:15,999 to know what your neighbor is assessed because that 660 00:31:15,999 --> 00:31:18,083 is an issue of fairness. 661 00:31:18,083 --> 00:31:20,999 If you are assessed on fair, you have a right to feel. 662 00:31:20,999 --> 00:31:22,876 You are not allowed to know the square feet and 663 00:31:22,876 --> 00:31:25,125 all the other details one. 664 00:31:25,125 --> 00:31:30,334 They have a public facing one, there's a deeper level that you have 665 00:31:30,334 --> 00:31:33,999 to sign up and be validated. 666 00:31:33,999 --> 00:31:37,125 That's what Calgary did; an admirable job of taking 667 00:31:37,125 --> 00:31:40,501 a real mess and making it good. 668 00:31:40,584 --> 00:31:42,999 Hopefully other cities will follow that lead. 669 00:31:43,542 --> 00:31:45,292 Are we done? 670 00:31:45,292 --> 00:31:46,918 MODERATOR: We're done. 671 00:31:46,918 --> 00:31:48,417 TOM KEENAN: Thanks very much.