00:00:00.167-->00:00:05.172 >> Aah, next up we have Dr. Paul Vixie and I will remind you before he speaks, ummm as you're 00:00:05.172-->00:00:09.076 coming in on the left side, ummm, when the talks done, make sure when you exit out, ummm, 00:00:09.076-->00:00:13.914 exit out your left side or stage right go out those back doors or the side doors here and we kinda 00:00:13.914-->00:00:18.185 keep the flow going through when you're done So I'd like to present Dr. Paul Vixie 00:00:23.123-->00:00:28.128 [clapping] >> Thank you very much, thank you for inviting me Ummm, we have been doing some 00:00:30.664-->00:00:36.036 science fair project work at my day job And we have some intermediate results that I 00:00:36.036-->00:00:40.974 thought would be interesting to this crowd and the programming committee agrees So let us 00:00:40.974-->00:00:45.979 begin. Ummm, front-runners are people who are grabbing things, uh, that may be valuable to 00:00:48.949-->00:00:54.054 others for the purpose of either hoarding them to drive up the price of other things or uh just 00:00:54.054-->00:00:58.725 keeping them so that you have to pay them essentially ransom to get it. I was a member of the 00:00:58.725-->00:01:02.896 ICANN security and stability committee when we published this report because there was an 00:01:02.896-->00:01:07.501 awful lot of front-running in form of domain grabbing People were grabbing domain names for 00:01:07.501-->00:01:13.540 reasons other than uh using them and uh we thought that was a security and stability problem 00:01:13.540-->00:01:18.545 and uh wrote a report. uh one of the ways that this manifested was in tasting domain tasting it 00:01:21.148-->00:01:27.054 used to be possible within certain limits for some registrant to grab a domain 00:01:27.054-->00:01:33.627 name, keep it for three days and then return it without owing any money uh as I said it's within 00:01:33.627-->00:01:38.999 limits they had some things they couldn't do too many per day and they had to pay for some other 00:01:38.999-->00:01:44.237 number of them and so forth it was basically it was a loop hole by which a small number of 00:01:44.237-->00:01:50.944 people were keeping a whole large number of domain names uh because you could grab the same 00:01:50.944-->00:01:56.249 one seventy three hours later and uh anyway that was all shut down ICANN finally did something 00:01:56.249-->00:02:01.688 they realised uh that this was bad for the world and as a 501C3 public charity they thought that 00:02:01.688-->00:02:07.094 they should help the world in that way so they got rid of domain tasting uh that does not 00:02:07.094-->00:02:12.099 mean that front-running has gone away so we are a passive DNS company uh at the moment we have 00:02:16.370-->00:02:20.807 a lot of other data sources but we're known for passive DNS and we have a lot of real-time data 00:02:20.807-->00:02:26.947 and we thought we would take a look at whether the real-time information flowing through the 00:02:26.947-->00:02:31.952 DNS could be of help to somebody who wanted to uh acquire domain names for the purpose of selling 00:02:34.421-->00:02:39.926 it later or maybe collecting traffic because of typo-squatting or whatever ummm 00:02:39.926-->00:02:45.699 now most of what we do focuses on things that exist but we do have a channel that uh just 00:02:45.699-->00:02:51.838 talks about obversations of non-existance which is called NXDOMAIN in the uh in the DNS 00:02:51.838-->00:02:57.611 field ummm and so although we don't have a database for that we do have very good real-time 00:02:57.611-->00:03:04.051 flows so this turned out to be kind of a good way to use our position of observability in the 00:03:04.051-->00:03:09.489 industry to uh figure out whether this was a problem that we could then bring to the 00:03:09.489-->00:03:14.494 attention of ICANN and others ummm we are not concerned about people who are amateurs at this 00:03:17.497-->00:03:23.136 if you think of a cool name of a cool domain name when you're taking a shower or driving on 00:03:23.136-->00:03:28.141 your commute and you go register that you're not going to cause a problem for anybody ummm there 00:03:28.141-->00:03:33.914 was a chance to do that thirty years ago whoever registered scuba.com probably later sold it 00:03:33.914-->00:03:39.119 for millions and I can tell you that the guy who registered altavista.com did in fact sell 00:03:39.119-->00:03:44.357 it for millions to DEC when they had a search engine by that name but we're not worried about that 00:03:44.357-->00:03:50.530 ummm because it's by definition not scaling but the professionals who are working at 00:03:50.530-->00:03:57.170 scale uh are really getting in the way it's inevitable now that when you start a company that 00:03:57.170-->00:04:03.643 one of the things you test for is can I get the dot com name if you can't get the dot com name 00:04:03.643-->00:04:08.949 chances are you will not choose that name for your company or your product uh which means 00:04:08.949-->00:04:13.820 these name are very powerful they are as powerful as if an international trademark would be 00:04:13.820-->00:04:20.227 if such a thing existed and umm wherever there is money to be made you'll find people looking 00:04:20.227-->00:04:26.533 for loopholes that will allow some of that money to flow their way and umm I think we have to 00:04:26.533-->00:04:32.539 pay attention to that I think that the DNS should really be available for people who want to 00:04:32.539-->00:04:38.044 add value to the Internet rather than adding money to their bank account as their first act I'm 00:04:38.044-->00:04:44.818 happy to do well by doing good, but doing well at the expense of others is a problem Sorry if I'm 00:04:44.818-->00:04:50.157 going a little fast for ya I have 50 slides this is a one hour talk I have 20 minutes to 00:04:50.157-->00:04:55.162 do it in ummm so ummm we don't sell the NXDOMAIN feed to spammers or domainers and uhh 00:05:01.168-->00:05:07.274 that does not just reflect my own ethical concerns about it it's a very practical concern uh 00:05:07.274-->00:05:13.079 we have 200 thousand cache miss transactions coming to us every second and these are from 00:05:13.079-->00:05:18.151 customers who operate a sensor for us in exchange for a discount on commercial services 00:05:18.151-->00:05:23.156 umm and they could if they saw us doing evil with their feed, uh stop sending us that data so 00:05:26.893-->00:05:31.731 we're under contract and that contract could be terminated they could choose to either stop 00:05:31.731-->00:05:37.304 using our commercial services or to stop getting a discount on them which means I have to be 00:05:37.304-->00:05:42.609 extremely careful with what we allow their data as it flows through our system to then be 00:05:42.609-->00:05:47.180 allowed to do and a couple of them are ethically very concerned about things like 00:05:47.180-->00:05:52.252 front-running things like domaining. That's why we have a fairly long contracting process 00:05:52.252-->00:05:56.756 when people want to buy data from us will end up sitting with a fairly senior person at 00:05:56.756-->00:06:01.561 Farsight and describing their intended use and signing a contract indicating that and 00:06:01.561-->00:06:06.700 only that will be the use thay make of our data and if we later find that they made other uses 00:06:06.700-->00:06:11.738 then they're gonna discover that the contract has teeth. So as you listen to this please 00:06:11.738-->00:06:16.876 understsand that we dont do surveillance and we don't help spammers even though we have a 00:06:16.876-->00:06:21.881 very good observation framework within those constraints. So again this is a science fair 00:06:25.552-->00:06:31.124 project and I'll tell you what our hypothesis was. We thought that there would be somebody out 00:06:31.124-->00:06:36.429 there who could see NXDOMAIN traffic although probably not from us because as I said we 00:06:36.429-->00:06:42.469 wouldn't sell to them, but we're not the only game in town and a lot of other isps and so forth 00:06:42.469-->00:06:47.307 are data-mining everything they can because they're engaged in a race to the bottom on the 00:06:47.307-->00:06:52.312 margins on their primary product and so they're all giving an eye towards social networking 00:06:55.482-->00:07:00.420 basically selling your PII as an additional revenue source for them and ummm so we thought are 00:07:03.790-->00:07:09.329 there people looking for typo-squatting are they registering domains that are one 00:07:09.329-->00:07:14.734 letter or in some cases one bit off of some existing name for the purpose of catching nearby 00:07:14.734-->00:07:19.739 traffic umm or are they looking for permutations uhh in Vietnam we called this recon by fire: 00:07:21.975-->00:07:28.448 shoot a machine gun into the forest and see if anybody yells ummm and uh we looked at various 00:07:28.448-->00:07:35.455 thing like hamming distance and whatnot to find what nearby meant in the context of nearby 00:07:35.455-->00:07:40.460 domain names we have of course I mentioned we have a passive DNS database and that's what we're 00:07:42.495-->00:07:46.833 most known for but it's built on the realtime foundation called the security information 00:07:46.833-->00:07:51.838 exchange and umm we're seeing maybe 700 megabits a second of real-time data which about half 00:07:54.007-->00:07:59.813 is DNS the other half is random other stuff like spam ummm and that is the foundation of our 00:07:59.813-->00:08:05.752 passive DNS database so although the database pays the bills the real time exchange is the source 00:08:05.752-->00:08:12.225 of the data that we used for this science fair experiment and umm I wanna mention that if you 00:08:12.225-->00:08:18.498 are out there doing good uhh by our definition and you're not charging a fee for it, you 00:08:18.498-->00:08:22.836 probably qualify for a free grant of our services, whether the good you're doing is 00:08:22.836-->00:08:28.508 generating chapters for your masters thesis or whether you're an internet superhero it doesn't 00:08:28.508-->00:08:35.482 matter to us uhh so please do talk to us about that, we're not just a commercial company ummm 00:08:35.482-->00:08:40.687 so we looked at NXDOMAIN data, we looked at newly observed data, we have a channel for each 00:08:40.687-->00:08:46.793 one uhh I've just gotten my 5 minute warning believe it or not so uhh I can't go through these 00:08:46.793-->00:08:52.031 in as much detail as I'd like but they will be available online, and of course you all 00:08:52.031-->00:08:56.736 apparently have my email address please use it This is what it looks like on the nxdomain 00:08:56.736-->00:09:03.510 channel broken out into ascii So you can see the delegation point there at the end ummm, that 00:09:03.510-->00:09:07.847 delegation point nflixvideo.net is what you would have to register so that's what concerns 00:09:07.847-->00:09:12.852 us and the expression of negativity is below that ip 41.0 etc is what didn't exist 00:09:15.321-->00:09:20.326 nflixvideo.net very much did exist ummm you can see that on whois it belongs to netflix the 00:09:22.429-->00:09:26.499 other channel we have is newly observed they look like this and you're just seeing the domain 00:09:26.499-->00:09:31.504 startjobs.xyz in this case so ummm, these are in vastly different volume scales, some 00:09:34.441-->00:09:40.380 5000 increments on the top and 2 million on the bottom so you're seeing a lot more NXDOMAIN 00:09:40.380-->00:09:45.385 traffic than you're seeing the uh newly observed domains uhh, the stuff that you see as the 00:09:48.121-->00:09:53.026 parent of the things that don't exist is what you'd expect a lot of people look for non-existant 00:09:53.026-->00:09:58.031 ip6.int or non-existent spamhaus.org subdomains ummm we filtered we did all kinds of 00:10:00.133-->00:10:06.172 stuff umm, a lot of junk, huge amounts of junk bad characters that make it into DNS that 00:10:06.172-->00:10:11.177 shouldn't they're just buggy libraries and buggy applications ummm and what do we find, this 00:10:15.114-->00:10:20.119 is what I really wanted to get to so I'll use my last minute for that ummm, so there wasnt' a 00:10:22.122-->00:10:28.328 huge amount of evidence that NXDOMAIN correlates to newly observed domains and we think 00:10:28.328-->00:10:33.566 that we can uhh hone this down by doing different science using slightly different data and 00:10:33.566-->00:10:39.072 filtering it out in different ways so we did learn some things but we're skipping the slides 00:10:39.072-->00:10:43.610 that tell you what we learned but ultimately there were only 181 cross-overs where something 00:10:43.610-->00:10:48.414 was negative before it was positive and out of the size of the data set we have lasting a 00:10:48.414-->00:10:54.721 week that could easily be completely reasonable people that are just registering stuff 00:10:54.721-->00:11:01.494 that they uhh they do plan to use so in other words at the moment we have no evidence that 00:11:01.494-->00:11:07.166 the bad guys are using NXDOMAIN to drive their operations as to whether they should I'll leave 00:11:07.166-->00:11:12.171 that to them to decide If they do we'll be watching Ummm, there's a huge amount of other 00:11:14.774-->00:11:19.779 crud I wanted to let you know it turns out NXDOMAINs is a great source of DGA intelligence: 00:11:23.116-->00:11:29.055 Domain Generation Algoritm. Bot-nets that are using the DNS in order to find their Command 00:11:29.055-->00:11:35.328 and Control use these gibberish domain names that are computed based on the current time of day 00:11:35.328-->00:11:39.199 in order to decide where their Command and Control are going to be so you as the owner of the 00:11:39.199-->00:11:44.637 botnet need only register one of the ones that your botnet is going to be using tomorrow and 00:11:44.637-->00:11:49.976 then tomorrow you'll be able to send it commands, but all of them have to do lookups for all 00:11:49.976-->00:11:54.714 of the names that might be used for tomorrow and so NXDOMAIN as a data-source this is something 00:11:54.714-->00:12:00.887 Damballa, David Beggin's team at Georgia Tech told us 10 years ago and I thought that perhaps 00:12:00.887-->00:12:05.992 that would mean that we would have dealt with it by now, but no according to our uh our data 00:12:05.992-->00:12:10.997 this is very much real so my conclusion is, umm well, my conclusion way down at the other 00:12:16.235-->00:12:20.206 way, excuse me one more thing I'm going to go over time they're gonna kill me now Paypal 00:12:20.206-->00:12:25.945 has a problem, ummm, these are all things that had payp somewhere in their names, and 00:12:25.945-->00:12:31.651 umm, you know Paypal is one of my partners so I've already informed them of this, but every 00:12:31.651-->00:12:37.690 domain holder, every trademark holder has a problem like this one and you're gonna find these 00:12:37.690-->00:12:43.162 often in the NXDOMAIN first bceause they will be doing reconnaisance to see if this 00:12:43.162-->00:12:48.167 exists before they try and register it umm and again this is just one trademark, so, ummm, 00:12:51.971-->00:12:57.910 yeah here we go, way to many uh, slides there, anyway, ummm, my hope is that NXDOMAIN traffic 00:12:57.910-->00:13:02.682 uhh is going to be the next big data-mining oppportunity, and my hope is that uh we will somehow 00:13:02.682-->00:13:07.687 keep that data-mining from happening on the dark side of the economy and uh if you are 00:13:12.425-->00:13:17.397 interested in doing sceince for which you do not charge a fee you can get a NXDOMAIN feed in 00:13:17.397-->00:13:22.468 real-time from us but whatever you do I hope you will at least give some consideration to 00:13:22.468-->00:13:26.572 things that don't exist and the implications of that non-existence on your your 00:13:26.572-->00:13:31.577 security and the security of the Internet thank you, do I have a minute for questions? ok 00:13:51.664-->00:13:56.669 [clapping] >> [inaudible] >> just scream it out >> [inaudible] >> umm, to me it was 00:14:02.608-->00:14:09.248 the fact that non-existent names that have a short hamming distance from Paypal are being 00:14:09.248-->00:14:15.021 reconned. I assumed that people would try to register these things and use the you know "I'm 00:14:15.021-->00:14:18.558 sorry that name already exists" as their signal that they couldn't get it but they're not 00:14:18.558-->00:14:23.563 they're doing reconnaisance first >> [inaudible] >> umm, we did not look at, you'd have to 00:14:30.436-->00:14:35.441 decode the IDN strings into uh UTF or something and our level of processing was command-line 00:14:38.778-->00:14:43.049 awk scripts for this so no we did not look at any of the international names for this 00:14:43.049-->00:14:49.155 study. I'll be back next year for an update on this that has a lot more detail >> [inaudible] 00:14:49.155-->00:14:54.627 >> ummm, no, the new gTLDs were conspicuously absent from this study, ummm, I wanna say as I 00:14:54.627-->00:15:00.800 usually do that the way that most of us in this room first become aware of the existence of 00:15:00.800-->00:15:05.805 a new gTLD is when we get spammed by it. and so I was expecting to see a fair amount 00:15:09.308-->00:15:14.313 of the bad behaviour happening there but it's not and I think it's either that the bad guys 00:15:21.020-->00:15:25.858 have other lower hanging fruit and it's more profitable to forget about that stuff or they 00:15:25.858-->00:15:30.329 realise that these new gTLDs are very sparsely populated unilike something like .de or .com and 00:15:33.266-->00:15:37.804 so there won't be very many opportunities to sort of play battle-ship in those spaces 00:15:37.804-->00:15:42.809 until and unless one of them ever succeeds. >> [inaudible] >> louder >> [inaudible] >> uhh 00:15:57.623-->00:16:04.030 that is true. if someone does a registration event and then they don't use the domain we will not 00:16:04.030-->00:16:08.734 see it in the newly observed domain feed so when we're looking for things that were 00:16:08.734-->00:16:15.241 first observed negatively and then observed positively, so you could bypass a study like ours 00:16:15.241-->00:16:20.780 by simply avoiding dns queries and doing everything at the registry level umm and we'll 00:16:20.780-->00:16:27.653 probably do a study there we have three bulk whois providers that we can use for that study, 00:16:27.653-->00:16:31.991 they're partners, umm and we'll probably have an update on that in the next version of this talk 00:16:38.130-->00:16:43.135 >> [inaudible] >> the question is: how much whois information is private and my answer is ummm 00:16:49.542-->00:16:55.081 I estimate that very little of it is private, probably on the order of 10 to 15%, uh because 00:16:55.081-->00:17:00.119 bad guys would rather use the address that corresponds to the stolen credit card they used to 00:17:00.119-->00:17:04.757 buy the domain because that will help them get in the door in case the whois information is 00:17:04.757-->00:17:09.762 used to validate the credit card number. ummm, so really the whois privacy tends mostly to be 00:17:11.764-->00:17:18.304 used by people who just don't want to be spammed, rather than by people who wanna do crime. 00:17:18.304-->00:17:22.975 That doesn't mean I love whois privacy I'd like to see it go away, but I don't think it's the 00:17:22.975-->00:17:27.980 problem that people are worried about. Is that it? >> [inaudible] >> How do you get 00:17:32.985-->00:17:39.425 the slides? >> [inaudible] >> ummm, my email address is vixie@fsi.io so you could send 00:17:39.425-->00:17:44.430 me mail or you could wait for it to appear on the defcon website, which it will inevitably do. 00:17:48.467-->00:17:51.671 Right, thank you [clapping]