>> Good afternoon, how's everyone doing? Let's talk about attribution now. Nation state stuff is a big deal. The other night I went to bed and turn on television in my hotel room. I see Ted Crew talking about national security and nation state to nation state hacking. Let's give our next speakers a big round of applause. >> Welcome to big game hunting. Actually the title is fuck your attribution shows us your .idb. I'm currently the directly of security at media. Test. Test. Sorry. I also am a senior researcher at University of Toronto and I advise bunch of organization for security. >> Hello, I'm not as busy as Morgan is. I occasionally do reverse engineering malware. Spare time I give classes and hang out conferences like this one. I'm happy to be part of the most diverse group here. >> I was advice that the panel was more diverse but that's not true. We are starting a claudio hashtag. We got a lot to get through. Today we are going to cover a little bit about how antivirus move to antivirus 2.0 circle rise intelligence. We will talk about how that's made. Discuss the industry's fans nation, attribution. We will discuss a novel approach to binary attribution. I'll do some hacking team as oppose to and at the end we will have just by stage you will all be experts and have choose your own venture malware where the audience will help us out, set of nation state malware. I will start at at least the beginner of the modern era in terms that realizing that this game was going on. I worked at google when google was hacked by china in 2009. It was publically announced in 2010. And reaction was interesting. It lead to finger pointing. How can you hack our beloved silicon valley. And we've learned that silicon valley was owned by china and also everything is owned by china. And chess beating on china. And we learned that it wasn't just china. It was everything was hacking everything. And another the things life's big stage you probably know about net, the first world digital side web. It was a joint effort by U.S. and Israel to sabotage Iran's program. This is called the tidbit platform. More recently I did a bunch of work with claudio who isn't here, and malware called rigid. And this was attributed to the hack. >> On the other side funnier. Beginning of 2015 we uncovered not only the British government is interested in operating their own spyware and also the French had their fingers around Iran and certain institutions. And spying in the middle east. >> So earlier this year I expressed the NASA network. So what we actually see is basically all states using hacking, at least the larger ones are and the smaller ones probably want to be doing this. Around 2012 we learned there was commercial companies that were selling this capability to nation states. So bunch of commercial material from a conference surveillance conference. Bunch of people that went there to sell this capability to various nation states. So we learned about finfisher hacking team boot pin. They were selling implants and exploits and stuff, which lead to finger wracking. In these days everyone is selling implants and exploits. There are probably someone in distance black and wearing black hat who will offer to sell you these things. So there's a commercial scene, rigorous public debate to cover people actually selling commercial capability in this area. Right. So this is actually lead to corresponding change to industry. Ad 2.0. A lot of people might be similar to this if you are not playing for the software then you are the product. With a/v 2.0 you are the product and you are still paying for the software. So you actually manage, they do one better. >> The question is how, so I try to find the definition. And I didn't find anything that would fit one slide. So for intelligence we understand malware watching. When threat detection industry it's all about detecting threats and now it's watching threat. This is interesting is where is the malware, where did it come from, who is operating it, and what does the guy want to do with malware. Lots has been written on this. A lot of money flowing depending on the threat intel. You can imagine it's like, you can imagine a robbery, someone breaks into your house and steal the stuff in your house. And you hire the security company. The security company say don't stop him and watch what he wants to steal and try to find out because we have to figure out where he came from and where he's going. And watching this guy stealing the stuff from your house and security company is watching what the thief is doing. This is what we are seeing today with antivirus intelligence. This lead to industry working on tracking. Who operating the threat. Having interesting threat, interesting operator is actually worth revenue and gain market value. What's even more interesting to me is the numbering a/v ts for the sake of publicity. With revenue, when nation malware it's interesting it causes, it has a lot of public interest on the cyber warfare scene and number of targets are small. Security company can be fairly sure by holding the information back but not publishing soon and watching the actor for a long time, actually smaller customers at risk. How do we find these? Let's think about this as a needle, in a haystack problem. A problem in a big malware process. These are automatically processed and traded. Company exchange samples, scans, send on to other vendors or whatever. All these big hay sack might or might not contain interesting samples or our precious treasures. How do we find these treasures? So as you might know we are both here worked on malware. That is very specific problem that we have is massive haystack and have to find the needle. With the different needle signing is what you want to do is use the indicator, you know the malware you are searching for and request certain malware from data and if you find your indicator and data already there. This is one way. By the way the same problem, in data, you don't find much of that data. Another interesting source is leaked document. This is recent phenomenon. We have documents actually describing these precious needles. The infected machines. If you are not at all have big haystack to dig for, ask people what they have in their machines. Last point, is gossip. You wouldn't think how much information is exchanged at the bar after a couple drinks. This doesn't seize to amaze me. So what we have today is endpoint wars. I started my career at an antivirus company. Antivirus company some years ago what they actually do is not writing antivirus engine but security points product. So they have endpoint agents. They are out endpoints to scan for indicator compromise or threat indicator searched for and antivirus input agent through mitigation tactics. Endpoint data to the company operating the product. So this is the data. How does it work? Here's our endpoint agent, effect infection and mitigation. This is the signature and detection patterns of the security company produces. This is being sent out frequently so they are up to date. Depend on the threat caters. So what agent send data to security company to threat indicator. This quality insurance data. It could be the binary itself. So security company can check to see if it's a real threat or positive. And hit frequency. So actually these security company know what their agents do. This data is the precious data being used for the security vendors, we have detection for this here and we assume, the government in India. How does it work back stage? We do signature testing. So the threat security company send endpoint and see how the threat data works. At the data search for blood post. Another phenomenon is silent signatures. They are detection send out to the end point to work silently and generate qa data to see if the signature works or if it might produce a massive false positive. The signature is not activate at the beginning. These can be known to use occasionally but do well searching for well without indicating to the customer that there's something wrong in the machine. This is a theory. Binaries are sent back to the threat detection companies to check signatures. Look at binary and sneak some binary out of the machine with their endpoint. And data produced and summing it up if you install free security product be there they are produce data. The free security product are actually contributing to the a tela machine. Another phenomenon while digging through our haystacks and searching for our needles you get to talk to other people interested. Occasionally the same malware you are searching for, you will find yourself in very interesting conversation for people you never thought would talk to you all of a sudden very friendly or the opposite. Oh, no. I saw your e-mail and I just didn't have time. These things happen. >> Yeah. So the issue frequently when you are actually looking for a particularly, maybe you are hunting a country. Maybe you know New Zealand is doing interesting openings. >> Never happened. >> So in the real world example of this is malware that I worked on for with a variety of people for a significant period is attributed to the 5 is. We will get to that. The gcsq and that sort of thing. It was allegedly used on the hack European. Now when I was working on this it became apparently that it's the worst antivirus. Because people had known about it for quite some time. So somewhat naively. I suspected they wouldn't published and then somehow it's published that week. Semantic published Sunday night. And another a few hours later and my report came out the next morning. >> Imagine getting your market team working on Sunday night. >> I'm not bitter about this. But actually this happen when you are doing this type of work. And then all of a sudden five or 6 different people in the antivirus searching for the same thing and you are all working on it but yet who is actually releasing it. It gets tricky very fast. >> All right. Before we actually get to our big game hunting, we want to speak out to our friends in Africa. I want to tell you there's no harm during the presentation. >> Sisl. We sort of forget about actually victim/targets really, really fast. There's this kind of position with sophistication and attribution fuel reports on this. So I notice this perspective really illustrated when I went to an antivirus company. I want to do a small test. Google does this state sponsored warning. They will stick a banner on your gmail. We think you have been targeted. Are you guys familiar with this? How many people here actually received that warning? Not too bad. A couple. So I went to this conference in the middle east. Primarily people who had done a lot of political writing. Fair, balance reporting how the government were not very keen on freedom of the press. And I asked this question and half the audience put their hand up. So I was at this a/v company recently. And this guy says to me. Hey, where did you find interesting sample it the other day. We it was sent to me. The guy said, that's cheating. If I'm not obsessing about malware. I'm obsessing about sophistication. This is cheating. So this industry is kind of missed up. What's interesting and what you should be doing in this research. Malware is used espionage in some places that policy is draconian. This guy, I spent couple of years tracking the digital campaign going on in Syria. This guy was talking to the aid worker a lot. His computer was, actually their computers were compromise. It only Syria police have all the records of his skype conversation, e-mails so on and so forth and this stuck with me. That's actually a hard drive. It got smuggled out and Fedex to U.S. with me. I actually have it onstage with me because no one leaves anything interesting in the hotel at Defcon. This woman's name is Alasha and the guy, her name is actually finfisher zero. It's a sweet governmental intrusion solved by a German company and they did a bunch of interesting thing. Sell it to Egyptian during the revolution. When I did the publication a few ago, she was the first person who sent me a sample. She's a London economic professor. Her husband was seized and they got no word of what happened to him for 48 days for spreading information about the government largely on the comments he made on Facebook. This guy I believe he's also an advisor of human rights watch. His official charge was insulting an officer. He had no idea how they were tracking him. So I did something on his machine and I found malware by these guys. Anyone here from hacking team? But in this case, attribution was in that. For brief malware targeted that address. So in this case it was recently easy to find out who was doing the spying. Which brings us to this guy. Some of you might be familiar with this case. He was argentine attorney who was about to bring charges prisoner of Argentina and high level partition for a cover up terrorist attack which killed people. Four days before, he was found dead. There were, it was ruled as suicide, apparently he shot himself in his head with powder left on his hands. His death lead to protest in Argentina. It was actually published in a small Argentina news out let. And they found one that was. And it's called [ on screen ] that. >> [speaking in Spanish] Happy to help. >> And it was a bit confusing. The malware was actually for windows. It turns out the conditions around the forensics of devices gets murkier and murkier. What can we do? We can extrapolate a bunch of targeting. The way you do this is you search for the sample. One hit, one up, Argentina. It is related. Control remain, the people targeting him, using political bait, network based indicator suggest to us the actor were based in Argentina. The actual malware itself was interesting. It started as proof of concept. Someone actually wrote a proof of concept for android and it's called fruit tus. And someone selling them. And popped up as reasonably cheap. Another piece of for sale malware and recently seen it doing around attack as alien spy. As you can see on seen it's mostly Spanish language and does a variety of stuff. Turn off the microphone on your laptop or cell phone so you can listen to ambient noises around the device and that sort of thing. This is the similar targeting the same group is doing. I can't tell if the document is real or not. The document is supposed to give the implant it looks like from the embassy from one to brazil. Indicators point to people in Argentina, well actor based in Argentina. However we also see the use of hosting services in the U.S., German, Sweden, go daddy. So it could have been anybody. Right? >> It was definitely a suicide. >> You said about attribution sometimes it's tricky sometimes it's easy depends. In that case, attribution was semi tricky. Because I didn't actually when working on malware not do any attribution because attribution had already been done. In the case of barbar, which is allegedly written by French intelligence and the attribution actually was made through a leaked document published earlier this year saying that Canadian intelligence services found Canada and attributed to the French. Canadian did a good job and I totally agree with them. But, of course, in real life we don't have this opportunity. Furthermore, barbar wasn't the only user used they came with brothers and sisters. Downloading script partially change. But more. They were other family namely casper and deno. And we research these samples and yeah. These look like the same author. People were like why are you so sure that they run together. Especially for the end bot we had a serious issue with actually proving our statement. Okay. These are operated allegedly by French intelligence. So what do we do and help with this problem? You might have seen a lot of these posts oh, my god we found more sample. And duku 2 related to duku 1. All these statements they never came with any understandable prove how the analyst got to their resource. So pretty how to transparently prove how two binary is related or not related. One can do a suitable conclusion who the actor was. The bar was mentioned in the document. The document said it was false then we know the family members were related to barbar. How did we do this? Malware attribution is who wrote the malware, control it, the victims, the problem is if you have ever done in the binary, it doesn't tell you any of this. The time when the names written into the malware, that was back in the '90s. So with this the people who wrote the malware were not the same people. Someone bought it and used it later on. We want the binary and put it into a context but we can't get to it through binary. So what we want to do is linking binaries together. And we have this set of binaries and they look like they are related to the malware. And we conclude that it the same. So how? There's already research being done how source code can be attributed to an author. With that if you have a certain set of samples for machine learning and attributes extract samples to determine who the author for the source code was. This works on source code if you have one author writing hundreds of samples to train machine. The problem with binary is you don't have any more handwriting left. You don't see white spaces variable names, comma, nothing but binary have massive influence on the compiler itself. Another problem in this is there's a team of author and this team of author might change over time. So the same person who wrote this malware won't be able to help to find the writers for this. So how can we tell if the same author wrote this? Very important the attribute that we get for describing our binary get from different domains. Which means we not only look at the technique of malware uses, overlapping source code. Because source code can be copy/paste by anyone. So we try to grab attributes from others to spread the probability to even out human comparison so we try to get as many attributes we can get our hands on. And another reason is attributes can be fake. If you are malware author you don't want people to recognize and attribute to binary. You can even out fake attributes and link to the binary. The assumption is that it's impossible to random the attribute. You can even out human influence. You are not interested in individual guy the code but who operator it and stole the data. Here's how the attribute that we propose in our paper. I'm sure you can read it very well. But memorize them all, I'm kidding. These are the four I was talking about. The string constant in the binary we found very helpful. The implementation trait, how specific activity performed, memory, construct and deconstructor implemented, et cetera. The third column is feature in malware. Malware constantly does the same things. If you are someone writing malware, you generate do not implement these several times over. Just erase it. Because these are usually expensive. It's hard to find people who are able to implement these traits. First column is infrastructure to raise the bar even higher. And infrastructure as cnc server and location are helpful in [ inaudible ] binaries. Here's our proof of concept and I'm sure you can read more later. Proof of concept on malware when we try attribute linking, the yellow lines indicate that attributes overlap. The overlapping in the eye of the analyst. I want to grab the attributing describing the dynamic api loading and all these things. These look similar and yes it does. What does this tell me? The binary on barbar helped us linked the other different family to that. [ on screen ] linked with each other. Actually by doing so we were able to create much bigger picture than we have that our operators would do espionage that they would do that in Syria because that's where they found the machine and iran and given this might be the French government. They ran partners in 2010. Now we didn't figure out why French would need service button. There are several problems not to get in details. But we were also to do attribution by linking binary we still didn't know. We need to reverse engineering a lot. There's very few automation and machine learning is always in the eyes of the analyst. >> So we are going to talk a little bit, I guess, attribution in our industry have done in a number of ways. There's soft attribution. Malware family is linked together because they have attempt files. This malware family was written by dino. So a good example on soft attribution. When we look at complex of malware, lots of moving parts and looking at it for a long period of time maybe a year. And it starts off mailing list, someone posted, people talking about difficult to reverse sample of Chinese binary and someone post this and look at this stuff. Crickets chirping. And then someone said wait that's from china. This is as you can see it's an underscore sample this looks like a forensics sample from a compromise machine. Forensics tool which bits the frost it imprint. There's a process log, and this actually gives us the name of the system analyze and where it was. What we get is the name. Britain ghq hack Belgium teleconference. The timing is right. And so on. So I mean you kind of know. But I call it soft attribute. You don't have hard prove. You don't have someone standing up and saying this is me. However we did end up getting this. It turns out looking at this code closely this code identical to blah. Blah. Plug in. A friend of mine said it better even blamed him as the original. That's what I call hard binary. The evidence you are producing in research in these areas legal spies are obliged to lie. Dnr was caught in a difficult position in front of the government and tv whether NASA was spying on millions of Americans. Once it came out that he lied, he said he said least deceitful things. This has gone on. As any of you look at the hacking team there's stack and stack of sketchy lies. And pointed to the sale of malware to another government. After a while they were asked if it was sold, and they said no. Not us. But they showed money wired millions to Sudan. So you actually can't expect when you produce report about government espionage you got us good on you. That's not how it works. Hacking team is an interesting case. They listed out, this is a slide from intel presentation there about the people they were worried about. They were worried about citizen lab, and variety of other things. If you are worried about democracy activist worldwide you should probably change your business model. Now the lies actually continued actually to the point of the leak where this segment said there's attack and not true the leak contain virus. Note contain virus which you sell. Hacking team don't particularly like me and they issue a statement about me that I've been on a wolf cry...which I found it was quite hurtful actually. [ applause ] But I'm sure it wasn't personal. They named me by name and there's the photo and audio recordings of me. This just gets weirder and weirder. And this is the most bazaar one. In great opsec they have a link on this. FBI quietly formed security unit. It turns out they actually say meeting with these people with anything good came out of citizen lab article is it brought us them contact. So the FBI read the report. This malware is good. Maybe we should buy it. So hacking team I want my 15% sales cut. That's weird. We are getting harass so we got to skip to the fun bit. Choose your malware story. Something new. We have difficulty naming this. >> Stick to industry standard on naming stuff. We went through the numbering. >> You are going to silence us and stopping releasing. >> If you could do it in 7 seconds, I will not stop. >> This is really funny. This is the tool that the government we are about the out. This is pretty neat. >> I am not a tool of the government. >> We are trying to speak the truth. [ applause ] >> Sorry. Guys. >> Don't blame me. Blame the man. [ applause ] >> Want to give them 3 minutes? >> All right. Joking aside. It doesn't have anything to do with the character. We found interesting sample. Being active from 2002 to 2011. Choose your own adventure. So we have an actor being one for almost ten years. >> Another malware family being around this long. In 2009 years is a long time to be actually writing this family of knowledge. >> So check the security process whatever. >> Talk about the 16-bit stuff. >> Cool stuff on the machine. I was prepare to run the windows 95. And actually had a check built in. Research for ne value. I don't know I'm young engineering that was before my time. It was built for Microsoft 16-bit system. >> The next malware related we found modern era 2007, 2009 suggested that it was used and it's pretty old. It might be a good time to know that government are awesome at updating their system and try to get prepared for linux p for example. >> To be running it was a window 4. The program manager which was introduced to be window vista. So we were doing archaeology, grave digging site. So we knew these belonged together. >> Lots of cnc service. Pulling. Very stealthy communication infiltration. >> Faster. Here's another sample. It doesn't try to enumerate others. So the author knew where it was. Summing up. Learn very well how to work through the work. The de-assembling work ten pulse drop on the machine and these will start with a d. Remember our introduction. >> So people at Defcon love to see Blackhat. When we did this then, we ran out of time. So we did that and it did what it did which is attribute it to china. [ applause ] So who actually thinks this is actually -- (no audio) -- before we go off stage. The first person who gets it right I will buy them a drink. I can't comment on that. All right. We are out of here. >> Thank you. [ applause ].