>> I have a question. >> What's the question? >> Is this your first time speaking at DEF CON? >> It is. >> Oh, excellent! [ APPLAUSE ] >> So, actually, I wanted to say that we were experimenting with something new this year at DEF CON. Can you set us up, please? We were looking for a drone delivery system. >> What is this? [ LAUGHTER ] >> What do you think? >> Yeah. >> You all know how this works. You've been here long enough. >> Okay. >> Whoa. Hi! [ APPLAUSE ] >> Here. Thank you. >> Cheers. >> And now, back to our regularly scheduled talk. >> So... [ LAUGHTER ] It is a bit warmer. Let's see how the exploitation process takes place. If you are trying to exploit something, you have to first of all find a vulnerability and this vulnerability has to be useful. By useful, I mean that you have to have a flow of the program that you want. Once you are able to do this, then, you can perform your desired actions. I don't know, copying wallets and doing your actual final aim. Our thought process is on this last part. We are assuming that we are able to divert the (Indiscernible) and we are going to see what we do afterwards and how do we do the last part of the exploitation process with the counter measures that I present. Just being able to divert it is not enough. The question is where do I point it to? That's the important thing. Since it is 2015, we can up load a shell target and the operating system prevents this kind of stuff. The hackers and the exploiters came out with the concept of the code attacks. We cannot inject new code. So, we use existing code. So, we came up with all of the return orientated programming, which you are a bit familiar with. The problem is that we are able to still perform our attack if we are not able to enter a new code, a shell code. This is an ASLR, which is layout presentation and the code that you want to use, for instance, the system library functions, and those kind of functions are not in the same position. Their position is not deterministic. The code that I want to use is there, but where is it? The typical situation to get around ASLR is use the functions that are already imported. The main executable and let's suppose that the main executable uses the print function, which is very common. It keeps in memory, the memory area, which is dedicated to the main binary, which is a reference to this object and holds a reference to the print function. The typical way to bypass is the ASLR is try and read the address of this printer function, which is important by the binary and if you are targeting the system of the library and let's compute the distance between the printer function and the function that we are want to inject to perform our malicious operations and you can add the distance to reach exactly and we know the address of exactly and we can code exactly from our tech. This works, but the problem is that first of all, it requires, you don't just have to be able to divert the code, but you have to be able to leak this piece of information. You need at least, two vulnerabilities. You need knowledge about the layout of the library that you are targeting. You need the exact copy, meaning the exact copy of the library and that's not always the case. There are instances where you don't have access to it. And there is another point; you need to interact with the attacker. It is not just that you launch your attack and it works and you do your exploitation and you have to read on the attacker side and then another page of the exploits. So, it is two-stage. This is a problem because your target (Indiscernible) and maybe, you have a JPG or a P cap file, which is open and you cannot really communicate, so this is a problem. How can we solve this problem? Let's try and zoom out a bit. Actually, our idea when we came up with this technique. Let's try and zoom out a bit. What are we trying to do? I have a name and I want its address and so I'm able to code it. If the operating system is doing its job, the coder. The dynamic coder is to take the main library or the library and see what the important functions, for instance, printer are and retain its address where the printer function actually resides in memory. Let's try and get to know this guy, the dynamic coder. First of all, I'm going to talk about dynamic coding and everything else on the context of ELF-based codecs. If we consider an ELF file executable, there are sections and the most important sections that you usually deal with and maybe, you have been doing reverse engine nearing and they are all dot text. All of the sections start with a dot. The actually binary code of the application and we have dot data and it holds all of the global data and they are not reliable and (Indiscernible) and it stands for we don't make it up. BSS, which is a section and it, is a global variable that is not initialized. And you would write a global array and if it is not initialized and when we see this piece of code, like a very simple program. It just prints out. It is not a code to the printer in the systematic library and like I said, the position is bungled and you cannot know prior to the time where the printer is going to end up. It codes the printer at PFT and it is a piece of assembly code that we are going to see. More importantly, it is establishing another section, which is a table. And one for each important function. Basically, let's see how it looks like. So, this front loading is used to support lazy loading and it allows you, when you start the program, instead of solving all of the important functions and getting the real address of the startup of the program, you retain the address the first time you code them. So, if there is a function, it is not going to be called and you are saving time. And it is faster for the end user. It works like this. The first time it is being coded, the main binary is going to pass the code to the main loader and have this function, which we will explore in a little bit. Otherwise, it is going to jump to the cache version and if this the print F, somewhere in memory, we are going to see where the place where the others of print F is kept. The run timers going to take care of finding where print F is and storing the address and the cache of the print F address and also, call the function. It takes two parameters. What is a relocation? A relocation is basically, a directive for the dynamic loader telling him, so this symbol, it is a concept that represents like a function. For instance, in a library or a global variable in an external library, so the symbol. Take the symbol, so for instance, print F in this case, and write its address at this specific address and it represents where the address symbol has to be written and it is basically, the identifier of the symbol. The second parameter of the runtime and it is the index in a table of this relocation. This is relative of the P section and it is relocation of the data infraction. It is an identifier in another table, which is an array of symbols. A lot of details we are not interested in. Some of the things that we are interested in is the field code and then, there is another table. This is the last one, I promise. It is called (Indiscernible) and it has the definition or the name of all of the symbols that are imported in a certain binary and this is an index in the table, from the relocation we pass from the symbol table, and from the symbol table, we pass to the name. In this case, it is print F. We are going to get back on these things. To recap, it finds the symbol and its location and finds its address. From the location to the symbol and from the name to the address. It is right that the address and the others specified in the location and then, it transferred the execution to the debt function and it also calls it. It -- as you might understand at this point and it is going to point to the address of the print cache address and right there, it tells us the address of this function. Next time, I'm going to code the trampoline and just jump right there. It is sort of organization. Okay, where does this print address actually stay? It is in another section which is called dot MLP. It is an entry point of where these cached addresses are. Initially, they are all initialized and they are initially, they are -- let's try and come up with an attack to this system. First, with assumptions and our assumption is that we are able to write a binary location and let's say that we have a small gadget and I have this value and the address. Very simple stuff. What can we do? If you look at this setting, the idea is what if I'm able to replace the print F string for instance -- if I'm able to solve and replace that string and evoke the function with the print index, what the loader is going to do is be go through the relocation and in the end, it is going to end in the dynamic string table, but it won't find the print F. So, it will solve the address of the IP and evoke that function. In this case, we can evoke any function that we want if we are able to write out this memory arbitration. This approach does not work. There is no reason why the dynamic string table is writable. It is a string of information that you just write once and never change. This attack will not work. Even if you have this gadget and trying to load a known memory location and get an executable. Let's try around write around this. We have been talking about a lot of different sections, but the dynamic loader doesn't consider the sections by its name. It doesn't look up the POT section by name. It uses another section and it holds (Indiscernible) where the key represents one of the sections and the value is its actual address. For instance, the SIM table and a pointer to the dynamic SIM tab and etc. The nice thing is that the dot binary section is executable. We can trick it into thinking that the dynamic string table is somewhere else. We can build a fake string table in the DSS or any memory or any other memory that is bribable by us. It is always bribable. For instance, we go do the dynamic section and what the dynamic loader is going to do is going to the strings and instead of get to the dynamic string table, go to DSS. As the hacker, I form a string table and basically, we are able to form the attack in the sense of any library function. Okay, this approach is quite -- (Indiscernible) and actually, we were not the first thinking about this. So, the developers of the linked-inners is the link that we described. Why is the dynamic section loadable? It has to be writable when you start your binary, but after you initialize, you can write it as a section. Basically, our dynamic attack doesn't work anymore. Maybe, we can do something else? So far, we have played with the dot dynamic section. What happens if instead of making a point in the entry of the existing relocation, we put an index that is big enough to load after the relative section and maybe, we can trick the loader to going somewhere more interesting? Let's take a look. Maybe not in memory relocation, but in DSS, let's suppose that the index is the start from the beginning of the real PLT section and if we put a section big enough, we can go through the dot dynamic and end up with dot data in dot DSS and if you are able to trick the loader in going there and able to forge a fake relocation there and we can basically build our own relocation and solve any library function without touching the dynamic section, which is not writable. This is an example of the thing. We put an index, which is big enough to get into the DSS, where we forge the relocation and it forges the fake symbol. It is the same frequency and then, the same trick again. The off string in the dynamic table and we are basically loading all of the data structure, which is needed to solve the binary function (Indiscernible). We have to put our info in the SD field to be able to form our attack. The binary is round and like other several distributions and it is a symbol version which basically allows you to depend on any print F function, but a specific function of print F. For instance, I want the print F from the point, I don't know, .22. This is actually an extension and since it is popular, we have to deal with it. If the symbol versioning is enabled, it is not used just as an index in the symbol table, but another table, which is called a new version table, which I will talk about. The fact is that the index is used for two different things and we have additional constraint and we have to do two different things and it is related to the version that is not looked into. We can make the point and make the thing going to zero, zero. You have basically, disabled the version. Is it doable? Yes. And it can also be automated. The problem is that there are some situations where this is very, very hard. The 46-bit minor and huge pages and you have memory page that are large, up to a megabyte and the only part of the binary and we have the text, the relocation table, the dynamic string table and all of these things and it is very far away and like one minute away from the writable page. It makes it really hard to satisfy the constraint that we just solved. The problem is that 64-bit binaries are pretty popular. We found another solution for this. The idea in this case is not to play around the relocation but the current object info. Let's look at the current object info. This pointer, it is always stored in our point of entry in the GOT table. The second entry of the GOT table. It has a field called L info and it actually keeps a cache of pointer in the dot dynamic section. We can go basically back to the first attack when we were tampering with the dot dynamic section. This is what happens. We go to the STR tab, which is the pointer. The dynamic entry itself a pointer to the dynamic string table and we can change its value and make it point DSS and build it a fake dynamic entry and tricks the dynamic pointer to thinking that the string table is in this area. We are forming back to the first attack that we solved. We are still tampering with the dot dynamic table, but indirectly, like corrupting the section of the dot dynamic loader and it is always in a GOT, in the first, second entry of the GOT. Still, there is initial unprotection to this. The railroad comes in two flavors. Partial, which we just saw. Full railroad has all of the features of partial railroad, but it disables the binary loading. The GOT is initialized and we cannot write in the GOT anymore. Also, since they are not used any longer, the point of entry is no longer initialized and we lost the pointer to this critical infrastructure to bypass the railroad. We were getting its address from the GOT two entry. With these three things, we don't have a pointer to the link map; we don't have a pointer to the (Indiscernible) and a pointer to the (Indiscernible). It seems like we are pretty far. This is the feature like the typical feature and it is staying there and waiting for someone to abuse it. And here are. This is an entry of the dynamic table and certain events related to dynamic loading and we have a new library being loaded and you can execute it and the new symbols and that we don't really care about. The nice thing is that the dynamic entry doesn't point to a section, but the early bag structure and that's a pointer to link map. We are able to go through these and back to the previous attack. It is not that simple. As we said, we also -- sorry -- this is the last drawing I'm showing you. I promise. [ LAUGHTER ] This is really interesting because we don't have a pointer and we found a pointer (Indiscernible) and dynamic entry. Let's see how this goes. In the dot dynamic section, we follow its value and we get (Indiscernible) bag and then, we have this field and which points are the link structure that we saw before. First thing that we do is corrupt the DTRS entry and that's the dynamic strain. We do what we did before. We build a fake dynamic entry and a table. What will the dynamic loader do? It is going to through all of this stuff and solve it at the end. Nice. But then, one of the features, you saw, it is storing the address exactly in the GOT. The GOT is being completely initialized because of the full railroad. So, we get a sec load. So, we have to fake another entry of the link map data structure and that's the rail and that's a pointer holding a pointer to the relocation table. We create a fake relocation table and pointing to the GOT and pointing to like in this case, a memory area just after it. We solve also this problem. We have one last problem. The trick is here, usually, not usually, always, it is protected with the full railroad. The libraries are not protected with full railroad. If we are able to get the full GOT of the libraries, we are able to get it resolved. The structure is not just a structure per se; it is part of a list of all data structures representing all of the loaded F objects. We can dereference the exit that is entering the dynamic in the linked-in list and the data structure of any library and we get the others of the GOT and we go to the third entry and get the address and problem solved. We are building this up all together and able to bypass the full railroad, which is a pretty cool thing. Okay, so this is very interesting, but how can we do that? We implemented Leakless, which is the name of our tool and it is full binary and tells us the approach that is most treatable based on the protection that is enabled and the 64-bit pages and those kind of things and produce two different kinds of output. It can tell you to the this thing. And adjacent on five, which basically exploits you and write this thing here and write this thing there and runtime is resolved. If you provide Leakless with the gadgets that it needs and we are going to see the kind of gadgets that it needs, it is going to produce the exploit to resolve one or more functions one after the other and call them with the appropriate parameter. You can see the code there on the day that we just published a you can play around. Gadgets, we talked about gadgets and depending on the four attacks that I presented and the first one was not working. The first, naive approach. Depending on the attack that we are dealing with, we need different things. The first one, basically, overwriting the dot dynamic section and you need a gadget and giving an address and it writes there. Basically, all (Indiscernible) all of the technique need these kind of gadgets. O stands for partial railroad and then all of this other kind of stuff, and then, full railroad. The one that we use to tamper with the data structure and you have a pointer and you have an offset and you reference it and you have a value there. This is used for the huge pages and the full railroad. This one is basically, a gadget that we need to store somewhere some data that we took from those and all of those data structures and basically, you choose a memory where you want this address and you take it from this data structure and this offset and in this case, I initially skipped over this. For the first three attacks and to call the other runtime, we consider using a part of the DLP and that's tampering. So, we need four gadgets and depending on the exploitation. What loaders are actually vulnerable to this? We tested our linked-in to the GOT and (Indiscernible) and embedded. They are assisting the libraries for the embedded systems and open the DST and they basically, all behave like the same if you think about the standard L features except for the minor modifications and they all cache the dynamic entry, some difference, but not a real big deal. We say that it is not bionic, as far as I know it supports the binaries and embedded system library. Basically, our attack is worthless. It is a feature. And the previous free loader, not the first, the second attack, I presented you; basically, the one which was overflowing the relocation table and the symbol table and those kind of things does not work. These are actually checking the boundaries and they are the basic two things to do, but they are not the only ones. The other one does work. Vulnerable and not vulnerable. Let's recap. What are the advantages of Leakless? It is single stabling. You don't require interaction with the victim. We are not sending back attack on anything. We are doing a single exploit. Offline attacks, which before were not feasible, it is reliable and it is deterministic and there is nothing left to randomness and it is very deterministic and then, you don't need a copy of the target library. If you think about that in the beginning, I told you that you need to compute in a particular situation and you have to compute the distance from print F. You need the library. In this case, we do not care about the layout of the library. We just abuse the dynamic loader, which does it in the proper way. In most cases, it is very portable and open to DSP and so forth. As I told you, there are minor exploitations to the network. You need to implement what the loader is doing and if you go through all of the relocation symbols and all of those things is very complex and it takes very long. And also the advantage of the Leakless, once you set-up the data structure and if you want to make multiple calls to different library functions and you just need to change the name of the function that you want to call and just call again (Indiscernible) and instead of with the loader, you have to do all of the processes. Shorter means higher feasibility. If you don't want to interact again, you cannot read. If you want to do it single stage, being short is important. (Indiscernible) criticism that one is going to make such an attack is why don't you just do this with Cisco? Because it may not be available. If you can do Cisco, you can do everybody that you want. But with the new process and for instance, if you want do exactly, it is kind of invasive and here is an example that we came up with. It is a single code for using the functions with the client. What we are doing here, we are taking the pointer to the proxy object and if the attack that we want to carry on is set a proxy, so I can intercept the Instant Messenger traffic for the user. We connect and disconnect the user. Supposed to with Cisco somehow, but we are losing system functionality and it is less intrusive. It is doing everything, except finding the gadgets. Maybe, in the future, we can have some gadget finding. The counterattack and this is an actual solution and with the survey, the PIP is not being deployed a lot. The solution is used on the binaries they deem critical. Why they deploy on all of the binaries doesn't make sense to me. Just remove it or use it if you want an environment variable. And then, with the data loader, you get a crash and it is also very good conformation and the key point is a that this point of core confidence of the operating system should be recoded with security in mind. The ones that we have inspected are just using user input. And since they are all over the place, it is very dangerous. If you are able to attack them, you are able to obtain something that is portable and all over the place. The last thing, I want to thank all of the people who are working with me. Yan is not here. And that's basically all. Maybe, I have a second to do a very short -- just to show you some code. So, this is where -- we are just trying to implement and attack a real binary. We wanted to do something real. We took a very basic step and the step buffer overflow and we tried to generate an exploit an attack with Leakless and just providing the gadgets. And testing the pickup, which I generated and I got a shell. [ APPLAUSE ] It is very easy. Shell is always good. And the same experience for you, just trust me. The last thing and this is the part, Leakless has been implemented in Python and this is the code that you need to write if you want to provide info to them, to Leakless. It is order code that could be factorized out and it is basically, tape the gadgets and mix them together and the part of Leakless, which is just from the binary. I wanted to show you the adjacent output and this is the Leakless, and we asked him to produce (Indiscernible) the adjacent format, which is the second one I presented. What he does, he basically gives off some instructions like write this thing at this address. If you don't want it to be generated automatically because maybe it is a complicated setting, it can give you the instruction to perform the attack. And with this, thank you for attention. [ APPLAUSE ]