>> So um good morning, thank you for attending a Sunday morning uh talk. So today we’re going to talk about Android packers, how do they work. And we’re gonna analyze a few of them and um I’m going to show a method we developed a tool that can handle most uh uh it’s a generic way to handle most of the packers in the the market. So who are we? I am Avi.I’m uh currently a founder at myDRO but uh I was formally a Mobile R&D Team Leader at Check Point for a few years and this is where I did, did this uh research. And before that I was in security research in Lacoon Mobile Security. And uh unfortunately sla.. Slava, my uh co-presenter could not attend today. But except being a very uh talented researcher and a good friend, he also did a lot of the heavy lifting around this research. We miss you Slava. So uh let’s go to business. So uh Boxing Apps. Okay, so Malware authors use various uh boxing or packing techniques to prevent Static Code Analysis and Reverse Engineering. Uh Malware author invested a lot of time in developing his cool Malware. And he doesn’t want you know uh security guys to understand what’s happening, detecting it through, they’re using their automatic tools or uh with manual reverse engineering. This is the same as in the PC world where most of the packers and most for he malware today are packed. So how can one um protect his code. So they can use a proprietary technique or a third party software that can uh protect the apps code. So what does this software should do? So this includes code protection, anti debugging, anti tampering, dumping, all of the methods that would prevent the security guys um the static code analysis engines, the reverse engineering from understanding what’s happening inside the app, the malware. And what was our motive to this research. So in the PC..PC world it’s it’s for years now that Malware’s are uh uh are uh packed by uh different packers. Also we see this trend rising in the Android world. And this is a snapshot of an analysis done in a checkpoint system from uh May. And we saw that almost 25 percent that the packed apps were detected by us as malware. And we asked ourself.. can we um improve the detection, is this really the amount of malicious apps from the all the pack apps or maybe we’re missing something because we’re not going and doing static code analysis or fool static code analysis for this app. These apps because they’re packed. So this was our motive and saw uh in order to understand how packing works and to find a generic way to unpack them. So uh what techniques exist to protect an app’s code. So let’s talk about the main three. We have Obfuscater’s, Packer snap Protectors. So what is Obfuscation? So Obfuscation is um adding redundant code the uh to the main app’s code. It doesn’t affect the functionality of the app and changing the function name and the variable names and this is done in order to prevent or reverse engineer to understand what’s happening inside the app. And uh today in the Android world there’s a default uh obfuscater tool called ProGuard which comes with Android studio and used by most of the Android developers. And.. but I have to say it’s not the best obfuscater in the market and for uh.. really not really experienced reverse engineerer it shouldn’t be any problem to understand what’s happening inside the app. And another method which we are going to concentrate here in our talk is uh Packers. And what does Packing do? So let’s say I have an APK, an Android app. And it contains um a DEX file. What’s a DEX file? It’s a small byte code that what’s happen..happens to the.. the code after it’s being compiled. And um this is what Android executes. So I have the original uh code and what does the packet process do? It takes this original code and encrypts it, packs it in some in some manner. And the way it opens it, it adds a packer loader which will be the now the entry point for the execution of the app and once the app is executed the packet loader will take.. take the bundled original encrypted DEX file and load it to the memory, unpack it and load it to the memory so it can be used by the app. And um.. there’s also Protectors. That.. uh… that works.. uhh in a better or different way. They take the original DEX file but they don’t only uh encrypt but they also modify it. Uh why they’re doing that, they want to add another layer or protection. Let’s say one of the protectors uh that we saw uh in the wild adds an encryption in the class level. Meaning if only when the class is initiated, uh…eh… hmm.. lovely. Um, the encrypted uh, only when a class is initiated it will be decrypted. And what’s surprising here that we didn’t see.. see.. saw uhh a large use of protects, protectors in the wild. And our, what we thought is it might because it might effect the.. the logic of the malware. Malware offers uh don’t use it as much as packers. So we decided to concentrate in our research on packers. So in order to understand more on how Packers work. We need to go back a bit to basics and understand some things about uh Android. So let’s talk a bit about ART. The Android RunTime VM, this is a schematic of how things looks in Andorra 6, which is good enough for us to understand this world. So what happens? How does Android uh RunTime Vm works? So the Android RunTime VM can work in two modes. One, interpreting the small byte code and the second is working with a compiled byte code. Ahead of Time compilation. That was something that was intercede in Android 4 point 4. And what happens is when you install an app, it goes through a process of compilation and then the VM will work on a compiled uh ELF code. And while this was done and this allows to gain uh a lot of improvements in ram, battery uh performance, and run time… start up run time of the application. But it’s important to remember that the VM can work in both ways. Interpreting a [inaudible] code or what uh a compiled eh native. So what happens when you load a DEX file? Uh when you start an app. So uh you trigger the zygote process which is an empty process that contains preloaded classes and it’s diagrammed to show the startup time of the app. And what happens the cyber process forks itself into an empty app process and loads the app code.. the uh uh the OAT file. But what happens if the OAT file is missing? So what will happen is it will trigger DEX2oat, that’s the process that compiles the DEX file into an OAT file. And we’ll use the oat file in order to execute the app. So I talked a bit about Oat files. So let’s try to explain what what is it? So Oats file is basically an ELF file with some uh added sections. One of them is Oats data which contains the original DEX file and one of the is the oat exec which contains the compiled version of the DEX file. And both of this sections are used uh by Android the ART Vm when executing an app. One is for uh creation of different headers and one for inter..interpreting the app. But it’s important again to note that you don’t have to have the native file. All of the Oat exec in order to execute an app. You can back fall uh to the interpreted uh to the smaller version and interpret the code. So now that we understand a bit more about Android. Let’s try to think about ways to unpack an App. So the first one can be finding the algorithm. We can try the different packers. Try to analyze how do they pack an app, which algorithm do they use. And do the back steps in order to decrypt the app. The problem with this is this doesn’t scale. You need to understand each packer, how it works and even if a packer only do uh a minor modification to the packing algorithm, the script will break and you need to start your research again. So this is not the way you want to go. Another method could be extracting a DEX file from the compiled OATs. As we said we have the DEX file inside the OAT. But what we saw in different packers is you don’t have to have the DEX file inside the oat, you need only part of it. So some of the packers delete the DEX file from the Oat. So you can just take it and use… use the DEX file. Another method might be just dumping the DEX file from the memory. But again this is not uh uh always work, does not always work because uh the DEX file the DEX might be missing and the packer will use the oat file. So we wind up thinking about using a custom Android ROM. Which uh this is something we already do in Check Point. We have amec analysis engine and uh maybe introducing a few modifications to the custom Android Rom that will allow us to dump uh to place a few hooks in interesting places. And this will allow us to dump the DEX file and pass it to our uh static code analysis engine. So before continuing on I want to uh talk about a few notable works that was done in this area. One of them was Android Hacker Protection Level uh Zero and it was presented here at Def Con a few years ago. And it was a very goo talk that talks about the different packers and protecters uh in the wild and they also released a few set of scripts that dumps uhm… that work on some of the packers and dumps uh DEX file. Another very interesting talk from the guys that released the DEX hunter tool which is a modified version of Android Dalvik/ART VM and it really reconstructs a new DEX file from.. from the memory. While this was a very interesting project, it was not what we aimed for. We wanted to get the original DEX file before the packing process was begin… uh was began in order to have the same hash as the original file. So we want to go in a different path. So what was our approach? We wanted to find a solution that would require minimal changes to the Android source code. So it will be portable. And it will work on most of the packers. So how… how did we do it, how did we address the problem. So we took the most popular packers that we witnessed in our systems and reversed them. We additionally analyzed the way Android loads a DEX file in order to understand truly how it works. The results was a patch of a few Android uh lines the Android runtime that will allow us to dump the files and uh analyze it in our static code analysis. So what were the analyzed packers that we looked on. So the most popular packers encountered were um Baidu, Bangle, Tencent, Ali, and 360 Jiagu. And what’s uh Baidu is the same huge Baidu Chinese company that you know. They also have a packing service. It’s a web service, you send an APK and you get the packed version of of it. It was very surprising to see that they offer this kind of service. So in this talk, I am going to talk about Baidu and Bangle. And what’s interesting about them is they work in a bit different method but covering both of them allowed us to find a solution that works on almost all the packers we encountered, encountered. So let’s try to think about the abstract way in which a packer should work. So um as I said, you have the packer loader it will load the bundled packed DEX. So it will load the DEX, this will trigger libart, the ART VM um to uh work. And opening the DEX file, map it’s data to the memory. Uh then you have the DEX in the memory. But something is missing here, where does the unpacking process takes place. So what we thought is that most of the packing uh unpacking process will be inside a native… a native lib function and uh file. And why? Because reversing of a native file its…it will be harder for uh reverse engineers. So it’s a good idea to put your unpacking logic in an obfuscated and protected manner and for that you need to not do it in the… in the java byte code packer loader but in a separate native file. And what does this native file do? It needs to interject itself somewhere so it could decrypt and unpack the packed file. So he will do it with hooking. It might hook lib art lib c and uh during… now when lib art will open the file the unpacking process will take place. And then… libART will eh get the original DEX file that it can execute. So cool. Now that we thought about an hypothesis of how packers should work we want to verify if it’s really what we see in the field. So let’s look about uh the first packer, Bangcle. So in order to identify Bangcle it.. its very easy it has uh… uh various classes we chose in every packed up by Bangcle. And um different files. One of them is the native packer which is used to unpack an app. And the packed version of the text… DEX file which is Bangcle classes. And this is a snippet of the java loader implementation. And we can see that Bangcle loads a native lib…a native lib file. And calls fu-JNI which is a bridge between java and uh calling native functions to uh functions from this native file. And then it loads the DEX file which will trigger uhh libART SO. So we wanted to understand what’s happening inside the native file. So we try uh.. open it with IDA and it crashed. And uh this is one step afterwards and uh we could uh after we fixed it, I’ll explain in a second how we did it. Uh but what we noticed here is we didn’t see uh any mapping between the native functions and the functions uh names in the java interface. Meaning something is missing here where does this mapping happens? So… what what we needed to do is understand the mapping. So I’ll take a step backwards and I said that IDA crashed, why did IDA crash? We know that this file is in use so it should work but when we dumped um the file headers was using, the file headers we saw that some of this uh some of the segments were missing. We didn’t see the text segment. So um eh what we noticed is it defined in the dynamics section and we had to manually reconstruct the different sections in the file. Um from the info we got from he dynamics section. And then we could eh… eh analyze it in IDA. But that wasn’t enough because even uh… when we uh.. opened the file in IDA uh the entry point was not valid. It didn’t point to anything interesting so uh something else is happening here. So what happens here is there’s a call to an INIT function eh from the dynamic section. What does this INIT function do? So what it does is the native file contains a compressed section of the code. And the INIT uh function decompresses this uh code and overrides the text section. And now the entry point, the original entry point of the ELF is valid. And what we saw is that one of the functions inside the native file is JNI OnLoad. And JNI OnLoad provides the mapping between the functions in the native file to the JNI, the java. So now we can understand what does uh what does the function do. Okay. So, now let’s see how Bangcle works. So um the first function, extract a file from the assets. And the second one, which is the interesting one uh forks three different processes. The first one is just the apps uh process. The second one is an anti debugging process which does different tricks in order to prevent us to uh understand what’s happening. And um.. the fourth… third one only executes when the OAT file does not exist. And as we know this mostly is the first time when the dynamic uh DEX… when the DEX is loaded. But it doesn’t executes DEX2oat uh in a regular way but it uses a LD preload in order to hook some of the functions in DEX2oat and create a special kind of version of uhh an oat file. This oat file will later be used by RTM when executing the file. I hate windows. Okay, so… what does the hooking and uh and uh in Bangcle do? So um on the left so uh it hooks 8 different functions and we have here an example of one of the functioned hooks and the way it hooks them. So we can see it on the left uh an open app fun..function uh without any hooking. And on the right we can see uhh.. the uhhh the hooked version and we can see that the first bytes were over read and um the what it does it changes the PC register in order to change the flow of execution to the pack unpacking process of the app. So let’s do a recap of Bangcle and how it works. So it creates a stop.. a packer loader as a Java activity to load the native library. The native library is protected with different anti uh research techniques that we have to bypass and what it does it hooks libc and uh for um for the unpacking process and what it does is when a libART encounters the OAT file it will unpack it and provide an unpacked version to the libART VM. So we understand how Bangcle works. Let’s look at Baidu. So again for classification this is pretty straight forwarded, we can see that if we have a stub application and the stub provider and again a native lib and uh the packed original DEX. And again the same we couldn’t see um the mapping between the native functions used in uh in the loader to uh to the functions in the native lib and uh you can see that uh that um again uh by the used the INEX section in order to decompress some of the code. Because again we couldn’t see it in IDA and only after the decompression we could understand whats happening inside the file. And uh again it’s using the JNL OnLoad function to provide uh to provide the mapping and do some uh other interesting stuff. And these are the things it does. So it has an anti-debugging technique that I will, I will elaborate. And uh registration of the native methods meaning the mapping between the JNI and the native functions. And it extracts the packed DEX from the assets and creates an empty DEX file, not an OAT file but a DEX file.And provide the hooking. So what are the anti-debugging techniques used in Baidu so we have obfuscation, uh log disabling, uh it checks that gdb isn’t executed and JDWP isn’t executed and a few uh more other anti-debugging techniques. And we can see that the hooking and libART in Baidu is a bit different. It hooks the Android log print function in order to prevent any logging so uh if you try to debug it, you can’t. It will be harder for you to understand what is happening. And uh it hooks the EXECV function. Uh and when DEX2OAT is executed, by Android, it prevents the compilation of uh the DEX file. Meaning the Oat DEX section it will be it won’t be empty but uh Android won’t use it to execute the logic of the app. It will fallback to uh the file encode. And it hooks the function open. Meaning with Android tries to look for the one dot jar file instead it will decrypt the uh packed uh DEX file and supply it to libART VM. So again, let’s see what Baidu does. So it creates a stub in the Java activity. It uh… The native lib is protected with different anti research techniques. And it hooks libART for handling, opening of the DEX file. Well this looks familiar. But this is a different packer or almost a different packer. So uh what’s.. what can we understand from here. That most of the pack.. unpacking process might be generic with a few minor changes. We can see that the trigger for the decryption by the different packers is uh when the..when libC opens the file. And in Bangcle it’s when it’s opening classes an OAT file. In Baidu it’s when it’s opening the DEX file. And if a hook plays hooks in the first places in uh the libART VM process when uh it tried to open a OAT file and a DEX file. And dumped the files, we should have the decrypted version, the unpacked version of the DEX file. So that’s exactly what we did. We understood the way an app is loaded uh by the artVM and where is the first places we can place a hook uh in the VM uh in the code uh in the flow uh of loading. So we can dump the the files. So one function in the OAT file loading process and in the DEX loading process. And as you can see it’s only a few lines of code. One is three and one’s a few one. And this allows us to dump the decrypted version of uh packed files. Packed DEX files. So let’s see a demo. Okay. I’ll try to… cool. So this is a demo of a tool we created that uh that will can gen.. generate and unpack most of the packers. And what you see now in the in the background we open now an app which is packed and you can see by Bangcle and you can see the DEX file, the packed DEX file. Which you can’t really understand what’s happening here because it’s packed. Now we execute our app. And uh it’s our tool which is uh uh a forked version of the AL spawn of Android. And I’ll try to fast forward this and unfortunately I can’t so we’ll have to wait. So what’s happening here is Android emulator is loading and once it’s loaded we will uh load the app and our hooks our two different hooks will dump two versions of the DEX file. One should be valid and one should be not. It depends how the packer works. Some the packer hooks in both places some of them hooks only one of the loading uh in one of the loading uh flows. But this enables us to unpack the apps. So um well this will take a few more seconds. Um sorry. Okay. So how are you guys today. [Audience Repsonds] Um what I can mention is the hooking used by the packers is not persistent they placed the hooks during the loading process and then they remove it. It’s it’s…. it’s a good uh it was we had to really understand when the hooks are placed so we could decrypt uh dump the DEX file in the right time. Because trying to connect uh later on uh with GDB and dump the memory or execute or just dump it in uh afterwards after the app is already uh executed won’t always work. So it was crucial for us to understand the DEX loading process. And um.. wow Android is so slow. [Laughs] Oh man. [Audience laughs] [Laughs] [Clapping] Okay you’ll believe… believe it it works but you don’t need to believe me. You can download the tool.. uh for yourself from our repository and uh it’s it’s it’s not a compiled version of Android but a patch that you can apply and the script that wraps the unpacking process. And you can go over them, uh execute them and see that it works and uh uh enjoy. So uh sorry, so we understood how the packing process works by different packer and we only introduce a few changes over the ART VM. And this enabled us to work with like 90 percent of the packers we encountered in the systems. And what was very interesting we quit this uh change allowed us to uh to send an unpacked of the DEX files to our static code analysis systems and we got a 50 percent increase and detection of malciusouness of packed apps from this uh from this feature. Which is uh was very good for us. [Applause] Thank you thank you, you’re far too kind. [Applause] And uh that’s it um if you have any questions um feel free to ask. [Applause]