Reverse engineering, as applied to software, is the process of breaking down a program into its base components in order to reveal more about how it functions. One example of reverse engineering is the attempt to analyze a program's implementation of digital rights management (DRM) copy protection mechanisms. If enough is learned about how the copy protection works at a lower level, it can be broken.
Even if you don't have access to an app's source code during your pen test, you may be able to obtain the app's binaries or capture information about the app during execution; this can enable you to reverse engineer the app to look for potential weaknesses in design, programming, or implementation.
When it comes to software, there are three primary methods of performing reverse engineering: decompilation, disassembly, and debugging.
Use an Obfuscator!
Decompilation is the reverse engineering process of translating an executable into high-level source code. This typically involves translating the machine language code of compiled binaries into the source code that the software was written in before being run through a compiler. However, decompilation can also involve translating intermediary bytecode that is normally executed by an interpreter into the original source code.
Being able to deconstruct an executable into its source code means that you don't just need to rely on dynamic analysis to test a target app. You can use it to recover lost source code, as well as examine malware. You can also perform static code analysis to correct errors. Decompiling an app will help you determine if the app's logic will produce unintended results, if the app uses insecure libraries and APIs, and if the app exhibits any of the other poor coding practices that developers can fall prey to.
Some apps are easier to deconstruct than others. For example, the nature of the class files in the Java programming language enables them to be easily decompiled into source code. You can therefore reverse engineer apps written in Java with freely available, easy-to-use tools. However, some languages and third-party tools are designed to obfuscate source code before it is compiled. Obfuscated code is difficult to dissect because it uses convoluted and non-straightforward expressions that are not friendly to human analysis. For example, the name of a string variable in the source code might be something simple and self-explanatory like count, but in the decompiled code, it may appear as a seemingly random combination of numbers, like 42893285936546456421324. This makes it more difficult for a human reviewer to understand and retain the variable's purpose, as well as trace the variable throughout the code.
The following table compares some popular decompilers.
Note: For more information on decompilers, see https://en.wikibooks.org/wiki/X86_Disassembly/Disassemblers_and_Decompilers
Disassembly and Debugging
Disassembly is the reverse engineering process of translating low-level machine code into higher level assembly language code. Assembly language is lower level than typical source code, but it is still human readable and can include familiar programming elements like variables, functions, and even comments. Like decompilation, the purpose of disassembly is to better understand how an app functions in ways that might not be visible during normal execution. A tool that performs disassembly is called a disassembler.
Disassembly certainly has its disadvantages when compared to decompilation. Assembly code is not as concise as high-level code; it's more repetitive; the linear flow of the code is not as well structured; and, of course, it requires knowledge of assembly, which not many people possess. However, disassemblers tend to be more common than decompilers, as accurate decompilation is difficult. Likewise, disassembly is deterministic—in other words, a machine code instruction will always translate to the same assembly instruction. In decompilation, translating one machine code instruction can result in multiple different high-level expressions.
Note: Hex-Rays IDA is also a disassembler/debugger.
Debugging is the process of manipulating a program's running state in order to analyze it for general bugs, vulnerabilities, and other issues. You manipulate its running state by stepping through, halting, or otherwise modifying portions of the program's underlying code, directly affecting the program as it executes. Debuggers are common in integrated development environments (IDEs) for developers to debug code as they write or test it, but they can also be used on compiled software as a form of interactive reverse engineering. These debuggers can include a decompiler for modification of source code, but more commonly they include a disassembler for modification of assembly instructions during execution.
Debugging can aid a pen test because it not only translates machine code for static analysis, but also enables you to change that code and perform dynamic analysis on the program to see its effect. This can make it much easier to understand how an app functions and how it might be vulnerable.
The following table summarizes some popular disassembler/debugger tools.