Decompilers
A decompiler tries to translate an object file into a compilable
source file. There are many decompilers for C# or Java,
but only a few for C/C++. See in particular:
-
Ghidra:
An open-source decompiler developed by the U.S. National Security Agency,
is an advanced interactive environment (seems inspired by IDA - below),
for binary analysis and decompilation. It's written in Java, has a user
interface resembling the Eclipse IDE (in fact there's also a plug-in for Eclipse).
I've analyzed its implementation (the decompiler is in C++) and it has
many of the features I wanted to implement in my own decompiler (see REC, below).
Users can write their own plug-in for target-specific analysis in either
Java or python!
Runs on Windows, Linux, MacOS, and supports many processors. New processors
can be added by writing text files to specify the processor architecture's
and its instruction set.
Overall, an excellent work, which sets a new standard for decompilers.
-
reko:
Another open-source decompiler. Written in C#, it thus only
run on Windows, or on platforms supporting mono.
It accepts binaries compiled for many processors. It has both a GUI
with all the standard views (disassembly, hexdump, C source, project),
and can also be used from the command line.
-
RetDec:
Originally developed by the Brno University of Technology, Czech Republic,
as an on-line service, and AVG Technologies, now part of Avast, it can be downloaded
from a GitHub repository and run locally.
I have not evaluated it, but at the time I had read the paper published by the
Brno University team, and it seemed at the level of the other advanced
decompilers available at the time.
-
C4Decompiler:
(The original link seems to be dead. I'm leaving the description here
in case it becomes available again - I think I have an old version
downloaded on my hard-disk)
A new decompiler under development. Windows only, has a slick
user interface inspired to Visual Studio 2010 with many useful
interactions, that unfortunately are not always obvious. One
has to right-click to discover them.
The analysis seems very good, at least for the debug-compiled
example included in the installation. Trying it on random executables
from the Windows folder had mixed results, from completion of the
analysis to crashes to endless loops.
Still it's very promising, as its authors have
clearly put a lot of thought and effort in its development.
-
Boomerang:
open source C decompiler. Very advanced set of analyses
that attempt to solve the most difficult problems
facing decompilers. The generated code quality varies greatly:
some functions are almost perfect in their representation of code
structure, local variables and types. Other functions look highly
obfuscated by the number of variables and their uses.
It's also rather fragile, as it often crashes with big programs.
-
REC:
My own C decompiler for Linux, DOS and Windows.
The first decompiler to work on multiple platforms and that supports
multiple processors (x86 16 and 32-bits, MIPS, 680x0, PowerPC).
It's very stable, as it's been tested with hundreds of programs.
The quality of the output is not as good as Boomerang's,
since its implementation is based on 20 years old coding style
(read very difficult to extend). I've now published a new version,
RecStudio 4, which supports 64-bit executables. It has not been
tested on as many executables, so problems still remain. Also the
different analyses performed (SSA), generate totally different
code that at times may seem of much worse quality (although it's
probably more correct), than the code generated by the previous
version.
-
Hex Rays:
a decompiler plug-in for IDA Pro. The combination
with IDA's advanced disassembly capabilities and run-time debugger
make it the ideal choice. However it's still very new, and requires IDA Pro.
Unlike the others decopilers, it's not free.
It also has to stand the test of time in terms of stability. Very promising.
-
Dcc:
DOS to C decompiler. One of the first decompilers. It shows its age,
but it's still referenced by many other decompilers for its
structuring abilities. Only supports 8086 (16 bits) programs.
-
More on other decompilers at the
Program Transformation Wiki on Decompilation
Here's a comparison of the various decompilers:
C4Decompiler |
Windows |
IA64 |
PE-COFF |
Interactive GUI |
No |
Very Good |
Good |
Fair |
|
Boomerang |
Windows/Linux |
IA32 MIPS PPC |
ELF PE-COFF Mac-OS |
Batch with GUI front-end |
No |
Very good |
Good |
Very good |
|
REC |
Windows/Linux |
IA32 IA64 MIPS PPC mc68k |
ELF PE-COFF AOUT RAW PS-X |
Batch / Interactive |
No |
Good |
Fair |
Partial |
|
dcc |
Windows |
8086 |
DOS .com |
Batch |
No |
Good |
Fair |
Poor |
|
Hex Rays |
Windows |
? |
? |
Interactive |
? |
? |
? |
? |
|
Testing Decompilers
The quality of a decompiler is based on how good the code it generates is,
and how well it performs in the presence of "unexpected" input.
Particularly difficult problems are posed by the use of compiler optimizations
which make the input code highly unstructured and difficult to understand, even
for a human. Handling the following cases defines the quality of a decompiler:
No information on symbol names in the binary file (stripped executable)
Static vs. dynamically linked executable files (use pattern matching vs.
dynamic linker information to identify access to library functions)