RetDec: An Open-Source Machine-Code Decompiler

Size: px
Start display at page:

Download "RetDec: An Open-Source Machine-Code Decompiler"

Transcription

1 RetDec: An Open-Source Machine-Code Decompiler Jakub Křoustek Peter Matula Petr Zemek Threat Labs Botconf / 51

2 > whoarewe Jakub Křoustek founder of RetDec Threat Labs reverse engineer, malware hunter, security Botconf / 51

3 > whoarewe Jakub Křoustek founder of RetDec Threat Labs reverse engineer, malware hunter, security Peter Matula main developer of the RetDec decompiler senior rock climbing and Botconf / 51

4 Quiz Time Botconf / 51

5 Quiz Time Botconf / 51

6 Quiz Time Botconf / 51

7 Quiz Time Botconf / 51

8 Disassembling vs. Decompilation Botconf / 51

9 Decompilation? What is it? Botconf / 51

10 Decompilation? What good is it? Binary analysis reverse engineering malware analysis vulnerability detection verification binary comparison... Botconf / 51

11 Decompilation? What good is it? Binary analysis reverse engineering malware analysis vulnerability detection verification binary comparison... Binary recompilation (yeah, like that s ever gonna work) porting bug fixing adding new features original sources got lost optimizations Botconf / 51

12 Ok, why aren t we already using it? Multiple existing tools: Hex-Rays, Hopper, Snowman, etc. Botconf / 51

13 Ok, why aren t we already using it? Multiple existing tools: Hex-Rays, Hopper, Snowman, etc. It is damn hard Botconf / 51

14 Ok, why aren t we already using it? Multiple existing tools: Hex-Rays, Hopper, Snowman, etc. It is damn hard compilation is not lossless high-level constructions data types names comments, macros,... Botconf / 51

15 Ok, why aren t we already using it? Multiple existing tools: Hex-Rays, Hopper, Snowman, etc. It is damn hard compilation is not lossless high-level constructions data types names comments, macros,... compilers are optimizing Botconf / 51

16 Ok, why aren t we already using it? Multiple existing tools: Hex-Rays, Hopper, Snowman, etc. It is damn hard compilation is not lossless high-level constructions data types names comments, macros,... compilers are optimizing computer science goodies undecidable problems complex algorithms exponential complexities Botconf / 51

17 Ok, why aren t we already using it? Multiple existing tools: Hex-Rays, Hopper, Snowman, etc. It is damn hard compilation is not lossless high-level constructions data types names comments, macros,... compilers are optimizing computer science goodies undecidable problems complex algorithms exponential complexities obfuscation, packing, anti-debugging Botconf / 51

18 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Botconf / 51

19 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Many ABIs Botconf / 51

20 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Many ABIs Many OFFs (object-file formats) ELF, PE, apple Mach-O,... Botconf / 51

21 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Many ABIs Many OFFs (object-file formats) ELF, PE, apple Mach-O,... Many programming languages Botconf / 51

22 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Many ABIs Many OFFs (object-file formats) ELF, PE, apple Mach-O,... Many programming languages Many compilers & optimizations Botconf / 51

23 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Many ABIs Many OFFs (object-file formats) ELF, PE, apple Mach-O,... Many programming languages Many compilers & optimizations Statically linked code Botconf / 51

24 Generic decompilation? Even harder Many architectures x86, ARM, MIPS, PowerPC,... CISC vs. RISC bit length, endianness, floating points versions & extensions Many ABIs Many OFFs (object-file formats) ELF, PE, apple Mach-O,... Many programming languages Many compilers & optimizations Statically linked code... Botconf / 51

25 Retargetable Decompiler (RetDec) Goal generic decompilation of binary code Botconf / 51

26 Retargetable Decompiler (RetDec) Goal generic decompilation of binary code History (AVG + BUT FIT via TAČR TA grant) (AVG + BUT FIT students via diploma theses) 2016 * (Avast + BUT FIT students) Botconf / 51

27 Retargetable Decompiler (RetDec) Goal generic decompilation of binary code History (AVG + BUT FIT via TAČR TA grant) (AVG + BUT FIT students via diploma theses) 2016 * (Avast + BUT FIT students) People 3-4 core developers 20 BSc/MSc/PhD students Botconf / 51

28 Retargetable Decompiler (RetDec) Goal generic decompilation of binary code History (AVG + BUT FIT via TAČR TA grant) (AVG + BUT FIT students via diploma theses) 2016 * (Avast + BUT FIT students) People 3-4 core developers 20 BSc/MSc/PhD students Lines of code 419,451 code 205,222 comments, etc ,673 total Botconf / 51

29 RetDec? What does it do Supports architectures (32-bit): x86, ARM, PowerPC, MIPS OFFs: ELF, PE, COFF, Mach-O, Intel HEX, AR, raw compilers (we test with): gcc, Clang, MSVC Botconf / 51

30 RetDec? What does it do Supports architectures (32-bit): x86, ARM, PowerPC, MIPS OFFs: ELF, PE, COFF, Mach-O, Intel HEX, AR, raw compilers (we test with): gcc, Clang, MSVC Does compiler/packer detection statically linked code detection OS loader simulation recursive traversal disassembling high-level constructions/types reconstruction pattern detection... Botconf / 51

31 RetDec? What does it do Supports architectures (32-bit): x86, ARM, PowerPC, MIPS OFFs: ELF, PE, COFF, Mach-O, Intel HEX, AR, raw compilers (we test with): gcc, Clang, MSVC Does compiler/packer detection statically linked code detection OS loader simulation recursive traversal disassembling high-level constructions/types reconstruction pattern detection... Runs on (hopefully) Botconf / 51

32 Good news everyone! RetDec goes open-source under the MIT license december 2017, shortly after the conference Botconf / 51

33 Good news everyone! RetDec goes open-source under the MIT license december 2017, shortly after the conference Repositories 11 core 6 support 8 third party Contacts info@retdec.com Botconf / 51

34 How to get from EXE to C... Botconf / 51

35 ... by using cool technologies Botconf / 51

36 We need to go deeper! Botconf / 51

37 Preprocessing Botconf / 51

38 Preprocessing Botconf / 51

39 Preprocessing Botconf / 51

40 Preprocessing Botconf / 51

41 Preprocessing Botconf / 51

42 Preprocessing Botconf / 51

43 Preprocessing Botconf / 51

44 Preprocessing Botconf / 51

45 Preprocessing Botconf / 51

46 Preprocessing Botconf / 51

47 Preprocessing Botconf / 51

48 Preprocessing repos Fileformat fileformat, loader, cpdetect, fileinfo, unpacker ar-extractor, macho-extractor,... Botconf / 51

49 Preprocessing repos Fileformat fileformat, loader, cpdetect, fileinfo, unpacker ar-extractor, macho-extractor,... PeLib strengthened new modules (rich header, delayed imports, security dir,... ) Botconf / 51

50 Preprocessing repos Fileformat fileformat, loader, cpdetect, fileinfo, unpacker ar-extractor, macho-extractor,... PeLib strengthened new modules (rich header, delayed imports, security dir,... ) ELFIO strengthened Botconf / 51

51 Preprocessing repos Fileformat fileformat, loader, cpdetect, fileinfo, unpacker ar-extractor, macho-extractor,... PeLib strengthened new modules (rich header, delayed imports, security dir,... ) ELFIO strengthened PDBparser will hopefully be replaced by LLVM parsers Botconf / 51

52 Preprocessing repos Fileformat fileformat, loader, cpdetect, fileinfo, unpacker ar-extractor, macho-extractor,... PeLib strengthened new modules (rich header, delayed imports, security dir,... ) ELFIO strengthened PDBparser will hopefully be replaced by LLVM parsers Yaracpp YARA C++ wrapper Botconf / 51

53 Core Botconf / 51

54 Core Botconf / 51

55 Core Botconf / 51

56 Core Botconf / 51

57 Core: LLVM dozens of analysis & transform & utility passes dead global elimination, constant propagation, inlining, reassociation, loop optimization, memory promotion, dead store elimination,... Botconf / 51

58 Core: LLVM dozens of analysis & transform & utility passes dead global elimination, constant propagation, inlining, reassociation, loop optimization, memory promotion, dead store elimination,... clang -o hello hello.c -O3 217 passes -targetlibinfo -tti -tbaa -scoped-noalias -assumption-cache-tracker -profile-summary-info -forceattrs -inferattrs -ipsccp -globalopt -domtree -mem2reg -deadargelim -domtree -basicaa -aa -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -inline -functionattrs -argpromotion -domtree -sroa -basicaa -aa -memoryssa -early-cse-memssa -speculative-execution -domtree -basicaa -aa -lazy-value-info -jump-threading... Botconf / 51

59 Core: LLVM IR LLVM IR = LLVM Intermediate Representation kind of assembly language / three address = global i32 define %arg) { %x = load i32, %y = add i32 %x, %arg store i32 return i32 %y } Botconf / 51

60 Core: LLVM IR LLVM IR = LLVM Intermediate Representation kind of assembly language / three address = global i32 define %arg) { %x = load i32, %y = add i32 %x, %arg store i32 return i32 %y } SSA = Static Single Assignment %y = add i32 %x, %arg Load/Store architecture %x = load i32, Functions, arguments, returns, data types (Un)conditional branches, switches Botconf / 51

61 Core: LLVM IR LLVM IR = LLVM Intermediate Representation kind of assembly language / three address = global i32 define %arg) { %x = load i32, %y = add i32 %x, %arg store i32 return i32 %y } SSA = Static Single Assignment %y = add i32 %x, %arg Load/Store architecture %x = load i32, Functions, arguments, returns, data types (Un)conditional branches, switches Universal IR for efficient compiler transformations and analyses Botconf / 51

62 Core: decoder Botconf / 51

63 Core: decoder Botconf / 51

64 Core: decoder Botconf / 51

65 Core: decoder Botconf / 51

66 Core: decoder Botconf / 51

67 Core: decoder Botconf / 51

68 Core: decoder Botconf / 51

69 Core: decoder Botconf / 51

70 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Botconf / 51

71 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Architectures (core instruction sets): ARM + Thumb extension 32-bit MIPS 32/64-bit PowerPC 32/64-bit x86 32/64-bit Capstone: 64-bit ARM, SPARC, SYSZ, XCore, m68k, m680x, TMS320C64x Botconf / 51

72 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Architectures (core instruction sets): ARM + Thumb extension 32-bit MIPS 32/64-bit PowerPC 32/64-bit x86 32/64-bit Capstone: 64-bit ARM, SPARC, SYSZ, XCore, m68k, m680x, TMS320C64x Decompilation & advanced insns Botconf / 51

73 Would you rather... PMULHUW Multiply Packed Unsigned Integers and Store High Result if (OperandSize == 64) { //PMULHUW instruction with 64-bit operands: Tmp0[0..31] = Dst[0..15] * Src[0..15]; Tmp1[0..31] = Dst[16..31] * Src[16..31]; Tmp2[0..31] = Dst[32..47] * Src[32..47]; Tmp3[0..31] = Dst[48..63] * Src[48..63]; Dst[0..15] = Tmp0[16..31]; Dst[16..31] = Tmp1[16..31]; Dst[32..47] = Tmp2[16..31]; Dst[48..63] = Tmp3[16..31]; } else { //PMULHUW instruction with 128-bit operands: // Even longer... } asm_pmulhuw(mm1, mm2); Botconf / 51

74 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Architectures (core instruction sets): ARM + Thumb extension 32-bit MIPS 32/64-bit PowerPC 32/64-bit x86 32/64-bit Capstone: 64-bit ARM, SPARC, SYSZ, XCore, m68k, m680x, TMS320C64x Decompilation & advanced insns Botconf / 51

75 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Architectures (core instruction sets): ARM + Thumb extension 32-bit MIPS 32/64-bit PowerPC 32/64-bit x86 32/64-bit Capstone: 64-bit ARM, SPARC, SYSZ, XCore, m68k, m680x, TMS320C64x Decompilation & advanced insns full semantics only for simple instructions Botconf / 51

76 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Architectures (core instruction sets): ARM + Thumb extension 32-bit MIPS 32/64-bit PowerPC 32/64-bit x86 32/64-bit Capstone: 64-bit ARM, SPARC, SYSZ, XCore, m68k, m680x, TMS320C64x Decompilation & advanced insns full semantics only for simple instructions Implementation details, testing framework (Keystone Engine + LLVM emulator), keeping LLVM IR ASM mapping,... Botconf / 51

77 Core: Capstone2LlvmIR Capstone insn sequence of LLVM IR Handcoded sequences 32/64-bit x86 1 person 2-3 weeks Architectures (core instruction sets): ARM + Thumb extension 32-bit MIPS 32/64-bit PowerPC 32/64-bit x86 32/64-bit Capstone: 64-bit ARM, SPARC, SYSZ, XCore, m68k, m680x, TMS320C64x Decompilation & advanced insns full semantics only for simple instructions Implementation details, testing framework (Keystone Engine + LLVM emulator), keeping LLVM IR ASM mapping,... Botconf / 51

78 Core: low-level passes Botconf / 51

79 Core: low-level passes Botconf / 51

80 Core: assembly generation Botconf / 51

81 Core: high-level passes Botconf / 51

82 Core repos RetDec bin2llvmir library bin2llvmirtool Botconf / 51

83 Core repos RetDec bin2llvmir library bin2llvmirtool Capstone2LlvmIR Capstone instruction to LLVM IR translation Botconf / 51

84 Core repos RetDec bin2llvmir library bin2llvmirtool Capstone2LlvmIR Capstone instruction to LLVM IR translation Capstone-dumper what does Capstone know about any instruction Botconf / 51

85 Core repos RetDec bin2llvmir library bin2llvmirtool Capstone2LlvmIR Capstone instruction to LLVM IR translation Capstone-dumper what does Capstone know about any instruction Fnc-patterns statically linked function pattern creation and detection Botconf / 51

86 Core repos RetDec bin2llvmir library bin2llvmirtool Capstone2LlvmIR Capstone instruction to LLVM IR translation Capstone-dumper what does Capstone know about any instruction Fnc-patterns statically linked function pattern creation and detection Yaramod YARA to AST parsing & C++ API to build new YARA rulesets Botconf / 51

87 Core repos RetDec bin2llvmir library bin2llvmirtool Capstone2LlvmIR Capstone instruction to LLVM IR translation Capstone-dumper what does Capstone know about any instruction Fnc-patterns statically linked function pattern creation and detection Yaramod YARA to AST parsing & C++ API to build new YARA rulesets Ctypes extraction and presentation of C function data types Botconf / 51

88 Core repos RetDec bin2llvmir library bin2llvmirtool Capstone2LlvmIR Capstone instruction to LLVM IR translation Capstone-dumper what does Capstone know about any instruction Fnc-patterns statically linked function pattern creation and detection Yaramod YARA to AST parsing & C++ API to build new YARA rulesets Ctypes extraction and presentation of C function data types Demangler gcc/clang, Microsoft Visual C++, and Borland C++ Botconf / 51

89 Backend Botconf / 51

90 Backend Botconf / 51

91 Backend Botconf / 51

92 Backend: BIR is an AST BIR = Backend IR AST = Abstract syntax tree while (x < 20){ x = x + (y * 2); } Botconf / 51

93 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

94 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

95 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

96 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

97 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

98 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

99 Backend: code structuring LLVM IR: only (un)conditional branches & switches identify high-level control-flow patterns restructure BIR: if-else, for-loop, while-loop, switch, break, continue Botconf / 51

100 Backend: optimizations copy propagation reducing the number of variables Botconf / 51

101 Backend: optimizations copy propagation reducing the number of variables arithmetic expression simplification a a + 3 Botconf / 51

102 Backend: optimizations copy propagation reducing the number of variables arithmetic expression simplification a a + 3 negation optimization if (!(a == b)) if (a!= b) Botconf / 51

103 Backend: optimizations copy propagation reducing the number of variables arithmetic expression simplification a a + 3 negation optimization if (!(a == b)) if (a!= b) pointer arithmetic *(a + 4) a[4] Botconf / 51

104 Backend: optimizations copy propagation reducing the number of variables arithmetic expression simplification a a + 3 negation optimization if (!(a == b)) if (a!= b) pointer arithmetic *(a + 4) a[4] conversion of while (true){... if (cond) break;... } for (cond){... } while (cond){... } Botconf / 51

105 Backend: optimizations copy propagation reducing the number of variables arithmetic expression simplification a a + 3 negation optimization if (!(a == b)) if (a!= b) pointer arithmetic *(a + 4) a[4] conversion of while (true){... if (cond) break;... } for (cond){... } while (cond){... } conversion of if/else-if/else chains to switch Botconf / 51

106 Backend: optimizations copy propagation reducing the number of variables arithmetic expression simplification a a + 3 negation optimization if (!(a == b)) if (a!= b) pointer arithmetic *(a + 4) a[4] conversion of while (true){... if (cond) break;... } for (cond){... } while (cond){... } conversion of if/else-if/else chains to switch... Botconf / 51

107 Backend: code generation variable name assignment induction variables: for (i = 0; i < 10; ++i) function arguments: a1, a2, a3,... general context names: return result; stdlib context names: int len = strlen(); Botconf / 51

108 Backend: code generation variable name assignment induction variables: for (i = 0; i < 10; ++i) function arguments: a1, a2, a3,... general context names: return result; stdlib context names: int len = strlen(); stdlib context literals var_ffff7dc6 = socket(2, 3, 255) Botconf / 51

109 Backend: code generation variable name assignment induction variables: for (i = 0; i < 10; ++i) function arguments: a1, a2, a3,... general context names: return result; stdlib context names: int len = strlen(); stdlib context literals var_ffff7dc6 = socket(2, 3, 255) sock_id = socket(pf_inet, SOCK_RAW, IPPROTO_RAW) Botconf / 51

110 Backend: code generation variable name assignment induction variables: for (i = 0; i < 10; ++i) function arguments: a1, a2, a3,... general context names: return result; stdlib context names: int len = strlen(); stdlib context literals var_ffff7dc6 = socket(2, 3, 255) sock_id = socket(pf_inet, SOCK_RAW, IPPROTO_RAW) flock(sock_id, 7) flock(sock_id, LOCK_SH LOCK_EX LOCK_NB) Botconf / 51

111 Backend: code generation variable name assignment induction variables: for (i = 0; i < 10; ++i) function arguments: a1, a2, a3,... general context names: return result; stdlib context names: int len = strlen(); stdlib context literals var_ffff7dc6 = socket(2, 3, 255) sock_id = socket(pf_inet, SOCK_RAW, IPPROTO_RAW) flock(sock_id, 7) flock(sock_id, LOCK_SH LOCK_EX LOCK_NB) output generation C CFG = Control-Flow Graph Call Graph Botconf / 51

112 Backend repos RetDec llvmir2hll library llvmir2hlltool Botconf / 51

113 How to use RetDec Online decompilation service Botconf / 51

114 How to use RetDec Online decompilation service REST API Botconf / 51

115 How to use RetDec Online decompilation service REST API Build it yourself CMake, gcc/clang, Visual Studio 2015 Update 2 Perl, GNU Bison, Flex, GNU Tar, scp, GNU bash, UPX, dot Recursively clone the main RetDec repository mkdir build && cd build cmake.. make && make install Botconf / 51

116 How to use RetDec Online decompilation service REST API Build it yourself CMake, gcc/clang, Visual Studio 2015 Update 2 Perl, GNU Bison, Flex, GNU Tar, scp, GNU bash, UPX, dot Recursively clone the main RetDec repository mkdir build && cd build cmake.. make && make install Run it yourself decompile.sh binary.exe Botconf / 51

117 How to use RetDec Online decompilation service REST API Build it yourself CMake, gcc/clang, Visual Studio 2015 Update 2 Perl, GNU Bison, Flex, GNU Tar, scp, GNU bash, UPX, dot Recursively clone the main RetDec repository mkdir build && cd build cmake.. make && make install Run it yourself decompile.sh binary.exe Get RetDec IDA plugin Botconf / 51

118 What is RetDec IDA plugin Botconf / 51

119 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

120 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

121 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

122 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

123 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

124 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

125 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

126 How does RetDec IDA plugin work Goals look & feel native same object names as IDA interactive Botconf / 51

127 RetDec IDA plugin is interactive Botconf / 51

128 How was RetDec used so far retdec.com launched on Botconf / 51

129 How was RetDec used so far retdec.com launched on ,000 registered users Botconf / 51

130 How was RetDec used so far retdec.com launched on ,000 registered users 423,000 decompilations 350,000 Web 73,000 API 410 decompilations daily Botconf / 51

131 Real example #1: Vawtrak (x86) Botconf / 51

132 Real example #2: Vawtrak (x86) Botconf / 51

133 Real example #3: CryproWall (x86) Botconf / 51

134 Real example #4: Psyb0t (mips) RetDec Botconf / 51

135 Real example #5: Psyb0t (mips) RetDec main ip2c system flock function_ function_404b1c fileno backup Daemonize RSeed getip fetch sleep strcmp xdec fopen fread fclose fork exit gettimeofday srand snprintf strncmp parse strncpy strlen Botconf / 51

136 Should you throw away your Hex-Rays? Botconf / 51

137 Should you throw away your Hex-Rays? IDA and Hex-Rays are great output quality interactive seamlessly integrated mature many plugins official support NO! Botconf / 51

138 Should you throw away your Hex-Rays? IDA and Hex-Rays are great output quality interactive seamlessly integrated mature many plugins official support IDA and Hex-Rays have flaws not free proprietary big monolithic GUI app NO! Botconf / 51

139 RetDec is handy because... Obvious reasons it is free + MIPS architecture MIT license you can play with the sources Botconf / 51

140 RetDec is handy because... Obvious reasons it is free + MIPS architecture MIT license you can play with the sources Not so obvious reasons LLVM is awesome Botconf / 51

141 RetDec is handy because... Obvious reasons it is free + MIPS architecture MIT license you can play with the sources Not so obvious reasons LLVM is awesome different basic designs: interactive GUI vs. pipeline Botconf / 51

142 RetDec is handy because... Obvious reasons it is free + MIPS architecture MIT license you can play with the sources Not so obvious reasons LLVM is awesome different basic designs: interactive GUI vs. pipeline LLVM is OP (don t worry, it won t be nerfed) Botconf / 51

143 RetDec is not only decompiler RetDec the decompiler RetDec IDA plugin Hex-Rays impersonation Botconf / 51

144 RetDec is not only decompiler RetDec the decompiler RetDec IDA plugin Hex-Rays impersonation Fileformat generic OFF parsing and analysis Capstone2LlvmIR binary to LLVM translation Fnc-patterns statically linked code detection in YARA (IDA F.L.I.R.T.) Yaramod hack YARA rules in C++ Yaracpp YARA C++ wrapper Ctypes info on function types Botconf / 51

145 What s next Release the sources on Github shortly after the conference Botconf / 51

146 What s next Release the sources on Github shortly after the conference Throw a release party Botconf / 51

147 What s next Release the sources on Github shortly after the conference Throw a release party Solve some inevitable hey guys, I m unable to build your repo Botconf / 51

148 What s next Release the sources on Github shortly after the conference Throw a release party Solve some inevitable hey guys, I m unable to build your repo Write more technical documentation on how it all works Botconf / 51

149 What s next Release the sources on Github shortly after the conference Throw a release party Solve some inevitable hey guys, I m unable to build your repo Write more technical documentation on how it all works Present it somewhere (maybe LLVM dev meeting) Botconf / 51

150 What s next Release the sources on Github shortly after the conference Throw a release party Solve some inevitable hey guys, I m unable to build your repo Write more technical documentation on how it all works Present it somewhere (maybe LLVM dev meeting) Continue improving the implementation make it more portable: Bash Python, bit architectures supports replace some libraries & modules Botconf / 51

151 What s next Release the sources on Github shortly after the conference Throw a release party Solve some inevitable hey guys, I m unable to build your repo Write more technical documentation on how it all works Present it somewhere (maybe LLVM dev meeting) Continue improving the implementation make it more portable: Bash Python, bit architectures supports replace some libraries & modules We will see... Botconf / 51

152 That s all folks Thanks! Contacts info@retdec.com Botconf / 51

RetDec: An Open-Source Machine-Code Decompiler. Jakub Křoustek Peter Matula

RetDec: An Open-Source Machine-Code Decompiler. Jakub Křoustek Peter Matula RetDec: An Open-Source Machine-Code Decompiler Jakub Křoustek Peter Matula Who Are We? 2 Jakub Křoustek Founder of RetDec Threat Labs lead @Avast (previously @AVG) Reverse engineer, malware hunter, security

More information

DragonEgg - the Fortran compiler

DragonEgg - the Fortran compiler DragonEgg - the Fortran compiler Contents DragonEgg - the Fortran compiler o SYNOPSIS o DESCRIPTION o BUILD DRAGONEGG PLUGIN o USAGE OF DRAGONEGG PLUGIN TO COMPILE FORTRAN PROGRAM o LLVM Optimizations

More information

Clang - the C, C++ Compiler

Clang - the C, C++ Compiler Clang - the C, C++ Compiler Contents Clang - the C, C++ Compiler o SYNOPSIS o DESCRIPTION o OPTIONS SYNOPSIS clang [options] filename... DESCRIPTION clang is a C, C++, and Objective-C compiler which encompasses

More information

P4LLVM: An LLVM based P4 Compiler

P4LLVM: An LLVM based P4 Compiler P4LLVM: An LLVM based P4 Compiler Tharun Kumar Dangeti, Venkata Keerthy Soundararajan, Ramakrishna Upadrasta Indian Institute of Technology Hyderabad First P4 European Workshop - P4EU September 24 th,

More information

A New Approach to Instruction-Idioms Detection in a Retargetable Decompiler

A New Approach to Instruction-Idioms Detection in a Retargetable Decompiler Computer Science and Information Systems 11(4):1337 1359 DOI: 10.2298/CSIS131203076K A New Approach to Instruction-Idioms Detection in a Retargetable Decompiler Jakub Křoustek, Fridolín Pokorný, and Dušan

More information

Flang - the Fortran Compiler SYNOPSIS DESCRIPTION. Contents. Flang the Fortran Compiler. flang [options] filename...

Flang - the Fortran Compiler SYNOPSIS DESCRIPTION. Contents. Flang the Fortran Compiler. flang [options] filename... Flang - the Fortran Compiler Contents Flang the Fortran Compiler o SYNOPSIS o DESCRIPTION o OPTIONS SYNOPSIS flang [options] filename... DESCRIPTION Flang is the Fortran front-end designed for integration

More information

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g., C++) to low-level assembly language that can be executed by hardware int a,

More information

OptiCode: Machine Code Deobfuscation for Malware Analysis

OptiCode: Machine Code Deobfuscation for Malware Analysis OptiCode: Machine Code Deobfuscation for Malware Analysis NGUYEN Anh Quynh, COSEINC CONFidence, Krakow - Poland 2013, May 28th 1 / 47 Agenda 1 Obfuscation problem in malware analysis

More information

Detection of Non-Size Increasing Programs in Compilers

Detection of Non-Size Increasing Programs in Compilers Detection of Non-Size Increasing Programs in Compilers Implementation of Implicit Complexity Analysis Jean-Yves Moyen 1 Thomas Rubiano 2 1 Department of Computer Science University of Copenhagen (DIKU)

More information

Lecture Compiler Middle-End

Lecture Compiler Middle-End Lecture 16-18 18 Compiler Middle-End Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 What We Have Done A lot! Compiler Frontend Defining language Generating

More information

Tour of common optimizations

Tour of common optimizations Tour of common optimizations Simple example foo(z) { x := 3 + 6; y := x 5 return z * y } Simple example foo(z) { x := 3 + 6; y := x 5; return z * y } x:=9; Applying Constant Folding Simple example foo(z)

More information

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573 What is a compiler? Xiaokang Qiu Purdue University ECE 573 August 21, 2017 What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g.,

More information

ECE 498 Linux Assembly Language Lecture 1

ECE 498 Linux Assembly Language Lecture 1 ECE 498 Linux Assembly Language Lecture 1 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 13 November 2012 Assembly Language: What s it good for? Understanding at a low-level what

More information

Cross Compiling. Real Time Operating Systems and Middleware. Luca Abeni

Cross Compiling. Real Time Operating Systems and Middleware. Luca Abeni Cross Compiling Real Time Operating Systems and Middleware Luca Abeni luca.abeni@unitn.it The Kernel Kernel OS component interacting with hardware Runs in privileged mode (Kernel Space KS) User Level Kernel

More information

LIEF: Library to Instrument Executable Formats

LIEF: Library to Instrument Executable Formats RMLL 2017 Romain Thomas - rthomas@quarkslab.com LIEF: Library to Instrument Executable Formats Table of Contents Introduction Project Overview Demo Conclusion About Romain Thomas (rthomas@quarkslab.com)

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA CISC 662 Graduate Computer Architecture Lecture 4 - ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall INTRODUCTION TO LLVM Bo Wang SA 2016 Fall LLVM Basic LLVM IR LLVM Pass OUTLINE What is LLVM? LLVM is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. Implemented

More information

CMSC 611: Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture CMSC 611: Advanced Computer Architecture Compilers Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science

More information

Reminder: tutorials start next week!

Reminder: tutorials start next week! Previous lecture recap! Metrics of computer architecture! Fundamental ways of improving performance: parallelism, locality, focus on the common case! Amdahl s Law: speedup proportional only to the affected

More information

Writing LLVM Backends Construindo Backends para o LLVM. Bruno Cardoso Lopes

Writing LLVM Backends Construindo Backends para o LLVM. Bruno Cardoso Lopes Writing LLVM Backends Construindo Backends para o LLVM Bruno Cardoso Lopes Agenda What s LLVM? Why LLVM? Who is using? Infrastructure. Internals. The Backend. Writing LLVM backends. Work. What's LLVM?

More information

Lecture 4: Instruction Set Architecture

Lecture 4: Instruction Set Architecture Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)

More information

CS 426 Fall Machine Problem 4. Machine Problem 4. CS 426 Compiler Construction Fall Semester 2017

CS 426 Fall Machine Problem 4. Machine Problem 4. CS 426 Compiler Construction Fall Semester 2017 CS 426 Fall 2017 1 Machine Problem 4 Machine Problem 4 CS 426 Compiler Construction Fall Semester 2017 Handed out: November 16, 2017. Due: December 7, 2017, 5:00 p.m. This assignment deals with implementing

More information

CENG3420 Lecture 03 Review

CENG3420 Lecture 03 Review CENG3420 Lecture 03 Review Bei Yu byu@cse.cuhk.edu.hk 2017 Spring 1 / 38 CISC vs. RISC Complex Instruction Set Computer (CISC) Lots of instructions of variable size, very memory optimal, typically less

More information

PLT Course Project Demo Spring Chih- Fan Chen Theofilos Petsios Marios Pomonis Adrian Tang

PLT Course Project Demo Spring Chih- Fan Chen Theofilos Petsios Marios Pomonis Adrian Tang PLT Course Project Demo Spring 2013 Chih- Fan Chen Theofilos Petsios Marios Pomonis Adrian Tang 1. Introduction To Code Obfuscation 2. LLVM 3. Obfuscation Techniques 1. String Transformation 2. Junk Code

More information

ECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment

ECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment ECE 5775 (Fall 17) High-Level Digital Design Automation Static Single Assignment Announcements HW 1 released (due Friday) Student-led discussions on Tuesday 9/26 Sign up on Piazza: 3 students / group Meet

More information

Instructions: Language of the Computer

Instructions: Language of the Computer CS359: Computer Architecture Instructions: Language of the Computer Yanyan Shen Department of Computer Science and Engineering 1 The Language a Computer Understands Word a computer understands: instruction

More information

LLVM and IR Construction

LLVM and IR Construction LLVM and IR Construction Fabian Ritter based on slides by Christoph Mallon and Johannes Doerfert http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University 1 Project Progress source code

More information

ISA: The Hardware Software Interface

ISA: The Hardware Software Interface ISA: The Hardware Software Interface Instruction Set Architecture (ISA) is where software meets hardware In embedded systems, this boundary is often flexible Understanding of ISA design is therefore important

More information

CSE 504: Compiler Design. Code Generation

CSE 504: Compiler Design. Code Generation Code Generation Pradipta De pradipta.de@sunykorea.ac.kr Current Topic Introducing basic concepts in code generation phase Code Generation Detailed Steps The problem of generating an optimal target program

More information

Lecture 3 Overview of the LLVM Compiler

Lecture 3 Overview of the LLVM Compiler LLVM Compiler System Lecture 3 Overview of the LLVM Compiler The LLVM Compiler Infrastructure - Provides reusable components for building compilers - Reduce the time/cost to build a new compiler - Build

More information

A Plan 9 C Compiler for RISC-V

A Plan 9 C Compiler for RISC-V A Plan 9 C Compiler for RISC-V Richard Miller r.miller@acm.org Plan 9 C compiler - written by Ken Thompson for Plan 9 OS - used for Inferno OS kernel and limbo VM - used to bootstrap first releases of

More information

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro Math 230 Assembly Programming (AKA Computer Organization) Spring 2008 MIPS Intro Adapted from slides developed for: Mary J. Irwin PSU CSE331 Dave Patterson s UCB CS152 M230 L09.1 Smith Spring 2008 MIPS

More information

ECE 471 Embedded Systems Lecture 4

ECE 471 Embedded Systems Lecture 4 ECE 471 Embedded Systems Lecture 4 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 12 September 2013 Announcements HW#1 will be posted later today For next class, at least skim

More information

ECE 486/586. Computer Architecture. Lecture # 7

ECE 486/586. Computer Architecture. Lecture # 7 ECE 486/586 Computer Architecture Lecture # 7 Spring 2015 Portland State University Lecture Topics Instruction Set Principles Instruction Encoding Role of Compilers The MIPS Architecture Reference: Appendix

More information

Lecture 4: RISC Computers

Lecture 4: RISC Computers Lecture 4: RISC Computers Introduction Program execution features RISC characteristics RISC vs. CICS Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) represents an important

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization CISC 662 Graduate Computer Architecture Lecture 4 - ISA MIPS ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

Lecture 4: Instruction Set Design/Pipelining

Lecture 4: Instruction Set Design/Pipelining Lecture 4: Instruction Set Design/Pipelining Instruction set design (Sections 2.9-2.12) control instructions instruction encoding Basic pipelining implementation (Section A.1) 1 Control Transfer Instructions

More information

LLVM & LLVM Bitcode Introduction

LLVM & LLVM Bitcode Introduction LLVM & LLVM Bitcode Introduction What is LLVM? (1/2) LLVM (Low Level Virtual Machine) is a compiler infrastructure Written by C++ & STL History The LLVM project started in 2000 at the University of Illinois

More information

15-411: LLVM. Jan Hoffmann. Substantial portions courtesy of Deby Katz

15-411: LLVM. Jan Hoffmann. Substantial portions courtesy of Deby Katz 15-411: LLVM Jan Hoffmann Substantial portions courtesy of Deby Katz and Gennady Pekhimenko, Olatunji Ruwase,Chris Lattner, Vikram Adve, and David Koes Carnegie What is LLVM? A collection of modular and

More information

Proceedings of the 2013 Federated Conference on Computer Science and Information Systems pp

Proceedings of the 2013 Federated Conference on Computer Science and Information Systems pp Proceedings of the 2013 Federated Conference on Computer Science and Information Systems pp. 1507 1514 Reconstruction of Instruction Idioms in a Retargetable Decompiler Jakub Křoustek, Fridolín Pokorný

More information

Practical Malware Analysis

Practical Malware Analysis Practical Malware Analysis Ch 4: A Crash Course in x86 Disassembly Revised 1-16-7 Basic Techniques Basic static analysis Looks at malware from the outside Basic dynamic analysis Only shows you how the

More information

Computer Organization MIPS ISA

Computer Organization MIPS ISA CPE 335 Computer Organization MIPS ISA Dr. Iyad Jafar Adapted from Dr. Gheith Abandah Slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE 232 MIPS ISA 1 (vonneumann) Processor Organization

More information

McSema: Static Translation of X86 Instructions to LLVM

McSema: Static Translation of X86 Instructions to LLVM McSema: Static Translation of X86 Instructions to LLVM ARTEM DINABURG, ARTEM@TRAILOFBITS.COM ANDREW RUEF, ANDREW@TRAILOFBITS.COM About Us Artem Security Researcher blog.dinaburg.org Andrew PhD Student,

More information

Chapter 5:: Target Machine Architecture (cont.)

Chapter 5:: Target Machine Architecture (cont.) Chapter 5:: Target Machine Architecture (cont.) Programming Language Pragmatics Michael L. Scott Review Describe the heap for dynamic memory allocation? What is scope and with most languages how what happens

More information

Abstract Interpretation

Abstract Interpretation Abstract Interpretation MATHE MATICAL PROGRAM CHE CKING Overview High level mathematical tools Originally conceived to help give a theoretical grounding to program analysis Useful for other kinds of analyses

More information

Welcome to CSE131b: Compiler Construction

Welcome to CSE131b: Compiler Construction Welcome to CSE131b: Compiler Construction Lingjia Tang pic from: http://xkcd.com/303/ Course Information What are compilers? Why do we learn about them? History of compilers Structure of compilers A bit

More information

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program. Language Translation Compilation vs. interpretation Compilation diagram Step 1: compile program compiler Compiled program Step 2: run input Compiled program output Language Translation compilation is translation

More information

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine. This lecture Compiler construction Lecture 6: Code generation for x86 Magnus Myreen Spring 2018 Chalmers University of Technology Gothenburg University x86 architecture s Some x86 instructions From LLVM

More information

Introducing LLDB for Linux on Arm and AArch64. Omair Javaid

Introducing LLDB for Linux on Arm and AArch64. Omair Javaid Introducing LLDB for Linux on Arm and AArch64 Omair Javaid Agenda ENGINEERS AND DEVICES WORKING TOGETHER Brief introduction and history behind LLDB Status of LLDB on Linux and Android Linaro s contributions

More information

LDC: The LLVM-based D Compiler

LDC: The LLVM-based D Compiler LDC: The LLVM-based D Compiler Using LLVM as backend for a D compiler Kai Nacke 02/02/14 LLVM devroom @ FOSDEM 14 Agenda Brief introduction to D Internals of the LDC compiler Used LLVM features Possible

More information

Chapter 2A Instructions: Language of the Computer

Chapter 2A Instructions: Language of the Computer Chapter 2A Instructions: Language of the Computer Copyright 2009 Elsevier, Inc. All rights reserved. Instruction Set The repertoire of instructions of a computer Different computers have different instruction

More information

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating

More information

T Reverse Engineering Malware: Static Analysis I

T Reverse Engineering Malware: Static Analysis I T-110.6220 Reverse Engineering Malware: Static Analysis I Antti Tikkanen, F-Secure Corporation Protecting the irreplaceable f-secure.com Representing Data 2 Binary Numbers 1 0 1 1 Nibble B 1 0 1 1 1 1

More information

COMPILER CONSTRUCTION Seminar 03 TDDB

COMPILER CONSTRUCTION Seminar 03 TDDB COMPILER CONSTRUCTION Seminar 03 TDDB44 2016 Martin Sjölund (martin.sjolund@liu.se) Mahder Gebremedhin (mahder.gebremedhin@liu.se) Department of Computer and Information Science Linköping University LABS

More information

CSE 351. GDB Introduction

CSE 351. GDB Introduction CSE 351 GDB Introduction Lab 2 Out either tonight or tomorrow Due April 27 th (you have ~12 days) Reading and understanding x86_64 assembly Debugging and disassembling programs Today: General debugging

More information

07 - Program Flow Control

07 - Program Flow Control September 23, 2014 Schedule change this week The lecture on thursday needs to move Lab computers The current computer lab (Bussen) is pretty nice since it has dual monitors However, the computers does

More information

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer

More information

Instructions: Language of the Computer

Instructions: Language of the Computer CS359: Computer Architecture Instructions: Language of the Computer Yanyan Shen Department of Computer Science and Engineering 1 The Language a Computer Understands Word a computer understands: instruction

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

What do Compilers Produce?

What do Compilers Produce? What do Compilers Produce? Pure Machine Code Compilers may generate code for a particular machine, not assuming any operating system or library routines. This is pure code because it includes nothing beyond

More information

LLVM, Clang and Embedded Linux Systems. Bruno Cardoso Lopes University of Campinas

LLVM, Clang and Embedded Linux Systems. Bruno Cardoso Lopes University of Campinas LLVM, Clang and Embedded Linux Systems Bruno Cardoso Lopes University of Campinas What s LLVM? What s LLVM Compiler infrastructure Frontend (clang) IR Optimizer Backends JIT Tools Assembler Disassembler

More information

Tutorial: Building a backend in 24 hours. Anton Korobeynikov

Tutorial: Building a backend in 24 hours. Anton Korobeynikov Tutorial: Building a backend in 24 hours Anton Korobeynikov anton@korobeynikov.info Outline 1. From IR to assembler: codegen pipeline 2. MC 3. Parts of a backend 4. Example step-by-step The Pipeline LLVM

More information

Design of an Automatically Generated Retargetable Decompiler

Design of an Automatically Generated Retargetable Decompiler esign of an Automatically Generated Retargetable ecompiler Lukáš Ďurfina, Jakub Křoustek, Petr Zemek, ušan Kolář, Tomáš Hruška, Karel Masařík, Alexander Meduna Faculty of Information Technology, Brno University

More information

Chapter 11 Introduction to Programming in C

Chapter 11 Introduction to Programming in C Chapter 11 Introduction to Programming in C C: A High-Level Language Gives symbolic names for containers of values don t need to know which register or memory location Provides abstraction of underlying

More information

Compiler Optimization Intermediate Representation

Compiler Optimization Intermediate Representation Compiler Optimization Intermediate Representation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology

More information

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer

More information

For our next chapter, we will discuss the emulation process which is an integral part of virtual machines.

For our next chapter, we will discuss the emulation process which is an integral part of virtual machines. For our next chapter, we will discuss the emulation process which is an integral part of virtual machines. 1 2 For today s lecture, we ll start by defining what we mean by emulation. Specifically, in this

More information

Formats of Translated Programs

Formats of Translated Programs Formats of Translated Programs Compilers differ in the format of the target code they generate. Target formats may be categorized as assembly language, relocatable binary, or memory-image. Assembly Language

More information

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona CSc 453 Compilers and Systems Software 13 : Intermediate Code I Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2009 Christian Collberg Introduction Compiler Phases

More information

Introduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C

Introduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C Final Review CS304 Introduction to C Why C? Difference between Python and C C compiler stages Basic syntax in C Pointers What is a pointer? declaration, &, dereference... Pointer & dynamic memory allocation

More information

17. Instruction Sets: Characteristics and Functions

17. Instruction Sets: Characteristics and Functions 17. Instruction Sets: Characteristics and Functions Chapter 12 Spring 2016 CS430 - Computer Architecture 1 Introduction Section 12.1, 12.2, and 12.3 pp. 406-418 Computer Designer: Machine instruction set

More information

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008 Math 230 Assembly Programming (AKA Computer Organization) Spring 2008 MIPS Intro II Lect 10 Feb 15, 2008 Adapted from slides developed for: Mary J. Irwin PSU CSE331 Dave Patterson s UCB CS152 M230 L10.1

More information

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski Operating Systems 18. Remote Procedure Calls Paul Krzyzanowski Rutgers University Spring 2015 4/20/2015 2014-2015 Paul Krzyzanowski 1 Remote Procedure Calls 2 Problems with the sockets API The sockets

More information

Intermediate Code & Local Optimizations

Intermediate Code & Local Optimizations Lecture Outline Intermediate Code & Local Optimizations Intermediate code Local optimizations Compiler Design I (2011) 2 Code Generation Summary We have so far discussed Runtime organization Simple stack

More information

Topics Power tends to corrupt; absolute power corrupts absolutely. Computer Organization CS Data Representation

Topics Power tends to corrupt; absolute power corrupts absolutely. Computer Organization CS Data Representation Computer Organization CS 231-01 Data Representation Dr. William H. Robinson November 12, 2004 Topics Power tends to corrupt; absolute power corrupts absolutely. Lord Acton British historian, late 19 th

More information

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines Lecture Outline Code Generation Lecture 30 (based on slides by R. Bodik) Stack machines The MIPS assembly language The x86 assembly language A simple source language Stack-machine implementation of the

More information

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Supercharge your PS3 game code Part 1: Compiler internals.

More information

Compiler, Assembler, and Linker

Compiler, Assembler, and Linker Compiler, Assembler, and Linker Minsoo Ryu Department of Computer Science and Engineering Hanyang University msryu@hanyang.ac.kr What is a Compilation? Preprocessor Compiler Assembler Linker Loader Contents

More information

Computer Systems Architecture I. CSE 560M Lecture 3 Prof. Patrick Crowley

Computer Systems Architecture I. CSE 560M Lecture 3 Prof. Patrick Crowley Computer Systems Architecture I CSE 560M Lecture 3 Prof. Patrick Crowley Plan for Today Announcements Readings are extremely important! No class meeting next Monday Questions Commentaries A few remaining

More information

Lecture Notes on Decompilation

Lecture Notes on Decompilation Lecture Notes on Decompilation 15411: Compiler Design Maxime Serrano Lecture 20 October 31, 2013 1 Introduction In this lecture, we consider the problem of doing compilation backwards - that is, transforming

More information

Chapter 11 Introduction to Programming in C

Chapter 11 Introduction to Programming in C Chapter 11 Introduction to Programming in C Original slides from Gregory Byrd, North Carolina State University Modified slides by Chris Wilcox, Colorado State University C: A High-Level Language! Gives

More information

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei Instruction Set Architecture part 1 (Introduction) Mehran Rezaei Overview Last Lecture s Review Execution Cycle Levels of Computer Languages Stored Program Computer/Instruction Execution Cycle SPIM, a

More information

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls Announcements HW1 is due on this Friday (Sept 12 th ) Appendix A is very helpful to HW1. Check out system calls on Page A-48. Ask TA (Liquan chen: liquan@ece.rutgers.edu) about homework related questions.

More information

About Codefrux While the current trends around the world are based on the internet, mobile and its applications, we try to make the most out of it. As for us, we are a well established IT professionals

More information

Slides for Lecture 6

Slides for Lecture 6 Slides for Lecture 6 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary 28 January,

More information

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands Stored Program Concept Instructions: Instructions are bits Programs are stored in memory to be read or written just like data Processor Memory memory for data, programs, compilers, editors, etc. Fetch

More information

CISC / RISC. Complex / Reduced Instruction Set Computers

CISC / RISC. Complex / Reduced Instruction Set Computers Systems Architecture CISC / RISC Complex / Reduced Instruction Set Computers CISC / RISC p. 1/12 Instruction Usage Instruction Group Average Usage 1 Data Movement 45.28% 2 Flow Control 28.73% 3 Arithmetic

More information

SSA Construction. Daniel Grund & Sebastian Hack. CC Winter Term 09/10. Saarland University

SSA Construction. Daniel Grund & Sebastian Hack. CC Winter Term 09/10. Saarland University SSA Construction Daniel Grund & Sebastian Hack Saarland University CC Winter Term 09/10 Outline Overview Intermediate Representations Why? How? IR Concepts Static Single Assignment Form Introduction Theory

More information

Measuring the User Debugging Experience. Greg Bedwell Sony Interactive Entertainment

Measuring the User Debugging Experience. Greg Bedwell Sony Interactive Entertainment Measuring the User Debugging Experience Greg Bedwell Sony Interactive Entertainment introducing DExTer introducing Debugging Experience Tester introducing Debugging Experience Tester (currently in internal

More information

Code Generation. Lecture 30

Code Generation. Lecture 30 Code Generation Lecture 30 (based on slides by R. Bodik) 11/14/06 Prof. Hilfinger CS164 Lecture 30 1 Lecture Outline Stack machines The MIPS assembly language The x86 assembly language A simple source

More information

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization CS/COE0447: Computer Organization and Assembly Language Chapter 2 Sangyeun Cho Dept. of Computer Science Five classic components I am like a control tower I am like a pack of file folders I am like a conveyor

More information

Connecting the EDG front-end to LLVM. Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd.

Connecting the EDG front-end to LLVM. Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd. Connecting the EDG front-end to LLVM Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd. 1 Outline Why EDG Producing IR ARM support 2 EDG Front-End LLVM already has two good C++ front-ends, why

More information

GCC Plugins Die by the Sword

GCC Plugins Die by the Sword GCC Plugins Die by the Sword Matt Davis (enferex) Ruxcon 2011 mattdavis9@gmail.com November 20 2011 mattdavis9@gmail.com (Ruxcon 2011) Getting Dirty with GCC November 20 2011 1 / 666 Overview 1 Overview

More information

Who are we? Andre Platzer Out of town the first week GHC TAs Alex Crichton, senior in CS and ECE Ian Gillis, senior in CS

Who are we? Andre Platzer Out of town the first week GHC TAs Alex Crichton, senior in CS and ECE Ian Gillis, senior in CS 15-411 Compilers Who are we? Andre Platzer Out of town the first week GHC 9103 TAs Alex Crichton, senior in CS and ECE Ian Gillis, senior in CS Logistics symbolaris.com/course/compiler12.html symbolaris.com

More information

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine Machine Language Instructions Introduction Instructions Words of a language understood by machine Instruction set Vocabulary of the machine Current goal: to relate a high level language to instruction

More information

Introduction to LLVM. UG3 Compiling Techniques Autumn 2018

Introduction to LLVM. UG3 Compiling Techniques Autumn 2018 Introduction to LLVM UG3 Compiling Techniques Autumn 2018 Contact Information Instructor: Aaron Smith Email: aaron.l.smith@ed.ac.uk Office: IF 1.29 TA for LLVM: Andrej Ivanis Email: andrej.ivanis@ed.ac.uk

More information

Computer Architecture. MIPS Instruction Set Architecture

Computer Architecture. MIPS Instruction Set Architecture Computer Architecture MIPS Instruction Set Architecture Instruction Set Architecture An Abstract Data Type Objects Registers & Memory Operations Instructions Goal of Instruction Set Architecture Design

More information

Binary Code Analysis: Concepts and Perspectives

Binary Code Analysis: Concepts and Perspectives Binary Code Analysis: Concepts and Perspectives Emmanuel Fleury LaBRI, Université de Bordeaux, France May 12, 2016 E. Fleury (LaBRI, France) Binary Code Analysis: Concepts

More information

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009 101 Assembly ENGR 3410 Computer Architecture Mark L. Chang Fall 2009 What is assembly? 79 Why are we learning assembly now? 80 Assembly Language Readings: Chapter 2 (2.1-2.6, 2.8, 2.9, 2.13, 2.15), Appendix

More information

Intermediate Representation (IR)

Intermediate Representation (IR) Intermediate Representation (IR) Components and Design Goals for an IR IR encodes all knowledge the compiler has derived about source program. Simple compiler structure source code More typical compiler

More information