Automated Software Transplantation Earl T. Mark Yue Alexandru Justyna Barr Harman Jia Marginean Petke CREST, University College London
Why Autotransplantation? ~100 players Why not handle H.264? Video Player Check open source repositories Start from scratch
Related Work Manual Code Transplants Clone Detection Code Migration In-Situ Code Reusal Dependence Analysis Feature Location Automatic Error Fixing Code Salvaging Autotransplantation Feature Extraction In-Situ Code Reuse Synchronising Manual Transplants Automatic Replay Copy- Paste
Related Work Miles et al.: In situ reuse of logically extracted functional components In-Situ Code Reusal Manual Code Transplants Debugger Automatic Error Fixing Binary Organ Autotransplantation Running Program
Related Work Sidiroglou-Douskos et al.: Automatic Error Elimination by Multi-Application Code Transfer Manual Code Transplants Automatic Error Fixing In-Situ Code Reusal Donors Autotransplantation
Related Work Autotransplantation Manual Code Transplants In-Situ Code Reusal Automatic Error Fixing
Human Organ Transplantation
Automated Software Transplantation Donor Manual Work: Organ Entry V ENTRY Organ O Organ s Test Suite Organ s Test Suite Implantation Point
μtrans Donor Stage 1: Static Analysis Stage 2: Genetic Programming Stage 3: Organ Implantation Organ s Test Suite Beneficiary
Stage 1 Static Analysis Vein Donor Organ ENTRY OE H Implantation Point Matching Table Dependency Graph Donor: Stm: int x = X 10; -> : -> Decl: int int A, B, x; C
Stage 2 GP Individual S1 S2 S3 S4 S5 Sn Donor Variable ID V1 D V2 D Matching Table Variable ID (set) V3 H V4 H H V5 V1 H V2 H Genetic Programming Var Matching Stateme nts fitness(i) = M1: V1 D M2: V2 D Does it compile? S1 S7 S73 V1 H V4 H ( 1 3 (1 + TX i T + TP i T ) i 2 I C 0 i/2 I C Strong Weak Proxies: Proxies: Does it Does produce it execute the correct test cases output? without crashing?
Research Questions Acceptance Regression Tests Donor Organ RQ1: RQ2: RQ4: RQ3: Do Have Is autotransplantation we How we break about really the the added initial computational new functionality? useful? effort?
Research Questions RQ1: Do we break the initial functionality? RQ2: Have we really added new functionality? Empirical Study 15 Transplantations 300 Runs 5 Donors 3 s Case Study: H.264 Encoding Transplantation RQ3: How about the computational effort? RQ4: Is autotransplantation useful?
Validation Regression Tests Donor Acceptance Tests RQ1.1 Augmented Regression Tests RQ1.2 Acceptance Tests RQ2 Beneficiary Manual Validation
ISSTA * * Artifact Consistent * Complete * Well Documented * Easy to Reuse * Evaluated * AEC * Subjects Subjects Type Size KLOC Reg. Tests Organ Test Suite Idct Donor 2.3-3-5 Mytar Donor 0.4-4 Cflow Donor 25-6-20 Webserver Donor 1.7-3 TuxCrypt Donor 2.7-4-5 Pidgin 363 88 - Cflow 25 21 - SoX 43 157 - Case Study x264 Donor 63-5 VLC 422 27 - Minimal size: 0.4k; Max size: 422k; Average Donor: 16k; Average : 213k;
Experimental Methodology Count LOC and Setup CLOC Donor x 20 Implantation Point Organ s Test Suite OE Validation Test Suites GNU Time Ubuntu 14.10, 16 GB Ram 8 threads μscalpel Beneficiary Implantation Point Coverage Information: Gcov Organ
ISSTA * * Artifact Consistent * Complete * Well Documented * Easy to Reuse * Evaluated * AEC * Empirical Study RQ1,2 All Test Suites Donor Passed Regression Regression++ Acceptance Idct Pidgin 16 20 17 16 Mytar Pidgin 16 20 18 20 Web Pidgin 0 20 0 18 Cflow Pidgin 15 20 15 16 Tux Pidgin 15 20 17 16 Idct Cflow 16 17 16 16 Mytar Cflow 17 17 17 20 Web Cflow 0 0 0 17 Cflow Cflow 20 20 20 20 Tux Cflow 14 15 14 16 Idct SoX 15 18 17 16 Mytar SoX 17 17 17 20 Web SoX 0 0 0 17 Cflow SoX 14 16 15 14 Tux SoX 13 13 13 14 TOTAL 188/300 233/300 196/300 256/300 RQ1.1 RQ1.2 RQ2
ISSTA * * Artifact Consistent * Complete * Well Documented * Easy to Reuse * Evaluated * AEC * Empirical Study RQ3 Donor Timing Information Execution Time (minutes) Std. Dev. Idct Pidgin 5 7 97 Mytar Pidgin 3 1 65 Web Pidgin 8 5 160 Cflow Pidgin 58 16 1151 Tux Pidgin 29 10 574 Idct Cflow 3 5 59 Mytar Cflow 3 1 53 Web Cflow 5 2 102 Cflow Cflow 44 9 872 Tux Cflow 31 11 623 Idct SoX 12 17 233 Mytar SoX 3 1 60 Web SoX 7 3 132 Cflow SoX 89 53 74 Tux SoX 34 13 94 Total Average 334 (min) 10 (Average) RQ3 Total 72 (hours)
ISSTA * * Artifact Consistent * Complete * Well Documented * Easy to Reuse * Evaluated * AEC * Case Study RQ4 Second Annual MSU MPEG-4 AVC/ H.264 Codecs Comparison MSU Sixth MPEG-4 AVC/H.264 Transplant Time & Test Suites Video Codecs Comparison, Time with (hours) ~24% better Regression encoding Regression++ Acceptance than second place. H.264 26 100% 100% 100% Doom9's 2005 Codec Shoot-Out
Automated Software Transplantation μtrans D Manual Work: Organ Entry Organ s Test Suite V ENTRY O H Organ s Test Suite Donor Stage 1: Static Analysis Stage 2: Genetic Programming Stage 3: Organ Implantation Implantation Point Organ s Test Suite Beneficiary Validation ISSTA * * Artifact Consistent * Complete * Well Documented * Easy to Reuse * Evaluated * AEC * Subjects Regression Tests RQ1.a Augmented Regression Tests RQ1.b Beneficiary Donor Acceptance Tests Acceptance Tests RQ2 Manual Validation Subjects Type Size KLOC Reg. Tests Organ Test Suite Idct Donor 2.3-3-5 Mytar Donor 0.4-4 Cflow Donor 25-6-20 Webserver Donor 1.7-3 TuxCrypt Donor 2.7-4-5 Pidgin 363 88 - Cflow 25 21 - SoX 43 157 - Case Study x264 Donor 63-5 VLC 422 27 - Minimal size: 0.4k; Max size: 422k; Average Donor: 16k; Average : 213k;