High-Fidelity of software systems Dr. Nikolai Mansourov Chief Technology Officer, KDM Analytics http://www.kdmanalytics.com 5 March 2007
Agenda 1. Motivation: of security properties of existing software systems 2. The SwA Ecosystem 3. OMG Knowledge Discovery Meta (KDM) 4. Fidelity of the static 5. Micro KDM high fidelity intermediate representation 6. Platform knowledge in KDM 7. Next steps
SwA Ecosystem Analysis Tools Traditional tools are built as silos software assessment tool Formal Methods extractor reporting reporting KDM KDM extractor SBVR extractor extractor Existing software Economy of tools, COTS components and services emerges Software assurance SwA Ecosystem tools Today s tools have limited interoperability Reverse Engineering Tools
Why KDM? KDM is an OMG foundation for software assurance and modernization! Meta- for describing existing software assets! Includes software, its parts and operating environment! Focuses on structure of existing software to provide context for modernizations! Represents existing software as Entities, Relationships and Attributes! Exchange of information between tool vendors! Language- and platform-independent yet extensible! Spans multiple abstractions yet traceable back to the artifacts
KDM: precise s of existing enterprise software Top-down s specify constraints ( as-required properties ) UML 2.1 Top-Down ing: Architecture CWM Bottom-up Bottom-Up s specify ing: existing Persistent realitydata ( as-is properties ) KDM Bottom-Up ing: Programming Languages, Data, User Interface, Build, Platform, Runtime, Architecture, Business KDM approximates topdown s and can SBVR, BPDM be used to validate compliance with Top-Down ing: requirements Business KDM provides complete CAM, EAI high-fidelity bottom up s of existing Bottom-Up ing: software Data types & Messages Gartner Group
Motivation: High-fidelity static! Fidelity: trustfulness, closeness in sound, facts, color, etc. to the original! Fidelity of the internal! Accuracy! Adequacy! Proper scope
Factors that Limit Fidelity! Low fidelity s! Technical platform considerations (even a high-fidelity may be incomplete)! Other mismatches! Dynamic aspects! Preprocessor! Code generation
Background: Language-dependent s Abstract Syntax Tree (AST) language-dependent; emphasizes form not meaning reporting CASTFunctionDefinition Abstract Syntax Graph (ASG) main field unsigned char CASTCompoundExpression CASTExpressionStatement memcpy CASTExpressionStatement CASTFunctionCallExpression CASTExpressionList CASTTIdExpression CASTExpressionStatement cics_link CASTTIdExpression CASTExpressionList field ABC 3 PROG2 field 3 main ( ) { unsigned char field [ 3 ] ; parser main() { unsigned char field[3]; memcpy(field, ABC,3); EXEC CICS LINK PROGRAM( PROG2 ) COMMAREA(field) LENGTH(sizeof(field)); } Lexical tokens Source code
KDM: Entities-Relationship Term Fact reporting main CASTFunctionDefinition field unsigned char CASTCompoundExpression CASTExpressionStatement memcpy Entity-Relationship Model CASTExpressionStatement CASTFunctionCallExpression CASTExpressionList CASTTIdExpression CASTExpressionStatement cics_link CASTTIdExpression CASTExpressionList field ABC 3 PROG2 field 3 parser KDM is an Entity-Relationship Model container
Micro KDM: Lower-Level s of software reporting parser CASTFunctionDefinition IRLabel IRCallInst IRCallInst CASTCompoundExpression CASTExpressionStatement CASTExpressionStatement IRArgInst CASTExpressionStatement IRArgInst main CASTTIdExpression CASTFunctionCallExpression ABC 3 CASTExpressionList 3 field CASTExpressionList PROG2 CASTTIdExpression unsigned char main memcpy memcpy Intermediate Representation (IR) KDM terms and facts reach down to this level field cics_link cics_link field ABC 3 PROG2 field 3 Example IR:! Sun Compiler 3-address IR! Intel compiler XIL! DEC GEM Compiler for Alpha! Java bytecode! Microsoft CLR! DIANA IR for Ada! SUIF compiler! RTL of GCC compiler! GIMPLE of GCC compiler! LAST of the McCAT compiler
Evaluation target enterprise software system Gartner Group
Programming language is not enough control flow A data flow CICS commarea 4 B 1 2 CICS TS 3 T1 5 6 T2
Typical security pattern, involving platform platform x 2 1 b a c 3 e d Component B f Component A
scenario Composite Applications Custom business processes & analytics Reusable, Executable Services Service Enablement Layer Reusable, Executable Services Traditional Applications Application Platform Business Process Platform Middleware Transaction Systems DBMS Network Systems Operating System / VM Machine Technical Platform SAP AG and D. Frankel, 2005
Platform knowledge! platform provides resources to application code! platform provides services that are related to resources! application code invokes platform services to manage the life-cycle of resources! platform provides component deployment mechanism! platform defines control and data flow between application components! platform provides error handling across application components! platform provides of application components
KDM architecture Higher-level entities and relationships ; Aggregated Relationships; derivative, implicit, experts, analysts Structure Infrastructure Abstractions layer Code UI Build Data Core, kdm, source Action Event Platform Runtime Resource layer Program Elements Resources provided by the runtime platform, managed through The API calls to external packages Primitive entities and relationships; explicit, automatically extracted Conceptual
SwA Ecosystem Analysis Tools Traditional tools are built as silos software assessment tool Formal Methods extractor reporting reporting KDM KDM extractor SBVR extractor extractor Existing software Economy of tools, COTS components and services emerges Software assurance SwA Ecosystem tools Today s tools have limited interoperability Reverse Engineering Tools
Thank you!