EffectiveSan: Type and Memory Error Detection using Dynamically Typed C/C++

Size: px
Start display at page:

Download "EffectiveSan: Type and Memory Error Detection using Dynamically Typed C/C++"

Transcription

1 EffectiveSan: Type and Memory Error Detection using Dynamically Typed C/C++ Gregory J. Duck and Roland H. C. Yap Department of Computer Science National University of Singapore {gregory, arxiv: v1 [cs.pl] 17 Oct 2017 Abstract Low-level programming languages such as C and C++ are vulnerable to errors related to the misuse of memory. Such errors include bounds overflows, sub-object bounds overflows, use-after-free, reuse -after-free and type confusion. These errors account for many of the exploits in programs implemented in such unsafe languages. Most bug detection tools (sanitizers) tend to focus only on detecting specific classes of errors, e.g. bounds overflow or type confusion only. In this paper, we present a new type confusion and memory error sanitizer based on dynamically checking the effective type (a.k.a. the dynamic type) of C/C++ objects at runtime. We show that this effective type sanitizer (EffectiveSan) can detect the memory misuse errors mentioned above, all while using the same underlying methodology (dynamic typing). Our approach uses a combination of low-fat pointers, type meta data and type check instrumentation. We also have a novel approach to preventing sub-object bound overflow errors leveraging on the C/C++ types. We show EffectiveSan finds type confusion, (sub-)object bounds overflow, and use-after-free bugs in the SPEC2006 benchmark suite. I. INTRODUCTION Memory corruption (e.g., bounds overflows), (re)use-afterfree and type confusion errors are a major source of security vulnerabilities in programs written in low-level languages such as C/C++. Such errors allow attackers to access or modify memory in ways not intended by the programmer. This is the basis of many attacks, such as control flow hijacking or information leakage. For example, these errors have accounted for over 75% remote code execution vulnerabilities in Microsoft software [1]. A search of the National Vulnerability Database (NVD) [2] for the year 2016 alone yields 110 hits for use after free, 125 for out-of-bounds read/write and 35 for type confusion, indicating that such errors are still active threats. One countermeasure is to use a sanitizer that instruments the program with additional code that detects such bugs. Sanitizers are typically used for testing or debugging during the software development process, and in some cases may be used to harden production code (with a performance penalty). One weakness with many sanitizers is that they tend to be specialized to one class of errors. For example, TypeSan [3], BaggyBounds [4] and CETS [5] are specialized tools designed to detect type confusion, bounds overflows and use-after-free errors respectively. Neither tool offers any protection against other kinds of errors. Furthermore, many sanitizers only offer incomplete protection against the class of errors they do target. For example, a known limitation for AddressSanitizer [6], LowFat [7], [8] and BaggyBounds [4] is that they do not protect against sub-object overflows, e.g. struct account { char name[10]; int balance; } modification of balance from overflow in name will not be detected. Another example is CAVER [9] and TypeSan [3] which can be used to detect type confusion (a.k.a. bad casting) errors. However, these tools are specialized only for bad C++style static downcasts, and do not protect against type confusion caused by other kinds of casts. Combining sanitizers (for greater error detection) generally suffers from incompatibilities due to competing instrumentation/meta-data schemes. In this paper, we introduce the Effective Type Sanitizer (a.k.a. EffectiveSan) that detects a wide range of memory misuse errors, including type confusion, (sub-)object bounds overflows and (re)use-after-free. EffectiveSan works by dynamically checking an object s effective type (see [10] , a.k.a. the dynamic type) against the declared static type at runtime effectively turning C/C++ into a dynamically typed language. EffectiveSan can directly detect the following classes of errors: Type-confusion: By dynamically checking object types, we directly detect type confusion errors (or type errors for short) caused by bad casts. EffectiveSan protects all types of bad casts, including: C++-style casts, C-style casts, unions, and other implicit casts. (Sub-)Object-bounds-overflows: C/C++ types can encode bounds information, e.g. (int[100]), which is enforced at runtime by EffectiveSan. EffectiveSan also detects subobject bounds overflows that do not escape the containing allocated object. Furthermore, EffectiveSan can detect some classes of (re)useafter-free errors using dynamic typing: (Re)Use-after-free and Double-free: By binding unallocated objects to a special type, EffectiveSan can also detect some use-after-free and double-free errors. Likewise, reuse-after-free (when the object is reallocated before the erroneous access) is protected if the reallocated object is bound to a different type. Unlike other more specialized tools, EffectiveSan is a generalist sanitizer that can target multiple classes of errors using a single underlying methodology (dynamic type checking). As mentioned above, such errors account for the majority of attacks [1], and we envision that the primary use case

2 of EffectiveSan is to detect such errors during the software development and testing life-cycle. In this paper, we also present an implementation of EffectiveSan that uses low-fat pointers [7], [8] to dynamically bind type meta data to allocated objects. This differs from existing sanitizers that typically store meta data in shadow space or some other adjunct memory. Low-fat pointers have several benefits, namely speed, very low memory overheads and compatibility with uninstrumented code (e.g. system or proprietary third party libraries). We experimentally evaluate EffectiveSan against the SPEC2006 benchmarks and the Firefox web browser [11]. EffectiveSan finds several previously unreported sub-object bounds errors in SPEC2006, as well as previously reported bounds overflow and use-after-free bugs. EffectiveSan also finds multiple type errors that rely on undefined behavior under the C/C++ standards. With full instrumentation enabled, EffectiveSan has higher performance overhead compared to existing tools. Our philosophy is that EffectiveSan is primarily designed to be a error detection tool which aims to find more rather than less errors. This differs in philosophy from program hardening tools such as TypeSan that are far more specialized (i.e., C++ downcasts only, no (sub-)object bounds errors, no use-after-free errors). As such, we prioritize comprehensive instrumentation over raw performance sometimes leading to orders of magnitude more runtime checks compared to the related work. By design, EffectiveSan aims to detect errors which have security risks due to potentially unsafe code e.g., even type confusion between scalar types that may be used as indices into arrays. Finally, we note that it is easy to trade performance for completeness by reducing the level of instrumentation. To demonstrate this we also evaluate two reduced-instrumentation variants of EffectiveSan, namely: EffectiveSan-type: for type-confusion-only; and EffectiveSan-bounds: for bounds-checking-only. Both variants give significantly less overheads from full instrumentation with error detection being similar (or better) than existing state-of-the-art specialized sanitizers. In summary, the main contributions of this paper are: Dynamic Type Checking: We introduce dynamic type checking for C/C++ as a general methodology against a range of errors relating to the misuse of memory, such as type confusion and use-after-free bugs. Sub-object Bounds Checking: Dynamic type checking is also a novel approach to sub-object bounds checking. Most existing bounds-checking tools either check object bounds only (e.g., AddressSanitizer [6]), or require explicit tracking of sub-object bounds information (e.g., Softbound [5]), a major source of incompatibility. The EffectiveSan Tool: We present an implementation of EffectiveSan, the first practical sanitizer combining dynamic type checking with memory error detection. EffectiveSan offers comprehensive protection against type confusion errors (for both C and C++), comprehensive protection against (sub-)object bounds overflow errors, as well as partial protection against classes of (re-)use-after-free errors, all using the same underlying methodology. Evaluation: We experimentally evaluate EffectiveSan against the SPEC2006 benchmark suite [12] and the Firefox web browser [11]. SPEC2006 is a very heavily analyzed code-base, yet EffectiveSan is able to detect several new errors. As far as we are aware, EffectiveSan is the first full type and sub-object bounds error detection tool used to analyze a web browser. The paper is organized as follows: Section II discusses background and related work, Section III introduces a dynamic type system for C/C++, Section IV introduces the EffectiveSan instrumentation schema, Section V introduces the EffectiveSan runtime environment including meta data representation, Section VI experimentally evaluates EffectiveSan, and Section VII concludes. II. BACKGROUND We summarize related work on dynamic type systems and sanitizers for C/C++. A. Dynamic Typing Dynamically typed languages, such as JavaScript, Python, Lua, etc., check the types of objects at runtime. In contrast, statically typed languages, such as C, only check types at compile time. Similarly, C++ is a statically typed language with the limited exception of Run-Time Type Information (RTTI) and the (dynamic_cast) operator for downcasting (casting from a base to a derived class). The C/C++ type system is intentionally weak, i.e. allows arbitrary pointer casting and pointer arithmetic, meaning that type confusion and memory errors will not be prevented at compile time. By using dynamic typing, we can detect such errors at runtime at the cost of additional runtime overheads. Note that, in the context of C/C++, dynamic typing concerns pointer or reference casts only, e.g. (q = (float *)p) is a type confusion error if p does not point to a (float) object. Casts that create copies of objects, e.g. (f = (float)i), are valid conversions and not type errors. Aside from RTTI, there is little existing work on dynamic type systems for C/C++. The work of [13] proposes a very simple dynamic checking system for C that tags each data word with a basic type, e.g., integral, real, pointer, etc. Unlike our approach, there is no distinction between different types of pointers (i.e. all pointers are treated as (void *)) Overheads are also very high at for SPEC95. Type confusion sanitizers CAVER [9] and TypeSan [3] also provide a limited form of dynamic typing (for downcasts) discussed below. Bounds-checking can also be considered a weak form of dynamic typing (where only the effective type size is checked). B. Sanitizers Type confusion and memory errors have long been recognized as a major source of security vulnerabilities in programs written in low-level languages such as C/C++. As such, many

3 different bug detection tools (sanitizers) have been proposed. In this section, we give a brief overview as well as a comparison against EffectiveSan. Type Confusion: As mentioned above, C++ provides a very limited form of dynamic typing in the form of RTTI and (dynamic_cast) for downcasting. However, many programmers opt for the faster, albeit unsafe, (static_cast) version of the same operation and this is a known source of security vulnerabilities. CAVER [9] and TypeSan [3] are specialized for detecting type confusion errors caused by such downcasts, as illustrated by the following example: Example 1 (C++ Type Confusion). For example, consider: struct Derived1 : public Base { float f; }; struct Derived2 : public Base { int i; }; void set(base *ptr) { static_cast<derived1 *>(ptr)->f = 1.23f; } It is a type confusion error if function (set) is passed a pointer to a (Derived2) object. CAVER and TypeSan work by binding type meta data to pointers which is verified at runtime. An alternative approach is Undefined Behavior Sanitizer (UBSan) [14] that transforms static_casts into dynamic_casts to enable standard RTTI protections. The main limitation of this approach is that it only works for polymorphic classes (where a virtual function pointer table is present), so it is not applicable to Example 1. Another limitation of existing type confusion sanitizers is that they make the strong assumption of memory safety. For example, suppose Example 1 is modified to include a bounds overflow: (1 + static_cast<derived1 *>(ptr))->f = 1.23f; Even if (ptr) points to a (Derived1) object, the resulting out-of-bounds pointer may point to an object of any type, or any memory. TypeSan does not detect such errors, being out of scope. In contrast, EffectiveSan does not assume memory safety but rather explicitly enforces it, and will detect the outof-bounds access. Furthermore, existing tools are primarily designed for program hardening rather than comprehensively detecting type errors. As such, these tools tend to focus more on specific classes of bad casts, i.e. static downcasts. However, it is possible to disguise downcasts as more low-level casts. For example, the downcast from Example 1 could be replaced with the more low-level cast ((float *)ptr = 1.23f) without changing the functionality of the program. However, since the low-level cast is outside the scope of TypeSan s protection (i.e. no longer a downcast between classes), the type confusion error will not be detected. In contrast with existing tools, EffectiveSan aims for comprehensive type error detection, and can detect type confusion errors from any kind of cast (C++style casts, C-style casts, unions, implicit casts, downcasts, upcasts, etc.) and between any C/C++ type (integers, floats, pointers, classes, structures, etc.). (Sub-)Object Bounds Overflows: Object bounds overflows are well known to be a major source of security vulnerabilities. As such, many existing solutions have been proposed, including [4] [8], [15] [21] amongst others. Many solutions, such as BaggyBounds [4], LowFat [7], [8], Intel MPX [17] and Soft- Bound [5], work by binding bounds meta data (object size and base) to each pointer. The binding is typically implemented using some form of shadow memory (e.g., SoftBound, MPX) or encoding the meta data within the pointer itself (e.g., LowFat with low-fat pointers). Solutions that use shadow memory may also have compatibility issues interfacing with uninstrumented code that allocates its own memory (the corresponding entries in the shadow memory will not be initialized). This can be partially mitigated by intercepting standard memory allocation functions, or by hardware-based solutions (e.g. with MPX). Low-fat pointers avoid the problem by encoding bounds meta data within the pointer itself. Another approach to memory safety is AddressSanitizer [6] which uses poisoned red-zones and shadow memory to track the state of each word of memory, e.g. unallocated, allocated or red-zone. Out-of-bounds memory access that maps to a red-zone will be detected, however memory errors that skip red-zones may be missed. Most existing bounds overflow sanitizers protect allocation or object bounds only. This means the overflows contained within an allocated object will not be detected, e.g. the overflow into (balance) from Section I. Some bounds checking systems, e.g., SoftBound [5] and Intel MPX [17], can also detect sub-object bounds overflows by using static type information for bounds narrowing, i.e. an operation that further constrains bounds meta information to a specific subobject (e.g. the account (name)) for more accurate protection. This also requires sub-object bounds to be associated with pointers when they are passed between contexts, e.g. a pointer parameter in a function call. For example, MPX solves this problem by passing bounds information through special registers (bnd0-bnd3), or failing that, by resorting to the bounds directory stored in shadow memory. Similarly, SoftBound also explicitly tracks bounds information, e.g. by inserting additional function parameters [5]. Both MPX and Softbound use shadow memory schemes that are known to be unstable in multi-threaded environments [22]. EffectiveSan uses a different approach: sub-object bounds meta information is never passed explicitly, rather it will be reconstructed on demand based on the dynamic type information. Our approach preserves the function call Application Binary Interface ABI, meaning that EffectiveSan-instrumented code can be interfaced with uninstrumented code without compatibility issues. Unlike EffectiveSan, none of the existing (sub-)object bounds sanitizers offer direct protection against type confusion errors. That said, some indirect (incidental) protection may occur for bad casts between types of different sizes, leading to an object bounds overflow. In contrast, EffectiveSan directly detects type errors regardless of object size. (Re)Use-After-Free: Use-after-free sanitizers include tools such as AddressSanitizer [6] and Compiler Enforced Temporal Safety (CETS) [23]. AddressSanitizer stores the allocation state in shadow memory, which can detect use-after-free (but not reuse-after-free) errors. That said, AddressSanitizer mitigates reuse-after-free by putting freed objects in a quarantine to delay reallocation (a technique also applicable to

4 Sanitizer Type Conf. Overflows U.A.F. CAVER [9] Partial TypeSan [3] Partial UBSan [14] Partial BaggyBounds [4] Partial LowFat [7], [8] Partial Intel MPX [17] SoftBound [5] CETS [23] AddressSanitizer [6] Partial Partial SoftBound+CETS [5], [23] EffectiveSan Partial Fig. 1: Summary of different sanitizers and capabilities against type confusion and memory errors. Here ( ) means comprehensive protection, ( ) means no or incidental protection, and (Partial) means partial protection with caveats. The caveats are: ( ) only protects C++ static downcasts, ( ) only protects allocation bounds, ( ) only protects use-after-free (not reuse-afterfree), and ( ) only protects reuse-after-free if the reallocated object is bound to a different type. EffectiveSan). CETS uses a more sophisticated identifier-based approach, that binds a unique tag to each allocated object, allowing for complete (re)use-after-free detection. EffectiveSan s use-after-free protection is more closely related to the AddressSanitizer approach but with type meta data replacing AddressSanitizer s shadow memory scheme. EffectiveSan can directly detect reuse-after-free in the case where the object is reallocated using a different type. Although EffectiveSan s (re)use-after-free protection is not as comprehensive as more specialized tools, such as CETS, it is nevertheless worthwhile to target such errors using dynamic typing anyway, since this will incur no additional runtime cost. Other Approaches: There is a tradeoff between sanitizer/defense accuracy and overhead. To lessen the costs for production software, many low(er) overhead program hardening methods have been proposed, including Address Space Layout Randomization (ASLR) [24], Data Execution Prevention (DEP) [25], shadow stacks [26], Control Flow Integrity (CFI) [27] and Code Pointer Integrity (CPI) [28] amongst others. The protection tends to be incomplete and/or targets one specific kind of attack (e.g., control flow hijacking). In contrast, EffectiveSan aims for more comprehensive error detection rather than attack mitigation, which is more commonly used for program hardening. This also means that EffectiveSan can detect errors that can be used for other kinds of attacks, such as data flow [29], that are not normally protected against by most existing hardening methods. That said, EffectiveSan is primarily intended to be used for bug detection rather program hardening. C. Our Approach Figure 1 summarizes existing sanitizers and their capabilities that were described in Section II-B. Most sanitizers are specialized to one particular class of error and/or offer partial protection against the classes of errors they do support. This means that, if more comprehensive error sanitation is desired, then multiple different tools must be deployed at once. However, this is problematic, since most sanitizers are compiler specific (e.g. clang versus gcc) and use competing instrumentation/shadow-memory schemes not designed to be interoperable. 1 A notable exception is SoftBounds+CETS whose combination gives spatial and temporal safety, sans type confusion. That said, even SoftBounds+CETS has difficulty running the full SPEC2006 benchmarks (e.g., [28] were only able to run 4/19 benchmarks). Our proposal targets a more comprehensive range of memory misuse errors using a single underlying methodology, i.e. dynamic type checking for C/C++. The basic idea is to bind a dynamic type to each allocated object. This dynamic type is retrieved at runtime, and used to prevent type confusion errors by comparing the dynamic with the static type declared by the programmer. The type information is comprehensive, and supports all the standard C/C++ types (int, float, arrays, structures, classes, etc.) and even complex types such as structures with flexible array members. This allows for the detection of type confusion errors beyond static downcasts supported by CAVER and TypeSan. Furthermore, dynamic type information also encodes (sub-)object bounds information which can be enforced to prevent (sub)-object bounds overflow errors. Our bounds enforcement is byte-precise and supports sub-objects, thus offering more comprehensive protection than that of BaggyBounds, LowFat and AddressSanitizer. Finally, use-after-free errors can be detected by binding freed objects to a special type representing deallocated memory. Dynamic typing can also detect some forms of reuse-after-free errors. Although use-after-free protection is partial, it incurs no additional costs over dynamic type checking, while protecting against many common cases. III. DYNAMIC TYPES FOR C/C++ In this section, we present the dynamic type system for C/C++. This is essentially equivalent to the standard static type system, but also includes extensions for handling unallocated memory, and methods for calculating sub-object types and bounds at runtime. The dynamic type of an object is a qualifier-free 2 version of the effective type ([10] ) as defined by the C standard. The dynamic type can be any C/C++ type, including fundamental types (e.g., int, float, etc.), pointers, function pointers, arrays, structures, classes and unions. Dynamic types are always complete, i.e. that the type s size is known (e.g., (int[50]) is complete but (int[]) is not). We assume w.l.o.g. that type aliases (e.g. typedef) are fully expanded and C++ templates and namespaces are fully instantiated. Structures, classes and unions are considered equivalent based on tag (i.e. the name), or in the case of anonymous types, based on layout. 1 As an experiment, we tried to compile a simple test program with both TypeSan and AddressSanitizer enabled (for combined type confusion, bounds overflow and use-after-free protection). This resulted in compiler errors (multiple defined symbols) meaning that these sanitizers are not currently compatible. 2 Qualifiers do not affect memory layout or access (as per [10] ).

5 L : Type Z P (Type Z) (a) L(T, 0) T, 0 (b) L(T, sizeof(t )) T, sizeof(t ) (c) L(T [N], k) L(T, k mod sizeof(t )) (d) L(T [N], k) T [N], k if k mod sizeof(t ) = 0 (e) L(struct S, k) L(T elem, k offsetof(s, elem))) (f) L(class C, k) L(T elem, k offsetof(c, elem))) (g) L(union U, k) L(T elem, k offsetof(u, elem))) (h) L(FREE, k) = { FREE, 0 } Fig. 2: The layout function (L). Rules (c)-(h) implicitly assume that k is within the bounds of the object, i.e., 0 k < sizeof(u) for rule matching L(U, k). Here sizeof and offsetof are the standard ANSI C operators. We denote the set of all types as (Type). For brevity, we use the C++ convention of referring to types by their tag, e.g., (S) is short for (struct S). During a new allocation (e.g., stack allocation or heap allocation via, new, new[]) the dynamic type will be bound to the object. For stack allocations and C++ s new/new[] operators, the dynamic type is the same as the declared type of the object as defined by the program. For and C++ s operator new (not to be confused with the new operator) the dynamic type is deemed equivalent to the first lvalue usage type. The latter is determined by a simple program analysis. Example 2 (Dynamic Types). Consider the type definitions and allocations: struct T {int a[3]; char *s;}; struct S {float f; struct T t;}; T p[8]; q = new S; r = (S *)(sizeof(s)); s = (S *)(100*sizeof(S)); Pointer p will be bound to type (T[8]), q and r bound to type (S[1]), and s bound to type (S[100]). Notice that all dynamic types are complete, i.e., the type size is known (as determined by the allocation size). Deriving Sub-object Types: The dynamic type represents the type of the top-level allocated object. In C/C++, it is common for pointers to point to sub-objects contained within larger objects so called interior pointers. Interior pointers can point to array elements, or to an element (a.k.a. a field or member) contained within a structure or a class. Another example is C++ and classes with inheritance, where base class(es) are typically implemented as sub-objects of the derived class. EffectiveSan only explicitly tracks the dynamic types of toplevel allocated objects. Sub-objects are handled by deriving their type dynamically from the containing allocated object s dynamic type (or containing type for short) and an offset, i.e., the pointer difference (in bytes) between the interior pointer and the base pointer of the containing object. For the ease of presentation, we shall assume all pointer arithmetic uses byte offsets regardless of the underlying pointer type. To derive subobject types, we assume a runtime system that can map interior pointers to containing types and offsets (see Section V). Next the containing type and offset is mapped to a set of possible sub-object types using a memory layout function, denoted (L), that is formalized as the relation defined inductively over rules (a)-(h) from Figure 2. Essentially, given a pointer p to the base of an allocated object of dynamic type T and an offset k (in bytes), the function L(T, k) returns a set of type/integer pairs U, δ that represent all valid sub-objects pointed to by pointer (p+k). Here, the type (U) represents the sub-object s type, and integer δ represents the distance from the pointer (p+k) and the sub-object s base. The integer δ is used later for sub-object bounds calculation. For example, the layout for int assuming sizeof(int)=4 is L(int, 0) = { int, 0 } L(int, 4) = { int, 4 } L(int, k) = (otherwise) Thus, if p points to int, then both (p+0) and (p+4) also point to int by rules Figure 2(a)-(b) respectively. Rule (b) accounts for the one-past-the-last-element required by the C standard ([10] ,8). The layouts for other fundamental types, pointers, functions and enumerations are defined similarly. For compound types (arrays, structures and unions), we use Figure 2 rules (c)-(g) to build more complicated layouts. Rules (c),(e)-(g) state that the layout of an array/struct/class/union element (elem) of type (T elem ) includes the layout of (T elem ) offset within the containing type (normally the offset is zero for unions). For classes with inheritance, we consider any base class to be an implicit embedded element. Finally, special rule (d) states that interior pointers to array elements can also be considered pointers to containing array itself. This is necessary because it is a common idiom to scan arrays using pointers rather than element indices. Example 3 (Structure Layout). Consider a pointer p to type (S) defined in Example 2. Then all (sub-)objects for p are described by the following table: Sub-obj. Offset Type p p+0 S p->f p+0 float p->t p+4 T p->t.a p+4 int[3] Sub-obj. Offset Type p->t.a[0] p+4 int p->t.a[1] p+8 int p->t.a[2] p+12 int p->t.s p+16 char * Consider (p+4), which points to the base of sub-objects (p->t), (p->t.a) and (p->t.a[0]) respectively, as well as pointing to the end of sub-object (p->f). Using the rules from Figure 2, we derive: L(S, 4) = { T, 0, int[3], 0, int, 0, float, 4 } Pointers to array elements can also be treated as pointers to the array itself (rule Figure 2(c)). Thus for (p+12): L(S, 12) = { int[3], 8, int, 0, int, 4 } corresponding to the array sub-object (p->t.a) via rule (c), the array element (p->t.a[2]) and the end of the previous array element (p->t.a[1]), respectively. The layout for compound objects is a flattened representation, meaning that (L) returns types for even nested objects, e.g., (p->t.a[2]) is three levels deep.

6 f( type * ptr ) { (a) BOUNDS b = type_ check ( ptr, type[]); } (b) (c) (d) (e) (f) (g) type * ptr = f( ); BOUNDS b = type_ check ( ptr, type[]); type * ptr = *( ptr2 ); BOUNDS b = type_ check ( ptr, type[]); type * ptr = ( type *) ptr2 ; BOUNDS b = type_ check ( ptr, type[]); type * ptr = &ptr2 -> field ; BOUNDS b = bounds_narrow (b2, ptr2 -> field ); type * ptr = ptr2 + k; BOUNDS b = b2; bounds_check (ptr, b); val = * ptr ; or * ptr = val ; or ptr escapes [7] Fig. 3: Type check instrumentation schema. Here b represents the bounds for ptr, and b2 for ptr2. A Special Type for Deallocated Memory: Deallocated objects are bound to a special type (FREE) that is defined to be distinct from all other C/C++ types. This reduces use-afterfree and double-free errors to type errors without any other special treatment. The (FREE) type has a special layout defined by rule Figure 2(h). Essentially, if p points to deallocated memory, then so does (p+k) for all k. Reuse-after-free is already handled for the case where the reallocated object has a different type to that of the dangling pointer. Deriving Sub-object Bounds: C/C++ types also encode bounds information. For example, the type (int[100]) is an array object of length 100, and accessing an element outside the range is an object bounds error. Hence, full dynamic type checking necessitates the enforcement of object bounds at runtime. To support this, we use (sub-)object bounds derived from dynamic type information. The basic idea is as follows: let p point to type T, then each pair U, δ L(T, k) corresponds to a sub-object of type U pointed to by q=(p+k). The integer δ represents the distance from q to the start of the sub-object, and is not necessarily zero (e.g., interior pointers in arrays). The sub-object bounds, represented as an address range, can be calculated using the following helper function: type bounds(q, U, δ ) = q δ.. q δ+sizeof(u) For example, let us consider the pointer q=(p+12) into the sub-object (p->t.a) corresponding to the pair int[3], 8 L(S, 12) from Example 3. The sub-object bounds are (p+4)..(p+16). That is, the sub-object (p->t.a) spans offsets 4 16 bytes. IV. DYNAMIC TYPE CHECK INSTRUMENTATION Dynamic type checking verifies that the dynamic type bound to an object matches the static type declared by the programmer. If there is a mis-match, a type error is logged. Dynamic type checking is achieved by an automatic program transformation that inserts explicit type checks, i.e., type check instrumentation. In our implementation, the transformation is implemented as an LLVM [30] compiler infrastructure pass (see Section VI-A for details). The basic type check instrumentation schema is shown in Figure 3. Essentially dynamic type checking is split into two parts: (i) input pointers (e.g. function parameters) are typed checked against the static type declared by the programmer; and (ii) derived pointers (e.g. pointers derived from arithmetic and field access) are bounds checked against the (sub-)object bounds. As a simplification, we assume that all static types are incomplete, i.e. of the form (type[]) with no bounds information. Complete static types can be decomposed into an incomplete type followed by a bounds check. The rules for input pointers are shown in Figure 3(a)- (d), and cover function parameters, function calls, memory reads and casts respectively. Rule (d) is also extended to other kinds of casts such as C++ s (static_cast). Input pointers are type checked against the incomplete static type, represented by (type[]), by a call to a (type_check) function that is supplied by the runtime system (to be defined later in Section V). The (type_check) function will log an error message if the pointer does not point to an object, or subobject of an larger object, of the complete type (type[n]) for some N. For example: int *p = new int[100]; BOUNDS b1 = type_check(p, int[]); BOUNDS b2 = type_check(p, float[]); The first type check passes but the second fails since (int) and (float) are distinct types. Assuming there is no error, the (type_check) function also return the bounds of the matching (sub-)object. Bounds are represented by a pair of pointers, e.g., b1=p..p+100 sizeof(int). The final step of dynamic type checking is to ensure that all actual usage of the (sub-)object are within the (sub-)object s bounds, i.e. those returned by the (type_check) function. For this, the rule Figure 3(g) inserts a bounds check, as represented by a call to a (bounds_check) function, before each memory read or write operation on a derived pointer. The (bounds_check) macro uses a standard definition: #define bounds_check(ptr, b) \ if (ptr < b.lb ptr > b.ub-sizeof(*ptr)) \ bounds_error(); Here (b.lb) and (b.ub) represent the lower and upper bounds respectively. Finally, rule Figure 3(e) represents bounds narrowing to the sub-object selected by field access. Narrowing can be defined as interval intersection: #define bounds_narrow(b, field) \ {max(b.lb, &field), \ min(b.ub, &field+sizeof(field))}

7 1 int length ( node *xs) { 2 BOUNDS b = type_ check ( xs, node[]); 3 int len = 0; 4 while ( xs!= NULL ) { 5 len ++; 6 node ** tmp = &xs -> next ; 7 b = bounds_narrow (b, xs -> next ); 8 bounds_check (tmp, b); 9 xs = * tmp ; 10 b = type_ check ( xs, node[]); 11 } 12 return len ; 13 } int sum ( int *a, int len ) { 16 BOUNDS b = type_ check (a, int[]); 17 int sum = 0; 18 for ( int i = 0; i < len ; i ++) { 19 int * tmp = a + i; 20 bounds_check (tmp, b); 21 sum += * tmp ; 22 } 23 return sum ; 24 } Fig. 4: Instrumented length and sum functions. The narrowing operation is similar to that of MPX [17]. Note that ordinary pointer arithmetic (e.g. array access) is not narrowed, see rule 3(f), since the resulting pointer may still refer to the containing array. Example 4 (Type Check Instrumentation). We consider two functions: length calculates the length of a linked-list and sum calculates the sum of an array. The instrumented versions of these functions are shown in Figure 4. The instrumentation in lines {2, 7, 8, 10, 16, 20} is highlighted. The original functions can be obtained by deleting these lines and eliminating temporary variables. For the length function, the input pointer(s) (xs) on lines {2, 10} are type checked against the static type (node[]) declared by the programmer. This means that (xs) must point to an object (or sub-object of a larger object) of type (node). Thus far, it has not been specified that (xs) points to the base of a (node) object (recall rule Figure 2(b), where (xs) may point the end of an object), so derived pointer (tmp) may be an overflow. To prevent this, the derived pointer (tmp) is bounds checked on line 8. Similarly, for the sum function, the input pointer (a) is type checked against the static type (int[]). Derived pointer (tmp) is bounds checked before access. By design, the instrumentation from Figure 3 enables full type and bounds checking of all input pointers. Essentially the type and bounds of input pointers are never trusted, meaning that even obscure type errors (e.g. mixing up function prototype definitions, multiple definitions for the same type, etc.) can be detected. It is possible to reduce instrumentation by trading protection for speed, e.g., by instrumenting type casts only, which is more similar to CAVER and TypeSan. We shall discuss such variants in Section VI. V. DYNAMIC TYPE CHECK RUNTIME We now discuss an implementation based on object and type meta data using low-fat pointers. Low-fat Pointers: Low-fat pointers [7], [8], [19] are a method for encoding bounds meta data (i.e. size and base of an allocation) within the native machine pointer representation itself. Low-fat pointers require sufficient pointer bit-width, and are feasible for 64-bit systems (e.g. the x86_64). To use lowfat pointers, objects must be allocated using a special lowfat memory allocator that ensures the returned pointers are suitably encoded. The low-fat heap allocator [7] provides replacement functions (lowfat_, lowfat_free, etc.) to all the stdlib memory allocation functions. The replacement functions have the same interface (i.e. function prototype) as the originals. We also implement low-fat pointers for both stack [8] and global objects. Several low-fat pointer encodings have been proposed. For this paper, we use the low-fat pointer encoding from [7], [8]. This encoding provides the following abstract operations: given a low-fat pointer p to (possibly the interior of) a low-fat allocated object O, then: size(p) = sizeof(o) base(p) = &O That is, given a low-fat pointer p, we can use the size and base operations to quickly determine bounds meta data (the size and base address) of the allocated object. For example, if str = lowfat (sizeof(char[32])) then size(str+10)=32 and base(str+10)=str, etc. Not all pointers will be low-fat pointers, and such pointers are referred to as legacy. For legacy pointer q, size(q) = SIZE MAX and base(q) = NULL. Legacy pointer support is essential as they can arise from non-instrumented code, e.g. libraries, or from Custom Memory Allocators (CMAs). The low-fat pointer encoding of [7], [8] works by (1) arranging objects into different memory regions based on allocation size, and (2) ensuring that all objects are allocation size-aligned. Thus, given a pointer p, we can quickly derive the allocation size (i.e. size(p)) based on which memory region p points to. Next the base(p) operation is implemented by rounding p down to the nearest size(p)-aligned address. Both the size(p) and base(p) operations are fast and constant time O(1). For more on low-fat pointers, see [7], [8]. Using Low-fat Pointers for Meta Data: Low-fat pointers were originally designed for allocation bounds checking using the meta data encoded in the pointer. That is, given pointer p, any access outside the range base(p)..base(p)+size(p) is a bounds error that will abort the program. In this paper, we do not use low-fat pointer directly for bounds checking. Rather, we repurpose low-fat pointers as a general method for binding meta data (in our case, type information) to objects. The basic idea is to store the meta data at the base of the object, and this meta data can be retrieved by any interior pointer by using the base(p) operation. We refer

8 META base(p) q f t.a[0] t.a[1] t.a[2] p t.a t.s Fig. 5: Object and object meta data layout. to this as object meta data, since it is associated with every allocated object. In the case of EffectiveSan, the object meta data will contain a representation of the dynamic type of the allocated object. Example 5 (Object Meta Data). An example of the combined object and meta data layout is shown in Figure 5. Here we assume the object is of type (S) from Example 2, and the layout of each sub-object (e.g. f, t, t.a[0], etc.) from Example 3 is also illustrated. The memory is divided into two main parts: space for the object meta data (META) and space for the allocated object itself. Given a pointer p to the object or subobject (e.g., p=&t.a[2] in Figure 5), the pointer to the object meta data can be retrieved by base(p). It is important to note that the meta data (META) is attached to the outermost object only, and is not attached to each sub-object. This means that the memory layout of C/C++ objects is unchanged. The (META) is analogous to an extra header that is transparent to the program. Under our scheme we assume that the (META) header is a type-integer pair storing (1) the (top-level) allocation type of the object, and (2) the object s allocation size (e.g., the parameter to ()). For sub-objects, EffectiveSan retrieves the type using the layout function (L) discussed below. To implement the object layout of Figure 5 we replace the standard () with the version shown in Figure 6 lines 1-6. Here, (typed_) is a thin wrapper around the underlying low-fat memory allocator (lowfat_). The wrapper functions takes a type (t) as an argument (similar to C++ s new operator). Here we treat types as first class objects of type (TYPE). In lines 2-3, the wrapper allocates space for both the object (of sizeof(t)) and the object meta data (of sizeof(type)) using the underlying low-fat allocator. Line 4 stores the allocated object s meta data at the base address. Line 5 returns the pointer to the start of the allocated object excluding the meta data. The (typed_) function essentially binds the allocation type (t) to the memory returned for the allocated object. In addition to heap objects, we similarly wrap low-fat stack [8] and global objects with meta data. Note that the (typed_) function returns the base address of allocated object (i.e., pointer q from Figure 5), and not a pointer to the object meta data. This means that, from the point-of-view of the program, the object meta data is hidden. The typed (t) function can thus be used as a drop-in t 1 void * typed_ ( size_t size, TYPE t) { 2 META * meta = lowfat_ ( 3 sizeof ( META )+ size ); 4 meta type = t; 5 meta size = size ; 6 return ( void *)( meta + 1); 7 } 8 9 BOUNDS type_ check ( void * ptr, TYPE s) { 10 META * meta = base(ptr); 11 if ( meta == NULL ) /* legacy pointer */ 12 return (0.. UINTPTR_ MAX ); 13 TYPE t = meta type ; 14 void * bptr = ( void *)( meta + 1); 15 BOUNDS b = ( bptr.. bptr + meta size ); 16 ssize_t k = ptr - bptr ; 17 for ( SUBOBJECT o : L(t, k)) 18 if ( o. type == s) { 19 BOUNDS c = type_ bounds ( ptr, o); 20 return bounds_ narrow (b, c); 21 } 22 type_error (); /* report error */ 23 return (0.. UINTPTR_ MAX ); 24 } Fig. 6: Simplified definitions for the typed and type check functions. replacement for standard (sizeof(t)) or (new t) without any further modification of the program. Memory deallocation is handled by a (typed_free) replacement for stdlib (free). The replacement function overwrites the object meta data with the special type (FREE) defined in Section III before returning the memory to the underlying low-fat allocator. The low-fat allocator has also been modified to ensure that the meta data will be preserved until the memory is reallocated. The (typed_free) function also detects double free errors, i.e., when the freed object already has type (FREE). A. Type Checking with Meta Data By replacing the standard allocators, all objects are bound to the allocation type that can be retrieved using the base operation. Combined with the layout function (L), the (type_check) function can be implemented as shown in Figure 6 lines Here (type_check) function has three basic steps: 1) Get the allocation type (t), bounds (b) and object base pointer (bptr) (lines 10-15); 2) Calculate the sub-object offset (k) (lines 16); 3) Scan all sub-objects at offset (k) as returned by the layout function (L) (lines 17-22). Return the bounds of the subobject that matches the declared static type (s) narrowed to the allocation bounds, else raise a type error (line 22) if no match exists. For legacy pointers, the (type_check) function always succeeds and returns wide bounds (lines 11-12) for compatibility reasons. Likewise, wide bounds are returned after a type error has been logged.

9 Example 6 (Type Check). Let p point to an allocated object of type (S) from Example 3. Assuming that sizeof(meta)=16, then the type will be stored as object meta data at address base(p) = (p sizeof(meta)) = (p 16). Consider the interior pointer (q=p+12). Then type check(q, int[]) proceeds as follows: 1) t = (META *)base(q)->type = S 2) k = q base(q)+sizeof(meta) = 12 3) L(S, 12) = { int[3], 8, int, 0, int, 4 } Type (int[]) matches the first sub-object int[3], 8, and the bounds p+4..p+16 are returned. In contrast, type check(q, double[]) fails since there is no matching sub-object for type (double[]). As illustrated in Example 6, it is sometimes possible to have multiple matching sub-objects, i.e., int[3], 8, int, 0, and int, 4 all match type (int[]). In such cases, the following tie-breaking rules are used: 1) sub-objects with wider bounds are preferred; and 2) pointers-to-the-end-of-sub-objects (see Figure 2(b)) are matched last. Thus, the sub-object bounds for (int[3]) is returned. Note that our approach for deriving (sub-)object bounds differs from that of other systems such as SoftBound [5] and MPX [17]. These system track (sub-)object bounds by passing meta data whereas EffectiveSan always (re)calculates bounds using the dynamic type. Explicit bounds tracking may allow for narrower bounds for some cases of type ambiguity, e.g., when the bounds for (int) is intended. In order to pass narrowed pointer arguments, SoftBound necessitates changing the Application Binary Interface (ABI). EffectiveSan s approach achieves very good binary compatibility since the underlying ABI is not changed. Furthermore, Softbound/MPX require meta data updates when a pointer is written to memory, a problem for multi-threaded applications [22]. Our approach requires no such updates, allowing for better multi-threading support. B. Layout and Type Meta Data Implementation The object meta data is a representation of the effective type of the corresponding object. EffectiveSan represents incomplete types (i.e., T []) as pointers to a type meta data structure containing useful information, such as the type s size (i.e., sizeof(t )), name (for reflection) and layout information. The type meta data structure is defined once per type. To reduce overheads, the current EffectiveSan implementation uses a layout hash table representation. The basic idea is as follows: for all possible type (T), sub-object type (S) and sub-object offset (k) combinations: S, δ L(T, k) 0 k sizeof(t) the layout hash table will contain a corresponding entry: T S k δ..δ + sizeof(s) which maps the (T S k) triple to the corresponding subobject bounds. In order to keep the hash table finite, only entries corresponding to offsets 0 k sizeof(t) are stored. Otherwise, for entries outside this range, the offset is first normalized (k:=k mod sizeof(t)). If multiple matching subobjects exist for the same (S) the above tie-breaking rules apply. Using this implementation, the sub-object matching of Figure 6 lines can be efficiently implemented as an O(1) hash table lookup. Our approach can also be used to implement standard C/C++ language features, including: 1) Structure types with flexible array members; and 2) Allowable coercions between types under the C/C++ standards ([10] 6.2.7). Structures with Flexible Array Members (FAMs) have definitions of the form 3 (struct T { ; U member[];}), where (member) of type (U[]) is the FAM. The size of the FAM is determined by the object s allocation size. EffectiveSan implements structures with FAM by treating the type layout as that equivalent to (struct T { ; U member[1];}) and by using a different offset normalization for k>sizeof(t): k := ((k sizeof(t)) mod sizeof(u)) + sizeof(t) The final feature is automatic type coercion between different types. The C/C++ standards allow objects of type (char[]) to be coerced to any other type. Essentially, we wish to treat objects of type (char[]) as untyped memory, that can be accessed as any type without causing errors. To implement this, EffectiveSan uses two layout hash table lookups instead of one: if the first lookup (T S k) fails, next (T char k) is tried, representing the coercion from (char[]) to (S[]). Only if the second lookup fails then a type error occurs. This idea can be generalized to other kinds of useful coercions, such as (void *) to (S *), etc. Our approach is configurable meaning that strict (no coercions) or loose (many allowable coercions) type checking can be implemented. Type meta data, including the layout hash table, is automatically generated using a compiler pass, once per compiled module. Each type meta data object is declared as a weak symbol meaning that only one copy will be included in the final executable. The type meta data structures are constant (read-only) and thus cannot be modified at runtime. Related Work: Both CAVER and TypeSan similarly map offsets to sub-object type information, but use a different meta data representation. For example, TypeSan stores sub-object type information in an ordered vector that is scanned linearly to find matching sub-objects. The worst-case behavior of this approach is O(N), where N is the number of embedded or inherited class sub-objects, although in practice N is typically small. In contrast, EffectiveSan allows any type (including int, float, pointers, etc.) to be a sub-objects, meaning that a vector representation does not scale. In addition to O(1) sub-object matching, EffectiveSan improves on CAVER and TypeSan in several ways, including: (1) an encoding of (sub-)object bounds information necessary for bounds-checking, (2) support for structures with flexible array members, and (3) support for coercions. Such a detailed 3 Other forms may also be possible.

10 layout information is necessary for comprehensive detection of type errors beyond C++-style static downcasts, as well as the detection of (sub-)object bounds errors. In contrast, both CAVER and TypeSan assume the program is memory safe rather than explicitly enforce it. VI. EXPERIMENTS In this section, we experimentally evaluate a prototype of EffectiveSan on several benchmarks to gauge the effectiveness and performance of our approach. A. Implementation and Setup We have implemented a prototype version of EffectiveSan using the LLVM compiler infrastructure [30] version for the x86_64 architecture. EffectiveSan instrumentation is a two step process. In the first step, a modified clang front-end generates a type annotated LLVM [30] Intermediate Representation (IR) of the C/C++ program. Here, type annotations are associated with each LLVM IR instruction/global/function using the standard DWARF [31] debug format (similar to that generated by the (-g) command-line option). In the second step, the type annotated IR is instrumented using the schema from (Figure 3). This step also replaces all heap/stack/global allocations with the typed variants described in Section V, and generates the runtime type meta data described in Section V-B. Our implementation fully supports all types described in Section III, including fundamental types, pointers, structures, classes, unions, etc., as well as standard C/C++ features such as inheritance, virtual inheritance, templates, multi-threading, basic coercions between (T ) to/from (char[]) and (T *) to/from (char *), and objects with flexible array members. In addition to the Figure 3 schema, our EffectiveSan prototype supports basic optimizations such as: removing dynamic type checks that can never fail (e.g., C++ upcasts) and removing redundant bounds narrowing operations. For speed, all instrumentation except (type_check) is inlined. By default, EffectiveSan logs type/bounds errors (sans duplicates) without stopping the program. EffectiveSan can also be configured to abort the program after the first error. For our experiments we use the default logging mode. Limitations: The EffectiveSan prototype may not detect all possible type and memory error bug. For example, Effective- San can only partially protect legacy pointers in the form of bounds narrowing. For non-legacy pointers, EffectiveSan must correctly bind the allocation type with each allocated object. For global/stack objects as well as objects allocated using C++ s new, the allocation type is simply the declared type and is unambiguous. However, for heap objects allocated by () we use a simple program analysis (see Example 2). The allocation type can also be obfuscated by Custom Memory Allocators (CMAs), a problem EffectiveSan shares with related tools such as CAVER and TypeSan. For our experiments we use a version of SPEC2006 with CMAs removed where possible, see Appendix B for more details. Another limitation of EffectiveSan relates to sub-object matching. By default, EffectiveSan heuristically chooses the SPEC2006 kilo- checks (billions) #Issues- Bench. sloc #Type #Bound found perlbench bzip gcc mcf gobmk hmmer sjeng libquantum h264ref omnetpp astar xalancbmk milc namd dealii soplex povray lbm sphinx Totals (all) Totals (C++) Fig. 7: Summary of the SPEC2006 benchmarks. C++ benchmarks are marked with a (++), the rest in C. We bucket issues by type and offset. Benchmarks with issues are highlighted. sub-object with the widest bounds (see Section V-A tiebreaking rules), which may differ from the intended bounds. However, this is a reasonable choice for code compatibility. MPX and SoftBound sometimes use narrower bounds but are known to have difficulty compiling all of SPEC2006 [5], [22]. B. Effectiveness To test the effectiveness of EffectiveSan, we use the full SPEC2006 [12] benchmarks as well the Firefox web browser version 52 (ESR). The SPEC2006 benchmarks ( 1.1million sloc) comprise several integer and floating point C/C++ programs. For SPEC2006 we use the standard workloads. For Firefox ( 7.9million sloc) we use standard web benchmarks. The results for EffectiveSan on SPEC2006 is summarized in Figure 7. Here (kilo-sloc) represents the source lines of code (in thousands), (#Type) the number of type checks, (#Bounds) the number of bounds checks, (#Issues-found) the number of issues logged by EffectiveSan. We bucket issues by type and offset to prevent the same issue from being reported at multiple different program points. Of the 2.2 trillion type checks in Figure 7, only 1.1% were performed on legacy (non-fat) pointers, meaning that EffectiveSan achieves high coverage. 1) SPEC2006 Issues Found: For SPEC2006 our Effective- San prototype detects several issues (see Figure 7): A use-after-free bug in perlbench (reported in [6]). 4 A bounds overflow error in h264ref (reported in [6]). Three sub-object bounds overflow errors in gcc, h264ref and soplex. 5 4 Only applicable to the SPEC2006 test workload. 5 Some are also found by MPX, see the technical report [22].

A program execution is memory safe so long as memory access errors never occur:

A program execution is memory safe so long as memory access errors never occur: A program execution is memory safe so long as memory access errors never occur: Buffer overflows, null pointer dereference, use after free, use of uninitialized memory, illegal free Memory safety categories

More information

Stack Bounds Protection with Low Fat Pointers

Stack Bounds Protection with Low Fat Pointers Stack Bounds Protection with Low Fat Pointers Gregory J. Duck, Roland H.C. Yap, and Lorenzo Cavallaro NDSS 2017 Overview Heap Bounds Protection with Low Fat Pointers, CC 2016 Stack New method for detecting

More information

HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer

HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer Motivation C++ is a popular programming language Google Chrome, Firefox,

More information

CS527 Software Security

CS527 Software Security Security Policies Purdue University, Spring 2018 Security Policies A policy is a deliberate system of principles to guide decisions and achieve rational outcomes. A policy is a statement of intent, and

More information

CS-527 Software Security

CS-527 Software Security CS-527 Software Security Memory Safety Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Kyriakos Ispoglou https://nebelwelt.net/teaching/17-527-softsec/ Spring 2017 Eternal

More information

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs CCured Type-Safe Retrofitting of C Programs [Necula, McPeak,, Weimer, Condit, Harren] #1 One-Slide Summary CCured enforces memory safety and type safety in legacy C programs. CCured analyzes how you use

More information

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교 Identifying Memory Corruption Bugs with Compiler Instrumentations 이병영 ( 조지아공과대학교 ) blee@gatech.edu @POC2014 How to find bugs Source code auditing Fuzzing Source Code Auditing Focusing on specific vulnerability

More information

Fast, precise dynamic checking of types and bounds in C

Fast, precise dynamic checking of types and bounds in C Fast, precise dynamic checking of types and bounds in C Stephen Kell stephen.kell@cl.cam.ac.uk Computer Laboratory University of Cambridge p.1 Tool wanted if (obj >type == OBJ COMMIT) { if (process commit(walker,

More information

Cling: A Memory Allocator to Mitigate Dangling Pointers. Periklis Akritidis

Cling: A Memory Allocator to Mitigate Dangling Pointers. Periklis Akritidis Cling: A Memory Allocator to Mitigate Dangling Pointers Periklis Akritidis --2010 Use-after-free Vulnerabilities Accessing Memory Through Dangling Pointers Techniques : Heap Spraying, Feng Shui Manual

More information

CPSC 3740 Programming Languages University of Lethbridge. Data Types

CPSC 3740 Programming Languages University of Lethbridge. Data Types Data Types A data type defines a collection of data values and a set of predefined operations on those values Some languages allow user to define additional types Useful for error detection through type

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

TypeSan: Practical Type Confusion Detection

TypeSan: Practical Type Confusion Detection TypeSan: Practical Type Confusion Detection Hui Peng peng124@purdue.edu Istvan Haller istvan.haller@gmail.com Herbert Bos herbertb@few.vu.nl Yuseok Jeon jeon41@purdue.edu Mathias Payer mathias.payer@nebelwelt.net

More information

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Projects 1 Information flow analysis for mobile applications 2 2 Machine-learning-guide typestate analysis for UAF vulnerabilities 3 3 Preventing

More information

The Typed Racket Guide

The Typed Racket Guide The Typed Racket Guide Version 5.3.6 Sam Tobin-Hochstadt and Vincent St-Amour August 9, 2013 Typed Racket is a family of languages, each of which enforce

More information

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1 CSE 401 Compilers Static Semantics Hal Perkins Winter 2009 2/3/2009 2002-09 Hal Perkins & UW CSE I-1 Agenda Static semantics Types Symbol tables General ideas for now; details later for MiniJava project

More information

Typed Racket: Racket with Static Types

Typed Racket: Racket with Static Types Typed Racket: Racket with Static Types Version 5.0.2 Sam Tobin-Hochstadt November 6, 2010 Typed Racket is a family of languages, each of which enforce that programs written in the language obey a type

More information

C-LANGUAGE CURRICULAM

C-LANGUAGE CURRICULAM C-LANGUAGE CURRICULAM Duration: 2 Months. 1. Introducing C 1.1 History of C Origin Standardization C-Based Languages 1.2 Strengths and Weaknesses Of C Strengths Weaknesses Effective Use of C 2. C Fundamentals

More information

Review of the C Programming Language for Principles of Operating Systems

Review of the C Programming Language for Principles of Operating Systems Review of the C Programming Language for Principles of Operating Systems Prof. James L. Frankel Harvard University Version of 7:26 PM 4-Sep-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights

More information

Tokens, Expressions and Control Structures

Tokens, Expressions and Control Structures 3 Tokens, Expressions and Control Structures Tokens Keywords Identifiers Data types User-defined types Derived types Symbolic constants Declaration of variables Initialization Reference variables Type

More information

Heap Bounds Protection with Low Fat Pointers

Heap Bounds Protection with Low Fat Pointers Heap Bounds Protection with Low Fat Pointers Gregory J. Duck Roland H. C. Yap Department of Computer Science National University of Singapore, Singapore {gregory,ryap}@comp.nus.edu.sg Abstract Heap buffer

More information

Programming Languages Third Edition. Chapter 7 Basic Semantics

Programming Languages Third Edition. Chapter 7 Basic Semantics Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol

More information

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Spring 2015 1 Handling Overloaded Declarations Two approaches are popular: 1. Create a single symbol table

More information

in memory: an evolution of attacks Mathias Payer Purdue University

in memory: an evolution of attacks Mathias Payer Purdue University in memory: an evolution of attacks Mathias Payer Purdue University Images (c) MGM, WarGames, 1983 Memory attacks: an ongoing war Vulnerability classes according to CVE Memory

More information

High System-Code Security with Low Overhead

High System-Code Security with Low Overhead High System-Code Security with Low Overhead Jonas Wagner, Volodymyr Kuznetsov, George Candea, and Johannes Kinder École Polytechnique Fédérale de Lausanne Royal Holloway, University of London High System-Code

More information

Introduction to C++ Introduction. Structure of a C++ Program. Structure of a C++ Program. C++ widely-used general-purpose programming language

Introduction to C++ Introduction. Structure of a C++ Program. Structure of a C++ Program. C++ widely-used general-purpose programming language Introduction C++ widely-used general-purpose programming language procedural and object-oriented support strong support created by Bjarne Stroustrup starting in 1979 based on C Introduction to C++ also

More information

HA2lloc: Hardware-Assisted Secure Allocator

HA2lloc: Hardware-Assisted Secure Allocator HA2lloc: Hardware-Assisted Secure Allocator Orlando Arias, Dean Sullivan, Yier Jin {oarias,dean.sullivan}@knights.ucf.edu yier.jin@ece.ufl.edu University of Central Florida University of Florida June 25,

More information

Advanced Systems Security: New Threats

Advanced Systems Security: New Threats Systems and Internet Infrastructure Security Network and Security Research Center Department of Computer Science and Engineering Pennsylvania State University, University Park PA Advanced Systems Security:

More information

September 10,

September 10, September 10, 2013 1 Bjarne Stroustrup, AT&T Bell Labs, early 80s cfront original C++ to C translator Difficult to debug Potentially inefficient Many native compilers exist today C++ is mostly upward compatible

More information

Introduction to C++ with content from

Introduction to C++ with content from Introduction to C++ with content from www.cplusplus.com 2 Introduction C++ widely-used general-purpose programming language procedural and object-oriented support strong support created by Bjarne Stroustrup

More information

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result. Every program uses data, either explicitly or implicitly to arrive at a result. Data in a program is collected into data structures, and is manipulated by algorithms. Algorithms + Data Structures = Programs

More information

SoK: Eternal War in Memory Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song In: Oakland 14

SoK: Eternal War in Memory Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song In: Oakland 14 SoK: Eternal War in Memory Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song In: Oakland 14 Presenter: Mathias Payer, EPFL http://hexhive.github.io 1 Memory attacks: an ongoing war Vulnerability classes

More information

Review of the C Programming Language

Review of the C Programming Language Review of the C Programming Language Prof. James L. Frankel Harvard University Version of 11:55 AM 22-Apr-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Reference Manual for the

More information

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking CS 430 Spring 2015 Mike Lam, Professor Data Types and Type Checking Type Systems Type system Rules about valid types, type compatibility, and how data values can be used Benefits of a robust type system

More information

Cpt S 122 Data Structures. Course Review Midterm Exam # 2

Cpt S 122 Data Structures. Course Review Midterm Exam # 2 Cpt S 122 Data Structures Course Review Midterm Exam # 2 Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Midterm Exam 2 When: Monday (11/05) 12:10 pm -1pm

More information

Symbol Tables Symbol Table: In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier in a program's source code is

More information

A brief introduction to C programming for Java programmers

A brief introduction to C programming for Java programmers A brief introduction to C programming for Java programmers Sven Gestegård Robertz September 2017 There are many similarities between Java and C. The syntax in Java is basically

More information

Stack Bounds Protection with Low Fat Pointers

Stack Bounds Protection with Low Fat Pointers Stack Bounds Protection with Low Fat Pointers Gregory J. Duck and Roland H. C. Yap Department of Computer Science National University of Singapore {gregory, ryap}@comp.nus.edu.sg Lorenzo Cavallaro Information

More information

Preventing Use-after-free with Dangling Pointers Nullification

Preventing Use-after-free with Dangling Pointers Nullification Preventing Use-after-free with Dangling Pointers Nullification Byoungyoung Lee, Chengyu Song, Yeongjin Jang Tielei Wang, Taesoo Kim, Long Lu, Wenke Lee Georgia Institute of Technology Stony Brook University

More information

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline CS 0 Lecture 8 Chapter 5 Louden Outline The symbol table Static scoping vs dynamic scoping Symbol table Dictionary associates names to attributes In general: hash tables, tree and lists (assignment ) can

More information

A Fast Instruction Set Simulator for RISC-V

A Fast Instruction Set Simulator for RISC-V A Fast Instruction Set Simulator for RISC-V Maxim.Maslov@esperantotech.com Vadim.Gimpelson@esperantotech.com Nikita.Voronov@esperantotech.com Dave.Ditzel@esperantotech.com Esperanto Technologies, Inc.

More information

Lightweight Memory Tracing

Lightweight Memory Tracing Lightweight Memory Tracing Mathias Payer*, Enrico Kravina, Thomas Gross Department of Computer Science ETH Zürich, Switzerland * now at UC Berkeley Memory Tracing via Memlets Execute code (memlets) for

More information

Chapter 5 Names, Binding, Type Checking and Scopes

Chapter 5 Names, Binding, Type Checking and Scopes Chapter 5 Names, Binding, Type Checking and Scopes Names - We discuss all user-defined names here - Design issues for names: -Maximum length? - Are connector characters allowed? - Are names case sensitive?

More information

5.Coding for 64-Bit Programs

5.Coding for 64-Bit Programs Chapter 5 5.Coding for 64-Bit Programs This chapter provides information about ways to write/update your code so that you can take advantage of the Silicon Graphics implementation of the IRIX 64-bit operating

More information

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms SE352b Software Engineering Design Tools W3: Programming Paradigms Feb. 3, 2005 SE352b, ECE,UWO, Hamada Ghenniwa SE352b: Roadmap CASE Tools: Introduction System Programming Tools Programming Paradigms

More information

Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS

Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS Contents Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS 1.1. INTRODUCTION TO COMPUTERS... 1 1.2. HISTORY OF C & C++... 3 1.3. DESIGN, DEVELOPMENT AND EXECUTION OF A PROGRAM... 3 1.4 TESTING OF PROGRAMS...

More information

Lecture Notes on Generic Data Structures

Lecture Notes on Generic Data Structures Lecture Notes on Generic Data Structures 15-122: Principles of Imperative Computation Frank Pfenning, Penny Anderson Lecture 21 November 7, 2013 1 Introduction Using void* to represent pointers to values

More information

Chapter 10. Improving the Runtime Type Checker Type-Flow Analysis

Chapter 10. Improving the Runtime Type Checker Type-Flow Analysis 122 Chapter 10 Improving the Runtime Type Checker The runtime overhead of the unoptimized RTC is quite high, because it instruments every use of a memory location in the program and tags every user-defined

More information

C++ Programming: Polymorphism

C++ Programming: Polymorphism C++ Programming: Polymorphism 2018 년도 2 학기 Instructor: Young-guk Ha Dept. of Computer Science & Engineering Contents Run-time binding in C++ Abstract base classes Run-time type identification 2 Function

More information

Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization

Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization Anton Kuijsten Andrew S. Tanenbaum Vrije Universiteit Amsterdam 21st USENIX Security Symposium Bellevue,

More information

Increases Program Structure which results in greater reliability. Polymorphism

Increases Program Structure which results in greater reliability. Polymorphism UNIT 4 C++ Inheritance What is Inheritance? Inheritance is the process by which new classes called derived classes are created from existing classes called base classes. The derived classes have all the

More information

Pointers (continued), arrays and strings

Pointers (continued), arrays and strings Pointers (continued), arrays and strings 1 Last week We have seen pointers, e.g. of type char *p with the operators * and & These are tricky to understand, unless you draw pictures 2 Pointer arithmetic

More information

In Java we have the keyword null, which is the value of an uninitialized reference type

In Java we have the keyword null, which is the value of an uninitialized reference type + More on Pointers + Null pointers In Java we have the keyword null, which is the value of an uninitialized reference type In C we sometimes use NULL, but its just a macro for the integer 0 Pointers are

More information

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things.

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things. A Appendix Grammar There is no worse danger for a teacher than to teach words instead of things. Marc Block Introduction keywords lexical conventions programs expressions statements declarations declarators

More information

Types and Type Inference

Types and Type Inference CS 242 2012 Types and Type Inference Notes modified from John Mitchell and Kathleen Fisher Reading: Concepts in Programming Languages, Revised Chapter 6 - handout on Web!! Outline General discussion of

More information

Lecture Notes on Generic Data Structures

Lecture Notes on Generic Data Structures Lecture Notes on Generic Data Structures 15-122: Principles of Imperative Computation Frank Pfenning Lecture 22 November 15, 2012 1 Introduction Using void* to represent pointers to values of arbitrary

More information

Absolute C++ Walter Savitch

Absolute C++ Walter Savitch Absolute C++ sixth edition Walter Savitch Global edition This page intentionally left blank Absolute C++, Global Edition Cover Title Page Copyright Page Preface Acknowledgments Brief Contents Contents

More information

CSCI 171 Chapter Outlines

CSCI 171 Chapter Outlines Contents CSCI 171 Chapter 1 Overview... 2 CSCI 171 Chapter 2 Programming Components... 3 CSCI 171 Chapter 3 (Sections 1 4) Selection Structures... 5 CSCI 171 Chapter 3 (Sections 5 & 6) Iteration Structures

More information

SoK: Eternal War in Memory

SoK: Eternal War in Memory SoK: Eternal War in Memory László Szekeres, Mathias Payer, Tao Wei, Dawn Song Presenter: Wajih 11/7/2017 Some slides are taken from original S&P presentation 1 What is SoK paper? Systematization of Knowledge

More information

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019 Hacking in C Pointers Radboud University, Nijmegen, The Netherlands Spring 2019 Allocation of multiple variables Consider the program main(){ char x; int i; short s; char y;... } What will the layout of

More information

Object Model. Object Oriented Programming Spring 2015

Object Model. Object Oriented Programming Spring 2015 Object Model Object Oriented Programming 236703 Spring 2015 Class Representation In Memory A class is an abstract entity, so why should it be represented in the runtime environment? Answer #1: Dynamic

More information

COS 320. Compiling Techniques

COS 320. Compiling Techniques Topic 5: Types COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 Types: potential benefits (I) 2 For programmers: help to eliminate common programming mistakes, particularly

More information

Inheritance, Polymorphism and the Object Memory Model

Inheritance, Polymorphism and the Object Memory Model Inheritance, Polymorphism and the Object Memory Model 1 how objects are stored in memory at runtime? compiler - operations such as access to a member of an object are compiled runtime - implementation

More information

SoftBound: Highly Compatible and Complete Spatial Safety for C

SoftBound: Highly Compatible and Complete Spatial Safety for C SoftBound: Highly Compatible and Complete Spatial Safety for C Santosh Nagarakatte, Jianzhou Zhao, Milo Martin, Steve Zdancewic University of Pennsylvania {santoshn, jianzhou, milom, stevez}@cis.upenn.edu

More information

Chapter 5. Names, Bindings, and Scopes

Chapter 5. Names, Bindings, and Scopes Chapter 5 Names, Bindings, and Scopes Chapter 5 Topics Introduction Names Variables The Concept of Binding Scope Scope and Lifetime Referencing Environments Named Constants 1-2 Introduction Imperative

More information

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics PIC 10A Pointers, Arrays, and Dynamic Memory Allocation Ernest Ryu UCLA Mathematics Pointers A variable is stored somewhere in memory. The address-of operator & returns the memory address of the variable.

More information

Pointers (continued), arrays and strings

Pointers (continued), arrays and strings Pointers (continued), arrays and strings 1 Last week We have seen pointers, e.g. of type char *p with the operators * and & These are tricky to understand, unless you draw pictures 2 Pointer arithmetic

More information

Runtime Defenses against Memory Corruption

Runtime Defenses against Memory Corruption CS 380S Runtime Defenses against Memory Corruption Vitaly Shmatikov slide 1 Reading Assignment Cowan et al. Buffer overflows: Attacks and defenses for the vulnerability of the decade (DISCEX 2000). Avijit,

More information

Software security, secure programming

Software security, secure programming Software security, secure programming Lecture 4: Protecting your code against software vulnerabilities? (overview) Master on Cybersecurity Master MoSiG Academic Year 2017-2018 Preamble Bad news several

More information

SoK: Sanitizing for Security

SoK: Sanitizing for Security SoK: Sanitizing for Security Dokyung Song, Julian Lettner, Prabhu Rajasekaran, Yeoul Na, Stijn Volckaert, Per Larsen, Michael Franz University of California, Irvine {dokyungs,jlettner,rajasekp,yeouln,stijnv,perl,franz}@uci.edu

More information

COMP322 - Introduction to C++ Lecture 02 - Basics of C++

COMP322 - Introduction to C++ Lecture 02 - Basics of C++ COMP322 - Introduction to C++ Lecture 02 - Basics of C++ School of Computer Science 16 January 2012 C++ basics - Arithmetic operators Where possible, C++ will automatically convert among the basic types.

More information

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors

Resource-Conscious Scheduling for Energy Efficiency on Multicore Processors Resource-Conscious Scheduling for Energy Efficiency on Andreas Merkel, Jan Stoess, Frank Bellosa System Architecture Group KIT The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe

More information

Data Storage. August 9, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 August 9, / 19

Data Storage. August 9, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 August 9, / 19 Data Storage Geoffrey Brown Bryce Himebaugh Indiana University August 9, 2016 Geoffrey Brown, Bryce Himebaugh 2015 August 9, 2016 1 / 19 Outline Bits, Bytes, Words Word Size Byte Addressable Memory Byte

More information

Subprograms. Copyright 2015 Pearson. All rights reserved. 1-1

Subprograms. Copyright 2015 Pearson. All rights reserved. 1-1 Subprograms Introduction Fundamentals of Subprograms Design Issues for Subprograms Local Referencing Environments Parameter-Passing Methods Parameters That Are Subprograms Calling Subprograms Indirectly

More information

Chapter 2. Procedural Programming

Chapter 2. Procedural Programming Chapter 2 Procedural Programming 2: Preview Basic concepts that are similar in both Java and C++, including: standard data types control structures I/O functions Dynamic memory management, and some basic

More information

Protecting Dynamic Code by Modular Control-Flow Integrity

Protecting Dynamic Code by Modular Control-Flow Integrity Protecting Dynamic Code by Modular Control-Flow Integrity Gang Tan Department of CSE, Penn State Univ. At International Workshop on Modularity Across the System Stack (MASS) Mar 14 th, 2016, Malaga, Spain

More information

Software Vulnerabilities August 31, 2011 / CS261 Computer Security

Software Vulnerabilities August 31, 2011 / CS261 Computer Security Software Vulnerabilities August 31, 2011 / CS261 Computer Security Software Vulnerabilities...1 Review paper discussion...2 Trampolining...2 Heap smashing...2 malloc/free...2 Double freeing...4 Defenses...5

More information

HideM: Protecting the Contents of Userspace Memory in the Face of Disclosure Vulnerabilities

HideM: Protecting the Contents of Userspace Memory in the Face of Disclosure Vulnerabilities HideM: Protecting the Contents of Userspace Memory in the Face of Disclosure Vulnerabilities Jason Gionta, William Enck, Peng Ning 1 JIT-ROP 2 Two Attack Categories Injection Attacks Code Integrity Data

More information

Object-Oriented Programming for Scientific Computing

Object-Oriented Programming for Scientific Computing Object-Oriented Programming for Scientific Computing Dynamic Memory Management Ole Klein Interdisciplinary Center for Scientific Computing Heidelberg University ole.klein@iwr.uni-heidelberg.de 2. Mai 2017

More information

Black Hat Webcast Series. C/C++ AppSec in 2014

Black Hat Webcast Series. C/C++ AppSec in 2014 Black Hat Webcast Series C/C++ AppSec in 2014 Who Am I Chris Rohlf Leaf SR (Security Research) - Founder / Consultant BlackHat Speaker { 2009, 2011, 2012 } BlackHat Review Board Member http://leafsr.com

More information

Ironclad C++ A Library-Augmented Type-Safe Subset of C++

Ironclad C++ A Library-Augmented Type-Safe Subset of C++ Ironclad C++ A Library-Augmented Type-Safe Subset of C++ Christian DeLozier, Richard Eisenberg, Peter-Michael Osera, Santosh Nagarakatte*, Milo M. K. Martin, and Steve Zdancewic October 30, 2013 University

More information

Baggy bounds with LLVM

Baggy bounds with LLVM Baggy bounds with LLVM Anton Anastasov Chirantan Ekbote Travis Hance 6.858 Project Final Report 1 Introduction Buffer overflows are a well-known security problem; a simple buffer-overflow bug can often

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Winter 2017 Lecture 4a Andrew Tolmach Portland State University 1994-2017 Semantics and Erroneous Programs Important part of language specification is distinguishing valid from

More information

CS201 Some Important Definitions

CS201 Some Important Definitions CS201 Some Important Definitions For Viva Preparation 1. What is a program? A program is a precise sequence of steps to solve a particular problem. 2. What is a class? We write a C++ program using data

More information

mmap: Memory Mapped Files in R

mmap: Memory Mapped Files in R mmap: Memory Mapped Files in R Jeffrey A. Ryan August 1, 2011 Abstract The mmap package offers a cross-platform interface for R to information that resides on disk. As dataset grow, the finite limits of

More information

The New C Standard (Excerpted material)

The New C Standard (Excerpted material) The New C Standard (Excerpted material) An Economic and Cultural Derek M. Jones derek@knosof.co.uk Copyright 2002-2008 Derek M. Jones. All rights reserved. 1456 6.7.2.3 Tags 6.7.2.3 Tags type contents

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2016 Lecture 3a Andrew Tolmach Portland State University 1994-2016 Formal Semantics Goal: rigorous and unambiguous definition in terms of a wellunderstood formalism (e.g.

More information

Rvalue References as Funny Lvalues

Rvalue References as Funny Lvalues I. Background Document: Author: Date: 2009-11-09 Revision: 1 PL22.16/09-0200 = WG21 N3010 William M. Miller Edison Design Group Rvalue References as Funny Lvalues Rvalue references were introduced into

More information

Putting the Checks into Checked C. Archibald Samuel Elliott Quals - 31st Oct 2017

Putting the Checks into Checked C. Archibald Samuel Elliott Quals - 31st Oct 2017 Putting the Checks into Checked C Archibald Samuel Elliott Quals - 31st Oct 2017 Added Runtime Bounds Checks to the Checked C Compiler 2 C Extension for Spatial Memory Safety Added Runtime Bounds Checks

More information

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments Programming Languages Third Edition Chapter 10 Control II Procedures and Environments Objectives Understand the nature of procedure definition and activation Understand procedure semantics Learn parameter-passing

More information

The New C Standard (Excerpted material)

The New C Standard (Excerpted material) The New C Standard (Excerpted material) An Economic and Cultural Commentary Derek M. Jones derek@knosof.co.uk Copyright 2002-2008 Derek M. Jones. All rights reserved. 39 3.2 3.2 additive operators pointer

More information

A Taxonomy of Expression Value Categories

A Taxonomy of Expression Value Categories Document: Author: Date: 2010-03-12 Revision: 6 PL22.16/10-0045 = WG21 N3055 William M. Miller Edison Design Group A Taxonomy of Expression Value Categories Revision History: Revision 6 (PL22.16/10-0045

More information

Compiler construction 2009

Compiler construction 2009 Compiler construction 2009 Lecture 6 Some project extensions. Pointers and heap allocation. Object-oriented languages. Module systems. Memory structure Javalette restrictions Only local variables and parameters

More information

CSE 351, Spring 2010 Lab 7: Writing a Dynamic Storage Allocator Due: Thursday May 27, 11:59PM

CSE 351, Spring 2010 Lab 7: Writing a Dynamic Storage Allocator Due: Thursday May 27, 11:59PM CSE 351, Spring 2010 Lab 7: Writing a Dynamic Storage Allocator Due: Thursday May 27, 11:59PM 1 Instructions In this lab you will be writing a dynamic storage allocator for C programs, i.e., your own version

More information

Pointers. Reference operator (&) ted = &andy;

Pointers. Reference operator (&)  ted = &andy; Pointers We have already seen how variables are seen as memory cells that can be accessed using their identifiers. This way we did not have to care about the physical location of our data within memory,

More information

Overload Resolution. Ansel Sermersheim & Barbara Geller Amsterdam C++ Group March 2019

Overload Resolution. Ansel Sermersheim & Barbara Geller Amsterdam C++ Group March 2019 Ansel Sermersheim & Barbara Geller Amsterdam C++ Group March 2019 1 Introduction Prologue Definition of Function Overloading Determining which Overload to call How Works Standard Conversion Sequences Examples

More information

Types II. Hwansoo Han

Types II. Hwansoo Han Types II Hwansoo Han Arrays Most common and important composite data types Homogeneous elements, unlike records Fortran77 requires element type be scalar Elements can be any type (Fortran90, etc.) A mapping

More information

CSC 533: Organization of Programming Languages. Spring 2005

CSC 533: Organization of Programming Languages. Spring 2005 CSC 533: Organization of Programming Languages Spring 2005 Language features and issues variables & bindings data types primitive complex/structured expressions & assignments control structures subprograms

More information

Introduction to Programming Using Java (98-388)

Introduction to Programming Using Java (98-388) Introduction to Programming Using Java (98-388) Understand Java fundamentals Describe the use of main in a Java application Signature of main, why it is static; how to consume an instance of your own class;

More information

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus Types Type checking What is a type? The notion varies from language to language Consensus A set of values A set of operations on those values Classes are one instantiation of the modern notion of type

More information