A brief introduction to C programming for Java programmers

Similar documents
Programming refresher and intro to C programming

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

Outline. 1 About the course

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

Introduction to Programming Using Java (98-388)

Outline. 1 Function calls and parameter passing. 2 Pointers, arrays, and references. 5 Declarations, scope, and lifetimes 6 I/O

Pointers. 1 Background. 1.1 Variables and Memory. 1.2 Motivating Pointers Massachusetts Institute of Technology

Chapter IV Introduction to C for Java programmers

More on C programming

CS2141 Software Development using C/C++ C++ Basics

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Introduction to C. Robert Escriva. Cornell CS 4411, August 30, Geared toward programmers

CSE 307: Principles of Programming Languages

Pointers and Memory 1

Kurt Schmidt. October 30, 2018

Pointers (continued), arrays and strings

CS 61C: Great Ideas in Computer Architecture. Lecture 3: Pointers. Bernhard Boser & Randy Katz

nptr = new int; // assigns valid address_of_int value to nptr std::cin >> n; // assigns valid int value to n

Agenda. Components of a Computer. Computer Memory Type Name Addr Value. Pointer Type. Pointers. CS 61C: Great Ideas in Computer Architecture

CS 261 Fall C Introduction. Variables, Memory Model, Pointers, and Debugging. Mike Lam, Professor

[0569] p 0318 garbage

CS 61C: Great Ideas in Computer Architecture. Lecture 3: Pointers. Krste Asanović & Randy Katz

CS 61c: Great Ideas in Computer Architecture

Motivation was to facilitate development of systems software, especially OS development.

Pointers (continued), arrays and strings

Design and development of embedded systems for the Internet of Things (IoT) Fabio Angeletti Fabrizio Gattuso

VARIABLES AND TYPES CITS1001

Introduction to C. Sean Ogden. Cornell CS 4411, August 30, Geared toward programmers

Lectures 5-6: Introduction to C

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

Contents of Lecture 3

CS 61C: Great Ideas in Computer Architecture. C Arrays, Strings, More Pointers

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

Introduction to C. Ayush Dubey. Cornell CS 4411, August 31, Geared toward programmers

Introduce C# as Object Oriented programming language. Explain, tokens,

Memory, Data, & Addressing II CSE 351 Spring

Chapter 1 INTRODUCTION SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

ECEN 449 Microprocessor System Design. Review of C Programming. Texas A&M University

CSE 303: Concepts and Tools for Software Development

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

COMP 2355 Introduction to Systems Programming

Programming Languages Third Edition. Chapter 7 Basic Semantics

Pointers and Arrays CS 201. This slide set covers pointers and arrays in C++. You should read Chapter 8 from your Deitel & Deitel book.

Intermediate Programming, Spring 2017*

C Introduction. Comparison w/ Java, Memory Model, and Pointers

EL2310 Scientific Programming

Arrays and Pointers in C. Alan L. Cox

Systems Programming and Computer Architecture ( )

EMBEDDED SYSTEMS PROGRAMMING Language Basics

Pointers, Dynamic Data, and Reference Types

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

Socket programming in C

Procedural programming with C

Reference operator (&)

INITIALISING POINTER VARIABLES; DYNAMIC VARIABLES; OPERATIONS ON POINTERS

QUIZ. What is wrong with this code that uses default arguments?

CS 220: Introduction to Parallel Computing. Arrays. Lecture 4

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #29 Arrays in C

ECEN 449 Microprocessor System Design. Review of C Programming

Lecture Notes on Memory Management

Pointer Arithmetic and Lexical Scoping. CS449 Spring 2016

CE221 Programming in C++ Part 2 References and Pointers, Arrays and Strings

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Syntax and Variables

C Programming. Course Outline. C Programming. Code: MBD101. Duration: 10 Hours. Prerequisites:

CS61C : Machine Structures

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

gcc hello.c a.out Hello, world gcc -o hello hello.c hello Hello, world

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Variables. Data Types.

Lectures 5-6: Introduction to C

ch = argv[i][++j]; /* why does ++j but j++ does not? */

Introduction to C. Zhiyuan Teo. Cornell CS 4411, August 26, Geared toward programmers

Lecture 2: C Programm

Pointers (part 1) What are pointers? EECS We have seen pointers before. scanf( %f, &inches );! 25 September 2017

CS201 - Introduction to Programming Glossary By

Memory, Arrays & Pointers

CS107 Handout 08 Spring 2007 April 9, 2007 The Ins and Outs of C Arrays

Fast Introduction to Object Oriented Programming and C++

ECE 15B COMPUTER ORGANIZATION

CPSC 3740 Programming Languages University of Lethbridge. Data Types

CMSC 330: Organization of Programming Languages

The PCAT Programming Language Reference Manual

NEXT SET OF SLIDES FROM DENNIS FREY S FALL 2011 CMSC313.

CSC C69: OPERATING SYSTEMS

Final CSE 131B Spring 2004

Variables Data types Variable I/O. C introduction. Variables. Variables 1 / 14

Topic 9: Type Checking

Topic 9: Type Checking

Programming in C and C++

PRINCIPLES OF OPERATING SYSTEMS

a data type is Types

C Language Part 1 Digital Computer Concept and Practice Copyright 2012 by Jaejin Lee

HW1 due Monday by 9:30am Assignment online, submission details to come

Course Text. Course Description. Course Objectives. StraighterLine Introduction to Programming in C++

Exam 1 Prep. Dr. Demetrios Glinos University of Central Florida. COP3330 Object Oriented Programming

Lecture Notes on Memory Management

Memory Corruption 101 From Primitives to Exploit

Transcription:

A brief introduction to C programming for Java programmers Sven Gestegård Robertz <sven.robertz@cs.lth.se> September 2017 There are many similarities between Java and C. The syntax in Java is basically C syntax which has been adapted and extended. Thus most of the arithmetic operations and control flow statements (like if, while, for, etc.) are the same. However, despite the similar syntax the languages are quite different, and this document outlines some important differences between Java and C. This is not a C programming tutorial, and is by no means a complete list of important differences between C and Java, but a set of pointers to topics that a Java programmer may find difficult or confusing when first encountering C. C is not object-oriented. There are no classes in C, the top level building block is functions. However, C does allow grouping variables in a struct, which is like a Java class with public attributes and without methods. Operations on that struct can then be expressed as functions that take a pointer to the struct as an argument (much like the implicit this parameter to Java methods). The main thing that has no direct counterpart in C is inheritance and polymorphism. A pointer is a variable containing the address of a variable. In C, the use of pointers is a common source of confusion, starting with the syntax where the prefix operator * is used in both declarations and expressions, with different meaning. In declarations, prefix * means is pointer to : int *x; // x is a pointer -to - int In expressions, prefix * means contents of (dereference): int i = * x; // i gets the value that x points to and prefix & means address of (getting a pointer to a variable): int * y = & i; // y points to the variable i Note that *y is an int variable, as y is a pointer to int. Thus the statement *y = 5; means that the variable that y points to (in this case i) is assigned 5. C has no runtime safety net. In Java, the JVM performs runtime checks to detect or prevent certain types of errors. For instance, if you attempt to access oustide an array (e.g., trying to write the tenth element in an array of size 5, like int a[5], you will get an ArrayIndexOutOfBoundsException. In C, you get undefined behaviour, which often means that the program will crash due to memory corruption often at a later point in time. The same applies to type casts: In Java, if you attempt casting a variable to an incompatible type you get 1

a ClassCastException. In C, the programmer is responsible for that the type cast makes sense. If you, for instance, cast a pointer to a larger data type and assign through it, you will probably get memory corruption and undefined behaviour. The following snippet illustrates this: char c; int *p = &c; *p = 0; Here, the problem is that the size of the variable c is one byte, but the size of int is 4 bytes. Therefore, when *p = 0 is executed, four bytes are written to a variable of size 1, overwriting three bytes of memory adjacent to the variable c. If you enable warnings, the compiler will give a warning: initialization from incompatible pointer type. In C, function arguments can be passed by value or by reference. In Java, function arguments of primitive types are always passed by value, and objects (class instances) are always passed by reference. In C, you can choose (with the exception that arrays cannot be passed by value 1 ). That means that you can use pass-by-reference (pointer) for any data type. That is commonly used in the system libraries, where functions like ssize_t read ( int fildes, void * buf, size_t nbyte ); take an output parameter (in this case buf) that is used to provide the result to the caller. Example: void f_by_value ( int x) printf (" f_by_value : x = %d\n",x); x += 10; printf (" f_by_value : x = %d\n",x); void f_by_reference ( int * x) printf (" f_by_reference :x = %d\n",*x); *x += 10; printf (" f_by_reference : x = %d\n",*x); void example () int x = 10; printf (" example : x=%d\n",x); f_by_value (x); printf (" example : x=%d\n",x); f_by_reference (& x); printf (" example : x=%d\n",x); Here, calling f_by_value() does not change the value of x in the calling function, whereas calling f_by_reference() does. 1 If you really want to pass an array by value, you can define a struct containing an array, as structs can be passed by value. 2

C does not have function overloading. You cannot have two functions with the same name. Also, the namespace is flat, so all function names must be unique. Beware of name clashes with functions from the standard library, as common names like open, close, exit, shutdown, etc., exist there. A common way of naming funcions is to prefix their names with e.g., a module name (e.g., mydevice_init() instead of just init()). C arrays decay to a pointer to the first element. You often read that a C array is simply a pointer to the first element. While true in many situations, there is an important distinction. When you declare an array variable (like int a[5];) you allocate space (for a local variable, in the current function) for an array of five ints (i.e., an object whose size is 5*sizeof(int)) and bind the name a to it. If you declare int *p; you simply allocate space for the pointer, and no array. Thus, when talking about objects, or variables, an array and a pointer are different types, and in the given example, a is an array of five ints, and p is a pointer to int. However, if you use the name of an array variable (a) as an argument to a function taking a pointer (like void f(int*)), in an expression like f(a);, then what gets passed to the function is not the entire array, but a pointer to the first element (like had you written f(&(a[0]));. This is known as array decay. Here, the syntax of C may be a bit confusing, as an array parameter to a function can be written either int a[] or int *a. This is just syntactic sugar, and the function prototypes void f(int a[]); and void f(int *a); are equivalent: in both cases, a is a pointer to int. Adding to the confusion, the length of an array can be omitted in an array declaration where the array is also initialized. For instance, the declaration char s[] = "Hello"; is equivalent to char s[6] = "Hello"; as the compiler uses the length from the supplied initial value. In this case, the variable char s[] is an array of chars, and not just a pointer to char. C arrays contain no length information. All functions that take an array as a parameter just get a pointer to the first element (array decay). Thus, there is no built-in information about the length of the array, and such functions also need a length parameter. It is the responsibility of the programmer that the length passed is correct. A word of warning: C array variables have no length information, but in the scope where an array variable is defined, the compiler does know the length of the array. This information is, however, only available in the scope where the array is declared (typically a function). Sometimes you see code like the following: int a[] =...; // some initializer int i; for (i = 0; i < sizeof (a)/ sizeof (*a); ++i) // do something with a[ i ]; This idiom is dangerous if you are not careful with when it works as intended. As stated above, sizeof applied to an array variable will return the size (in bytes) of the array, and thus the expression sizeof(a)/sizeof(*a) is the number of elements in a (the size of the entire array divided by the size of one element). 3

However, if a were a pointer, sizeof(a) would return the size of a pointer, and not the size of whatever it points to (as that cannot be determined without run-time information). Thus, the above idiom of using sizeof to get the length of an array only works in the same scope as the declaration of the array variable. To illustrate the pitfall, the following code void f( char a []) printf ("f: The length of %s is %lu\n", a, sizeof (a)); int main () char s[] = " Hello, world!"; will give the output printf (" main : The length of %s is %lu\n", s, sizeof (s)); f(s); main : The length of Hello, world! is 14 f: The length of Hello, world! is 8 as s in example() is an array and thus sizeof gives the actual size, but in f(), a is a pointer and sizeof gives the size of a pointer (in this case 64 bits = 8 bytes). With warnings enabled, the compiler will give the (somewhat cryptic) sizeof on array function parameter will return size of char * instead of char [] C has pointer arithmetic. In C, a pointer is a variable containing the address of another variable of a specified type. C pointers have the arithmetic operators, and the most common use case is that array indexing is tightly connected to pointer arithmetic. Consider the following code, which fills an array with zeroes: int a [10]; int i; for ( i = 0; i < 10; ++i) a[ i] = 0; Here, the expression a[i] is equivalent to *(a+i) (i.e., the contents of the ith element of a. Thus, the assignment to array element i can also be written as *(a+i) = 0;. This use is mostly cryptical, but a common use-case of this is when passing arrays to other functions. E.g., assume we have char buf[bufsize]; and we want to read some data from a file descriptor fd into the buffer buf, starting at position pos (for instance, to continue reading a message after reading the first pos bytes of it). We can do that with the statement read (fd, buf +pos, BUFSIZE - pos ); and here, the expression buf+pos is, arguably, clearer than the equivalent expression &(buf[pos]). In order for this to work, adding a number k to a pointer means adding k times the size of the pointed-to type to the address. For example, if the array a in the above example is stored at address 0x10000, and sizeof(int) is 4 bytes, the value of the expression a+2 would be 0x10008. This means that the expression a[i] is equivalent to *(a+i). Incidentally, that means that you can write array indexing backwards : the expressions a[5] and 5[a] are equivalent (both meaning *(a+5)) although one obviously shouldn t use the latter. 4

C strings are null-terminated character arrays. In Java, the String class contains information about the length of a string and methods for copying, assigning, concatenating, comparing, etc., strings. In C, a string is simply an array of characters terminated with a null (i.e., the integer value zero) so for instance the string "hello" is an array of 6 chars (the five letters plus a null). Copying, concatenating and comparing strings is done using standard library functions (e.g. strncpy, strncat and strncmp). Another subtle detail about strings is that you can define a string variable in two ways: const char * s1 = " hello "; char s2 [] = " hello "; where s1 cannot be modified (as it is just a pointer pointing to a string literal which may be stored in read-only memory) whereas s2 can (as it is a char array allocated as a local variable and initialized with the string literal). Sometimes one sees char *s = "hello"; (without the const) but this is wrong, and changing the string pointed to by s is undefined behaviour. C has no boolean type. 2 Instead any integral value can be used in a boolean context. Here zero is interpreted as false, and non-zero as true. This means that any expression can be used in a boolean context. A common mistake is to write an assignment (which is an expression) instead of a comparison. For instance, the snippet int x = 5; if(x = 0) printf ("x is zero \n"); else printf ("x is %d\n", x; will print x is 0, as x = 0 is an assigment and its value is zero, which is interpreted as false in the boolean context and thus the else branch is chosen. This mistake is quite common, so if you turn on compiler warnings (e.g., with the option -Wall) you will get a warning if you use assignment in a boolean context. C has both signed and unsigned integer types. Unlike Java, which has only signed integer types, C has both signed and unsigned types. For instance, the type signed char typically has the value range [ 128, 127], whereas unsigned char has the range [0, 255]. For normal integer arithmetic this is of little importance, and just use the signed types (like int) to represent numbers. On the other hand, if you want to represent, or manipulate, bit patterns, using unsigned types is recommended, as that avoids problems related to sign extension. Also, with signed types the behaviour of the right shift operator (>>) when applied to a negative number (i.e., if a zero or a one is shifted in) is implementation defined, wheras right shift of an unsigned integer always shifts in zeros. 2 The type bool was introduced in the C99 standard, but that is still regarded as a new standard and still not directly supported by many C compilers. And what is said here about integers as boolean values is true also in C99. 5

C has no automatic memory management. In Java, you allocated objects with new, and when an object cannot be reached by the program the memory used by the object is automatically reclaimed by the garbage collector. In C, the programmer is responsible for freeing memory when it is no longer used, and failure to do so will cause a memory leak, and eventually lead to an out-ofmemory error. For this reason, the recommendation is to use stack allocation (i.e., use local variables in functions) whenever possible. Make sure the lifetime of a pointer is longer than the object it points to. Do not return pointers to local variables C has no exception mechanism. Error-handling has to be done explicitly. Often, the return value of functions is used to indicate success or failure. For instance, a function like read() returns the number of characters read on success, or a negative value on failure. Functions returning pointers often return null to indicate failure. A third option is to just return an error code: often 0 (i.e., false if no error) or non-zero (true) on error. The typical use of such a function is if( my_function_that_may_fail ()) // handle error or, if the user needs to handle different error codes differently int result = my_function_that_may_fail (); if( result == ERROR1 ) // handle error type 1 else if( result == ERROR2 ) // handle error type 2 Turn on compiler warnings. C has many pitfalls and some legal constructions are often not what the programmer intended (a common example is writing if(a = 0) instead of if(a == 0)). To get as much help as possible from the compiler it is recommended to enable warnings, and often also to make the compiler threat warnings as errors (with -Werror). With gcc and clang, the following set of options is a reasonable default: - Wall - Werror - pedantic - pedantic - errors 6