CS201 - Lecture 1 The C Programming Language RAOUL RIVAS PORTLAND STATE UNIVERSITY
History of the C Language The C language was invented in 1970 by Dennis Ritchie Dennis Ritchie and Ken Thompson were employees at Bell Labs (AT&T) developing an Operating System for multiple users called UNIX. The older cousin of Linux They were using a lot of assembly and a programming language called the B Language The B Language was not powerful enough Writing Unix in assembly was not fast enough They modified the B Language to be able to do what they needed and they called it the C Language 2
From AT&T to ANSI C C was initially used for DEC s PDP-11 Soon compilers for other systems were written Honeywell 6000, IBM s System/370 Kernighan and Ritchie s book was used as a loose specification of the language As popularity increased and many libraries arose, it came the need for a formal standard In 1983 the American National Standards Institute created a committee to standardize the C language The ANSI C standard is ratified in 1989 and it is known as C89 DEC s PDP-11 3
The C Language Nowadays ANSI C is fully portable If programmer makes no assumption about platform Do not assume size of data types, endianness. If programmer uses ANSI C standard code and libraries only No proprietary libraries or extensions Compilers for C exist for nearly every architecture Operating Systems and many other Applications are written in C and/or C++ Microsoft Windows, Linux, Android, ios, Microsoft Office, OS X, Adobe Photoshop, Google Chrome, Mozilla Firefox, GCC, Visual Studio 4
The C Language Nowadays In 1983 Ritchie and Thomson received the Turing Award for the development of Unix The Nobel Prize of Computer Science In 1999 they received the U.S National Medal of Technology for the design of Unix and the C Language Highest award in innovation awarded to a US Citizen 5
Hello World in C #include <stdio.h> main() { printf( hello, world ); } Include references to required library stdio.h (so we can use printf) Declare function main Call function printf to print the string: hello, world ANSI C Programs start at function main 6
A simple C example main() { } unsigned int result; unsigned int input = 3; result = compute_f(input); unsigned int compute_f(unsigned int n) { } return (n<1)? 1 : n * compute_f(--n); 1. What is the value of result after the program executes compute_f()? result = 6 2. What is compute_f() doing? Factorial of n 3. What is the value of input after the program executes compute_f()? input = 3 (C uses By Value as parameter passing mechanism) 7
If you know Java, C++ or C# There are no classes in C. C is function oriented There is no memory management So no NEW and no Garbage Collection Variables must be declared at the top of the block: main() { int a; a= 1 + 2; main() { int a; int b; Wrong! How we fix it? } int b; b = a + 3; } a= 1 + 2; b = a + 3; 8
If you know Java, C++ or C# There are no exceptions in C int speed(int d, int t) { return d / t; } If (t==0) this fails silently in ANSI C Error handling is done using the return values of functions Remember factorial? What if n<0? int factorial(int n) It is not pretty { but is efficient! return if (n<0)(n<1) return? 1-1; : n * factorial(n-1) } return (n<1)? 1 : n * factorial(n-1) } 9
Basic Data Types char 8-bit character. Strings are represented as arrays of characters. (More on the next slide) short int A short integer. 16-bits on IA32 int Integer. Usually 32-bits on IA32 long int Long Integer 32-bits on IA32 long long int 64 bits on IA32 float 32 bits. Typically an IEEE 754 single precision floating point double 64 bits. Typically an IEEE 754 double precision floating point 10
Basic Data Types The sizes for some types are architecture dependent. So it s best to always use sizeof(type) and preprocessor definitions to check for sizes. The file limits.h includes the ranges of each data type INT_MIN, INT_MAZ, LONG_MIN, LONG_MAX, etc Most programs can be coded independently of the exact size of these types. If you need to manipulate specific size of data use fixed length integer types from stdint.h int32_t, int16_t, uint32_t, uint16_t, int64_t, etc 11
Static Arrays Static arrays cannot change in size Single Dimension Arrays type name[elements]; char myarray[25]; Multidimensional Arrays Type name[rows][columns].; Implemented in Row-major Order int test[2][3] 11 12 13 21 22 23 memory Programmer needs to know the implementation to optimize code Access patterns will impact performance 12
C Strings Implemented as static arrays of characters char mystr [length]; Strings are not a type in C. They are an array! Last character must be NULL (zero) also written \0. So if you need to store words of 5 letters you need an array of characters of length 6. char one[6] = Hello ; char two[6] = { H, e, l, l, o, \0 }; H e l l o \0 72 101 108 108 111 0 Memory (ASCII) 13
Structures Structures was a feature that Ritchie thought they needed for their Unix project #include <stdio.h> struct person { char* name; int age; }; int main() { struct person entry; entry.name = & John Smith ; entry.age = 27; } printf( Name: %s, Age: %d\n, entry.name, entry.age); return 0; 14
Pointers A variable that stores the address of a region in memory Use arithmetic operators to manipulate pointers Dereference operator * to access what the pointer points to Address-of operator & to get the address of a variable int *ptrdata; int Val[2] = {55, 66}; ptrdata=&val; *ptrdata=88; ptrdata++; *ptrdata=77; 0x1238 ptrdata Val memory 66 55 0x1238 0x1234 0x1000 15
Pointers https://xkcd.com/138/ 16
Pointers What s wrong with this code? void swap(int a, int b) { int temp; temp = a; a = b; b = temp; } There are no side effects outside of swap because C is Pass-By-Value! 17
Pointers We can emulate Pass-By-Reference using pointers void swap(int* a, int* b) { int temp; temp = *a; *a = *b; *b = temp; } What s the proper invocation of this function? 18
Dynamic Memory In static allocation we need to know the size of the array we are declaring before we compile our program Most times we do not how much memory we will need Dynamic memory allows us to allocate memory while the program is running. In C++ and Java we use new and the garbage collector releases it In C we use malloc() and free() Malloc returns a pointer to the allocated memory The memory region must be released when it is not longer needed by calling free. Watch for memory leaks and double frees 19
Dynamic Memory int main() { int i; int *ptr, *curr; int array[5] = {10,20,30,40,50}; 1. What s the output of this program? } ptr=(int*) malloc(sizeof(int) * 5); curr=ptr; for(i = 0; i < 5; i++) { ptr[i]= *(array + i); curr++; printf( Element %d: %d\n, i, *curr); } free(ptr); return 0; 2. Are there any memory leaks? 20
Command Line arguments Most programs use command line arguments to configure its execution gzip k input1.txt input2.txt To retrieve them we use the full declaration of main: int main(int argc, char *argv[]) argc: Number of arguments (including program name) argv: Pointer to an array of pointers to strings argv[0] contains the program name What s the value of argc? argc=4 What s the contents of each entry in argv[]? argv[0]= gzip, argv[1]= -k, argv[2]= input1.txt, argv[3]= input2.txt 21
Command line arguments int main(int argc, char* argv[]) { int i; } for(i = 0; i < argc; i++) { printf( Argument %d: %s\n, argv[i]); } return 0; Read section 5.10 of K&R and make sure you understand the code on page 117 22
Pointers to Structures entry->name is equivalent to (*entry).name #include <stdio.h> struct person { char* name; int age; }; int main(int argc, *argv[]) { struct person *entry; entry = (struct person*) malloc(sizeof(struct person)); entry->name = & John Smith ; entry->age = 27; } printf( Name: %s, Age: %d\n, entry->name, entry->age); return 0; 23
Self Referential Structures Pointers allow to declare self referenced structures Basis for advanced data structures: linked-lists, hash tables, trees, etc. You took or are probably taking CS163 typedef struct LISTNODE; typedef struct { LIST_NODE* next; int data; }LIST_NODE; LIST_NODE* mylist; 24
Pointers to Functions C also allows to create pointers to functions Change the execution of a program at runtime Create plugins and extensions void print_even(int i) {printf( Even: %d\n, i);} void print_odd(int i) {printf( Odd: %d\n, i);} int main(int argc, char *argv[]) { void (*fp)(int); } fp=(argc%2)? print_even : print_odd; fp(argc); return 0; 25
Using GCC To compile our C code we will use the GNU C Compiler gcc input_files [flags] o output_file Switches you must use in your code for this class! -ansi: Specify the input is ANSI C -pedantic: Strictly adhere to the ANSI Standard -Wall: Enable all warnings -O0: Disable all optimizations If you need to use the debugger (GDB) you need to include: -g: Enable GDB information ( symbols ) 26
Compilation Overview What happens when we call GCC? source files gcc list.c math.c other.c o myapp list.c math.c other.c Compiler Compiler Compiler object files list.o math.o other.o Libraries executable Linker myapp gcc c compiles but does not link 27
Make Simple scripting language specifically designed to automate compilation Recipe for compiling your code Script file is called makefile or Makefile (big or little M) The make utility will use that by default Makefiles are composed as a set of rules (targets and actions) First line is a target. Second line is an action. The second line of each rule (the action) must start with a tab, not spaces! target: dependencies commands The first rule in the Makefile is used by default if you just say make with no arguments 28
The simplest Makefile sd: sd.c gcc sd.c o sd TAB (not spaces!) Generates sd by compiling sd.c using gcc Target: sd Dependencies: sd.c Action: gcc sd.c o sd 29
A More Practical Example CFLAGS = -Wall ansi pedantic O0 all: sd test1 sd: sd.c gcc $(CFLAGS) -o sd sd.c test1: part1.o part2.o gcc -o test1 part1.o part2.o part1.o: part1.c gcc -c o part1.o part1.c part2.o: part2.c gcc -c o part2.o part2.c What happens step by step when user types make test1? What happens when user types make all? clean: rm f *.o sd test1 test2 30
GNU TAR Popular archiving file format used in UNIX Used along with other programs can provide file compression (i.e gzip, bzip2, etc) Similar functionality to ZIP in Windows To create a TAR file: tar cvf mytarfile.tar input1.txt input2.txt input3.txt To extract from a TAR file: tar xvf mytarfile.tar 31
Summary C is a powerful language widely used in the industry Pointers are variables that contain memory addresses to other variables C uses row-major implementation for arrays Strings are not basic types in C, instead they are implemented as arrays of characters C uses pass-by value but we use pointers to emulate passby-reference Dynamic memory allows us to specify the size of an array at runtime. Programmers must ensure their code is free of memory leaks and double frees 32