Data Types. (with Examples In Haskell) COMP 524: Programming Languages Srinivas Krishnan March 22, 2011

Similar documents
Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

CS558 Programming Languages

G Programming Languages Spring 2010 Lecture 6. Robert Grimm, New York University

COS 140: Foundations of Computer Science

CS558 Programming Languages

Lecture 12: Data Types (and Some Leftover ML)

Final Review. COMP 524: Programming Languages Srinivas Krishnan April 25, 2011

CPSC 3740 Programming Languages University of Lethbridge. Data Types

TYPES, VALUES AND DECLARATIONS

COS 140: Foundations of Computer Science

Question No: 1 ( Marks: 1 ) - Please choose one One difference LISP and PROLOG is. AI Puzzle Game All f the given

22c:111 Programming Language Concepts. Fall Types I

Programming Languages, Summary CSC419; Odelia Schwartz

CSCI312 Principles of Programming Languages!

CA341 - Comparative Programming Languages

Fundamentals of Programming Languages

Data Types. Data Types. Introduction. Data Types. Data Types. Data Types. Introduction

Introduction. Primitive Data Types: Integer. Primitive Data Types. ICOM 4036 Programming Languages

CS321 Languages and Compiler Design I Winter 2012 Lecture 13

CS321 Languages and Compiler Design I. Fall 2013 Week 8: Types Andrew Tolmach Portland State University

Topic 9: Type Checking

Topic 9: Type Checking

Lecture Overview. [Scott, chapter 7] [Sebesta, chapter 6]

Computer Components. Software{ User Programs. Operating System. Hardware

Lecture Notes on Programming Languages

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

Types. What is a type?

Example: Haskell algebraic data types (1)

CSC324 Principles of Programming Languages

CSCI-GA Scripting Languages

Chapter 6 part 1. Data Types. (updated based on 11th edition) ISBN

COSC252: Programming Languages: Basic Semantics: Data Types. Jeremy Bolton, PhD Asst Teaching Professor

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Chapter 5 Names, Binding, Type Checking and Scopes

Programmiersprachen (Programming Languages)

Introduction Primitive Data Types Character String Types User-Defined Ordinal Types Array Types. Record Types. Pointer and Reference Types

CS112 Lecture: Primitive Types, Operators, Strings

Organization of Programming Languages (CSE452) Why are there so many programming languages? What makes a language successful?

Haskell 98 in short! CPSC 449 Principles of Programming Languages

The New C Standard (Excerpted material)

Introduction to Visual Basic and Visual C++ Introduction to Java. JDK Editions. Overview. Lesson 13. Overview

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

9/7/17. Outline. Name, Scope and Binding. Names. Introduction. Names (continued) Names (continued) In Text: Chapter 5

Types. Chapter Six Modern Programming Languages, 2nd ed. 1

The current topic: Python. Announcements. Python. Python

Chapter 5. Names, Bindings, and Scopes

LECTURE 18. Control Flow

Storage. Outline. Variables and Updating. Composite Variables. Storables Lifetime : Programming Languages. Course slides - Storage

Chapter 5 Names, Bindings, Type Checking, and Scopes

Chapter 7:: Data Types. Mid-Term Test. Mid-Term Test (cont.) Administrative Notes

Chapter 6. Data Types

Types and Static Type Checking (Introducing Micro-Haskell)

Input And Output of C++

Fundamental of Programming (C)

Organization of Programming Languages CS3200 / 5200N. Lecture 06

Index. object lifetimes, and ownership, use after change by an alias errors, use after drop errors, BTreeMap, 309

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

COMP6700/2140 Data and Types

Concepts of Programming Languages

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Zheng-Liang Lu Java Programming 45 / 79

Questions? Static Semantics. Static Semantics. Static Semantics. Next week on Wednesday (5 th of October) no

Final-Term Papers Solved MCQS with Reference

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

CSC 533: Organization of Programming Languages. Spring 2005


EMBEDDED SYSTEMS PROGRAMMING Language Basics

Full file at

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

1 class Lecture2 { 2 3 "Elementray Programming" / References 8 [1] Ch. 2 in YDL 9 [2] Ch. 2 and 3 in Sharan 10 [3] Ch.

Types and Static Type Checking (Introducing Micro-Haskell)

Type Checking and Type Equality

Basic Types & User Defined Types

Chapter 15. Functional Programming. Topics. Currying. Currying: example. Currying: example. Reduction

Organization of Programming Languages CS320/520N. Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Programming Languages Third Edition. Chapter 7 Basic Semantics

Chapter 8 :: Composite Types

Motivation was to facilitate development of systems software, especially OS development.

Lecture 13: Complex Types and Garbage Collection

Visual C# Instructor s Manual Table of Contents

Motivation was to facilitate development of systems software, especially OS development.

Introduction to Programming, Aug-Dec 2006

false, import, new 1 class Lecture2 { 2 3 "Data types, Variables, and Operators" 4

Data Types. Outline. In Text: Chapter 6. What is a type? Primitives Strings Ordinals Arrays Records Sets Pointers 5-1. Chapter 6: Data Types 2

Introduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C

HANDLING NONLOCAL REFERENCES

Binding and Variables

Data Types. CSE 307 Principles of Programming Languages Stony Brook University

Review for COSC 120 8/31/2017. Review for COSC 120 Computer Systems. Review for COSC 120 Computer Structure

Declaration and Memory

Pace University. Fundamental Concepts of CS121 1

Introduction. A. Bellaachia Page: 1

Storage. Outline. Variables and updating. Copy vs. Ref semantics Lifetime. Dangling References Garbage collection

Programming Languages

Topic 7: Algebraic Data Types

Fundamentals of Programming Languages. Data Types Lecture 07 sl. dr. ing. Ciprian-Bogdan Chirila

LECTURE 17. Expressions and Assignment

false, import, new 1 class Lecture2 { 2 3 "Data types, Variables, and Operators" 4

Variables and Constants

User-Defined Algebraic Data Types

Transcription:

Data Types (with Examples In Haskell) COMP 524: Programming Languages Srinivas Krishnan March 22, 2011 Based in part on slides and notes by Bjoern 1 Brandenburg, S. Olivier and A. Block. 1

Data Types Hardware-level: only little (if any) data abstraction. Computers operate on fixed-width words (strings of bits). 8 bits (micro controllers), 16 bit, 32 bits (x86), 64 bits (x86-64, ia64, POWER, SPARC V9). Often include ability to address smaller (but not larger) words Intel x86 chips can also address bytes (8 bits) and half-words (16 bits) Number, letter, address: all just a sequence of bits. Pragmatic view. Data types define how to interpret bit strings of various lengths. Allow compiler / runtime system to detect misuse (type checking). Semantical view (greatly simplified; this is an advanced topic in itself). A data type is a set of possible values (the domain). Together with a number of pre-defined operations. 2 2

Kinds of Data Types Constructive View Primitive types. A primitive value is atomic; the type is structureless. Built into the language. Special status in the language. e.g., literals, special syntax, special operators Often correspond to elementary processor capabilities. E.g., integers, floating point values. Composite Types. Types constructed from simpler types. Can be defined by users. Basis for abstract data types. Recursive Types. Composite types that are (partially) defined in terms of themselves. Lists, Trees, etc. 3 3

Primitive Types logic numbers letters Boolean. Explicit type in most languages. In C, booleans are just integers with a convention. Zero: False; any other value: True. True&False: literals or pre-defined constant symbol. In Haskell/LISP. Type: Bool. Values: True and False. Functions: not, && (logical and), (logical or), 4 4

Primitive Types logic numbers letters Integers. Every language has them, but designs differ greatly. Size (in bits) and max/min value. signed vs. unsigned. Use native word size or standardized word size? Java: standardized, portable, possibly inefficient. C: native, portability errors easy to make, efficient. In Haskell/LISP. Type: Int. Signed, based on native words, fast, size impl.-dependent. Type: Integer. Signed, unlimited size (no overflow!), slower. Sometimes known as BigNums in other languages. 5 5

Primitive Types logic numbers letters Integers. Ada Range Types: Every language has them, but designs differ greatly. Size (in bits) and (Pascal max/min also value. has range types.) signed vs. unsigned. type Month is range 1..12; type Day is range 1..31; type Year is range 1..10000; Use native word size or standardized word size? Java: standardized, portable, possibly inefficient. C: native, portability errors easy to make, efficient. In Haskell. Type: Int. Signed, based on native words, fast, size impl.-dependent. Type: Integer. Signed, unlimited size (no overflow!), slower. Sometimes known as BigNums in other languages. 6 6

Enumeration Types. Primitive Types (small) set of related symbolic constants. Compiled to ordinary integer constants. But much better in terms of readability readability. Can be emulated with regular constants (e.g., classic Java) But compiler can check for invalid assignments if explicitly declared as an enumeration. enum in C, C++. In Haskell/LISP. Integral part of the language. logic numbers letters Example: data LetterGrade = A B C D F. 7 7

Primitive Types logic numbers letters Floating point. IEEE 754 defines several standard floating point formats. Tradeoff between size, precision, and range. Subject to rounding. Not all computers support hardware floating point arithmetic. In Haskell/LISP. Type: Float. Signed, single-precision machine-dependent floating point. Type: Double. Double-precision, double the size. 8 8

Representing money. Primitive Types Uncontrolled rounding is catastrophic error in the financial industry (small errors add up quickly). Fixed-point arithmetic. Binary-coded decimal (BCD). Hardware support in some machines. New 128 bit IEE754 floating point formats with exponent 10 instead of 2. Allows decimal fractions to be stored without rounding. In Haskell/LISP. Not in the language standard. logic numbers letters But you can build your own types (next lecture). Also, can do rounding-free rational arithmetic 9 9

Primitive Types logic numbers letters Rational numbers. Store fractions as numerator / denominator pairs. Primitive type in some languages (e.g., Scheme). In Haskell/LISP. Not primitive. Type: (Integral a) => Rational a. Type class that can be instantiated for either Int (native words) or Integer (no overflow). With a Rational Integer, you never (!) have to worry about lack of precision or over/underflow. (We ll discuss type classes soon ) 10 10

Primitive Types Characters. logic numbers letters Every language has them, but some only implicitly. In legacy C, a character is just an 8-bit integer. Only 256 letters can be represented (ASCII + formatting). Chinese alone has over 40000 characters To be relevant, modern languages must support Unicode. Full Unicode codepoint support requires 32bit characters. Java (16bit char type) was designed for Unicode, but the Unicode standard was revised and extended Modern C and C++ support wide characters. In Haskell/LISP. Type: Char Unicode characters. 11 11

Source: Wikimedia Commons, subject to the GFDL Digression: Phaistos Disk Nobody knows what it means, but it s in Unicode. http://unicode.org/charts/pdf/u101d0.pdf 12 12

Source: Wikimedia Commons, subject to the GFDL Digression: Phaistos Disk Nobody knows what it means, but it s in Unicode. http://unicode.org/charts/pdf/u101d0.pdf 13 13

Type Checking How do you ensure meaning is preserved across platforms, runtimes and various inputs? Enforced by the language design via the Compiler or Interpreter Introduced in 1991 by Luca Cardelli in his paper Typeful Programming Strong Typing Types cannot be intermixed. Different levels of strength C, C#, Java are strongly typed, but C is weaker than C# and Java C allows conversion e.g. int a = (int) a ; printf ( %d, a); char i int p = 258; i = p; printf("%d", i); 97 2 14 14

Type Checking Weak Typing More permissive, lot more implicit conversions Perl, PHP $b = '2'; $c = 2; $d.= $b; $d.= $c; $e = $a +$c print "$d $e"; 22 4 15 15

Mapping Types An relation between two sets. m:i V Mathematical function. Maps values from a domain to values in a codomain. In programming languages. Array: maps a set of integer indices to values. In practice, integer indices must be consecutive (and often start at 0). This enables efficient implementations using offsets. Associative Array: maps arbitrary indices to values. Called dictionary in some scripting languages. Usually based on hashing + arrays. Subroutines / functions: implement arbitrary mappings. Each function signature defines a type. 16 16

Functions in Haskell m:i V square :: Integer -> Integer square x = x * x Named mappings. Type declaration (optional). Defined by equation. 17 17

Functions in Haskell m:i V square :: Integer -> Integer square x = x * x Named mappings. Type declaration (optional). Defined by equation. Type declaration: type of a symbol defined with :: keyword. Example: a mapping from Integers to Integers. 18 18

Functions in Haskell m:i V square :: Integer -> Integer square x = x * x Named mappings. Type declaration (optional). Defined by equation. Definition: simple equation defines the mapping. The square of x is given by x * x. 19 19

Composite Types types consisting of multiple components Mathematical foundation. Recall that each type is a set of values. composite: one value of each component type Cartesian product: S T = {(x, y) x S y T } The set of all tuples in which the first element is in S and the second element is in T. 20 20

Composite Types types consisting of multiple components Mathematical foundation. Recall that each type is a set of values. composite: one value of each component type Cartesian product: S T = {(x, y) x S y T } The set of all tuples in which the first element is in S and the second element is in T. Example: Given a 1024x768 pixel display, each coordinate of the form (x, y) is element of the set: {1,...,1024} {1,...,768} 21 21

Composite Types in Programming Languages History. Cobol was the first language to formally use records. Adopted and generalized by Algol. Fortran and LISP historically do not use record definitions. Classic LISP structures everything using cons cells (linked lists). Virtually all modern languages have some means to express structured data. Basis for abstract data types (ADTs)! Composite types go by many names. C/C++: struct Pascal/Ada: record Prolog: structures (= named tuples) Python: tuples Object-orientation: from a data point of view, classes also define composite types. We ll look at OO in depth later. 22 22

Composite Types in Haskell (1) Explicit type declaration. Named type. Named tuple. Components optionally named. -- Implicit fields: only types are given, no explicit names -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate = Coord2D Int Int -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int } -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 23 23

Composite Types in Haskell (1) Explicit type declaration. data declaration: introduces a type name. Named type. Named tuple. -- Implicit fields: only types are given, no explicit names -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate () = Coord2D Int Int -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int } -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 24 24

Composite Types in Haskell (1) Explicit type declaration. named tuple: introduces a constructor name. Named type. Named tuple. Components optionally named. -- Implicit fields: only types are given, no explicit names -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate = Coord2D Int Int -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int } -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 25 25

Composite Types in Haskell (1) Explicit type declaration. component Named type. names: give each field a meaningful name. Named tuple. -- Implicit fields: only types are given, no explicit names -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate = Coord2D Int Int -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int } -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 26 26

Composite Types in Haskell (2) Tuples. Not explicitly introduced as a type declaration. Can be used directly as a type. Can be named using type synonyms. stats :: [Double] -> (Double, Double, Double) stats lst = (maximum lst, average lst, minimum lst) where average lst = sum lst / fromintegral (length lst) type Statistics = (Double, Double, Double) stats2 :: [Double] -> Statistics stats2 = stats 27 27

Composite Types in Haskell (2) Tuples. Not explicitly introduced as a type declaration. Can be used directly as a type. Can be named using type synonyms. stats :: [Double] -> (Double, Double, Double) stats lst = (maximum lst, average lst, minimum lst) where average lst = sum lst / fromintegral (length lst) type Statistics = (Double, Double, Double) Type of function: stats maps lists of doubles to 3-tuples of doubles. stats2 :: [Double] -> Statistics stats2 = stats stats : ListsOfDoubles Double Double Double 28 28

Composite Types in Haskell (2) Tuples. Tuples used directly without declaration. Pragmatic view: multiple return values. Not explicitly introduced as a type declaration. Can be used directly as a type. Can be named using type synonyms. stats :: [Double] -> (Double, Double, Double) stats lst = (maximum lst, average lst, minimum lst) where average lst = sum lst / fromintegral (length lst) type Statistics = (Double, Double, Double) stats2 :: [Double] -> Statistics stats2 = stats 29 29

Composite Types in Haskell (2) Tuples. Not explicitly introduced as a type declaration. Type synonym: optionally named. Can be used directly as a type. Can be named using type synonyms. stats :: [Double] -> (Double, Double, Double) stats lst = (maximum lst, average lst, minimum lst) where average lst = sum lst / fromintegral (length lst) type Statistics = (Double, Double, Double) stats2 :: [Double] -> Statistics stats2 = stats 30 30

Composite Types in Haskell (2) Tuples. Not explicitly introduced as a type declaration. Can be used directly as a type. Can be named using type synonyms. stats :: [Double] -> (Double, Double, Double) stats lst = (maximum lst, average lst, minimum lst) where average lst = sum lst / fromintegral (length lst) type Statistics = (Double, Double, Double) stats2 :: [Double] -> Statistics stats2 = stats Type synonym: equivalent, but nicer to read. 31 31

Disjoint Union One value, chosen from multiple (disjoint domains). Mathematical view. Simply a union of all possible types (= sets of values). Each value is tagged to tell to which domain it belongs. Tag can be used for checks at runtime. ({1} S) ({2} T )={(t, x) (t =1 x S) (t =2 y T )} 32 32

Disjoint Union One value, chosen from multiple (disjoint domains). Mathematical view. Simply a union of all possible types (= sets of values). Each value is tagged to tell to which domain it belongs. Tag can be used for checks at runtime. ({1} S) ({2} T )={(t, x) (t =1 x S) (t =2 y T )} Example: A pixel color can be defined using RGB (red, green, blue color channels) or HSB (hue, saturation, brightness). Both are simply three-tuples, but values must be distinguished at runtime in order to be rendered correctly. 33 33

Disjoint Union in Haskell Algebraic data type. enumeration of named tuples Generalizes enumeration types and composite types. -- Implicit fields: only types are given, no explicit names -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate = Coord2D Int Int Coord3D Int Int Int -- Enumeration type. data ColorName = White Black Green Red Blue CarolinaBlue -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int} Named ColorName HSB { hue :: Double, sat :: Double, bright :: Double} -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 34 34

Disjoint Union in Haskell Algebraic data type. enumeration of named tuples Disjoint Union: enumeration of constructors. Generalizes enumeration types and composite types. -- Implicit fields: only types are given, no explicit names -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate = Coord2D Int Int Coord3D Int Int Int -- Enumeration type. data ColorName = White Black Green Red Blue CarolinaBlue -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int} Named ColorName HSB { hue :: Double, sat :: Double, bright :: Double} -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 35 35

Disjoint Union in Haskell Algebraic data type. enumeration of named tuples Generalizes enumeration types and composite types. -- Data Implicit types fields: only of types sub-domains are given, no explicit can names be heterogenous. -- These can be accessed using pattern matching -- (de-structuring bind). data Coordinate = Coord2D Int Int Coord3D Int Int Int -- Enumeration type. data ColorName = White Black Green Red Blue CarolinaBlue -- Explicit field names. data Color = RGB { red :: Int, green :: Int, blue :: Int} Named ColorName HSB { hue :: Double, sat :: Double, bright :: Double} -- Composite type of composite types. -- Again, implicit fields. data Pixel = Pixel Coordinate Color 36 36

Classic example: List. Recursive Types defined as a head (some value) and a tail (which is a list). Semantical view: infinite set of values. Rigorous treatment of the semantics of recursive types is non-trivial. Implementation. Requires pointers (abstraction of addresses) or references (abstraction of object location). Pointer arithmetic: calculate new addresses based on new ones. No arithmetic on references. types defined in terms of themselves References not necessarily exposed in programming language. e.g., Haskell does not have a reference type! However, references must be exposed to construct cyclical data structures. 37 37

Recursive Types in Haskell data IntList = EndOfList Link { elem :: Int, tail :: IntList } Algebraic type with self-reference. Can use name of type in definition of type. However, no explicit references. No doubly-linked lists! Haskell has generic built-in lists 38 38

Recursive Types in Haskell data IntList = EndOfList Link { elem :: Int, tail :: IntList } Algebraic type with self-reference. Can use name of type in definition of type. However, no explicit references. No doubly-linked lists! Haskell has generic built-in lists Type that is being defined is used in definition. 39 39

What are Strings? Character sequences. Is it a primitive type? Most languages support string literals. Is it a composite type? Array of characters (e.g., C). Object type? Is it a recursive type? sequence = list (e.g., Prolog). In Haskell. type String = [Char] Strings are simply lists of characters. A type synonym, both ways of referring to the type can be used interchangeably. 40 40

What are Strings? Character sequences. Is it a primitive type? Most languages support string literals. Is it a composite type? Array of characters (e.g., C). Object type? Is it a recursive type? sequence = list (e.g., Prolog). In Haskell. type String = [Char] Strings are simply lists of characters. Bottom line: No Consensus No approach to treating strings has been universally accepted; each approach has certain advantages. A type synonym, both ways of referring to the type can be used interchangeably. 41 41