Strings, characters and character literals

Similar documents
Java Basic Datatypees

Variables, Constants, and Data Types

Strings in Java String Methods. The only operator that can be applied to String objects is concatenation (+) for combining one or more strings.

CS112 Lecture: Primitive Types, Operators, Strings

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute

Princeton University. Computer Science 217: Introduction to Programming Systems. Data Types in C

3 The Building Blocks: Data Types, Literals, and Variables

Number Systems, Scalar Types, and Input and Output

TCL - STRINGS. Boolean value can be represented as 1, yes or true for true and 0, no, or false for false.

What did we talk about last time? Examples switch statements

Eng. Mohammed Abdualal

Strings, Strings and characters, String class methods. JAVA Standard Edition

Introduction to Computer Programming in Python Dr. William C. Bulko. Data Types

THE INTEGER DATA TYPES. Laura Marik Spring 2012 C++ Course Notes (Provided by Jason Minski)

CSCI 2010 Principles of Computer Science. Data and Expressions 08/09/2013 CSCI

Assoc. Prof. Marenglen Biba. (C) 2010 Pearson Education, Inc. All rights reserved.

Visual C# Instructor s Manual Table of Contents

Chapter 29B: More about Strings Bradley Kjell (Revised 06/12/2008)

VARIABLES AND CONSTANTS

An overview of Java, Data types and variables

Getting started with Java

Java characters Lecture 8

Assoc. Prof. Marenglen Biba. (C) 2010 Pearson Education, Inc. All rights reserved.

Strings. Strings and their methods. Dr. Siobhán Drohan. Produced by: Department of Computing and Mathematics

APPENDIX A: SUMMARY OF IMPORTANT JAVA FEATURES

CHAPTER 5 VARIABLES AND OTHER BASIC ELEMENTS IN JAVA PROGRAMS

Data Types Literals, Variables & Constants

Overview. - General Data Types - Categories of Words. - Define Before Use. - The Three S s. - End of Statement - My First Program

JAVA Programming Fundamentals

Data and Expressions. Outline. Data and Expressions 12/18/2010. Let's explore some other fundamental programming concepts. Chapter 2 focuses on:

UNIT- 3 Introduction to C++

PYTHON- AN INNOVATION

Overview of C. Basic Data Types Constants Variables Identifiers Keywords Basic I/O

Lecture Set 2: Starting Java

VHDL Lexical Elements

Lecture Set 2: Starting Java

Chapter 6 Primitive types

Java language. Part 1. Java fundamentals. Yevhen Berkunskyi, NUoS

"Hello" " This " + "is String " + "concatenation"

Table of Contents Date(s) Title/Topic Page #s. Abstraction

CS-201 Introduction to Programming with Java

Java s String Class. in simplest form, just quoted text. used as parameters to. "This is a string" "So is this" "hi"

Full file at

Full file at

CS Programming I: Using Objects

Object oriented programming. Instructor: Masoud Asghari Web page: Ch: 3

CS61B Lecture #14: Integers. Last modified: Wed Sep 27 15:44: CS61B: Lecture #14 1

Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation

Programming in C++ 4. The lexical basis of C++

MATHEMATICAL FUNCTIONS CHARACTERS, AND STRINGS. INTRODUCTION IB DP Computer science Standard Level ICS3U

CS Programming I: Using Objects

A variable is a name for a location in memory A variable must be declared

Language Reference Manual

Variables and Values

Exercises Software Development I. 03 Data Representation. Data types, range of values, internal format, literals. October 22nd, 2014

Java Notes. 10th ICSE. Saravanan Ganesh

Java Programming Fundamentals. Visit for more.

Chapter 2 Using Data. Instructor s Manual Table of Contents. At a Glance. Overview. Objectives. Teaching Tips. Quick Quizzes. Class Discussion Topics

VLC : Language Reference Manual

2/12/17. Goals of this Lecture. Historical context Princeton University Computer Science 217: Introduction to Programming Systems

Variables and Literals

Regular Expressions. Computer Science and Engineering College of Engineering The Ohio State University. Lecture 9

Princeton University Computer Science 217: Introduction to Programming Systems The C Programming Language Part 1

THE FUNDAMENTAL DATA TYPES

Maciej Sobieraj. Lecture 1

DQ Analyzer 9. Cheat Sheets. Read the most up-to-date documentation for the latest Ataccama release online at docs.ataccama.com

Goals of C "" The Goals of C (cont.) "" Goals of this Lecture"" The Design of C: A Rational Reconstruction"

AP Computer Science A

Tools : The Java Compiler. The Java Interpreter. The Java Debugger

Slide Set 2. for ENCM 335 in Fall Steve Norman, PhD, PEng

The Java Language Rules And Tools 3

Introduction to Programming Using Java (98-388)

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

CMPT 120 Basics of Python. Summer 2012 Instructor: Hassan Khosravi

Fundamental Data Types. CSE 130: Introduction to Programming in C Stony Brook University

ECE 122. Engineering Problem Solving with Java

ECE 122. Engineering Problem Solving with Java

Memory Addressing, Binary, and Hexadecimal Review

LESSON 4. The DATA TYPE char

Language Fundamentals Summary

Basics of Java Programming

! Widely available. ! Widely used. ! Variety of automatic checks for mistakes in programs. ! Embraces full set of modern abstractions. Caveat.

Expressions and Data Types CSC 121 Spring 2015 Howard Rosenthal

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

COMP-202: Foundations of Programming

Zheng-Liang Lu Java Programming 45 / 79

COMP519 Web Programming Lecture 17: Python (Part 1) Handouts

String. Other languages that implement strings as character arrays

Chapter 2 Part 2 Edited by JJ Shepherd, James O Reilly

Program Fundamentals

CS 106 Introduction to Computer Science I

Lecture Notes. System.out.println( Circle radius: + radius + area: + area); radius radius area area value

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

M1 Computers and Data

Objects and Types. COMS W1007 Introduction to Computer Science. Christopher Conway 29 May 2003

A Java program contains at least one class definition.

CS61B Lecture #14: Integers

We now start exploring some key elements of the Java programming language and ways of performing I/O

Chapter 2: Using Data

UNIVERSITÀ DI PADOVA. < 2014 March >

Transcription:

Strings, characters and character literals Internally, computers only manipulate bits of data; every item of data input can be represented as a number encoded in base 2. However, when it comes to processing text, mathematical equations on the numeric value of letters and punctuation are not very comfortable to write and use. To optimize the manipulation of text, many programming language provide extra facilities that help the programmer. The two main constructs available are: a simple data type to represent a single character - a letter, digit, punctuation sign, etc. a more complex data type to represent a sequence of zero or more characters. This is called a character string. Character types Character types are not strictly necessary for example, Python does not have a dedicated character type, and programmers will use variables of type int to manipulate individual characters instead. However, dedicated character types are useful because they are more compact in memory: they occupy typically much less space (1-2 bytes) than an ordinary integer (4-8 bytes). So if a program manipulates many characters it is more memory efficient to use the character type when it exists. For example, Java even has two character types, called char and byte, much smaller than int that is 32-bit long: Data type Size Set of allowable values Adequate for char 16 bits 0 to 65535 Processing international text byte 8 bits -128 to 127 Processing binary files Which type to use depends on the application: if text is to be read in or printed for human users, usually char is adequate; whereas reading or writing character for use by another program should probably use byte instead. String types and Java s String Every language has a different way to manipulate sequences of zero or more characters. Some languages like C do not have a dedicated type for character strings, so a programmer has to build their own using arrays of individual characters. In Java, like in Python and most modern lan- 1

guages, there is a dedicated type; in Java, the name for this is String. Like numbers, strings can be read in (Scanner.next()) and printed out (PrintStream.println(),.printf()) as-is. We can also express a string literal in the Java program using double quotes: String a = "hello"; This frament creates a variable of type String and initializes it to store the sequence of characters h, e, l, l, o. Testing for String equality Java s direct equality comparison operators == and!= are only defined for primitive types. This is why we can compare two int s using == (int is a primitive type), however the direct comparison operators do not work on compound types like String. This is arguably a shortcoming of the Java language in particular: most other programming languages do allow programmers to use == to compare character strings. So unfortunately you will have to remembed this as an exception. It is especially unfortunate because Java does not inform you properly if you make the mistake to use == to compare strings. For example, with the following code: String a = "abc"; String b = "abc"; if (a == b) out.println("all is well"); the program will not print all is well as expected, because == silently did something else entirely. The full explanation of what is going on here will come only later (when talking about references and aliasing), so at this point just remember this: to compare strings, use the method equals(), like this: String a = "abc"; String b = "abc"; if (a.equals(b)) out.println("all is well"); This program fragment properly prints all is well as expected. String and number conversions Strings are not numbers, so we cannot perform numeric arithmetic on them: String b = "123"; String c = b / 4; // produces an error: b is not a numeric variable The only arithmetic operation available with Java s String is +, which concatenates (attaches together) two strings: String b = "123"; String c = b + b; out.println(c); // prints 123123 To convert a String to a number (int, double, etc.) you can use the Scanner, already seen previously. To perform the opposite conversion (from int, double, etc. to String), String provides a service called valueof: 2

String d = String.valueOf(3.1415); // d = "3.1415" String e = d + d; out.println(e); // prints 3.14153.1415 Useful String methods The type String provides numerous services that help with the use of character strings in programs. A complete list is provided in the online documentation of the Java language, so you look up String there every time you are looking for a string-related language feature. However it may be useful to pre-load the following knowledge in your long-term memory: a.equals(b) compares two strings a and b for equality (replaces == ); a.compareto(b) compares two strings a and b for alphabetical order (replaces < and > ) a.length() returns the length of the string a, ie the number of character it contains; a.charat(i) retrieves the character in the string a at position i; a.substring(i, j) returns the part of the string that begins at position i and ends at position j, as a new string; a.indexof(c) returns the position of the first occurrence of the character c in a. Also even if you do not need to remember the specifics by heart, you should remember that String provides other services that help with handling international text, in particular comparison while ignoring the difference between lower and upper case, case conversion, etc. Literals, character names and escape sequences A literal is a construct in a programming language whose value is equal to what it represents. For example, the text 123 formed by 3 digits 1, 2, 3 concatenated together is a valid integer literal, whose value is equal to a hundred and twenty three. In every programming language, there are some values which can be represented by multiple different literals. For example, The approximate value represented by the literal 3.1415 is also represented by 31.415e-1 (31.415 10 1 = 3.1415). Characters in particular typically have many names in programming languages. For example, the following 3 literals all encode the same value in Java: a \x61 \u0061 \141 The first form is the most simple, this this the character itself. The other three forms are other encoding that start with a backslash ( \ ) followed by a code. The presence of the backslash indicates that what follows must be interpreted differently. A construct of this form is called a character escape sequence. The reason why these special forms with a backslash are useful is that some characters cannot be easily entered in a text editor. For example, what if I wanted to check whether a string starts 3

with the Greek character alpha? In most environments I cannot type it in the source code because special characters are not allowed in the text editor, or not saved properly. In order to work around this shortcoming, I can use the character code instead: final char GREEK_ALPHA = \u0251 ; if (a.charat(0) == GREEK_ALPHA)... Java, like C, Python and many other programming languages recognizes the following escape sequences: Sequence Meaning Example Same as (examples) \unnnn Unicode character with code \u0061 a NNNN \xnn Character with hexadecimal \x61 a code NN \NNN Character with octal code NNN \141 a \ The apostrophe character \x27, \047, \u0027 \" The double quote character \x22, \042, \u0022 \\ The character \ itself \x5c, \134, \u005c \n The newline character \x0a, \012, \u000a \t The tab character \x09, \011, \u0009 \r The carriage return character \x0d, \015, \u000d \f The form feed (new page) character \x0c, \014, \u000c To summarize, if a character can be entered as-is in the program text, use that directly; otherwise, use a character escape sequence. Important concepts character type and string type and the difference between them; dedicated character types are more compact in memory than other integer types; testing for equality using equals(), do not use == ; conversion between String and numbers in Java using Scanner and String.valueOf(); String s charat, compareto, length, substring and indexof methods; character literals; character escape sequences. Further reading Think Java, sections 8.1-8.6 (pp. 91-96) 4

Introduction to Programming, sections 2.2.3-2.2.4 (pp. 26-28), section 2.3.3 (pp. 34-35) Absolute Java, section 1.3 (pp. 33-45) WikiBooks: Programmeren in Java, Stringbewerkingen Copyright and licensing Copyright c 2014, Raphael kena Poss. Permission is granted to distribute, reuse and modify this document according to the terms of the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/ by-sa/4.0/. 5