Java Programming. String Processing. 1 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Similar documents
CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals

"Hello" " This " + "is String " + "concatenation"

H212 Introduction to Software Systems Honors

CIS 1068 Design and Abstraction Spring 2017 Midterm 1a

STUDENT LESSON A10 The String Class

Activity 9: Object-Oriented

Object-Oriented Software Engineering CS288

Strings. Strings, which are widely used in Java programming, are a sequence of characters. In the Java programming language, strings are objects.

Activity 9: Object-Oriented

Discover how to get up and running with the Java Development Environment and with the Eclipse IDE to create Java programs.

STRINGS AND STRINGBUILDERS. Spring 2019

Introduction to Programming Using Java (98-388)

Intro to Strings. Lecture 7 CGS 3416 Spring February 13, Lecture 7 CGS 3416 Spring 2017 Intro to Strings February 13, / 16

Strings. Strings and their methods. Dr. Siobhán Drohan. Produced by: Department of Computing and Mathematics

AP Computer Science. Strings. Credit: Slides are modified with permission from Barry Wittman at Elizabethtown College

Lecture Notes CPSC 224 (Spring 2012) Today... Java basics. S. Bowers 1 of 8

2. All the strings gets collected in a special memory are for Strings called " String constant pool".

There are several files including the start of a unit test and the method stubs in MindNumber.java. Here is a preview of what you will do:

Assoc. Prof. Marenglen Biba. (C) 2010 Pearson Education, Inc. All rights reserved.

CSE1710. Big Picture. Today is last day covering Ch 6. Tuesday, Dec 02 is designated as a study day. No classes. Thursday, Dec 04 is last lecture.

Array Basics: Outline. Creating and Accessing Arrays. Creating and Accessing Arrays. Arrays (Savitch, Chapter 7)

Intro to Strings. Lecture 7 COP 3252 Summer May 23, 2017

CONTENTS: Compilation Data and Expressions COMP 202. More on Chapter 2

Ch. 6. User-Defined Methods

- Thus there is a String class (a large class)

Interpreted vs Compiled. Java Compile. Classes, Objects, and Methods. Hello World 10/6/2016. Python Interpreted. Java Compiled

Introduction to Java & Fundamental Data Types

Top-Down Program Development

More on variables and methods

Creating Java Programs with Greenfoot

A variable is a name for a location in memory A variable must be declared

Lesson 4 Utility classes: Math, String, I/O

COMP 202. Built in Libraries and objects. CONTENTS: Introduction to objects Introduction to some basic Java libraries string

Lecture 6. Assignments. Java Scanner. User Input 1/29/18. Reading: 2.12, 2.13, 3.1, 3.2, 3.3, 3.4

Lab 14 & 15: String Handling

STUDENT LESSON A12 Iterations

String. Other languages that implement strings as character arrays

Lecture Set 4: More About Methods and More About Operators

Assoc. Prof. Marenglen Biba. (C) 2010 Pearson Education, Inc. All rights reserved.

CSE 114 Computer Science I

COMP 250: Java Programming I. Carlos G. Oliver, Jérôme Waldispühl January 17-18, 2018 Slides adapted from M. Blanchette

Faculty of Science Midterm. COMP-202B - Introduction to Computing I (Winter 2008)

York University Fall 2001 / Test #1 Department of Computer Science

CMSC 132: Object-Oriented Programming II

CS212 Midterm. 1. Read the following code fragments and answer the questions.

Object-Oriented Programming

Strings. Strings and their methods. Mairead Meagher Dr. Siobhán Drohan. Produced by: Department of Computing and Mathematics

class Ideone { public static void main (String[] args) throws java.lang.exception {

Computational Expression

CHAPTER 7 OBJECTS AND CLASSES

Using Java Classes Fall 2018 Margaret Reid-Miller

Getting started with Java

Creating Strings. String Length

Full file at

Course Outline. Introduction to java

TEST (MODULE:- 1 and 2)

School of Computer Science CPS109 Course Notes 5 Alexander Ferworn Updated Fall 15


Repetition. Chapter 6

Repetition. Chapter 6

Data Structures COE 312 ExamII Preparation Exercises

COE 212 Engineering Programming. Welcome to Exam I Tuesday November 11, 2014

Lecture Set 4: More About Methods and More About Operators

Chapter 9 Lab Text Processing and Wrapper Classes

We now start exploring some key elements of the Java programming language and ways of performing I/O

Appendix 3. Description: Syntax: Parameters: Return Value: Example: Java - String charat() Method

Follow this and additional works at: Part of the Programming Languages and Compilers Commons

Method OverLoading printf method Arrays Declaring and Using Arrays Arrays of Objects Array as Parameters

Loops! Step- by- step. An Example while Loop. Flow of Control: Loops (Savitch, Chapter 4)

Last Class. While loops Infinite loops Loop counters Iterations

Netbeans tutorial:

Computer Components. Software{ User Programs. Operating System. Hardware

Programming with Java

AP Computer Science Unit 1. Writing Programs Using BlueJ

Lecture 6. Assignments. Summary - Variables. Summary Program Parts 1/29/18. Reading: 3.1, 3.2, 3.3, 3.4

Regular Expressions & Automata

Introduction to Programming (Java) 4/12

The for Loop, Accumulator Variables, Seninel Values, and The Random Class. CS0007: Introduction to Computer Programming

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

appreciate the difference between a char and a string understand and use the String class methods

Mr. Monroe s Guide to Mastering Java Syntax

Java Coding 3. Over & over again!

CS-140 Fall Binghamton University. Methods. Sect. 3.3, 8.2. There s a method in my madness.

Chapter 8: Strings and Things

Important Java terminology

CSC 222: Object-Oriented Programming. Spring 2017

Selec%on and Decision Structures in Java: If Statements and Switch Statements CSC 121 Spring 2016 Howard Rosenthal

Introduction to Java Applications

Unit 14. Passing Arrays & C++ Strings

AP Computer Science Unit 1. Writing Programs Using BlueJ

COMP 110 Programming Exercise: Simulation of the Game of Craps

Java Identifiers, Data Types & Variables

Programming with Java

2.8. Decision Making: Equality and Relational Operators

Dr. Sarah Abraham University of Texas at Austin Computer Science Department. Regular Expressions. Elements of Graphics CS324e Spring 2017

AP Computer Science A


Preview from Notesale.co.uk Page 9 of 108

CSE 143 Au03 Midterm 2 Page 1 of 7

CHAPTER 7 ARRAYS: SETS OF SIMILAR DATA ITEMS

Transcription:

Java Programming String Processing 1 Copyright 2013, Oracle and/or its affiliates. All rights

Overview This lesson covers the following topics: Read, search, and parse Strings Use StringBuilder to create Strings Use regular expressions to search, parse, and replace Strings 2 Copyright 2013, Oracle and/or its affiliates. All rights

Strings The String object can be interpreted as a group of characters in a single memory location. Strings, like arrays, begin their index at 0 and end their index at StringName.length()-1. There are many different ways of approaching String manipulation. 3 Copyright 2013, Oracle and/or its affiliates. All rights

Using a FOR Loop One way to manipulate Strings is to use a FOR loop. This code segment initializes a String and increments through its characters, printing each one to the console. String str = Sample String ; for(int index=0;index<str.length();index++){ System.out.print(str.charAt(index));} 4 Copyright 2013, Oracle and/or its affiliates. All rights

Benefits to Using a FOR Loop Using the FOR loop method of incrementing through a String is beneficial if you desire to: Search for a specific character or String inside of the String. Read the String backwards (from last element to first element). Parse the String. 5 Copyright 2013, Oracle and/or its affiliates. All rights

Print String to Console An easier way to print a String to the console does not involve incrementing through the String. This code is shown below. System.out.print(str); 6 Copyright 2013, Oracle and/or its affiliates. All rights

Common String Methods A few other common String methods are: String Method length() Description Returns the number of characters in the String. chatat(int i) Returns the character at index i. substring(int start) substring(int start, int end) replace(char oldc, char newc) Returns part of the String from index start to the end of the String. Returns part of the String from index start to index end, but does not include the character at index end. Returns a String where all occurrences of character oldc have been replaced with newc. 7 Copyright 2013, Oracle and/or its affiliates. All rights

Searching and Strings There are a few different ways to search for a specific character or String inside of the String. The first is a for loop, which can be altered to search, count, or replace characters or Substrings contained in Strings. 8 Copyright 2013, Oracle and/or its affiliates. All rights

Searching and Strings Example The code below uses a for loop to count the number of spaces found in the String. String str = Searching for spaces ; int count=0; for(int i=0;i<str.length();i++){ if(str.charat(i)==' ') count++; } Since the index of a String begins at 0, we must begin searching for a ' ' at index 0. Search through the String until the index reaches the last element of the String, which is at index.strlength()-1. This means that i cannot be > or = to the str.length(). If it does exceed str.length()-1, an "index out of bounds" error will occur. 9 Copyright 2013, Oracle and/or its affiliates. All rights

Calling Methods on the String Other ways to search for something in a String is by calling any of the following methods on the String. These methods are beneficial when working with programming problems that involve the manipulation of Strings. String Method Description contains(charsequence s) Returns true if the String contains s. indexof(char ch) indexof(string str) Returns the index within this String of the first occurrence of the specified character and -1 if the character is not in the String. Returns the index within this String of the first occurrence of the specified Substring and -1 if the String does not contain the Substring str. 10 Copyright 2013, Oracle and/or its affiliates. All rights

Reading Strings Backwards Typically a String is read from left to right. To read a String backwards, simply change the starting index and ending index of the FOR loop that increments through the String. String str = Read this backwards ; String strbackwards = ; for(int i=str.length()-1; i>=0 ; i--){ strbackwards+=str.substring(i,i+1); } Start the FOR loop at the last index of the array (which is str.length()-1), and decrease the index all the way through to the first index (which is 0). 11 Copyright 2013, Oracle and/or its affiliates. All rights

Parsing a String Parsing means dividing a String into a set of Substrings. Typically, a sentence (stored as a String) is parsed by spaces to separate the words of the sentence rather than the whole sentence. This makes it easier to rearrange the words than if they were all together in one String. You may parse a String by any character or Substring. Below are two techniques for parsing a String: For loop Split 12 Copyright 2013, Oracle and/or its affiliates. All rights

Steps to Parsing a String with a For Loop 1. Increment through the for loop until you find the character or Substring where you wish to parse it. 2. Store the parsed components. 3. Update the String. 4. Manipulate the parsed components as desired. 13 Copyright 2013, Oracle and/or its affiliates. All rights

Steps to Parsing a String with a For Loop import java.util.*; public class StringParser { public static void main(string[] args) { String str = "Parse this String"; ArrayList<String> words = new ArrayList<String>(); } } while(str.length() > 0){ for(int i=0; i<str.length(); i++){ if(i==str.length()-1){ words.add(str.substring(0)); str = ""; break; } else if(str.charat(i)==' '){ words.add(str.substring(0,i)); str=str.substring(i+1); break; } } } for(string s : words) System.out.print(s + ' '); 14 Copyright 2013, Oracle and/or its affiliates. All rights

Parsing a String: Split Split is a method inside the String class that parses a String at specified characters, or if unspecified, spaces. It returns an array of Strings that contains the Substrings (or words) that parsing the String gives. How to call split on a String: String sentence = This is my sentence ; String[] words = sentence.split( ); //words will look like {This,is,my,sentence} String[] tokens = sentence.split( i ); //tokens will look like {Th,s,s my sentence} 15 Copyright 2013, Oracle and/or its affiliates. All rights

Split a String by More than One Character It is also possible to split a String by more than one specified character if you use brackets [ ] around the characters. Here is an example. String sentence = This is my sentence ; String[] tokens = sentence.split( [ie] ); //tokens will look like {Th,s,s my s,nt,nc} //each token is separated by any occurrence of //an i or any occurrence of an e. Notice how the brackets are used to include i and e. 16 Copyright 2013, Oracle and/or its affiliates. All rights

StringBuilder StringBuilder is a class that represents a String-like object. It is made of a sequence of characters, like a String. The difference between String and StringBuilder objects is that: StringBuilder includes methods that can modify the StringBuilder once it has been created by appending, removing, replacing, or inserting characters. Once created, a String cannot be changed. It is replaced by a new String instead. 17 Copyright 2013, Oracle and/or its affiliates. All rights

Strings Cannot be Modified It is not possible to make modifications to a String. Methods used to modify a String actually create a new String in memory with the specified changes, they do not modify the old one. This is why StringBuilders are much faster to work with: They can be modified and do not require you to create a new String with each modification. 18 Copyright 2013, Oracle and/or its affiliates. All rights

StringBuilder and String Shared Methods StringBuilder shares many of the same methods with String, including but not limited to: charat(int index) indexof(string str) length() substring(int start, int end) 19 Copyright 2013, Oracle and/or its affiliates. All rights

StringBuilder Methods StringBuilder also has some methods specific to its class, including the four below: Method append(type t) delete(int start, int end) insert(int offset, Type t) replace(int start, int end, String str) Description Is compatible with any Java type or object, appends the String representation of the Type argument to the end of the sequence. Removes the character sequence included in the Substring from start to end. Is compatible with any Java type, inserts the String representation of Type argument into the sequence. Replaces the characters in a Substring of this sequence with characters in str. 20 Copyright 2013, Oracle and/or its affiliates. All rights

Methods to Search Using a StringBuilder Searching using a StringBuilder can be done using either of the below methods: Method charat(int index) Description Returns the character at index. indexof(string str, int fromindex) Returns index of first occurrence of str. 21 Copyright 2013, Oracle and/or its affiliates. All rights

StringBuilder versus String StringBuilder Changeable Easier insertion, deletion, and replacement. Can be more difficult to use, especially when using regular expressions introduced on the next few slides. Use when memory needs to be conserved. String Immutable Easier concatenation. Visually simpler to use, similar to primitive types rather than objects. Use with simpler programs where memory is not a concern. 22 Copyright 2013, Oracle and/or its affiliates. All rights

Regular Expressions A regular expression is a character or a sequence of characters that represent a String or multiple Strings. Regular expressions: Are part of the java.util.regex package, thus any time regular expressions are used in your program you must import this package. Syntax is different than what you are used to but allows for quicker, easier searching, parsing, and replacing of characters in a String. 23 Copyright 2013, Oracle and/or its affiliates. All rights

String.matches(String regex) The String class contains a method matches(string regex) that returns true if a String matches the given regular expression. This is similar to the String method equals(string str). The difference is that comparing the String to a regular expression allows variability. For example, how would you write code that returns true if the String animal is cat or dog and returns false otherwise? 24 Copyright 2013, Oracle and/or its affiliates. All rights

Equals Versus Matches A standard answer may look something like this: if(animal.equals( cat )) return true; else if(animal.equals( dog )) return true; return false; An answer using regular expressions would look something like this: return animal.matches( cat dog ); The second solution is much shorter. The regular expression symbol allows for the method matches to check if animal is equal to cat or dog and return true accordingly. 25 Copyright 2013, Oracle and/or its affiliates. All rights

Square Brackets Square brackets are used in regular expression to allow for character variability. For example, if you wanted to return true if animal is equal to dog or Dog, but not dog, using equalsignorecase would not work and using equals would take time and multiple lines. If you use regular expression, this task can be done in one line as follows. This code tests if animal matches Dog or dog and returns true if it does. return animal.matches( [Dd]og ); 26 Copyright 2013, Oracle and/or its affiliates. All rights

Include Any Range of Characters Square brackets aren't restricted to two character options. They can be combined with a hyphen to include any range of characters. For example, you are writing code to create a rhyming game and you want to see if String word rhymes with banana. The definition of a rhyming word is a word that contains all the same letters except the first letter may be any letter of the alphabet. Your first attempt at coding may look like this: if(word.length()==6) if(word.substring(1,6).equals( anana )) return true; return false; 27 Copyright 2013, Oracle and/or its affiliates. All rights

Using Square Brackets and a Hyphen A shorter, more generic way to complete the same task is to use square brackets and a hyphen (regular expression) as shown below. return word.matches( [a-z]anana ); This code returns true if word begins with any lower case letter and ends in anana. To include upper case characters we would write: return word.matches( [a-za-z]anana ); 28 Copyright 2013, Oracle and/or its affiliates. All rights

Using Square Brackets and a Hyphen To allow the first character to be any number or a space in addition to a lower or upper case character, simply add 0-9 inside the brackets (note the space before 0). return word.matches( [ 0-9a-zA-Z]anana ); 29 Copyright 2013, Oracle and/or its affiliates. All rights

The Dot The dot (.) is a representation for any character in regular expressions. For example, you are writing a decoder for a top secret company and you think that you have cracked the code. You need to see if String element consists of a number followed by any other single character. This task is done easily with use of the dot as shown below. This code returns true if element consists of a number followed by any character. The dot matches any character. return element.matches( [0-9]. ); 30 Copyright 2013, Oracle and/or its affiliates. All rights

Repetition Operators A repetition operator is any symbol in regular expressions that indicates the number of times a specified character appears in a matching String. Repetition Operator * Definition Sample Code Code Meaning 0 or more occurrences return str.matches( A* ); Returns true if str consists of zero or more A's but no other letter.? 0 or 1 occurrence return str.matches( A? ); Returns true if str is or A. + 1 or more occurrences return str.matches( A+ ); Returns true if str is 1 or more A's in a sequence. 31 Copyright 2013, Oracle and/or its affiliates. All rights

More Repetition Operators Repetition Operator Definition Sample Code Code Meaning {x} X occurrences return str.matches( A{7} ); {x,y} {x,} Between x & y occurrences X or more occurrences return str.matches( A{7,9} ); Return str.matches( A{5,} ); Returns true if str is a sequence of 7 A's. Returns true if str is a sequence of 7, 8, or 9 A's. Returns true if str is a sequence of 5 or more A's. 32 Copyright 2013, Oracle and/or its affiliates. All rights

Combining Repetition Operators Example 1 In the code below: The dot represents any character. The asterisk represents any number of occurrences of the character preceding it. The.* means any number of any characters in a sequence will return true. return str.matches(.* ); 33 Copyright 2013, Oracle and/or its affiliates. All rights

Combining Repetition Operators Example 2 If the code below returns true, str must be a sequence of 10 digits (between 0 and 5) and may have 0 or 1 characters preceding the sequence. Remember, all symbols of regular expressions may be combined with each other, as shown below, and with standard characters. return str.matches(.?[0-5]{10} ); 34 Copyright 2013, Oracle and/or its affiliates. All rights

Pattern A Pattern is a class in the java.util.regex package that stores the format of the regular expression. For example, to initialize a Pattern of characters as defined by the regular expression [A-F]{5,}.* you would write the following code: Pattern p = Pattern.compile( [A-F]{5,}.* ); The compile method returns a Pattern as defined by the regular expression given in the parameter. 35 Copyright 2013, Oracle and/or its affiliates. All rights

Matcher A matcher is a class in the java.util.regex package that stores a possible match between the Pattern and a String. A Matcher is initialized as follows: Matcher match = patternname.matcher(stringname); The matcher method returns a Matcher object. The following code returns true if the regular expression given in the Pattern patternname declaration matches the String StringName. return match.matches(); 36 Copyright 2013, Oracle and/or its affiliates. All rights

Matcher: Putting it All Together To put it all together, we have: Pattern p = Pattern.compile( [A-F]{5,}.* ); Matcher match = patternname.matcher(stringname); return match.matches(); 37 Copyright 2013, Oracle and/or its affiliates. All rights

Benefits to Using Pattern and Matcher This seems like a very complex way of completing the same task as the String method matches. Although that may be true, there are benefits to using a Pattern and Matcher such as: Capturing groups of Strings and pulling them out, allowing to keep specific formats for dates or other specific formats without having to create special classes for them. Matches has a find() method that allows for detection of multiple instances of a pattern within the same String. 38 Copyright 2013, Oracle and/or its affiliates. All rights

Regular Expressions and Groups Segments of regular expressions can be grouped using parentheses, opening the group with ( and closing it with ). These groups can later be accessed with the Matcher method group(groupnumber). For example, consider reading in a sequence of dates, Strings in the format DD/MM/YYYY, and printing out each date in the format MM/DD/YYYY. Using groups would make this task quite simple. 39 Copyright 2013, Oracle and/or its affiliates. All rights

Regular Expressions and Groups Example import java.util.regex.pattern; import java.util.regex.matcher; import java.util.scanner; <Enter first-level introductory paragraph, sentence, or public class RegExpressionsPractice { phrase public static here.> void main(string[] (24 pt args) Arial { Regular) Group 1 Group 2 } Pattern datep; <Enter datep second-level = Pattern.compile("([0-9]{2})/([0-9]{2})/([0-9]{4})"); bullet text here.> (22 pt Arial Regular) Scanner in = new Scanner(System.in); <Enter System.out.println("Enter third-level bullet a Date: text (dd/mm/yyyy)"); here.> (20 pt Arial Regular) Group 3 } while(!date.equals("")){ <Enter fourth-level bullet text here.> (18 pt Arial Regular) 40 Copyright 2013, Oracle and/or its affiliates. All rights String date = in.nextline(); Matcher datem = datep.matcher(date); Recalls each group <Enter if(datem.matches()){ fifth-level bullet text here.> (16 pt Arial Regular) of the Matcher. String day = datem.group(1); String month = datem.group(2); String year = datem.group(3); System.out.println(month+"/"+day+"/"+year); } System.out.println("Enter a Date: (dd/mm/yyyy)"); date=in.nextline(); } Group 1 and Group 2 are defined to consist of 2 digits each. Group 3 (the year) is defined to consist of 4 digits. Note: It is still possible to get the whole Matcher by calling group (0).

Matcher.find() Matcher's find method will return true if the defined Pattern exists as a Substring of the String of the Matcher. For example, if we had a pattern defined by the regular expression [0-9], as long as we give the Matcher a String that contains at least one digit somewhere in the String, calling find() on this Matcher will return true. 41 Copyright 2013, Oracle and/or its affiliates. All rights

Parsing a String with Regular Expressions Recall the String method split() introduced earlier in the lesson, which splits a String by spaces and returns the split Strings in an array of Strings. The split method has an optional parameter, a regular expression that describes where the operator wishes to split the String. For example, if we wished to split the String at any sequence of one or more digits, we could write something like this: String[] tokens = str.split( [0-9]+ ); 42 Copyright 2013, Oracle and/or its affiliates. All rights

Replacing with Regular Expressions There are a few simple options for replacing Substrings using regular expressions. The following two are the most commonly used methods. For use with Strings, the method replaceall( insertregularexpressionhere, newsubstring) will replace all occurrences of the defined regular expression found in the String with the defined String newsubstring. 43 Copyright 2013, Oracle and/or its affiliates. All rights

replaceall Method This method works the same if called by a Matcher rather than a String. However, it does not require the regular expression. It will simply replace any matches of the Pattern you gave it when you initialized the Matcher. The method example shown below results in a replacement of all matches identified by Matcher with the String abc. MatcherName.replaceAll( abc ); 44 Copyright 2013, Oracle and/or its affiliates. All rights

Terminology Key terms used in this lesson included: Matcher Parsing Pattern Regular Expression Regular Expression Dot 45 Copyright 2013, Oracle and/or its affiliates. All rights

Terminology Key terms used in this lesson included: Regular Expression Groups Regular Expression Square Brackets Repetition Operator Split StringBuilder 46 Copyright 2013, Oracle and/or its affiliates. All rights

Summary In this lesson, you should have learned how to: Read, search, and parse Strings Use StringBuilder to create Strings Use regular expressions to search, parse, and replace Strings 47 Copyright 2013, Oracle and/or its affiliates. All rights