Quick XPath Guide Introduction What is XPath? Nodes Expressions How Does XPath Traverse the Tree? Different ways of choosing XPaths Tools for finding XPath Firefox Portable Google Chrome Fire IE Selenium Tool How to verify XPath Examples Example Log into web application Example Create Invoice Example 3. Download Invoice Example 4. Working with Elements in IFrames Introduction This topic is a brief introduction to XPath in RPA Express to give you a quick overview of the main techniques you can use in your recordings to handle elements on a web-page, for example, click a link or an object, get or set value from/ to fields or menus. This tutorial explains the basics of XPath with suitable examples. What is XPath? XPath is a query language that is used for traversing through an XML (or HTML) document identifying different parts of documents by indicating nodes by position, relative position, type, content, etc. In other words, XPath is a quick way to reference an element on a web page and search the page top to bottom looking for elements that match the criteria defined in the expression. Nodes Similar to the Document Object Model ( DOM), XPath allows us to pick nodes and sets of nodes out of an HTML tree. As far as the language is concerned, there are four main node types XPath has access to in HTML: 3. 4. Root Node Element Nodes Attribute Nodes Text Nodes Example The path to our element (link) is as follows:
We can locate the element by full path using the following XPath: /html/body/div/a These are the nodes (attributes and text), that can be used to locate the element: We can use the following XPaths to locate the element: using an attribute //*[@href='new_invoice_link'] Note '*' locates any elements with the href attribute, but XPath will traverse through the links only, as such attribute is only available for the <a> tag. using a text //a[contains(text(),'create New Invoice')] Expressions
Commonly, XPath is used to search for particular elements or attributes with matching patterns. The patterns are defined with expressions to identify an element on a web page that does not have an easy, unique identifier. For each element there are three main parts: the type, the attributes, and the text, which can be used as identifiers. Here is a quick reference for the most common operands used in XPath expressions: // search all descendant elements; / search all child elements; Difference between single / or double // : single slash (/) at the start of Xpath instructs XPath engine to search for the element starting from the root or parent node. double slash (//) at the start of Xpath instructs XPath engine to search for the matching element in any place in the document. * search elements regardless the type; [] specifies something about the element you are looking for, for example, an order number or an attribute; @ specifies an element attribute, for example, @class; text() gets the text of the element; contains() the predicate is used to look for a specific text containing in an attribute or text() value, if you can't do a full string match. How Does XPath Traverse the Tree? XPath can use location paths, attribute location steps, and compound location paths to very quickly and efficiently retrieve nodes from our document. There are two basic simple location paths - the root location path (/) and child element location paths. The forward slash (/) serves as the root location path, it selects the root node of the document. It is important to realize this is not going to retrieve the root element, but the entire document itself. The root location path is an absolute location path, no matter what the context node is, the root location path will always refer to the root node. Child element location steps are simply using a single element name. For example, the XPath div refers to all div children of our context node. One of the really handy things with XPath is that we have quick access to all attributes by using the at sign @ followed by the attribute name we want to retrieve. For example, we can quickly retrieve all title attributes by using @title. Let's s ee an example. Open https://pub_demo.s3.amazonaws.com/invoiceplane-2/invoiceplanedemo.html in a browser, for example, Google Chrome. Press F12 to open DevTools.
This is an XPath to the input field to enter the email: /html/body/div/div/form/div/div[2]/input So lets analyze the expressions. We start at the top element (also known as a node). The / means just to look at the current element's children. So /div means look for a div element from the parent. The brackets [] specify something about that element, in this case it means div's order number. Another XPath to the element is: //*[@id='email'] Press Ctrl+F to open the search field in Element and enter or paste the XPaths to verify, that the expressions are correct. Different ways of choosing XPaths Absolute XPath The absolute XPath means to find an element starting from the very first node of the document and traversing through the HTML tree top to bottom. The absolute path for the email input field in the previous example is /html/body/div/div/form/div/div[2]/input
Relative XPath As an absolute XPath is too lengthy, there is a possibility of getting a shorter XPath. The above XPath will technically work, but each of those nested relationships will need to be present 100% of the time, or the locator will not function. There is a good chance that your XPath will vary in every change of the web page. It is always better to choose Relative XPath, as it helps us to reduce the chance of the "Element not found" exception. To choose the relative XPath, it is advisable to look for the recent ID attribute. Let's have a look on the HTML code of the input field: <input id="email" class="form-control" type="email" value=" demo@invoiceplane.com " name="email" /> You can see the last id produced is email. This id would be appropriate in this case, so a quality XPath will look like this: //input[@id='email'] What is the difference between the Absolute and Relative XPaths? Absolute XPath: /html/body/div/div/form/div/div[2]/input Relative XPath: //input[@id='email'] Attention It's highly recommended to use a single quote sign (') to define an attribute, as Chrome DevTool generate XPaths with double quote sign ("). Absolute XPath is using the single slash at the start of the expression, as the Relative is using the double slash. It means, the Relative XPath searches through the entire web page to find the locator specified, while the Absolute goes strictly along the path defined from the root node. In the brackets [] we specify the attribute (id) with the @ symbol. We can chain as many of these together as we need. Partial XPath (Contains) Quite often we can face issues when the locator s properties are dynamically generating or changed by web developers. In this case we can search for an element using the contains predicate. So, with the contains we can create the following XPath expression to find the input field for the email: //input[contains(@id,'email')] Tools for finding XPath There are a lot of tools for working with XPath, so we suggest using just a few of them, as they are easily accessed and provide complete functionality required for working with XPath: 3. Firefox Portable with XPath plugin from RPA Express Package Google Chrome Developer Tools (optionally, you can install the XPath helper plugin for more productivity) Fire IE Selenium Tool for IE Browser
Firefox Portable Firefox Portable, which is a part of RPA Express, has a special plugin, which allows you to get XPath of elements on a web page. Open a web page in Firefox Portable and make the following steps to get XPath of a web page element: 3. Hover the mouse over the element and right click. Choose XPaths... from the context menu. Select one of XPaths suggested by the plugin, the selected XPath is copied to Clipboard. Note The green tick indicates a unique XPath of the element The red exclamation sign marks an XPath that match a number of elements on the web page. Google Chrome The Chrome Developer Tools (DevTools for short), are a set of web authoring and debugging tools built into Google Chrome. The DevTools provide web developers deep access into the internals of the browser and their web application. Inspecting the DOM The Elements panel lets you see everything in one DOM tree, and allows inspection and on-the-fly editing of DOM elements. Open a web page in Google Chrome and make the following steps to get XPath of a web page element: Hover the mouse over the element, right click and choose Inspect.
3. 4. Go to Elements and right click the selected element in the tree. Choose Copy Copy XPath from the context menu. XPath to the selected element is copied to Clipboard.
Note Depending on complexity of XPath, DevTools will create either an absolute or relative XPath. Fire IE Selenium Tool The tools described before are used to work with XPaths in Firefox and Google Chrome. As in some of cases you should deal with Internet Explorer, the approach to using XPath should be different as XPaths from Firefox or Chrome may not work in Internet Explorer. It means, there are just two options to enable getting element locators for IE only websites: IE Inspector or Fire IE Selenium Tool for IE Browser. As IE Inspector has just limited functionality to view the elements in the tree, you can use Fire IE Selenium Tool for IE Browser for getting locators for a element when WebApp is supporting only IE browser. The tool uses Excel based WebBrowser Control. Download Fire IE Selenium Tool from Fire IE Selenium download page. This tool is an Excel macro-enabled workbook. Instructions on how to use you can find on this web site.
How to verify XPath If you need to verify, that your XPath expression is correct and unique, you can use Chrome DevTools for the purpose. 3. 4. 5. Open the web page in Chrome. Press F12 to open DevTools. Switch to the Elements panel. Press Ctrl+F to open the Search field. Enter or paste the XPath in the field. 6. 7. If the XPath expression has at least one match, the respective element is highlighted in the tree with yellow color. If there is a number of elements matching the criteria defined in the expression, you will see the count of such elements to the right of the search filed. It indicates the following: a. It is correct, if you are going to iterate through these elements. A set of elements, which can be accessed with the same XPath, is called Collection. Each single element from a collection can be accessed by index. Hint For example, an HTML page contains a collection of elements with the same class sample_class. An XPath to access the collection is
//*[@class='sample_class'] The first element in the collection you can via //*[@class='sample_class'][1], the second //*[@class='sample_class'][2] and so on. In the recording you can iterate through the collection with the following XPath to get each single element: //*[@class='sample_class'][${recorder_var}] where recorder_var is a Recorder variable of Number type. b. If you need just a single element, you should edit the criteria for more accurate matching. Examples The samples explain how you can use XPath in RPA Express recordings to automate interactions with a web application. For getting XPath we will use Google Chrome DevTools, so for details you can refer to the respective section above. Note XPaths of elements in the examples below can be provided using Recorder variables of String type. Example Log into web application In this sample we will use RPA Express to log into a web application (Invoice Plane) Open the URL ( https://pub_demo.s3.amazonaws.com/invoiceplane-2/invoiceplanedemo.html ) in Google Chrome. Open DevTools and get XPaths for the following elements:
a. Email input field //*[@id="email"] b. Password input field //*[@id="password"] c. Login button //*[@id="login"]/form/input 3. Create the recording to automate the login procedure: # Step Recorder Action Settings 1 Create Recorder variables for email and password to log in. Login a. Name email b. Type String c. Value any email, as the web application operates in test mode. Password a. b. c. Name password Type String Value any text, as the web application operates in test mode. 2 Open web site to enter email and password and log into application. Open Website Site URL Value https://pub_demo.s3.amazonaws.com/invoiceplane-2/invoiceplanede mo.html 3 Enter email. Web Element Mode Set value Input email 3. Options XPath of the element: //*[@id="email"]
4 Enter password. Web Element Mode Set value Input password 3. Options XPath of the element: //*[@id="password"] 5 Click Login to log into application with the email and password provided. Click Mouse Locator Click on web element (Xpath) Mouse button Left button Single click XPath of the target element: //*[@id="login"]/form/input Playback the recording to make sure, that all data are entered correctly the Login button is clicked and the login procedure is executed. You can download the recording for further tests. Note The recording was made with RPA Express 8. Download sample recording Example Create Invoice In this sample we will use RPA Express to click the Create Invoice button in Invoice Plane. The step in the example is an extension to Example 1, as it is performed as we are logged into Invoice Plane. Open the URL ( https://pub_demo.s3.amazonaws.com/invoiceplane-2/invoiceplanedemodashboard.html) in Google Chrome. Open DevTools and get XPath for the following element: a. Create Invoice on the Quick Action panel //*[@id="panel-quick-actions"]/div[2]/a[3]
3. Alternatively, you can use the main menu, the Invoices dropdown and the Create Invoice menu item: a. Create Invoice menu item //*[@id="ip-navbar-collapse"]/ul[1]/li[4]/ul/li[1]/a or //a[@class='create-invoice']
You can find the locator is using the following expressions to search for a text in the link: //a[contains(text(), 'Create Invoice')] or, if the text is known and unique //a[text()='create Invoice'] 4. Add the following step to the previous recording: # Step Recorder Action Settings
1 Click Create Invoice to open the input mask to create a new invoice. Click Mouse Locator Click on web element (Xpath) Mouse button Left button Single click XPath of the target element: Note As Invoi ce Plane is used for tests, the input mask does not open. //*[@id="panel-quick-actions"]/div[2]/a[3] alternatively, you can use //*[@id="ip-navbar-collapse"]/ul[1]/li[4]/ul/li[1]/a or //a[@class='create-invoice'] Playback the recording to make sure, that Create Invoice is clicked. You can download the recording for further tests. Note The recording was made with RPA Express 8. Download sample recording Example 3. Download Invoice In this sample we will use RPA Express to click a link pointing to an invoice in the image format to download the image. The step is an extension to Example 1, as it is performed as we are logged into Invoice Plane. Open the URL ( https://pub_demo.s3.amazonaws.com/invoiceplane-2/invoiceplanedemodashboard.html) in Google Chrome. Open DevTools and get XPath for the following element:
Note We've got XPath to the first element //*[@id="panel-recent-invoices"]/div[2]/table/tbody/tr[1]/td[3]/a As we want to iterate through the all available invoices in the panel, we need an XPath to get the links from all table rows. It looks as follows //*[@id="panel-recent-invoices"]/div[2]/table/tbody/tr/td[3]/a To get each link we create a counter in Recorder variables (for example, counter with type Number) and use it in the XPath //*[@id="panel-recent-invoices"]/div[2]/table/tbody/tr[${counter}]/ td[3]/a 3. Create the following recording: # Step Recorder Action Settings 1 Open web site to to download invoices. Open Website Site URL Value https://pub_demo.s3.amazonaws.com/invoiceplane-2/invoiceplanedemodashboard.html
2 Create and use a List variable to save all links to invoices. Web Element Mode Get value Options XPath of the element: //*[@id="panel-recent-invoices"]/div[2]/table/tbody/tr/td[3]/a 3. Input links_list (Recorder variable) 3 Create a loop to iterate through all items in the list ( links _list ). 4 Increment Counter (initially defined as 0). 5 Right click a link to download an invoice in image format from Recent Invoices. For Each Loop Variables Expression Value Click Mouse Perform the nested action for each: item in links_list Select variable counter Expression ${counter}+1 Locator Click on web element (Xpath) Mouse button Right button Single click XPath of the target element: //*[@id="panel-recent-invoices"]/div[2]/table/tbody/tr[${counter} 6 Click Save Link As... in the context menu. 7 Click Save in the Save window. Click Mouse Click Mouse Capture the context menu. Locator Click on image Mouse button Left button Single click Target location set Anchor region and Click position in the captured image, so as the robot clicks the Save Link As... item in th Capture the Save window. Locator Click on image Mouse button Left button Single click Target location set Anchor region and Click position in the captured image, so as the robot clicks the Save button in the Save d Playback the recording to make sure, that all invoices are downloaded.
You can download the recording for further tests. Note The recording was made with RPA Express 8. Download sample recording Example 4. Working with Elements in IFrames The HTML <iframe> is used for embedding another HTML page into the current page. Let's see, how we can work with elements on the embedded page. Let's take an example from the w3schools.com web site. We will open the web page and click Next located in an IFrame to open the next topic of the embedded page. Open the URL ( https://www.w3schools.com/html/html_iframe.asp) in Google Chrome. Open DevTools and get XPaths for the following elements: a. IFrame (XPath to IFrame with the embedded document) //iframe[contains(@height, '310px')] b. Next button //*[@id="main"]/div[2]/a[2]
3. Create the recording to automate clicking Next on the embedded document: # Step Recorder Action 1 Create three String variables in Recorder Settings url: https://www.w3schools.com/html/html_iframe.asp target_iframe: //*[@id="main"]/div[2]/a[2] web_element: //iframe[@src='default.asp')] 2 Open web site to work with an IFrame. 3 Click Next on the web page embedded in the IFrame. Open Website Click Mouse Site URL Value ${url} Locator Click on web element (Xpath) Mouse button Left button Single click XPath of the target element ${web_element} Select Search in iframe(-s) option XPath of parent iframe(-s) ${target_iframe}
Playback the recording to make sure that it works correctly, i.e. the next topic is displayed in the IFrame. You can download the recording for further tests. Note The recording was made with RPA Express 8. Download sample recording