For data scraping, the first step is to find the data itself. This can be done in many ways - through unique attributes, class names, id or CSS Selectors. However, sometimes due to the presence of dynamic elements, it becomes more difficult to search for data, as well as to identify HTML elements. This is where XPath comes in handy.
When a web page is loaded in a browser, it generates a DOM (Document Object Model) structure. At the same time, XPath is a query language that queries objects in the DOM. This makes XPath a good way to search for web elements on a web page using Selenium as well.
Syntax of XPath
XML Path or commonly known as XPath is a query language for XML documents. It allows one to write an XML document navigation flow to search for any web element.
The XPath syntax consists of DOM attributes and tags, which makes it possible to locate any element on a web page using the DOM. In general, XPath starts with "//" and looks like this:
//tag_name[@Attribute_name = "Value"]/child nodes
Where tag name is node name, @ means start of name of the selected attribute and value helps filter results.
Example of XPath may be the next:
//*[@id="w-node"]/div/a[1]
Get fast, real-time access to structured Google search results with our SERP API. No blocks or CAPTCHAs - ever. Streamline your development process without worrying…
Use scraping of complete business information along with reviews, photos, addresses, ratings, popular places and more from Google Maps. Download ready structured…
Types of XPath
There are only 2 types of XPaths in Selenium - absolute XPaths and relative XPaths.
In example will be used a web page with the following html code:
<!DOCTYPE html>
<html>
<head>
<title>A sample shop</title>
</head>
<body>
<div class="product-item">
<img src="example.com\item1.jpg">
<div class="product-list">
<h3>Pen</h3>
<span class="price">10$</span>
<a href="example.com\item1.html" class="button">Buy</a>
</div>
</div>
<div class="product-item">
<img src="example.com\item2.jpg">
<div class="product-list">
<h3>Book</h3>
<span class="price">20$</span>
<a href="example.com\item2.html" class="button">Buy</a>
</div>
</div>
</body>
</html>
Absolute XPath
Using of absolute XPath helps to accurately find a specific given element. For example, lets write absolute XPath for product name:

Absolute XPath:
/html/body/div[1]/div/h3
To copy XPath from Chrome DevTools (press F12 to open) just inspect the element (Ctrl+Shift+C or inspect bottom):

Then right-click on highlight line at element window and choose copy-copy full XPath:

The resulting XPath can be checked in the console:

Here one can also copy the html code of this element. Just right-click on result and choose “Copy Object”:

The result:
<h3>Pen</h3>
This method is also known as a single slash search is the most vulnerable to minor changes in the structure of the page.
Relative XPath
Relative XPath is more flexible and not depends on the minor changes in the page structure. The next relative XPath will find the same element as an absolute XPath below:
//*[@class="product-list"]/h3
Let's check:

The result:
[ {<h3>Pen</h3>}, {<h3>Book</h3>} ]
Relative XPath can start to search anywhere in the DOM structure. Moreover, it is shorter than Absolute XPath.
XPath VS CSS Selectors
Someone, who has already read about CSS selector, may be can't choose between them. The main difference between XPath and CSS selectors is that with XPath one can move both forward and backward, while the CSS selector only moves forward and does not see parent elements. However, XPath is different in each browser, which does not allow them to be universal.
Thus, it can be concluded that CSS Selectors are best used when it is necessary to reduce time or simplify the code. Whereas XPath is more suitable for more complex tasks. Full article about CSS selectors is here.
Our Google Maps Scraper will quickly and easily extract business data including business type, phone, address, website, ratings, number of reviews, and more from…
Google SERP Scraper is the perfect tool for any digital marketer looking to quickly and accurately collect data from Google search engine results. With no coding…
Using XPath in Selenium
For scraping data using Selenium, the By class is used. There are two methods that can be useful for finding page elements in combination with the "By" class for selecting attributes. They are:
find_element
returns the first instance of multiple web elements with a particular attribute in the DOM. If no element is found, the method throws a NoSuchElementException.find_elements
returns an empty value if the element is not found, or a list of all web element instances that match the specified attribute.
So, for search product name of pen using XPath in Selenium:
from selenium.webdriver.common.by import By
driver.find_element(By.XPATH, '//*[@class="product-list"]/h3')
And for list contains all product names:
from selenium.webdriver.common.by import By
driver.find_elements(By.XPATH, '//*[@class="product-list"]/h3')
Dynamic XPath in Selenium
To perform specific queries, one can use special commands and XPath operators.
XPath Using Logical Operators: OR & AND
Logical operators are needed to more accurately search for elements depending on the specified conditions. XPath can use 2 logical operators: or & and. One should remember that they are case-sensitive. So, using "OR" & "AND" will be incorrect.
Logical Operator OR
This XPath query returns the child elements that match the first value, the second value, or both. For example:
//tag_name[@Attribute_name = "Value" or @Attribute_name2 = "Value2"]
It will return:
Attribute 1 | Attribute 2 | Result |
False | False | No Elements |
True | False | Returns A |
False | True | Returns B |
True | True | Returns Both |
Let's change example above and check work of logical operator or. Imagine that the price of pen is stored in a container:
<span time-in="150" class="price">10$</span>
And book price:
<span time-in="100" class="price">20$</span>
Use the logical operator or:
//span[@time-in = "100" or @class = "price"]
The result:

The query returned both products because they both had the class "price".
Logical Operator AND
This XPath query returns the child elements that match only both values. For example:
//tag_name[@Attribute_name = “Value” and @Attribute_name2 = “Value2”]
It will return:
Attribute 1 | Attribute 2 | Result |
False | False | No Elements |
True | False | No Elements |
False | True | No Elements |
True | True | Returns Both |
To check it just use the example above and change operator OR to AND:

XPath using Starts-With()
This method helps to find elements which a started at the special way. For example, lets find the article "Web Scraping with Python: from Fundamentals to Practice".

The XPath will be the next:
//a[starts-with(text(),'Web Scraping')]
or
//a[starts-with(text(),'Web')]
Let's check:

But the next will be incorrect:
//a[starts-with(text(),'Scraping with Python')]
This method can be used not only for static elements but for dynamic (as button) too. For example:
//span[starts-with(@class, 'read-more-link')]
XPath using Index
This method is useful when one needs to find a specific element in the DOM. For example:
//tag[@attribute_name='value'][element_num]
Try out Web Scraping API with proxy rotation, CAPTCHA bypass, and Javascript rendering.
We offer customized web scraping solutions that can provide any data you need, on time and with no hassle!
Tired of getting blocked while scraping the web?
Try now for free
Get structured data in the format you need!
Get a Quote
Let's return to the operator OR example and try to find only the first result:

XPath using Following
This method is used to find the web element or elements following a known one. Following syntax is the next:
//tag[@attribute_name='value']//following::tag
But it shouldn't be next to known tag or at the same level. Selenium will choose the nearest one:

XPath using Following-Sibling
This method will find the nearest element with the same parent. It has the next syntax:
//tag[@attribute_name='value']//following-sibiling::tag
Result will be the same as at previous example.
XPath using Preceding
Preceding method will find all the elements before current node:
//tag[@attribute_name='value']//preceding::tag
Searches for the nearest one at all levels.
XPath using Preceding-Sibling
The same as previous one but searching for elements before current node with the same parent:
//tag[@attribute_name='value']//preceding-sibling::tag
XPath using Child
This method is used to locate all the child elements of a particular node:
//tag[@attribute_name='value']//child::tag

XPath using Parent
This method is used to locate all the parent elements of a particular node:
//tag[@attribute_name='value']//parent::tag
XPath using Descendants
This method is used to locate all the descendants (child, grandchild nodes and etc.) of a particular node:
//tag[@attribute_name='value']//descendants::tag
XPath using Ancestors
This method is used to locate all the ancestors (parent, grandparent nodes and etc.) of a particular node:
//tag[@attribute_name='value']//ancestors::tag
Conclusion and Takeaways
So, XPath in selenium can help to locate elements for further scraping. It can work with static data and dynamic data. Moreover, unlike to selectors XPath can operate on all levels of DOM structure including parent elements.