In this post I will show you how easy it is to write a Python code that extracts hotel list from booking.com. The simplicity of this code is achieved with the help of Selenium Web Driver which acts as the main data extraction means here.
Let’s say we need to extract names of hotels in Berlin. What we need to do here is mainly to fill out the following search form on booking.com and click the Search button:
This time I will start with a complete code snippet, and then I will explain what each part means.
Here is the code that fills the form, clicks the Search button, extracts hotel names and prints the result on the screen:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Firefox()
driver.get('http://booking.com')
driver.find_element_by_css_selector("#destination").send_keys("Berlin")
WebDriverWait(driver, 1, poll_frequency=0.1).\
until(lambda drv: len(drv.find_elements_by_css_selector("ul.ui-autocomplete li")) > 0)
driver.find_element_by_css_selector("ul.ui-autocomplete li").click()
driver.find_element_by_css_selector("#availcheck").click()
driver.find_element_by_css_selector("#searchbox_btn").submit()
for link in driver.find_elements_by_css_selector("a.hotel_name_link"):
print(link.text)
1. Opening the home page
First we need to create a WebDriver and go to the home page of booking.com. Let’s use Firefox WebDriver as it’s already available in the standard selenium library:
driver = webdriver.Firefox()
driver.get('http://booking.com')
2. Selecting the destination
Then we need to type “Berlin” in the destination box and select the first item from the drop down list:
Here is the code that does it:
driver.find_element_by_css_selector("#destination").send_keys("Berlin")
WebDriverWait(driver, 1, poll_frequency=0.1).\
until(lambda drv: len(drv.find_elements_by_css_selector("ul.ui-autocomplete li")) > 0)
driver.find_element_by_css_selector("ul.ui-autocomplete li").click()
The first line causes the WebDriver to type “Berlin” into the text box with id=”destination”.
The second line is a bit more complicated. It forces the WebDriver to wait until the auto completion list appears on the page. This is necessary because when you type the name of the city, booking.com makes an AJAX request to the server to get a list of destinations fitting your search request. The WebDriverWait function periodically (each 0.1 sec in our case) checks a certain condition (defined in the lambda function) and returns when this condition becomes true or when the timeout has expired (1 sec in our case).
The third line simply clicks on the first item in the list of proposed destinations.
3. Checking the checkbox
The next line checks the “I don’t have specific dates yet” checkbox to get a list of hotels independent of any dates:
driver.find_element_by_css_selector("#availcheck").click()
4. Clicking the Search button
Finally we need to click the Search button to start the search:
driver.find_element_by_css_selector("#searchbox_btn").submit()
5. Extracting the hotel names
After we have submitted the search request, the WebDriver will wait until the page reloads and then we can extract all the hotel names listed in the search results and print them on the screen:
for link in driver.find_elements_by_css_selector("a.hotel_name_link"):
print(link.text)
Note though that for the purpose of simplicity I didn’t implement moving to the next page of the search results here. You can easily do it by yourself by forcing the WebDriver to click the “Next page” link at the bottom of the page.
How to find web page elements
Probably you have already noticed that to get web page elements to work with, I use the find_element_by_css_selector and find_elements_by_css_selector functions. They receive a CSS selector as a parameter and return a list of elements, a single element or they will throw an exception if nothing is found (it is thrown in case of find_element_by_css_selector only).
You can easily determine the element identifier by looking at web browser’s developer tools:
In Chrome and Firefox you can open this tool set by pressing Ctrl+Shift+I and in Internet Explorer you can get it by hitting F12.
There you are. If you have any questions or suggestions feel free to comment below!
I would also like to offer you a video showing what I was talking about:
5 replies on “A Simple Code that Extracts a Hotel List from Booking.com”
Thank you for good practical post.
Step5 does not seem to be working anymore. No value is returned.
Though, thanks much for this helpful blog!
Hi Amy. Just checked. It still works fine for me.
Hi Michael
What I meant was that your codes take me to this error page
http://www.booking.com/searchresults.html?src=index&nflt=&ss_raw=&error_url=http%3A%2F%2Fwww.booking.com%2Findex.en-us.html%3Fsid%3Df9c15aa06893bfe29342e0f8e25779af%3Bdcid%3D1%3B&dcid=1&sid=f9c15aa06893bfe29342e0f8e25779af&si=ai%2Cco%2Cci%2Cre%2Cdi&ss=Berlin&checkin_monthday=0&checkin_year_month=0&checkout_monthday=0&checkout_year_month=0&idf=on&interval_of_time=any&sb_predefined_group_options_value=2&no_rooms=1&group_adults=2&group_children=0
Amy, I don’t see any error on this page. Usually this page is shown if you didn’t click an item on step #2.