python logo

Get HTML Source with Python Selenium

Python hosting: Host, run, and code Python in the cloud!

Selenium is a web automation module that can be used to get a webpages html code. In this article we will show how to achieve that.

You can use the web drivers attribute .page_source to grab the html code of any webpage.
If you are new to selenium, I recommend the course below.

Related course
Browser Automation with Python Selenium

Install selenium

If you haven’t done so, install the selenium module (pip), the web browser and the web driver.


pip install selenium

For this example, you may need to set the path to chromium:


export PATH=$PATH:/usr/lib/chromium/

Get html source

You can import thet webdriver from the selenium module. A webdriver object is created (chromium) and we can optionally specify if we want to ignore certificate errors.

Of course any web browser can be used, but for this example I’ve used chromium.

Once the web browser started we navigate it to a webpage URL using the get() module. Then we get the page source.


from selenium import webdriver
import time

options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument("--test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)
driver.get('https://python.org')

html = driver.page_source
print(html)

It will output the webpage source, which is stored in the variable html.

selenium chromium Selenium will start the chromium browser automatically

Download Selenium Examples

BackNext





Leave a Reply: