Selenium get links


Related course:

Selenium
Selenium automates browsers. The selenium module can make the browser do anything you want including automated testing, automating web tasks and data extraction. In this article we’ll use it for data mining, extracting the links from a web page.

Install it using:

pip install selenium

To use the module, you need a selenium web driver. All the popular browsers are supported including Chrome and Firefox.

After installing the web driver, you may need to add it to path:

export PATH=$PATH:/usr/lib/chromium/

Starting Selenium
Test if selenium is installed correctly using:

from selenium import webdriver
import time
 
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument("--test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)
 
driver.get('https://www.w3.org/')
time.sleep(3)
driver.quit()

Depending on your setup, you can start it without parameters:

webdriver.Chrome()

Extract links
To get links from webpage, use the code below:

from selenium import webdriver
 
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument("--test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)
 
driver.get('https://www.w3.org/')
for a in driver.find_elements_by_xpath('.//a'):
    print(a.get_attribute('href'))
selenium textbox
Selenium click button