Reading about Python? Actually practice it. Try PyChallenge free

Python Tutorial

Selenium get links

Related course:

Selenium Selenium automates browsers. The selenium module can make the browser do anything you want including automated testing, automating web tasks and data extraction. In this article we'll use it for data mining, extracting the links from a web page.

Install it using:

pip install selenium

To use the module, you need a selenium web driver. All the popular browsers are supported including Chrome and Firefox.

After installing the web driver, you may need to add it to path:

export PATH=$PATH:/usr/lib/chromium/

Starting Selenium Test if selenium is installed correctly using:

from selenium import webdriver
import time

options = webdriver.ChromeOptions() options.add_argument('--ignore-certificate-errors') options.add_argument("--test-type") options.binary_location = "/usr/bin/chromium" driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.w3.org/') time.sleep(3) driver.quit()

Depending on your setup, you can start it without parameters:

webdriver.Chrome()

Extract links To get links from webpage, use the code below:

from selenium import webdriver

options = webdriver.ChromeOptions() options.add_argument('--ignore-certificate-errors') options.add_argument("--test-type") options.binary_location = "/usr/bin/chromium" driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.w3.org/') for a in driver.find_elements_by_xpath('.//a'): print(a.get_attribute('href'))

BackNext