python logo

selenium get link


Python hosting: Host, run, and code Python in the cloud!

Related Course: Browser Automation with Python Selenium

What is Selenium?
Selenium is a powerful tool that allows developers to automate browser activities. It’s not just limited to testing, but can also be used for various web-related tasks such as data extraction. In this tutorial, we will delve into how Selenium can be utilized for data mining by extracting links from web pages.

Setting Up Selenium
Before diving in, ensure you have Selenium installed. If not, you can easily install it with pip:

1
pip install selenium

After installing the Selenium module, the next step is to set up the selenium web driver. This driver is essential as it allows Selenium to interact with the web browsers. Browsers like Chrome and Firefox are well-supported.

For some setups, post-installation, you might need to add the web driver to your system’s path:

1
export PATH=$PATH:/usr/lib/chromium/

Testing Your Selenium Setup
To ensure that Selenium has been set up correctly, you can run the following code:

1
2
3
4
5
6
7
8
9
10
11
12
from selenium import webdriver
import time

options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument("--test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.w3.org/')
time.sleep(3)
driver.quit()

For some configurations, starting Selenium can be even simpler:

1
webdriver.Chrome()

How to Extract Links with Selenium?
Extracting links from a webpage using Selenium is a straightforward task. Below is a Python script that does just that:

1
2
3
4
5
6
7
8
9
10
11
from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument("--test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.w3.org/')
for a in driver.find_elements_by_xpath('.//a'):
print(a.get_attribute('href'))

Interested in more examples? Download More Selenium Examples Here.

If you’re looking to learn more, navigate through the previous tutorial or proceed to the next one.






Leave a Reply: