python requests library
Python hosting: Host, run, and code Python in the cloud!
Python’s Requests library provides an intuitive method to send HTTP/HTTPS requests. Unlike the traditional urllib library which can often introduce complexities, the Requests library simplifies the process, making web data extraction a breeze.
Requests is a versatile HTTP library written in Python. Notably, it’s built upon the foundational layers of both httplib and urllib3, taking away the intricacies involved in direct HTTP communication.
Installation
To install the Requests library, follow the steps below:1
2
3git clone https://github.com/kennethreitz/requests.git
cd requests
sudo python setup.py install
With the library installed, you can now dive into various examples illustrating its utility.
Related course: If you’re looking to deepen your understanding, consider the Foundations of Python Network Programming course.
Extracting Raw HTML
To fetch and display the raw HTML of a website, use the following code:1
2
3import requests
r = requests.get('http://pythonspot.com/')
print r.content
Running the script will yield the site’s HTML content.
Downloading Images
Python also empowers you to download images from the web. Here’s how:1
2
3
4
5
6
7from PIL import Image
from StringIO import StringIO
import requests
r = requests.get('http://1.bp.blogspot.com/_r-MQun1PKUg/SlnHnaLcw6I/AAAAAAAAA_U')
i = Image.open(StringIO(r.content))
i.show()
This code fetches and displays an image directly in Python.
Checking Website Status
Ascertain the status of a website using:1
2
3import requests
r = requests.get('http://pythonspot.com/')
print r.status_code
A response of 200
signifies that the website is up and running. For a comprehensive list of status codes, check out this Wikipedia page.
Fetching JSON Data
Extracting JSON data from a webserver is seamless with Python. Here’s a snippet:1
2
3import requests
r = requests.get('https://api.github.com/events')
print r.json()
Sending HTTP POST Requests
Transmit data to a server using POST requests as follows:1
2
3
4
5
6from StringIO import StringIO
import requests
payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.post("http://httpbin.org/post", data=payload)
print(r.text)
SSL Verification
Ensure the SSL certificates of websites with:1
2
3from StringIO import StringIO
import requests
print requests.get('https://github.com', verify=True)
Delving into HTTP Response Headers
Each HTTP request garners a set of response headers. Extract them using:1
2
3import requests
r = requests.get('http://pythonspot.com/')
print r.headers
Parsing the JSON-formatted data into a Python dictionary becomes straightforward:1
2
3
4
5
6
7
8
9
10
11
12import requests
import json
r = requests.get('http://pythonspot.com/')
jsondata = str(r.headers).replace('\'','"')
headerObj = json.loads(jsondata)
print headerObj['server']
print headerObj['content-length']
print headerObj['content-encoding']
print headerObj['content-type']
print headerObj['date']
print headerObj['x-powered-by']
Parsing HTML Responses
Once data is secured from a server, parsing becomes crucial. While Python string functions offer solutions, libraries like BeautifulSoup provide a more robust approach:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16from bs4 import BeautifulSoup
import requests
# fetch html content
r = requests.get('http://stackoverflow.com/')
html_doc = r.content
# instantiate a beautifulsoup object
soup = BeautifulSoup(html_doc)
# retrieve title
print soup.title
# list all hyperlinks
for link in soup.find_all('a'):
print(link.get('href'))
Leave a Reply:
We have to get Pillow Library to execute 'Download binary image using Python'
How to :
pip install Pillow
Hi,
Can you extend this section with adding actions(extracting something from response) on response from get request? I want to learn about this.
Hi, yes I will extend the section