Create and read csv

Spreadsheets often export CSV (comma seperated values) files, because they are easy to read and write. A csv file is simply consists of values, commas and newlines.  While the file is called ‘comma seperate value’ file, you can use another seperator such as the pipe character.

Related course
Data Analysis in Python with Pandas

Create a spreadsheet file (CSV) in Python
Let us create a file in CSV format with Python. We will use the comma character as seperator or delimter.

import csv
 
with open('persons.csv', 'wb') as csvfile:
    filewriter = csv.writer(csvfile, delimiter=',',
                            quotechar='|', quoting=csv.QUOTE_MINIMAL)
    filewriter.writerow(['Name', 'Profession'])
    filewriter.writerow(['Derek', 'Software Developer'])
    filewriter.writerow(['Steve', 'Software Developer'])
    filewriter.writerow(['Paul', 'Manager'])

Running this code will give us this fil persons.csv with this content:

Name,Profession
Derek,Software Developer
Steve,Software Developer
Paul,Manager

You can import the persons.csv file in your favorite office program.

python csv
Spreadsheet file created in Python

 

Read a spreadsheet file (csv) 
If you created a csv file, we can read files row by row with the code below:

import csv
 
# open file
with open('persons.csv', 'rb') as f:
    reader = csv.reader(f)
 
    # read file row by row
    for row in reader:
        print row

This will simply show every row as a list:

['Name', 'Profession']
['Derek', 'Software Developer']
['Steve', 'Software Developer']
['Paul', 'Manager']

Perhaps you want to store that into Python lists. We get the data from the csv file and then store it into Python lists. We skip the header with an if statement because it does not belong in the lists. Full code:

import csv
 
# create list holders for our data.
names = []
jobs = []
 
# open file
with open('persons.csv', 'rb') as f:
    reader = csv.reader(f)
 
    # read file row by row
    rowNr = 0
    for row in reader:
        # Skip the header row.
        if rowNr >= 1:
            names.append(row[0])
            jobs.append(row[1])
 
        # Increase the row number
        rowNr = rowNr + 1
 
# Print data 
print names
print jobs

Result:

['Derek', 'Steve', 'Paul']
['Software Developer', 'Software Developer', 'Manager']

Most spreadsheet or office programs can export csv files, so we recommend you to create any type of csv file and play around with it 🙂

Next tutorial: Zip archives

Requests: HTTP for Humans

If you want to request data from webservers, the traditional way to do that in Python is using the urllib library. While this library is effective, you could easily create more complexity than needed when building something. Is there another way?

Requests is an Apache2 Licensed HTTP library, written in Python. It’s powered by httplib and urllib3, but it does all the hard work for you.

To install type:

git clone https://github.com/kennethreitz/requests.git
cd requests
sudo python setup.py install

The Requests library is now installed. We will list some examples below:

Related course
Python BeautifulSoup: Extract Web Data Beautifully

Grabbing raw html using HTTP/HTTPS requests
We can now query a website as :

import requests
r = requests.get('http://pythonspot.com/')
print r.content

Save it and run with:

python website.py

It will output the raw HTML code.

Download binary image using Python

from PIL import Image
from StringIO import StringIO
import requests
 
r = requests.get('http://1.bp.blogspot.com/_r-MQun1PKUg/SlnHnaLcw6I/AAAAAAAAA_U$
i = Image.open(StringIO(r.content))
i.show()

An image retrieved using python

Website status code (is the website online?)

import requests
r = requests.get('http://pythonspot.com/')
print r.status_code

This returns 200 (OK). A list of status codes can be found here: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

Retrieve JSON from a webserver 
You can easily grab a JSON object from a webserver.

import requests
 
import requests
r = requests.get('https://api.github.com/events')
print r.json()

HTTP Post requests using Python

from StringIO import StringIO
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.post("http://httpbin.org/post", data=payload)
print(r.text)

SSL verification, verify certificates using Python

from StringIO import StringIO
import requests
print requests.get('https://github.com', verify=True)

Extract data from the HTTP response header
With every request you send to a HTTP server, the server will send you some additional data. You can get extract data from an HTTP response using:

#!/usr/bin/env python
import requests
r = requests.get('http://pythonspot.com/')
print r.headers

This will return the data in JSON format.  We can parse the data encoded in JSON format to a Python dict.

#!/usr/bin/env python
import requests
import json
r = requests.get('http://pythonspot.com/')
 
jsondata = str(r.headers).replace('\'','"')
headerObj = json.loads(jsondata)
print headerObj['server']
print headerObj['content-length']
print headerObj['content-encoding']
print headerObj['content-type']
print headerObj['date']
print headerObj['x-powered-by']

Extract data from HTML response
Once you get the data from a server, you can parse it using python string functions or use a library. BeautifulSoup is often used.  An example code that gets the page title and links:

from bs4 import BeautifulSoup
import requests
 
# get html data
r = requests.get('http://stackoverflow.com/')
html_doc = r.content
 
# create a beautifulsoup object
soup = BeautifulSoup(html_doc)
 
# get title
print soup.title
 
# print all links
for link in soup.find_all('a'):
    print(link.get('href'))