Read xls with Pandas


Pandas, a data analysis library, has native support for loading excel data (xls and xlsx).
The method read_excel loads xls data into a Pandas dataframe:

read_excel(filename)

If you have a large excel file you may want to specify the sheet:

df = pd.read_excel(file, sheetname='Elected presidents')

Read excel with Pandas
The code below reads excel data into a Python dataset (the dataset can be saved below).

from pandas import DataFrame, read_csv
import matplotlib.pyplot as plt
import pandas as pd 
 
file = r'data/Presidents.xls'
df = pd.read_excel(file)
print(df['Occupation'])

The dataframe can be used, as shown in the example below:

from pandas import DataFrame, read_csv
import matplotlib.pyplot as plt
import pandas as pd 
 
file = r'data/Presidents.xls'
df = pd.read_excel(file)
 
# remove messy data
df = df[df['Years in office'] != 'n/a']
 
# show data
print('Min: ', df['Years in office'].min())
print('Max: ', df['Years in office'].max())
print('Sum: ', df['Years in office'].sum())

Dataset
For purpose of demonstration, you can use the dataset from: depaul.edu.

xls dataset
A large dataset stored in XLS format