visualize data with python
Python hosting : Host, run, and code Python in the cloud!
Get the xls data for this tutorial from: depaul.edu . This dataset contains a list of US presidents, associated parties,profession and more.
Python Pandas Dataset
Related course Data Analysis with Python Pandas
Beautiful Plots with Pandas We can plot data of this large excel file with a few lines of code. We select the column “Occupation” for this demonstration using:
Full code:
from pandas import DataFrame, read_csvimport matplotlib.pyplot as pltimport pandas as pd file = r'data/Presidents.xls' df = pd.read_excel(file) colors = ['yellowgreen' , 'gold' , 'lightskyblue' , 'lightcoral' ,'red' ,'green' ,'blue' ,'orange' ,'white' ,'brown' ] df['Occupation' ].value_counts().plot(kind='pie' ,title='Occupation by President' ,colors=colors) plt.show()
Data cleaning and plotting We can also plot the popularity by president. Before we do that, we need to clean some data:
Data cleaning with Pandas
Some of the cells do not contain numerical values, we can either remove them or replace them:
df = df[df['% popular' ] != 'NA()' ]
To plot the popularity:
from pandas import DataFrame, read_csvimport matplotlib.pyplot as pltimport pandas as pd file = r'data/Presidents.xls' df = pd.read_excel(file) df = df[df['% popular' ] != 'NA()' ] print ( df['% popular' ] )df['% popular' ].plot(kind='hist' , bins=8 , title='Popularity by President' , facecolor='blue' , alpha=0.5 , normed=1 ) plt.show()
Popularity by President
Leave a Reply: