Reading CSV files using Python 3 is what you will learn in this article. The file data contains comma separated values (csv). The comma is known as the delimiter, it may be another character such as a semicolon.
A CSV file is a table of values, separated by commas. To read a CSV file from Python, you need to import the csv module or the pandas module.
Python Programming Bootcamp: Go from zero to hero
csv stands for “comma-separated values”. they are a common file format for data exchange, storage, and editing. in fact, the .csv files you may open in a spreadsheet application (like excel) are just plain text files, with one very simple rule:
all of the fields in your records must be separated by commas.
For example, the following might be a small part of a sample spreadsheet in csv format:
Another example csv file:
The process will be:
Read CSV file
One of the first things you will need to do when creating a data-driven Python application is to read your data from a CSV file into a dataset. If you’re familiar with Excel, reading data from a CSV file is easy but if you’re new to CSV, let me show you how easy it is.
The most basic method to read a csv file is:
# load csv module
We import the csv module. This is a simple module to read/write csv files in python.
You can read every row in the file. Every row is returned as an array and can be accessed as such, to print the first cells we could simply write:
For the second cell, you would use:
It is better to have the data in arrays, because it’s easier to understand than those indices like ,, etc.
You can do that by adding the cells to a list during loading. The example below demonstrates this:
# load module
We creates two arrays: dates and scores. We use the append method to add the cells to the arrays.
If you want to use a different delimiter simply change the reader call:
If you have many csv files in an identical format, you can create a function for loading the data. That way you don’t have to write duplicate code.
For instance, if your csv files have the format (dates,scores) then you can write this code:
Given a csv filename, the function will read and parse the csv data. Its added to the arrays dates and scores and returned.
CSV Files can be read by the Pandas library in Python. The read_csv() function in Pandas is used to read CSV files. You must pass it a file-like object containing your data
Pandas is not part of the Python standard library, so you will need to install it with the pip package manager. Panda’s read_csv function can read multiple columns
import pandas as pd
Pandas uses its own data structure called a DataFrame (df), it is different than a Python list that you used with the csv module. Once a dataset has been read then many data manipulation functions become available.
To access a row you can use the index like this
Related course: Python Programming Bootcamp: Go from zero to hero
In this tutorial you will learn how to use the SQLite database management system with Python. You will learn how to use SQLite, SQL queries, RDBMS and more of this cool stuff!Related course: Master SQL Databases with Python
Data is retrieved from a database system using the SQL language.
Python has bindings for many database systems including MySQL, Postregsql, Oracle, Microsoft SQL Server and Maria DB.
One of these database management systems (DBMS) is called SQLite. SQLite was created in the year 2000 and is one of the many management systems in the database zoo.
SQL is a special-purpose programming language designed for managing data held in a databases. The language has been around since 1986 and is worth learning. The is an old funny video about SQL
It is a self-contained, serverless, zero-configuration, transactional SQL database engine. The SQLite project is sponsored by Bloomberg and Mozilla.
Install SQLite:Use this command to install SQLite:
$ sudo apt-get install sqlite
Verify if it is correctly installed. Copy this program and save it as test1.py
$ python test1.py
It should output:
SQLite version: 3.8.2
What did the script above do?
The script connected to a new database called test.db with this line:
con = lite.connect('test.db')
It then queries the database management system with the command
which in turn returned its version number. That line is known as an SQL query.
Related course: Master SQL Databases with Python
SQL Create and Insert
The script below will store data into a new database called user.db
SQLite is a database management system that uses tables. These tables can have relations with other tables: it’s called relational database management system or RDBMS. The table defines the structure of the data and can hold the data. A database can hold many different tables. The table gets created using the command:
cur.execute("CREATE TABLE Users(Id INT, Name TEXT)")
We add records into the table with these commands:
cur.execute("INSERT INTO Users VALUES(2,'Sonya')")
The first value is the ID. The second value is the name. Once we run the script the data gets inserted into the database table Users:
SQLite query data
We can explore the database using two methods: the command line and a graphical interface.
From console: To explore using the command line type these commands:
This will output the data in the table Users.
sqlite> SELECT * FROM Users;
From GUI: If you want to use a GUI instead, there is a lot of choice. Personally I picked sqllite-man but there are many others. We install using:
sudo apt-get install sqliteman
We start the application sqliteman. A gui pops up.
Press File > Open > user.db. It appears like not much has changed, do not worry, this is just the user interface. On the left is a small tree view, press Tables > users. The full table including all records will be showing now.
This GUI can be used to modify the records (data) in the table and to add new tables.Related course: Master SQL Databases with Python
The SQL database query language
SQL has many commands to interact with the database. You can try the commands below from the command line or from the GUI:
We can use those queries in a Python program:
This will output all data in the Users table from the database:
$ python get.py
Creating a user information database
We can structure our data across multiple tables. This keeps our data structured, fast and organized. If we would have a single table to store everything, we would quickly have a big chaotic mess. What we will do is create multiple tables and use them in a combination. We create two tables:
To create these tables, you can do that by hand in the GUI or use the script below:
# -*- coding: utf-8 -*-
The jobs table has an extra parameter, Uid. We use that to connect the two tables in an SQL query:
SELECT users.name, jobs.profession FROM jobs INNER JOIN users ON users.ID = jobs.uid
You can incorporate that SQL query in a Python script:
It should output:
$ python get2.py
You may like: Databases and data analysis
In this article you will learn how to use the PostgreSQL database with Python. PostgreSQL is an relational database management system (rdbms). PostgreSQL supports foreign keys, joins, views, triggers, stored procedures and much more.
Master SQL Databases with Python
For this tutorial you will need the PostgreSQL dbms and the psycopg2 module.
On an Ubuntu system you can install the PostgreSQL database system with this command:
Test if the PostgreSQL database system is up and running with this command:
If you do not see the above screen, try one of these commands:
Psycopg is a PostgreSQL database adapter for Python.
This command installs the module:
We create a database and database user (also called a role)
Reload the database:
Create table and Insert data
Run this program :
It will create a database table (this datastructure holds the data). Data is inserted into the table using:
The line below is mandatory, it executes all sql queries:
Data can be read using the SELECT SQL query. A list is returned for every row:
PostgreSQL table data can be updated with this code:
Delete data from a PostgreSQL table with this code:
An object relational mapper maps a relational database system to objects. If you are unfamiliar with object orientated programming, read this tutorial first. The ORM is independent of which relational database system is used. From within Python, you can talk to objects and the ORM will map it to the database. In this article you will learn to use the SqlAlchemy ORM.
What an ORM does is shown in an illustration below:
Creating a class to feed the ORM
We create the file tabledef.py. In this file we will define a class Student. An abstract visualization of the class below:
Observe we do not define any methods, only variables of the class. This is because we will map this class to the database and thus won’t need any methods.
This is the contents of tabledef.py:
from sqlalchemy import *
The ORM created the database file tabledef.py. It will output the SQL query to the screen, in our case it showed:
CREATE TABLE student (
Thus, while we defined a class, the ORM created the database table for us. This table is still empty.
Inserting data into the database
The database table is still empty. We can insert data into the database using Python objects. Because we use the SqlAlchemy ORM we do not have to write a single SQL query. We now simply create Python objects that we feed to the ORM. Save the code below as dummy.py
The ORM will map the Python objects to a relational database. This means you do not have any direct interaction from your application, you simply interact with objects. If you open the database with SQLiteman or an SQLite database application you’ll find the table has been created:
Query the data
We can query all items of the table using the code below. Note that Python will see every record as a unique object as defined by the Students class. Save the code as demo.py
On execution you will see:
To select a single object use the filter() method. A demonstration below:
Finally, if you do not want the ORM the output any of the SQL queries change the create_engine statement to:
engine = create_engine('sqlite:///student.db', echo=False)