python logo


Category: database

csv to array python

Reading CSV files using Python 3 is what you will learn in this article. The file data contains comma separated values (csv). The comma is known as the delimiter, it may be another character such as a semicolon.

A CSV file is a table of values, separated by commas. To read a CSV file from Python, you need to import the csv module or the pandas module.

Related course
Python Programming Bootcamp: Go from zero to hero

Read CSV

csv stands for “comma-separated values”. they are a common file format for data exchange, storage, and editing. in fact, the .csv files you may open in a spreadsheet application (like excel) are just plain text files, with one very simple rule:

all of the fields in your records must be separated by commas.

For example, the following might be a small part of a sample spreadsheet in csv format:

"first_name","last_name","email","address","city","state","zip","phone"
"charlie","davidson","[email protected]","123 main street, akron, ohio","akron, ohio","23678"
"tanya","jones","[email protected]", "734 main street", "ny", "new york", "nyc", "12354"

Another example csv file:


01/01/2016, 4
02/01/2016, 2
03/01/2016, 10
04/01/2016, 8

The process will be:

read csv file Read csv file (first step is optional)

Read CSV file

One of the first things you will need to do when creating a data-driven Python application is to read your data from a CSV file into a dataset. If you’re familiar with Excel, reading data from a CSV file is easy but if you’re new to CSV, let me show you how easy it is.

The most basic method to read a csv file is:

# load csv module
import csv

# open file for reading
with open('file.csv') as csvDataFile:

# read file as csv file
csvReader = csv.reader(csvDataFile)

# for every row, print the row
for row in csvReader:
print(row)

We import the csv module. This is a simple module to read/write csv files in python.


import csv

You can read every row in the file. Every row is returned as an array and can be accessed as such, to print the first cells we could simply write:


print(row[0])

For the second cell, you would use:


print(row[1])

It is better to have the data in arrays, because it’s easier to understand than those indices like [0],[1],[2] etc.

You can do that by adding the cells to a list during loading. The example below demonstrates this:

# load module
import csv

# first cell data
dates = []

# second cell data
scores = []

# open file for reading
with open('file.csv') as csvDataFile:

# open file as csv file
csvReader = csv.reader(csvDataFile)

# loop over rows
for row in csvReader:

# add cell [0] to list of dates
dates.append(row[0])

# add cell [1] to list of scores
scores.append(row[1])

# output data
print(dates)
print(scores)

We creates two arrays: dates and scores. We use the append method to add the cells to the arrays.

If you want to use a different delimiter simply change the reader call:


csvReader = csv.reader(delimiter=';')

Load CSV function

If you have many csv files in an identical format, you can create a function for loading the data. That way you don’t have to write duplicate code.

For instance, if your csv files have the format (dates,scores) then you can write this code:


import csv

def readMyFile(filename):
dates = []
scores = []

with open(filename) as csvDataFile:
csvReader = csv.reader(csvDataFile)
for row in csvReader:
dates.append(row[0])
scores.append(row[1])

return dates, scores


dates,scores = readMyFile('file.csv')

print(dates)
print(scores)

Given a csv filename, the function will read and parse the csv data. Its added to the arrays dates and scores and returned.

Read csv with pandas

CSV Files can be read by the Pandas library in Python. The read_csv() function in Pandas is used to read CSV files. You must pass it a file-like object containing your data

Pandas is not part of the Python standard library, so you will need to install it with the pip package manager. Panda’s read_csv function can read multiple columns

import pandas as pd

df = pd.read_csv('data.csv')

print(df.to_string())

Pandas uses its own data structure called a DataFrame (df), it is different than a Python list that you used with the csv module. Once a dataset has been read then many data manipulation functions become available.

To access a row you can use the index like this

print(df.loc[0])

Related course: Python Programming Bootcamp: Go from zero to hero

dbms in python

In this tutorial you will learn how to use the SQLite database management system with Python. You will learn how to use SQLite, SQL queries, RDBMS and more of this cool stuff!

Related course: Master SQL Databases with Python

Pyton Database

Python Database Python Database.
Data is retrieved from a database system using the SQL language. Data is everywhere and software applications use that. Data is either in memory, files or databases.

Python has bindings for many database systems including MySQL, Postregsql, Oracle, Microsoft SQL Server and Maria DB.

One of these database management systems (DBMS) is called SQLite. SQLite was created in the year 2000 and is one of the many management systems in the database zoo.

SQL is a special-purpose programming language designed for managing data held in a databases. The language has been around since 1986 and is worth learning. The is an old funny video about SQL

SQLite


SQLite SQLite, a relational database management system. SQLite is the most widely deployed SQL database engine in the world. The source code for SQLite is in the public domain.

It is a self-contained, serverless, zero-configuration, transactional SQL database engine. The SQLite project is sponsored by Bloomberg and Mozilla.

Install SQLite:

Use this command to install SQLite:

$ sudo apt-get install sqlite


Verify if it is correctly installed. Copy this program and save it as test1.py

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sqlite3 as lite
import sys

con = None

try:
    con = lite.connect('test.db')
    cur = con.cursor()  
    cur.execute('SELECT SQLITE_VERSION()')
    data = cur.fetchone()
    print "SQLite version: %s" % data                
except lite.Error, e:   
    print "Error %s:" % e.args[0]
    sys.exit(1)
finally:    
    if con:
        con.close()


Execute with:
$ python test1.py

It should output:
SQLite version: 3.8.2


What did the script above do?
The script connected to a new database called test.db with this line:

con = lite.connect('test.db')

It then queries the database management system with the command

SELECT SQLITE_VERSION()


which in turn returned its version number. That line is known as an SQL query.

Related course: Master SQL Databases with Python

SQL Create and Insert


The script below will store data into a new database called user.db

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sqlite3 as lite
import sys

con = lite.connect('user.db')

with con:

cur = con.cursor()
cur.execute("CREATE TABLE Users(Id INT, Name TEXT)")
cur.execute("INSERT INTO Users VALUES(1,'Michelle')")
cur.execute("INSERT INTO Users VALUES(2,'Sonya')")
cur.execute("INSERT INTO Users VALUES(3,'Greg')")


SQLite is a database management system that uses tables. These tables can have relations with other tables: it’s called relational database management system or RDBMS. The table defines the structure of the data and can hold the data. A database can hold many different tables. The table gets created using the command:

cur.execute("CREATE TABLE Users(Id INT, Name TEXT)")

We add records into the table with these commands:
cur.execute("INSERT INTO Users VALUES(2,'Sonya')")
cur.execute("INSERT INTO Users VALUES(3,'Greg')")


The first value is the ID. The second value is the name. Once we run the script the data gets inserted into the database table Users:

SQL Table SQL Table

SQLite query data


We can explore the database using two methods: the command line and a graphical interface.

From console: To explore using the command line type these commands:

sqlite3 user.db
.tables
SELECT * FROM Users;

This will output the data in the table Users.

sqlite> SELECT * FROM Users;
1|Michelle
2|Sonya
3|Greg

From GUI: If you want to use a GUI instead, there is a lot of choice. Personally I picked sqllite-man but there are many others. We install using:

sudo apt-get install sqliteman

We start the application sqliteman. A gui pops up.

sqliteman sqliteman

Press File > Open > user.db. It appears like not much has changed, do not worry, this is just the user interface. On the left is a small tree view, press Tables > users. The full table including all records will be showing now.

sqliteman sqliteman

This GUI can be used to modify the records (data) in the table and to add new tables.

Related course: Master SQL Databases with Python

The SQL database query language


SQL has many commands to interact with the database. You can try the commands below from the command line or from the GUI:

sqlite3 user.db 
SELECT * FROM Users;
SELECT count(*) FROM Users;
SELECT name FROM Users;
SELECT * FROM Users WHERE id = 2;
DELETE FROM Users WHERE id = 6;

We can use those queries in a Python program:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sqlite3 as lite
import sys


con = lite.connect('user.db')

with con:

cur = con.cursor()
cur.execute("SELECT * FROM Users")

rows = cur.fetchall()

for row in rows:
print row

This will output all data in the Users table from the database:

$ python get.py 
(1, u'Michelle')
(2, u'Sonya')
(3, u'Greg')

Creating a user information database

We can structure our data across multiple tables. This keeps our data structured, fast and organized. If we would have a single table to store everything, we would quickly have a big chaotic mess. What we will do is create multiple tables and use them in a combination. We create two tables:

Users:

SQL Table SQL Table

Jobs:

SQL Table SQL Table

To create these tables, you can do that by hand in the GUI or use the script below:

# -*- coding: utf-8 -*-

import sqlite3 as lite
import sys

con = lite.connect('system.db')

with con:

cur = con.cursor()
cur.execute("CREATE TABLE Users(Id INT, Name TEXT)")
cur.execute("INSERT INTO Users VALUES(1,'Michelle')")
cur.execute("INSERT INTO Users VALUES(2,'Howard')")
cur.execute("INSERT INTO Users VALUES(3,'Greg')")

cur.execute("CREATE TABLE Jobs(Id INT, Uid INT, Profession TEXT)")
cur.execute("INSERT INTO Jobs VALUES(1,1,'Scientist')")
cur.execute("INSERT INTO Jobs VALUES(2,2,'Marketeer')")
cur.execute("INSERT INTO Jobs VALUES(3,3,'Developer')")

The jobs table has an extra parameter, Uid. We use that to connect the two tables in an SQL query:

SELECT users.name, jobs.profession FROM jobs INNER JOIN users ON users.ID = jobs.uid

You can incorporate that SQL query in a Python script:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sqlite3 as lite
import sys


con = lite.connect('system.db')

with con:

cur = con.cursor()
cur.execute("SELECT users.name, jobs.profession FROM jobs INNER JOIN users ON users.ID = jobs.uid")

rows = cur.fetchall()

for row in rows:
print row

It should output:

$ python get2.py
(u'Michelle', u'Scientist')
(u'Howard', u'Marketeer')
(u'Greg', u'Developer')

You may like: Databases and data analysis

python mysql

python postgresql

In this article you will learn how to use the PostgreSQL database with Python. PostgreSQL is an relational database management system (rdbms). PostgreSQL supports foreign keys, joins, views, triggers, stored procedures and much more.

Related course
Master SQL Databases with Python

Installation

Install PostgreSQL
For this tutorial you will need the PostgreSQL dbms and the psycopg2 module.
On an Ubuntu system you can install the PostgreSQL database system with this command:


sudo apt-get install postgresql postgresql-contrib

Test if the PostgreSQL database system is up and running with this command:


sudo /etc/init.d/postgresql status

python-database-postgresql PostreSQL database status

If you do not see the above screen, try one of these commands:


sudo service postgresql start
sudo service postgresql restart

Install psycopg2
Psycopg is a PostgreSQL database adapter for Python.
This command installs the module:


sudo apt-get install python-psycopg2

We create a database and database user (also called a role)


sudo -u postgres createuser -D -A -P pythonspot
sudo -u postgres createdb -O pythonspot testdb

Reload the database:


sudo /etc/init.d/postgresql reload

Python PostgreSQL

Create table and Insert data
Run this program :


#!/usr/bin/python
# -*- coding: utf-8 -*-

import psycopg2
import sys


con = None

try:
con = psycopg2.connect("host='localhost' dbname='testdb' user='pythonspot' password='password'")
cur = con.cursor()
cur.execute("CREATE TABLE Products(Id INTEGER PRIMARY KEY, Name VARCHAR(20), Price INT)")
cur.execute("INSERT INTO Products VALUES(1,'Milk',5)")
cur.execute("INSERT INTO Products VALUES(2,'Sugar',7)")
cur.execute("INSERT INTO Products VALUES(3,'Coffee',3)")
cur.execute("INSERT INTO Products VALUES(4,'Bread',5)")
cur.execute("INSERT INTO Products VALUES(5,'Oranges',3)")
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()

print 'Error %s' % e
sys.exit(1)

finally:
if con:
con.close()

It will create a database table (this datastructure holds the data). Data is inserted into the table using:


cur.execute("INSERT INTO Products VALUES(5,'Oranges',3)")

The line below is mandatory, it executes all sql queries:


con.commit()

pgadmin3 The program pgadmin3 can be used to graphically explore the database. In this example we see the data we have inserted with the program.

Read data
Data can be read using the SELECT SQL query. A list is returned for every row:


#!/usr/bin/python
# -*- coding: utf-8 -*-

import psycopg2
import sys


con = None

try:
con = psycopg2.connect("host='localhost' dbname='testdb' user='pythonspot' password='password'")
cur = con.cursor()
cur.execute("SELECT * FROM Products")

while True:
row = cur.fetchone()

if row == None:
break

print("Product: " + row[1] + "\t\tPrice: " + str(row[2]))

except psycopg2.DatabaseError, e:
if con:
con.rollback()

print 'Error %s' % e
sys.exit(1)

finally:
if con:
con.close()

Update data
PostgreSQL table data can be updated with this code:


#!/usr/bin/python
# -*- coding: utf-8 -*-

import psycopg2
import sys


con = None

try:
con = psycopg2.connect("host='localhost' dbname='testdb' user='pythonspot' password='password'")
cur = con.cursor()
cur.execute("UPDATE Products SET Price=%s WHERE Id=%s", (10, 4))
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()

print 'Error %s' % e
sys.exit(1)

finally:
if con:
con.close()

Delete data
Delete data from a PostgreSQL table with this code:


#!/usr/bin/python
# -*- coding: utf-8 -*-

import psycopg2
import sys


con = None

try:
con = psycopg2.connect("host='localhost' dbname='testdb' user='pythonspot' password='password'")
cur = con.cursor()
cur.execute("DELETE FROM Products WHERE Id=" + str(4))
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()

print 'Error %s' % e
sys.exit(1)

finally:
if con:
con.close()

sqlalchemy orm

An object relational mapper maps a relational database system to objects. If you are unfamiliar with object orientated programming, read this tutorial first. The ORM is independent of which relational database system is used. From within Python, you can talk to objects and the ORM will map it to the database. In this article you will learn to use the SqlAlchemy ORM.

What an ORM does is shown in an illustration below:

ORM Object Relational Mapping ORM Object Relational Mapping. We communicate with the database using the ORM and only use Python objects and classes.


Related course:

Creating a class to feed the ORM
We create the file tabledef.py. In this file we will define a class Student. An abstract visualization of the class below:

class Class definition

Observe we do not define any methods, only variables of the class. This is because we will map this class to the database and thus won’t need any methods.

This is the contents of tabledef.py:

from sqlalchemy import *
from sqlalchemy import create_engine, ForeignKey
from sqlalchemy import Column, Date, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, backref

engine = create_engine('sqlite:///student.db', echo=True)
Base = declarative_base()

########################################################################
class Student(Base):
""""""
__tablename__ = "student"

id = Column(Integer, primary_key=True)
username = Column(String)
firstname = Column(String)
lastname = Column(String)
university = Column(String)

#----------------------------------------------------------------------
def __init__(self, username, firstname, lastname, university):
""""""
self.username = username
self.firstname = firstname
self.lastname = lastname
self.university = university

# create tables
Base.metadata.create_all(engine)

Execute with:

python tabledef.py

The ORM created the database file tabledef.py. It will output the SQL query to the screen, in our case it showed:

CREATE TABLE student (
id INTEGER NOT NULL,
username VARCHAR,
firstname VARCHAR,
lastname VARCHAR,
university VARCHAR,
PRIMARY KEY (id)
)

Thus, while we defined a class, the ORM created the database table for us. This table is still empty.

Inserting data into the database
The database table is still empty. We can insert data into the database using Python objects. Because we use the SqlAlchemy ORM we do not have to write a single SQL query. We now simply create Python objects that we feed to the ORM. Save the code below as dummy.py

import datetime
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from tabledef import *

engine = create_engine('sqlite:///student.db', echo=True)

# create a Session
Session = sessionmaker(bind=engine)
session = Session()

# Create objects
user = Student("james","James","Boogie","MIT")
session.add(user)

user = Student("lara","Lara","Miami","UU")
session.add(user)

user = Student("eric","Eric","York","Stanford")
session.add(user)

# commit the record the database
session.commit()

Execute with:

python dummy.py

The ORM will map the Python objects to a relational database. This means you do not have any direct interaction from your application, you simply interact with objects. If you open the database with SQLiteman or an SQLite database application you’ll find the table has been created:

Data in database table. Data in database table.

Query the data
We can query all items of the table using the code below. Note that Python will see every record as a unique object as defined by the Students class. Save the code as demo.py

import datetime
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from tabledef import *

engine = create_engine('sqlite:///student.db', echo=True)

# create a Session
Session = sessionmaker(bind=engine)
session = Session()

# Create objects
for student in session.query(Student).order_by(Student.id):
print student.firstname, student.lastname

On execution you will see:

James Boogie
Lara Miami
Eric York

To select a single object use the filter() method. A demonstration below:

import datetime
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from tabledef import *

engine = create_engine('sqlite:///student.db', echo=True)

# create a Session
Session = sessionmaker(bind=engine)
session = Session()

# Select objects
for student in session.query(Student).filter(Student.firstname == 'Eric'):
print student.firstname, student.lastname

Output:

Eric York

Finally, if you do not want the ORM the output any of the SQL queries change the create_engine statement to:

engine = create_engine('sqlite:///student.db', echo=False)