Create a database engine¶

SQLite database
- simple and fast

SQLAlchemy
- Works with many Relational Database Management Systems
  - databasetype:///schema_name.extention

from sqlalchemy import create_engine
engine = create_engine('sqlite:///test.sqlite')

table_names = engine.table_names()
table_names

[]

example: Chinook Database¶

http://chinookdatabase.codeplex.com/
- the Chinook database contains information about a semi-fictional digital media store in which media data is real and customer, employee and sales data has been manually created.

# Import necessary module
from sqlalchemy import create_engine

# Create engine: engine
engine = create_engine('sqlite:///Chinook_Sqlite.sqlite')

# Save the table names to a list: table_names
table_names = engine.table_names()

# Print the table names to the shell
print(table_names)

[u'Album', u'Artist', u'Customer', u'Employee', u'Genre', u'Invoice', u'InvoiceLine', u'MediaType', u'Playlist', u'PlaylistTrack', u'Track']

The Hello World of SQL Queries!¶

SELECT, in order to retrieve all columns of the table Album in the Chinook database. Recall that the query SELECT * selects all columns.

# Import packages
from sqlalchemy import create_engine
import pandas as pd

# Create engine: engine
engine = create_engine('sqlite:///Chinook_Sqlite.sqlite')

# Open engine connection
con = engine.connect()

# Perform query: rs
rs = con.execute('select * from Album')

# Save results of the query to list: ll
ll = rs.fetchall()


# Close connection
con.close()

# Print head of list
print len(ll),type(ll)
ll[:3]

347 <type 'list'>

[(1, u'For Those About To Rock We Salute You', 1),
 (2, u'Balls to the Wall', 2),
 (3, u'Restless and Wild', 2)]

# Save results of the query to DataFrame
df=pd.DataFrame(ll)
df.head(3)

Customizing the Hello World of SQL Queries¶

Select specified columns from a table;

Select a specified number of rows;

Import column names from the database table.

# Open engine in context manager
# Perform query and save results to DataFrame: df
with engine.connect() as con:
    rs = con.execute("SELECT LastName, Title FROM Employee")
    df = pd.DataFrame(rs.fetchmany(size=3))
    
    # Using the rs object, set the DataFrame's column names to the corresponding names of the table columns.
    df.columns = rs.keys()

# Print the length of the DataFrame df
print(len(df))

# Print the head of the DataFrame df
df

3

Filtering your database records using SQL's WHERE¶

# Create engine: engine
engine = create_engine('sqlite:///Chinook_Sqlite.sqlite')

# Open engine in context manager
# Perform query and save results to DataFrame: df
with engine.connect() as con:
    rs = con.execute('select * from Employee where EmployeeID >= 6')
    df = pd.DataFrame(rs.fetchall())
    df.columns = rs.keys()

# Print the head of the DataFrame df
print df.shape
df.head(1)

(3, 15)

Ordering your SQL records with ORDER BY¶

# Create engine: engine
engine = create_engine('sqlite:///Chinook_Sqlite.sqlite')

# Open engine in context manager
with engine.connect() as con:
    rs = con.execute("SELECT * FROM Employee ORDER BY BirthDate")
    df = pd.DataFrame(rs.fetchall())

    # Set the DataFrame's column names
    df.columns = rs.keys()

# Print head of DataFrame
print df.shape
df.head(3)

(8, 15)

Querying relational databases directly with pandas¶

df = pd.read_sql_query("queries", engine)

# Import packages
from sqlalchemy import create_engine
import pandas as pd

# Create engine: engine
engine = create_engine('sqlite:///Chinook_Sqlite.sqlite')

# Execute query and store records in DataFrame: df
df = pd.read_sql_query("SELECT * FROM Album", engine)
print df.shape

(347, 3)

compare

# Open engine in context manager
# Perform query and save results to DataFrame: df1
with engine.connect() as con:
    rs = con.execute("SELECT * FROM Album")
    df1 = pd.DataFrame(rs.fetchall())
    df1.columns = rs.keys()

# Confirm that both methods yield the same result: does df = df1 ?
print(df.equals(df1))

True

Pandas for more complex querying¶

You'll build a DataFrame that contains the rows of the Employee table for which the EmployeeId is greater than or equal to 6 and you'll order these entries by BirthDate.

# Import packages
from sqlalchemy import create_engine
import pandas as pd

# Create engine: engine
engine = create_engine('sqlite:///Chinook_Sqlite.sqlite')

# Execute query and store records in DataFrame: df
df = pd.read_sql_query(
    "SELECT * FROM Employee WHERE EmployeeId >= 6 ORDER BY BirthDate",
    engine
)

# Print head of DataFrame
print df.shape
df

(3, 15)

INNER JOIN¶

# Open engine in context manager
# Perform query and save results to DataFrame: df
with engine.connect() as con:
    rs = con.execute("SELECT Title, Name FROM Album INNER JOIN Artist on Album.ArtistID = Artist.ArtistID")
    df = pd.DataFrame(rs.fetchall())
    df.columns = rs.keys()

# Print head of DataFrame df
print df.shape
df.head(3)

(347, 2)

pandas filter¶

# Execute query and store records in DataFrame: df
df = pd.read_sql_query(
    "SELECT * FROM PlaylistTrack INNER JOIN Track on PlaylistTrack.TrackId = Track.TrackId WHERE Milliseconds < 250000",
    engine
)

# Print head of DataFrame
print df.shape
df.head(3)

(4073, 11)

	LastName	Title
0	Adams	General Manager
1	Edwards	Sales Manager
2	Peacock	Sales Support Agent

	EmployeeId	LastName	FirstName	Title	ReportsTo	BirthDate	HireDate	Address	City	State	Country	PostalCode	Phone	Fax	Email
0	4	Park	Margaret	Sales Support Agent	2.0	1947-09-19 00:00:00	2003-05-03 00:00:00	683 10 Street SW	Calgary	AB	Canada	T2P 5G3	+1 (403) 263-4423	+1 (403) 263-4289	margaret@chinookcorp.com
1	2	Edwards	Nancy	Sales Manager	1.0	1958-12-08 00:00:00	2002-05-01 00:00:00	825 8 Ave SW	Calgary	AB	Canada	T2P 2T3	+1 (403) 262-3443	+1 (403) 262-3322	nancy@chinookcorp.com
2	1	Adams	Andrew	General Manager	NaN	1962-02-18 00:00:00	2002-08-14 00:00:00	11120 Jasper Ave NW	Edmonton	AB	Canada	T5K 2N1	+1 (780) 428-9482	+1 (780) 428-3457	andrew@chinookcorp.com

	EmployeeId	LastName	FirstName	Title	ReportsTo	BirthDate	HireDate	Address	City	State	Country	PostalCode	Phone	Fax	Email
0	8	Callahan	Laura	IT Staff	6	1968-01-09 00:00:00	2004-03-04 00:00:00	923 7 ST NW	Lethbridge	AB	Canada	T1H 1Y8	+1 (403) 467-3351	+1 (403) 467-8772	laura@chinookcorp.com
1	7	King	Robert	IT Staff	6	1970-05-29 00:00:00	2004-01-02 00:00:00	590 Columbia Boulevard West	Lethbridge	AB	Canada	T1K 5N8	+1 (403) 456-9986	+1 (403) 456-8485	robert@chinookcorp.com
2	6	Mitchell	Michael	IT Manager	1	1973-07-01 00:00:00	2003-10-17 00:00:00	5827 Bowness Road NW	Calgary	AB	Canada	T3B 0C5	+1 (403) 246-9887	+1 (403) 246-9899	michael@chinookcorp.com

	PlaylistId	TrackId	TrackId	Name	AlbumId	MediaTypeId	GenreId	Composer	Milliseconds	Bytes	UnitPrice
0	1	3390	3390	One and the Same	271	2	23	None	217732	3559040	0.99
1	1	3392	3392	Until We Fall	271	2	23	None	230758	3766605	0.99
2	1	3393	3393	Original Fire	271	2	23	None	218916	3577821	0.99

Manipulate database -2 SQLAlchemy PostgreSQL, filtering, grouping, aggregating funcs To Pandas DataFrame and plotting with matplotlib and seaborn

SQLAlchemy connecting to PostgreSQL,

filtering,

grouping, aggregating funcs

To Pandas DataFrame

and plotting with matplotlib and seaborn

Manipulate database basic -1 queries and SQLAlchemy

create engine

connect

queries

SQLAlchemy

Add Anaconda new Python environment & add to jupyter notebook kernel

database python connection basic Sqlalchemy & Pandas

Create a database engine¶

example: Chinook Database¶

The Hello World of SQL Queries!¶

Customizing the Hello World of SQL Queries¶

Filtering your database records using SQL's WHERE¶

Ordering your SQL records with ORDER BY¶

Querying relational databases directly with pandas¶

Pandas for more complex querying¶

INNER JOIN¶

pandas filter¶

Mysql database common commands

	0	1	2
0	1	For Those About To Rock We Salute You	1
1	2	Balls to the Wall	2
2	3	Restless and Wild	2

	Title	Name
0	For Those About To Rock We Salute You	AC/DC
1	Balls to the Wall	Accept
2	Restless and Wild	Accept

Data Science Notebook

Manipulate database -2 SQLAlchemy PostgreSQL, filtering, grouping, aggregating funcs To Pandas DataFrame and plotting with matplotlib and seaborn

SQLAlchemy connecting to PostgreSQL,

filtering,

grouping, aggregating funcs

To Pandas DataFrame

and plotting with matplotlib and seaborn

Manipulate database basic -1 queries and SQLAlchemy

create engine

connect

queries

SQLAlchemy

Add Anaconda new Python environment & add to jupyter notebook kernel

database python connection basic Sqlalchemy & Pandas

Create a database engine¶

example: Chinook Database¶

The Hello World of SQL Queries!¶

Customizing the Hello World of SQL Queries¶

Filtering your database records using SQL's WHERE¶

Ordering your SQL records with ORDER BY¶

Querying relational databases directly with pandas¶

Pandas for more complex querying¶

INNER JOIN¶

pandas filter¶

Mysql database common commands