● Before querying your database, you’ll want to know what is in it: what the tables are, for example:
from sqlalchemy import create_engine
engine = create_engine('sqlite:///census.sqlite')
print engine.table_names()
connection = engine.connect()
import pandas as pd
df = pd.read_csv('all_050_in_36.P1.csv')
df.info()
print df.shape
df.head(3)
df.to_sql('census',connection)
engine.table_names()
connection.execute('SELECT * FROM census LIMIT 10').fetchall()
connection.execute('SELECT COUNT(*) FROM cencus LIMIT 10').fetchall()
● Reflection reads database and builds SQLAlchemy Table objects
Your job is to reflect the census table available on your engine into the census variable.
from sqlalchemy import MetaData, Table
metadata = MetaData()
# Reflect census table from the engine: census
census = Table('census', metadata, autoload=True, autoload_with=engine)
# Print census table metadata
print(repr(census))
census.columns.keys() would return a list of column names of the census table.
use the metadata container to find out more details about the reflected table such as the columns and their types.
# Print the column names
print census.columns.keys()
# Print full table metadata
print(repr(metadata.tables['census']))
● Select, Insert, Update & Delete data
● Create & Alter data
● SELECT column_name FROM table_name
● SELECT pop2008 FROM People
● SELECT * FROM People
from sqlalchemy import create_engine
engine = create_engine('sqlite:///census.sqlite')
engine = create_engine('sqlite:///census_nyc.sqlite')
stmt = 'SELECT * FROM census'
result_proxy = connection.execute(stmt)
results = result_proxy.fetchall()
print type(results)
print type(results)
print results[0]
print type(results[0])
print
print results[0].keys()
print results[0].NAME
● Provides a Pythonic way to build SQL statements
● Hides differences between backend database types
from sqlalchemy import Table, MetaData, select
metadata = MetaData()
census = Table('census', metadata, autoload=True, autoload_with=engine)
stmt = select([census])
results = connection.execute(stmt).fetchall()
print type(stmt)
print stmt
print type(results), type(results[0]), len(results)
● Requires a list of one or more Tables or Columns
● Using a table will select all the columns in it
first = results[0]
type(first)
print dir(first)
print first.keys()
first['POP100'], first['NAME']
first[9], first['NAME']
Recall the differences between a ResultProxy and a ResultSet:
ResultProxy: The object returned by the .execute() method. It can be used in a variety of ways to get the data returned by the query.
ResultSet: The actual data asked for in the query when using a fetch method such as .fetchall() on a ResultProxy.
This separation between the ResultSet and ResultProxy allows us to fetch as much or as little data as we desire.