Outline
You have a web scraper setup that pulls in tons of data, but what do you do with that data. You could process it right away and then discard it. Often times though, you will need historical data to build a more complete picture. SQLITE3 DB's are a popular choice to store that data and allow for fast and easy querying. Let's take a look into how we can create a database, add data, and query it using SQLITE3 and Python. For this post, we will be displaying the different parts of the DB class and talking about their inner workings.
Connecting to the DB
Setup
We will need the os and sqlite3 libraries. The os
library allows us to work with file paths and the sqlite3
library allows us to interact with the database.
import os
import sqlite3
We will need a path to the database. Often it is easier to work with relative paths to the file calling the DB than an absolute path since we can then move the folder containing the Python code without worrying about the path structure.
To build a relative path for the database, we will use the os
library. The below code will create an absolute path dynamically and also create the database if the file/folder structure does not exist.
# the relative file path
path = 'db/db.sqlite3'
# get the path to the directory this script is in
scriptdir = os.path.dirname(__file__)
# add the relative path to the database file from there
db_path = os.path.join(scriptdir, path)
# make sure the path exists and if not create it
os.makedirs(os.path.dirname(db_path), exist_ok=True)
Connecting
Once we have the file path situation sorted out, connecting to the database is really quite simple. At this point, the database file can be empty.
# create the connection
con = sqlite3.connect(db_path)
# have queries return a dictionary rather than a tuple (optional)
con.row_factory = sqlite3.Row
# create a reference to the database cursur for later use
cur = con.cursor()
Connection is for committing to the database after
Insert
,Update
, andDelete
Cursor is for executing SQL queries to the database
Dis-connecting
Once you are done working with the database, you will want to close your connection.
# Close connection
con.close()
Creating tables
Alright, we now have a database connection
and cursor
but no tables. Let's create a User
table with an id
as the primary key and name
as the other column. For more reference on SQLITE3 create statements, follow this link. We then run the query using the execute
method on the cursor.
# the SQL to create the User table
create_table_sql = """
CREATE TABLE IF NOT EXISTS User(
id integer PRIMARY KEY,
name text NOT NULL
)
"""
# run the create statement
cur.execute(create_table_sql)
Awesome! You just created your first table! Since this is straight SQL, you could add foreign key options, create triggers, etc.
Insert/Update/Delete data
The command to insert, update, and delete data is essentially the same from Python's perspective. One key piece of code we now need to use is the commit()
method on the connection. Without this, our query will not save to the database.
Example INSERT
command:
# execute the sql
cur.execute("INSERT INTO User (name) VALUES (?)", ("John",))
# commit it to the database
con.commit()
Prepared statements
When writing an SQL query, you will want to use prepared statements. Essentially, this cleans up your inserted arguments (John in the example above) to prevent SQL injection and provide safer queries. Where the argument goes is denoted by a ?
. At the end of the statement, place a tuple with your desired data in the order it should be inserted. For more details, follow this link
Querying data
Now that we placed the data into the database, we want to get it back. For this, we use the SELECT
SQL command.
# execute the sql
for row in cur.execute("SELECT * FROM User"):
print(dict(row))
# returns: {'id': 1, 'name': 'John'}
This is a simple select statement, but this could be very complex including JOIN
, ORDER BY
, prepared arguments, and much more!
Bringing it all together
Now that you have a basic understanding of how to create and use SQLITE3 databases in Python, try adding one to your current project! Learning by doing is one of the best ways to cement an idea. Below is a complete example from the snippets above:
import os
import sqlite3
""" DB PATH """
# the relative file path
path = 'db/db.sqlite3'
# get the path to the directory this script is in
scriptdir = os.path.dirname(__file__)
# add the relative path to the database file from there
db_path = os.path.join(scriptdir, path)
# make sure the path exists and if not create it
os.makedirs(os.path.dirname(db_path), exist_ok=True)
""" DB CONNECTION """
# create the connection
con = sqlite3.connect(db_path)
# have queries return a dictionary rather than a tuple (optional)
con.row_factory = sqlite3.Row
# create a reference to the database cursur for later use
cur = con.cursor()
""" CREATE TABLE """
# the SQL to create the User table
create_table_sql = """
CREATE TABLE IF NOT EXISTS User(
id integer PRIMARY KEY,
name text NOT NULL
)
"""
# run the create statement
cur.execute(create_table_sql)
""" INSERT DATA """
# execute the sql
cur.execute("INSERT INTO User (name) VALUES (?)", ("John",))
# commit it to the database
con.commit()
""" GET DATA """
# execute the sql
for row in cur.execute("SELECT * FROM User"):
print(dict(row))
""" CLOSE CONNECTION """
con.close()
If you are looking for a small SQLITE3 wrapper, I have made a Python class called DB
that wraps the create, alter, select, insert, update, and delete statements into a convenient package. This handles the database connection and cursor/commits plus includes a function to execute arbitrary SQL. Give it a try! The source code is here on GitHub.
Comments
Login to Add comments.