A database is an abstraction on top of an operating system's file system to ease creating, reading, updating, and deleting persistent data.
At a high level web applications store data and present it to users in a useful way. For example, Google stores data about roads and provides directions to get from one location to another by driving through the Maps application. Driving directions are possible because the data is stored in a structured way.
Databases make structured storage reliable and fast. They also give you a mental framework for how the data should be saved and retrieved instead of having to figure out what to do with the data every time you build a new application.
The database storage abstraction most commonly used in Python web development is sets of relational tables. Alternative storage abstractions are explained in the NoSQL section of this guide.
Relational databases store all data in a series of tables. Interconnections between the tables are specified as foreign keys.
Databases storage implementations vary in complexity. SQLite, a database included with Python, creates a single file for all data per database. Other databases such as Oracle, PostgreSQL, and MySQL have more complicated persistence schemes while offering additional advanced features that are useful for web application data storage.
PostgreSQL and MySQL are two of the most common open source databases for storing Python web application data.
SQLite is a database that is stored in a single file on disk. SQLite is built into Python but is only built for access by a single connection at a time. Therefore is highly recommended to not run a production web application with SQLite.
PostgreSQL is the recommended relational database for working with Python web applications. PostgreSQL's feature set, active development and stability contribute to its usage as the backend for millions of applications live on the Web today.
This post on using PostgreSQL with Django or Flask is a great quickstart guide for either framework.
PostgreSQL Weekly is a weekly newsletter of PostgreSQL content from around the web.
Braintree wrote about their experiences scaling PostgreSQL. The post is an inside look at the evolution of Braintree's usage of the database.
This post estimates the costs of a PostgreSQL connection.
There is no such thing as total security but this IBM article covers hardening a PostgreSQL database.
Craig Kerstiens wrote a detailed post about understanding PostgreSQL performance.
Handling growth with Postgres provides 5 specific tips from Instagram's engineering team on how to scale the design of your PostgreSQL database.
Inserting And Using A New Record In Postgres shows some SQL equivalents to what many developers just do in their ORM of choice.
Following a Select Statement Through Postgres Internals provides a fascinating look into the internal workings of PostgreSQL during a query.
This article explains how and why PostgreSQL can handle full text searching for many use cases.
If you're just getting started with PostgreSQL here are 10 beginner tasks you should know how to execute.
The title's a bit presumptuous but here's a useful list of 7 PostgreSQL data migration hacks you should be using, but aren't.
This guide to PostgreSQL monitoring is handy for knowing what to measure and how to do it.
While you can use a graphical interface for working with PostgreSQL, it's best to spend some time getting comfortable with the command-line interface.
MySQL is another viable open source database backend option for Python web applications. MySQL has a slightly easier initial learning curve than PostgreSQL. The database is deployed in production at some of the highest trafficked sites such as Twitter, Facebook and many others major organizations. However, since the company focused on MySQL development, MySQL AB, was purchased by Sun Microsystems (which was in turn purchased by Oracle), there have been major defections away from the database by Wikipedia and Google. MySQL remains a viable database option but I always recommend new Python developers learn PostgreSQL if they do not already know MySQL.
28 Beginner's Tutorials for Learning about MySQL Databases is a curated collection on various introductory MySQL topics.
This tutorial shows how to install MySQL on Ubuntu.
Terrible Choices: MySQL is a blog post about specific deficiencies in MySQL's implementation that hinder its usage with Django's ORM.
Graph Data From MySQL Database in Python is an interesting study with code of how to pull data out of MySQL and graph the data with Plotly.
Pinterest open sourced many of their MySQL tools to manage instances of the database.
To work with a relational database using Python, you need to use a code library. The most common libraries for relational databases are:
SQLite support is built into Python 2.7+ and therefore a separate library is not necessary. Simply "import sqlite3" to begin interfacing with the single file-based database.
Object-relational mappers (ORMs) allow developers to access data from a backend by writing Python code instead of SQL queries. Each web application framework handles integrating ORMs differently. There's an entire page on object-relational mapping (ORMs) that you should read to get a handle on this subject.
Numerous companies run scalable database servers as a hosted service. Hosted databases can often provide automated backups and recovery, tightened security configurations and easy vertical scaling, depending on the provider.
Amazon Relational Database Service (RDS) provides pre-configured MySQL and PostgreSQL instances. The instances can be scaled to larger or smaller configurations based on storage and performance needs.
Google Cloud SQL is a service with managed, backed up, replicated, and auto-patched MySQL instances. Cloud SQL integrates with Google App Engine but can be used independently as well.
BitCan provides both MySQL and MongoDB hosted databases with extensive backup services.
How does a relational database work? is a detailed longform post on the sorting, searching, merging and other operations we often take for granted when using an established relational database such as PostgreSQL.
Why I Love Databases is a great read on the CAP Theorem, distributed systems and other topics that are at the core of database theory and implementation. Well worth the time to read.
DB-Engines ranks the most popular database management systems.
DB Weekly is a weekly roundup of general database articles and resources.
A different view provides some perspective on the impedance mismatch between ORMs and traditional SQL queries.
Databases integration testing strategies covers a difficult topic that comes up on every real world project.
PostgreSQL vs. MS SQL Server is one perspective on the differences between the two database servers from a data analyst.
Install PostgreSQL on your server. Assuming you went with Ubuntu run
sudo apt-get install postgresql.
Make sure the psycopg2 library is in your application's dependencies.
Configure your web application to connect to the PostgreSQL instance.
Create models in your ORM, either with Django's built-in ORM or SQLAlchemy with Flask.
Build your database tables or sync the ORM models with the PostgreSQL instance, if you're using an ORM.
Start creating, reading, updating and deleting data in the database from your web application.
Searching for a complete, step-by-step deployment walkthrough? Learn more about The Full Stack Python Guide to Deployments book.
Searching for a complete, step-by-step deployment walkthrough? Learn more about The Full Stack Python Guide to Deployments book.