Django’s only prerequisite is a working installation of Python. However, this book focuses on one of Django’s sweet spots, which is developing database-backed Web sites, so you’ll need to install a database server of some sort, for storing your data. If you just want to get started playing with Django, skip ahead to the “Starting a Project” section—but trust us, you’ll want to install a database eventually. All of the examples in the book assume you have a database set up. As of the time of this writing, Django supports many database engines:
- PostgreSQL (http://www.postgresql.org/)
- SQLite 3 (http://www.sqlite.org/)
- MySQL (http://www.mysql.com/)
The Django Web site will always have the latest information about supported databases. SQLite is an extremely simple in-process database engine that doesn’t require any sort of server setup or configuration. On Windows, obtaining database driver binaries is sometimes an involved process.
Django includes a number of applications by default (e.g. the admin program and user management and authentication). Some of these applications make use of at least one database table, so we need to create tables in a database before we can use them. To do that, change into the mysite folder created in the last step (type cd mysite at the command prompt) and run the following command:
python manage.py migrate
The migrate command creates a new SQLite database and any necessary database tables according to the settings file created by the startproject command (more on the settings file later in the book). If all goes to plan, you’ll see a message for each migration it applies:
(env_mysite) C:\Users\Nigel\OneDrive\Documents\mysite_project\mysite>python manage .py migrate Operations to perform: Apply all migrations: admin, auth, contenttypes, sessions Running migrations: Applying contenttypes.0001_initial... OK Applying auth.0001_initial... OK Applying admin.0001_initial... OK ### several more migrations (not shown)
Django installs and configures SQLite automatically, with no input from you.
Persistent Connections
Persistent connections avoid the overhead of re-establishing a connection to the database in each request. They’re controlled by the CONN_MAX_AGE parameter which defines the maximum lifetime of a connection. It can be set independently for each database. The default value is 0, preserving the historical behavior of closing the database connection at the end of each request. To enable persistent connections, set CONN_MAX_AGE to a positive number of seconds. For unlimited persistent connections, set it to None.
Connection Management
Django opens a connection to the database when it first makes a database query. It keeps this connection open and reuses it in subsequent requests. Django closes the connection once it exceeds the maximum age defined by CONN_MAX_AGE or when it isn’t usable any longer.
In detail, Django automatically opens a connection to the database whenever it needs one and doesn’t have one already – either because this is the first connection, or because the previous connection was closed.
At the beginning of each request, Django closes the connection if it has reached its maximum age. If your database terminates idle connections after some time, you should set CONN_MAX_AGE to a lower value, so that Django doesn’t attempt to use a connection that has been terminated by the database server. (This problem may only affect very low traffic sites.)
At the end of each request, Django closes the connection if it has reached its maximum age or if it is in an unrecoverable error state. If any database errors have occurred while processing the requests, Django checks whether the connection still works, and closes it if it doesn’t. Thus, database errors affect at most one request; if the connection becomes unusable, the next request gets a fresh connection.
Since each thread maintains its own connection, your database must support at least as many simultaneous connections as you have worker threads.
Sometimes a database won’t be accessed by the majority of your views, for example because it’s the database of an external system, or thanks to caching. In such cases, you should set CONN_MAX_AGE to a low value or even 0, because it doesn’t make sense to maintain a connection that’s unlikely to be reused. This will help keep the number of simultaneous connections to this database small.
The development server creates a new thread for each request it handles, negating the effect of persistent connections. Don’t enable them during development.
When Django establishes a connection to the database, it sets up appropriate parameters, depending on the backend being used. If you enable persistent connections, this setup is no longer repeated every request. If you modify parameters such as the connection’s isolation level or time zone, you should either restore Django’s defaults at the end of each request, force an appropriate value at the beginning of each request, or disable persistent connections.
Encoding
Django assumes that all databases use UTF-8 encoding. Using other encodings may result in unexpected behavior such as value too long errors from your database for data that is valid in Django. See the database specific notes below for information on how to set up your database correctly.
MySQL
Django supports MySQL 5.5 and higher. Django’s inspectdb feature uses the information_schema database, which contains detailed data on all database schemas. Django expects the database to support Unicode (UTF-8 encoding) and delegates to it the task of enforcing transactions and referential integrity. It is important to be aware of the fact that the two latter ones aren’t actually enforced by MySQL when using the MyISAM storage engine.
Storage Engines – MySQL has several storage engines. You can change the default storage engine in the server configuration. Until MySQL 5.5.4, the default engine was MyISAM. The main drawbacks of MyISAM are that it doesn’t support transactions or enforce foreign-key constraints. On the plus side, it was the only engine that supported full-text indexing and searching until MySQL 5.6.4.
Since MySQL 5.5.5, the default storage engine is InnoDB. This engine is fully transactional and supports foreign key references. It’s probably the best choice at this point. However, note that the InnoDB autoincrement counter is lost on a MySQL restart because it does not remember the AUTO_INCREMENT value, instead recreating it as max(id)+1. This may result in an inadvertent reuse of AutoField values.
If you upgrade an existing project to MySQL 5.5.5 and subsequently add some tables, ensure that your tables are using the same storage engine (i.e. MyISAM vs. InnoDB). Specifically, if tables that have a ForeignKey between them use different storage engines, you may see an error like the following when running migrate:
_mysql_exceptions.OperationalError: ( 1005, “Can’t create table ‘\\db_name\\.#sql-4a8_ab’ (errno: 150)”)
MySQL DB API Drivers – The Python Database API is described in PEP 249. MySQL has three prominent drivers that implement this API:
- MySQLdb is a native driver that has been developed and supported for over a decade by Andy Dustman.
- mysqlclient is a fork of MySQLdb which notably supports Python 3 and can be used as a drop-in replacement for MySQLdb. At the time of this writing, this is the recommended choice for using MySQL with Django.
- MySQL Connector/Python is a pure Python driver from Oracle that does not require the MySQL client library or any Python modules outside the standard library.
All these drivers are thread-safe and provide connection pooling. MySQLdb is the only one not supporting Python 3 currently.
In addition to a DB API driver, Django needs an adapter to access the database drivers from its ORM. Django provides an adapter for MySQLdb/mysqlclient while MySQL Connector/Python includes its own.
postgreSQL
Django supports PostgreSQL 9.0 and higher. It requires the use of Psycopg2 2.0.9 or higher. Django needs the following parameters for its database connections:
- client_encoding: ‘UTF8’
- default_transaction_isolation: ‘read committed’ by default, or the value set in the connection options
- timezone: ‘UTC’ when USE_TZ is True, value of TIME_ZONE otherwise.
If these parameters already have the correct values, Django won’t set them for every new connection, which improves performance slightly. You can configure them directly in postgresql.conf or more conveniently per database user with ALTER ROLE.
Django will work just fine without this optimization, but each new connection will do some additional queries to set these parameters.
Like PostgreSQL itself, Django defaults to the READ COMMITTED isolation level. If you need a higher isolation level such as REPEATABLE READ or SERIALIZABLE, set it in the OPTIONS part of your database configuration in DATABASES:
Under higher isolation levels, your application should be prepared to handle exceptions raised on serialization failures. This option is designed for advanced uses.
When specifying db_index=True on your model fields, Django typically outputs a single CREATE INDEX statement. However, if the database type for the field is either varchar or text (e.g., used by CharField, FileField, and TextField), then Django will create an additional index that uses an appropriate PostgreSQL operator class for the column. The extra index is necessary to correctly perform lookups that use the LIKE operator in their SQL, as is done with the contains and startswith lookup types.
Creating Your Database
You can create your database using the command-line tools and this SQL:
CREATE DATABASE <dbname> CHARACTER SET utf8;
This ensures all tables and columns will use UTF-8 by default.
Collation Settings
The collation setting for a column controls the order in which data is sorted as well as what strings compare as equal. It can be set on a database-wide level and also per-table and per-column. This is documented thoroughly in the MySQL documentation. In all cases, you set the collation by directly manipulating the database tables; Django doesn’t provide a way to set this on the model definition.
By default, with a UTF-8 database, MySQL will use the utf8_general_ci collation. This results in all string equality comparisons being done in a case-insensitive manner. That is, “Fred” and “freD” are considered equal at the database level. If you have a unique constraint on a field, it would be illegal to try to insert both “aa” and “AA” into the same column, since they compare as equal (and, hence, non-unique) with the default collation.
In many cases, this default will not be a problem. However, if you really want case-sensitive comparisons on a particular column or table, you would change the column or table to use the utf8_bin collation. The main thing to be aware of in this case is that if you are using MySQLdb 1.2.2, the database backend in Django will then return bytestrings (instead of Unicode strings) for any character fields it receives from the database. This is a strong variation from Django’s normal practice of always returning Unicode strings.
It is up to you, the developer, to handle the fact that you will receive bytestrings if you configure your table(s) to use utf8_bin collation. Django itself should mostly work smoothly with such columns (except for the contrib.sessions Session and contrib.admin LogEntry tables described below), but your code must be prepared to call django.utils.encoding.smart_text() at times if it really wants to work with consistent data – Django will not do this for you (the database backend layer and the model population layer are separated internally so the database layer doesn’t know it needs to make this conversion in this one particular case).
If you’re using MySQLdb 1.2.1p2, Django’s standard CharField class will return Unicode strings even with utf8_bin collation. However, TextField fields will be returned as an array.array instance (from Python’s standard array module). There isn’t a lot Django can do about that, since, again, the information needed to make the necessary conversions isn’t available when the data is read in from the database. This problem was fixed in MySQLdb 1.2.2, so if you want to use TextField with utf8_bin collation, upgrading to version 1.2.2 and then dealing with the bytestrings (which shouldn’t be too difficult) as described above is the recommended solution.
Should you decide to use utf8_bin collation for some of your tables with MySQLdb 1.2.1p2 or 1.2.2, you should still use utf8_general_ci (the default) collation for the django.contrib.sessions.models.Session table (usually called django_session) and the django.contrib.admin.models.LogEntry table (usually called django_admin_log). Please note that according to MySQL Unicode Character Sets, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. If this is acceptable for your application, you should use utf8_general_ci because it is faster. If this is not acceptable (for example, if you require German dictionary order), use utf8_unicode_ci because it is more accurate.
Connecting to the Database
Connection settings are used in this order:
- OPTIONS.
- NAME, USER, PASSWORD, HOST, PORT
- MySQL option files.
In other words, if you set the name of the database in OPTIONS, this will take precedence over NAME, which would override anything in a MySQL option file. Here’s a sample configuration which uses a MySQL option file:
# settings.py
DATABASES = {
‘default’: {
‘ENGINE’: ‘django.db.backends.mysql’,
‘OPTIONS’: {‘read_default_file’: ‘/path/to/my.cnf’,},
}
}
# my.cnf
[client]database = NAME
user = USER
password = PASSWORD
default-character-set = utf8
Several other MySQLdb connection options may be useful, such as ssl, init_command, and sql_mode.
Using Django Without a Database – As mentioned earlier, Django doesn’t actually require a database. If you just want to use it to serve dynamic pages that don’t hit a database, that’s perfectly fine. With that said, bear in mind that some of the extra tools bundled with Django do require a database, so if you choose not to use a database, you’ll miss out on those features.