Fixing Database Connections in Django
If you’re looking to get better performance from your Django apps you can check out Pro Django, PostgreSQL High Performance, or read some my earlier posts on Postgres Performance. All of these are of course good things to do – you can also start by correcting an incredibly common but also painful performance issue, that until 1.6 is unaddressed in Django.
Django’s current default behavior is to establish a connection for each request within a Django application. In many cases any particularly in distributed cloud environments this is a large time sink of your response time. An example application running on Heroku shows a typical connection time of 70ms. A large part of this time is the SSL negotiation that occurs in connecting to your database, which is a good practice to ensure security of your data. Regardless, this is a long time in simply establishing a connection. As a point of comparisson its commonly encourage that most queries to your database are under 10ms.
An example that highlights this in a small lightweight application shows the bulk of a request time being within a connection displayed by New Relic:
One option to remedy this is by running a connection pooler on your Database side such as Pgpool or PgBouncer. In fact Ask the Pony already highlighted these potential gains. While running an external DB they’re essentially testing the benefits of conncetion pooling. This is an obvious gain and can be in a much more lightweight format.
Connection Pooling in Django
As Django establishes a connection on each request it has an opportunity to both pool connections and persist connections. There are two major options for pooling, each works quite well with Django and provides some dramatic improvements. While the first request may take the 70ms of connection time, subsequent requests show absolutely no connection time since the connection already exists. This is highlighed by these two comparissons of before and after in actually the times it grabs a connection:
Clearly theres plenty of value to having a persistent connection or a pool within Django itself. As of today theres a few options for that:
Django-PostgresPool
The first Django-PostgresPool is created by kennethreitz. As in general I’d encourage the use of dj_database_url you can easily begin using his package (once installed) with:
import dj_database_url
DATABASE = { 'default': dj_database_url.config() }
DATABASES['default']['ENGINE'] = 'django_postgrespool'
An important thing to note is if you’re using South you’ll also want to setup the adapter for it:
SOUTH_DATABASE_ADAPTERS = {
'default': 'south.db.postgresql_psycopg2'
}
djorm-ext-pool
The second option djorm-ext-pool is created by niwibe. Once you’ve installed djorm-ext-pool
you then add it to your INSTALLED_APPS
within your settings.py
. From here then you can setup your pool:
DJORM_POOL_OPTIONS = {
"pool_size": 20,
"max_overflow": 0
}
django-db-pool
The third and final option is django-db-pool. You can set it up with:
DATABASES = {'default': dj_database_url.config()}
DATABASES['default']['ENGINE'] = 'dbpool.db.backends.postgresql_psycopg2'
DATABASES['default']['OPTIONS'] = {
'MAX_CONNS': 10
}
Gotchas
Each of these does work with recent versions of Django, though in some cases there are gotchas. If using a prodution worthy python web server such as Gunicorn or uwsgi and running with gevent or eventlet some edge cases can present themselves. Regardless of potential gotchas it is worth attempting this and of course providing feedback to maintainers and the community as you find those.
The future
Django more recently has directly started to address these issues of large costs of establishing a connection. The first major step here is this patch from Aymeric. You can find more dicussion around this particular patch here. Essentially with this patch which will hit in Django 1.6 developers then get a persistent connection which will help reduce the time. If you’re interested in trying the 1.6 master you can do this by adding it to your requirements.txt as:
https://github.com/django/django/archive/master.zip
At this point it does not introduce pooling which could allow even more gains, though I’m sure if there’s enough need it’ll be on a roadmap at some point. Though, as it stands today before 1.6 your best bet is one of the above options.