Thursday, November 12, 2009

Getting up to speed with django

I finally tracked down why my experience of django has been really slow, when supposedly one of its qualities is speed. So it is meant to be faster than a lot of LAMP frameworks, eg. Symfony, and also Ruby on the Rails, so certainly better than a non-rdbms centric platform like zope/plone - yet I was getting really terrible performance.

Initially with a standard Apache set up I was getting 10-20 secs/req ! ... gak unusable what was wrong?
Simply adding KeepAlive Off to the apache conf had a radical effect. Now I was getting 3-4 secs/req.
So pretty terrible but not totally unusable. Things were a bit better under mod_wsgi so 2-3 secs/req.

However the real nub of the issue was Oracle, the backend was not using pooled connection so each page was renegotiating an Oracle connection. With our distributed Oracle server network that could take 2-3 secs (ie around 90% of the request time).

I investigated pyorapool and it doubled performance, but had issues and it seemed overkill to run something more suited to pooling connections across servers.

So I went back to a post about a pooled oracle backend by Taras Halturin, which I initially couldnt get working. Having fixed it for current django 1.1.1 I have packaged it up in our repo as ilrtdjango.oracle_pool I guess if I add some extra features to it such as logging etc, I should contact Taras about punting it up on pypi.

This really deals with the problem by wrapping up cx_Oracle's own connection pooling. Now I am seeing 0.4 secs/req on average, with cached pages down to 0.1 secs.
There is a lot more that could be done with better fragment caching and use of memcache rather than a simple file system backend. But for the moment I am happy that django really is fast, ie. the framework can do 10 req/sec out of the box. With good queries and materialised views for remote databases, using a pooled backend a few Oracle queries should at most halve that speed, so a 5 req/sec uncached baseline.

This is a much better performance point to work from for database applications, and if I need the sort of 50 req/sec performance of a high traffic site then I have all the caching headroom to use (e.g. memcache or a web cache frontend).

No comments:

Post a Comment

Site code, Google Apps integration and design - Ed Crewe 2011