Wednesday, October 29, 2014

Fixing third party Django packages for Python 3

With the release of Django 1.7 it could be argued that the balance has finally tipped towards Python 3 being its preferred platform. Well given Python 2.7 is the last 2.* then its probably time we all thought about moving to Python 3 for our Django deployments.

Problem is those pesky third party package developers, because unless you are determined wheel reinventor (unlikely if you use Django!) - you are bound to have a range of third party eggs in your Django sites. As one of those pesky third party developers myself, it is about time I added Python 3 compatibility to my Django open source packages.

There are a number of resources related to porting Python from 2 to 3, including specifically for Django, but hopefully this post may still prove useful as a summarised approach for doing it for your Django projects or third party packages. Hopefully it isn't too much work and if you have been writing Python as long as me, it may also get you out of any legacy syntax  habits you have.

So lets get started, first thing is to set up Django 1.7 with Python 3
For repeatable builds we want pip and virtualenv - if they are not there.
For a linux platform such as Ubuntu you will have python3 installed as standard (although not yet the default python) so if you just add pip3 that lets you add the rest ...

Install Python 3 and Django for testing

sudo apt-get install python3-pip
(OR sudo easy_install3 pip)
sudo pip3 install virtualenv

So now you can run virtualenv with python3 in addition to the default python (2.*)

virtualenv --python=python3 myenv3
cd myenv3
bin/pip install django

Then add a src directory for putting the egg in you want to make compatible with Python 3 so an example from git (of course you can do this as one pip line if the source is in git)

mkdir src
git clone src/django-pesky
bin/pip install -e src/django-pesky

Then run the django-pesky tests (assuming nobody uses an egg without any tests!)
so the command to run pesky's test may be something like the following ...

bin/ test pesky.tests --settings=pesky.settings
One rather disconcerting thing that you will notice with tests is that the default assertEqual message is truncated in Python 3 where it wasn't in Python 2 with a count of the missing characters in square brackets, eg.

AssertionError: Lists differ: ['Failed to open file /home/jango/myenv/sr[85 chars]tem'] != []

Common Python 2 to Python 3 errors

And wait for those errors. The most common ones are:

  1. print statement without brackets
  2. except Error as err (NOT except Error, err)
  3. File open and file methods differ.
    Text files require better quality encoding - so more files default to bytes because strings in Python 3 are all stored in unicode
    (On the down side this may need more work for initial encoding clean up *,
    but on the plus side functional errors due to bad encoding are less likely to occur)
  4. There is no unicode() method in Python 3 since all strings are now unicode - ie. its become str() and hence strings no longer need the u'string' marker 
  5. Since unicode is not available as a method, it is not used for Django models default representation. Hence just using
    def __str__(self):
    is the future proofed method. I actually found that models with __unicode__ and __str__ methods may not return any representation, rather than the __str__ one being used, as one might assume, in Django 1.7 and Python 3
  6. dictionary has_key has gone, must use in (if key in dict)

* I found more raw strings were treated as bytes by Python 3 and these then required raw_string.decode(charset) to avoid them going into the database string (eg. varchar) fields as pseudo-bytes, ie. strings that held 'élément' as '\xc3\xa9l\xc3\xa9ment' rather than bytes, ie. b'\xc3\xa9l\xc3\xa9ment'

Ideally you will want to maintain one version but keep it compatible with Python 2 and 3,
since this is both less work and gets you into the habit of writing transitional Python :-)

Test the same code against Python 2 and 3

So to do that you want to be running your tests with builds in both Pythons.
So repeat the above but with virtualenv --python=python2 myenv2
and just symlink the src/django-pesky to the Python 2 src folder.

Now you can run the tests for both versions against the same egg code -
and make sure when you fix for 3 you don't break for 2.

For current Django 1.7 you would just need to support the latest Python 2.7
and so the above changes are all compatible except for use of unicode() and how you call open().

Version specific code

However in some cases you may need to write code that is specific to 2 or 3.
If that occurs you can either use the approach of latest or anything else (cross fingers)

    latest version compatible code (e.g. Python 3 - Django 1.7)
    older version compatible code (e.g. Python 2 - Django < 1.7)

Or you can use specific version targetting ...

import sys, django
django_version = django.get_version().split('.')

if sys.version_info['major'] == 3 and django_version[1] == 7:
    latest version
elif sys.version_info['major'] == 2 and django_version[1] == 6:
    older django version
    older version

where ...

django.get_version() -> '1.6' or '1.7.1'
sys.version_info() -> {'major':3, 'minor':4, 'micro':0, 'releaselevel':'final', 'serial':0}


So how did I get on with my first egg, django-csvimport ? ... it actually proved quite time consuming since the csv.reader library was far more sensitive to bad character encoding in Python 3 and so a more thorough manual alternative had to be implemented for those important edge cases - which the tests are aimed to cover. After all if a CSV file is really well encoded and you already have a model for it - it hardly needs a pesky third party egg for CSV imports - just a few django shell lines using the csv library will do the job.