jbalogh RSS

Jeff Balogh
Oct
20th
Tue
permalink

Highlights from DjangoCon 2009

Long (long!) overdue, here’s a bunch of links that point to the good things I learned at djangocon.

Restful Ponies

  • Started with a basic REST overview, gets interesting when it shows how you can easily expose resources using Piston
  • django-roa builds on Piston to give models a basic REST API for free
  • remoteobjects is an object-restational mapper that maps objects to REST apis on the web. It’s really cool and I’m playing with it for the new Bugzilla rest view.

Deploying Django

Talks about how they’re doing “repeatable, automated, isolated” deployments using Python tools. Don’t you wish we were rocking this on AMO?

  • virtualenv and pip to keep the deployment environment isolated
  • fabric to script vcs checkout and installation
  • using lightweight tags in git to mark deployment versions
  • up and down migrations with South to stay safe
  • mod_wsgi daemon mode is preferred, fastcgi is cool too

Scaling Django

Pownce was serving hundreds of request/sec, thousands of db ops/sec, it can be done. It’s simple to do automatic caching and invalidation when your queries are going through the ORM, Django’s signals decouple the invalidation process from your update code. Using multiple databases with Django is not straightforward, but it’s getting easier with an SoC project that’s ready to merged in.

The Realtime Web

Showed how they created an IRC client in the browser over comet with Django on the backend. The browser and the app keep “persistent” connections to an orbited server that handles all the messy details so the app can pretend it’s really connected directly to the browser.

This coincided with the release of FriendFeed’s Tornado web server, which led to an interesting focus on async during the conference.

Using Django in Non-Standard Ways

Covers a lot of little hurdles that people might consider show-stoppers when using Django, how to overcome them.

WSGI middleware is fun:

  • repoze.bitblt: automatically scales images
  • repoze.squeeze: Merges JS/CSS automatically based on statistical analysis
  • repoze.profile: Aggregates Python profiling data across all requests, and provides an html frontend for viewing the data
  • repoze.slicer: extract/filter pieces of an html response

Pinax Tutorial

Pinax tries to be a collection of reusable apps that work well together, but it looks like you spend more time trying to configure things than actually making useful apps.

Comments (View)
May
23rd
Sat
permalink

The worst schema versioning system, ever?

schematic talks to your database over stdin on the command line. Thus, it supports all DBMSs that have a command line interface and doesn’t care what programming language you worship. Win!

It only looks for files in the same directory as itself so you should put this script, settings.py, and all migrations in the same directory.

Configuration is done in settings.py, which should look something like:

# How to connect to the database
db = 'mysql --silent -p blam -D pow'
# The table where version info is stored.
table = 'schema_version'

It’s python so you can do whatever crazy things you want, and it’s a separate file so you can keep local settings out of version control.

Migrations are just sql in files whose names start with a number, like 001-adding-awesome.sql. They’re matched against '^\d+' so you can put zeros in front to keep ordering in ls happy, and whatever you want after the migration number, such as text describing the migration.

schematic creates a table (named in settings.py) with one column, that holds one row, which describes the current version of the database. Any migration file with a number greater than the current version will be applied to the database and the version tracker will be upgraded. The migration and version bump are performed in a transaction.

The version-tracking table will initially be set to 0, so the 0th migration could be a script that creates all your tables (for reference). Migration numbers are not required to increase linearly.

schematic doesn’t pretend to be intelligent. Running migrations manually without upgrading the version tracking will throw things off.

Tested on sqlite any mysql.

NOTE: any superfluous output, like column headers, will cause an error. On mysql, this is fixed by using the --silent parameter.

Things that might be nice: downgrades, running python files.

Comments (View)
May
14th
Thu
permalink

Introducing poboy

I’d be surprised if poboy is useful to anyone I don’t work with, but I wrote a README, so that should be shared with the internet.

Finds all the gettext calls that have an inline fallback and moves that fallback into the messages.po file. Thus, you can use ___('msgid', 'msgstr') when you’re writing new code and use this script to clean up afterwards.

poboy won’t edit any code files. Instead, it prints out a unified diff that you can check for correctness and send to patch. I didn’t want to deal with rewriting files safely.

How I use it

Find all the strings that have a fallback:

poboy locale/en_US/LC_MESSAGES/messages.po --find

Find the strings with a fallback that aren’t already in messages.po:

poboy locale/en_US/LC_MESSAGES/messages.po -an

That’s -a for --add (to the .po file) and -n for --dry_run.

Show the strings that will be added and the cleanup patch:

poboy locale/en_US/LC_MESSAGES/messages.po -n

And the fun one, add the strings to messages.po and generate a cleanup patch:

poboy locale/en_US/LC_MESSAGES/messages.po > poboy.patch
Comments (View)
Mar
24th
Tue
permalink

pyquery: a jquery-like library for python

pyquery is a fantastic little library for dealing with XML and HTML documents. It brings the power and ease of jQuery into Python, letting you deal with CSS selectors and functions instead of a clunky DOM. I try to avoid dealing with XML as much as possible, but slinging around pyquery almost makes XML fun.

Building lxml

The hardest part of working with pyquery is getting it installed. pyquery gets all of its XML power from lxml, which has a reputation for being difficult. Ian Bicking mentioned that lxml2.2 has become much easier to install by providing an option to compile the troublesome C libs as static libraries, which has avoided any problems for me. All you need to do is define STATIC_DEPS=true in the build environment:

STATIC_DEPS=true pip install pyquery

This has worked for me on OS X with pip, easy_install, buildout, and probably anything else based on distutils.

Web Scraping

Web scraping is ridiculously easy with pyquery. Grabbing a Shakespearean insult from the web is as simple as

import pyquery

p = pyquery.PyQuery('http://www.pangloss.com/seidel/Shaker/')
insult = p('font').text()

Finding the insult on that page is aided by the author’s semantic font tag.

Testing

I like to make sure that my views are working correctly, another task in which I’m finding pyquery indispensable. I’ve seen regexen used for the same task, but examining a real DOM is much more resilient than trying to pick out pieces by matching strings. Testing views is especially useful when dealing with template systems like Django’s and Jinja’s which silently hide errors instead of raising exceptions.

assert d('#stats').text() == '5 tests: +2 -3'

I’ve noticed that testing the HTML in this manner has improved my semantic markup. Pulling out and testing pieces of the page forces me to add meaningful ids and classes to the elements.

Bonus

For extra HTML goodness, the tests submit response pages to the w3c Validator using this multipart form encoder. Then, of course, I use pyquery to make sure all is well.

validator = post_multipart('validator.w3.org', '/check',
                           {'fragment': response.data})
assert pyquery.PyQuery(validator)('#congrats')
Comments (View)
Nov
2nd
Sun
permalink

Nose Test Runner for Django

Update: you can now find django-nose on pypi and github with much better documentation.

I am not a big fan of Python’s unittest library. The Java-inspired API and the difficulty of running tests are too much for me to deal with. That’s why I love nose: I can use regular asserts (or the Pythonic helpers in nose.tools) and running all my tests is as simple as calling nosetests from the command line. On top of that, nose also supports cool plugins like generating coverage reports and running tests interactively, test fixtures at any granularity level, and simple selection of tests to run, making me a happy tester.

Which is why I wrote a custom test runner as soon as I started working on basie. Django provides its own test runner framework, but it’s far less advanced than nose.

I haven’t packaged it up for PyPI yet, but you can download nose_runner.py from our repository. Here’s the documentation:


Django test runner that invokes nose.

Usage:
    ./manage.py test DJANGO_ARGS -- NOSE_ARGS

The 'test' argument, and any other args before '--', will not be passed
to nose, allowing django args and nose args to coexist.

You can use

    NOSE_ARGS = ['list', 'of', 'args']

in settings.py for arguments that you always want passed to nose.
Comments (View)