Content-type: matter-transport/sentient-life-form

Django Paste - Ian is starting to integrate Django with paste (and paste deploy). I for one will most definitely try to support that, so his list of related tickets is already down by one. Paste deploy might even be taken as the future default FCGI/SCGI solution - because it uses the same FLUP lib, it is as capable as my scripts, but due to the structure of Paste, installation should be much easier (and might even be standard in the future with Python hosters).

MoinMoin Release 1.5 - wow, the new MoinMoin looks really slick.

code.enthought.com - Enthought Tool Suite - sounds like an interesting GUI library that builds on WxPython and enables even more comfortable application development. Particularly interesting is the use of the "Traits" concept for the automatic creation of interfaces.

Codeville - and yet another version control system, this one is written in Python and specifically addresses the problem of merge algorithms.

cucumber2: an object-relational mapping system for Python and PostgreSQL - another ORM for Python. Special feature here: PostgreSQL table inheritance is used to make the transitions between objects and classes easier. However, it is also not portable to other databases.

LGT: Lightweight Game Toolkit for Python - particularly interesting are the NanoThreads (coroutines for Python), the EventNet (enhanced event programming) and Gherkin (an alternative to Pickle/Marshal). There is now an enhanced successor to NanoThreads and EventNet called FibraNet (which is independent of LGT).

Webstemmer - HTML-Grabber that extracts the actual core text from websites based on the layout.

simple_json 1.0 - Alternative to json.py with fewer quirks (therefore with a sociopath as a programmer - but you can't have everything. In this case, functional code is more important than friendly tone).

Dejavu - Trac - another ORM for Python. This one is characterized by absurd class names (Arena, Sandbox, Units ...)

appscript - Python as an alternative to AppleScript. Thus, application control via the AppleScript interfaces directly from Python programs.

Generic Functions with Python

PEAK has been offering generic functions similar to CLOS for Python for quite some time. I always wanted to play around with it, but for a long time it was just part of PyProtocols, and the installation was a bit tricky. However, since September of this year, it has been decoupled and much easier to install. So I dove right in.

And I must say: wow. What Phillip J. Eby has accomplished is truly fantastic. The integration with Python (works from Python 2.3 - he even invented his own implementation of decorators for Python 2.3) is superb, even if, of course, some things take a bit of getting used to.

A small example:

import dispatch

[dispatch.generic()]
def anton(a,b):
 "handle two objects"

[anton.when('isinstance(a,int) and isinstance(b,int)')]
def anton(a,b):
 return a+b

[anton.when('isinstance(a,str) and isinstance(b,str)')]
def anton(a,b):
 return a+b

[anton.when('isinstance(a,str) and isinstance(b,int)')]
def anton(a,b):
 return a*b

[anton.when('isinstance(a,int) and isinstance(b,str)')]
def anton(a,b):
 return b*a

[anton.before('True')]
def anton(a,b):
 print type(a), type(b)

This small example simply provides a function called 'anton', which executes different code based on the parameter types. The example is of course completely nonsensical, but it shows some important properties of generic functions:

Generic functions are - unlike classic object/class methods - not bound to any classes or objects. Instead, they are selected based on their parameter types.
Parameter types must therefore be defined - this usually happens via a mini-language with which the selection conditions are formulated. This is also the only syntactic part that I don't like so much: the conditions are stored as strings. However, the integration is very good, and you get clean syntax errors already when loading.
A generic function can be overloaded with any conditions - not just the first parameter is decisive. Conditions can also make decisions based on values - any arbitrary Python expression can be used there.
With method combinations (methods are the concrete manifestations of a generic function here), you can modify a method before or after its call without touching the code itself. The example uses a before method that is always (hence the 'True') used to generate debugging output. Of course, you can also use conditions with before/after methods to attach to specific manifestations of the call of the generic function - making generic functions a full-fledged event system.

A pretty good article about RuleDispatch (the generic functions package) can be found at Developerworks.

The example, by the way, shows the Python 2.3 syntax for decorators. With Python 2.4, of course, the @ syntax can also be used. One disadvantage should not be kept secret: the definition of generic functions and their methods is not possible interactively - at least not with the Python 2.3 syntax. Unfortunately, you generally have to work with external definitions in files here.

RuleDispatch will definitely find a place in my toolbox - the syntax is simple enough, the possibilities, however, are gigantic. As an event system, it surpasses any other system in flexibility, and as a general way of structuring code, it comes very close to CLOS. It's a shame that Django will likely align with PyDispatch - in my opinion, RuleDispatch would fit much better (as many aspects in Django could be written as dispatch on multiple parameter types).

Is Rails a DSL? What is a DSL, and is it possible in Python? - Domain Specific Languages - a quite useful description and examination of the situation in Python and Ruby.

Python Cheese Shop : python-fastcgi 1.0 - FastCGI implementation based on the OpenMarket FastCGI C library and therefore significantly faster than pure Python solutions.

Python OpenID 1.0.1 Released — OpenID Enabled - OpenID client and server in Python. I should check it out, could be quite interesting for comment functions.

How-To Guide for Descriptors - a very good explanation of how properties work in Python and what the magic methods get, set, and del are all about (and how getattribute plays into this).

Jacobian.org : Django performance tips - Jacob, one of the Django Core-Devs, writes about performance tuning for Django applications. Strongly aligns with my experiences.

SystemExit and exception handlers

Frequently used: SystemExit. A Python exception that many people don't know. The special thing about this exception: it is not an error. It also does not occur unexpectedly. It is simply triggered by sys.exit. The idea behind this is that you can insert an end processing in the dynamic flow (e.g. some file cleanups), without linking into global exit processing (with all the problems that entails).

The problem is that many programs and libraries install a global exception handler. One that catches every error and sends it nicely formatted by mail, logs it somewhere or something similar. I do this all the time. It also works great - except when you actually want to initiate an early end in your program. Then nothing works anymore - because you get corresponding errors for a non-error.

This becomes particularly critical in connection with multiple processes. If you start a process during operation, you also want to terminate it without executing any subsequent code. You can best see this in an example program:

import signal
import os

try:
 pid = os.fork()
 if pid:
 print "Elternprozess", os.getpid()
 else:
 print "Kindprozess", os.getpid()
 sys.exit(0)
except:
 print 'Fehler aufgetreten in Prozess', os.getpid()

print "Das darf nur der Elternprozess ausführen", os.getpid()

This code simply has a global error handler that catches errors in a rather unspecific way. Within the code, a parallel process is started with fork. However, since SystemExit is treated like all other exceptions, the child process is not terminated correctly - a process copies the entire state of the parent process, including return addresses, open error handling, files, database connections and so on.

This is of course fatal - because here sys.exit is caught. So there is an error message for the quite normal sys.exit(0) call. And even worse: since SystemExit is not treated separately, it continues normally afterwards - and the child process runs into code for the parent process. Code runs double, which can have critical results under certain circumstances.

If you can fully control the entire software stack, the solution is simple:

import signal
import os

try:
 pid = os.fork()
 if pid:
 print "Elternprozess", os.getpid()
 else:
 print "Kindprozess", os.getpid()
 sys.exit(0)
except SystemExit:
 raise
except:
 print 'Fehler aufgetreten in Prozess', os.getpid()

print "Das darf nur der Elternprozess ausführen", os.getpid()

This simply re-raises the SystemExit - i.e. triggers it again - without making a message. In most cases, Python's standard handling will then kick in and convert the SystemExit into a normal termination.

But what to do if you have several stacked variants of the wrong error handling? I had something like this with Django and FLUP (the FCGI/SCGI server for Python). In Django I changed it, then the error hit in FLUP. What do you do then?

The solution is a bit more brutal:

import signal
import os

try:
 pid = os.fork()
 if pid:
 print "Elternprozess", os.getpid()
 else:
 print "Kindprozess", os.getpid()
 os.kill(os.getpid(), signal.SIGTERM)
except:
 print 'Fehler aufgetreten in Prozess', os.getpid()

print "Das darf nur der Elternprozess ausführen", os.getpid()

Ultimately, the process simply commits suicide - it sends itself a SIGTERM, i.e. a termination signal. The same one you would normally send from the shell. However, you must then ensure that any necessary post-cleanups are either already done, or then run in a SIGKILL handling routine - otherwise you may have problems (e.g. database transactions should already be committed).

With this solution, you also have to be careful that no open resources block the process - otherwise you may produce zombie processes. Often it is better for such multiprocessing to start a management process much earlier in the system - outside the error handling chain - and then use it to start processing processes. However, this then has the disadvantage that processes started in this way do not inherit the environment of the parent process. Therefore, you usually have to make more preparations to perform the desired actions. Incidentally, Apache pursues a similar approach - there the processes are created from a very early basic state, so that they come as resource-free as possible.

Vampire - An extension of mod_python that makes it more developer-friendly. For example, it can also perform automatic code reloading.

Commentary - Sticky notes for websites, implemented as WSGI middleware. Very interesting, could be particularly interesting for source views or similar, or for longer texts.

pyinotify - very nice, finally a usable wrapper for the notify function in Linux. With it, Python programs can be informed about changes in the file system - ideal for directory monitoring.

akismet.py - Python interface for the (central) Akismet Spam Scanner.

Louie - a new event dispatching module for Python. Builds on PyDispatcher.

SQLAlchemy README - another ORM for Python, heavily oriented towards SQL and offering a lot of magical syntax. Fascinating how in this area programmers try to abuse every language feature just to avoid writing SQL ...

Again something from the crafting front

Content-type: matter-transport/sentient-life-form - for those who want to get a taste of where my blog is headed. Not quite finished yet, some bugs in my software, a few things waiting for patches in Django, but overall I'm already quite satisfied.

Another OPML server...

Phil Pearson does it again - this time he has reimplemented the community server for Dave Winer's OPML editor in Python (previously he had rebuilt the Radio Community Server, a project I was also briefly involved in). In any case, you can now also publish your OPML editor on your own Linux machine if you have Python and SCGI available there.

JobControl - Django Projects - Trac - a simple job control system for Django, with which you can set up background jobs.

Weird Python 2.3 Bug

Some bugs you chase are really strange. Just take a look at the following Python script:


 import re

r = re.compile('^', re.M)

src = '''<html> <head> <title>Logviewer</title> </head> <body> <div> <h1>Titel</h1> </div> <div> {{}}
 {% block content %}
 {% endblock %}
 </div> </body> </html> '''

for match in r.finditer(src):
 print match.start()

Looks quite harmless - it just returns the positions of the newlines (yes, I know, you do this differently - the source is not mine). The script has an infinite loop on the last, closing newline under Python 2.3. If you remove it (i.e., paste the """ directly behind the last tag without a line break), the script works. Under Python 2.4, both variants work. And you have to chase after things like that...

Do I really need to emphasize that this little snippet of code was hidden in a larger pile of code, or?

Closures python,scheme,ruby - a good explanation of the somewhat faulty lookups for lexical variables in Python (at least when an assignment is involved in an inner scope).

Routes 1.0 Released - this is the Python version of the URL routes from Ruby-on-Rails. Very interesting, I must sit down at some point and see if I can't build this into Django as an alternative URL dispatcher.

Dejavu - Trac - another Object-Relational-Mapper for Python. Sounds quite interesting in some points.

Some things annoy me terribly

For example, if umlauts are not processed cleanly - as with pre_populate_from in Django. Therefore, I no longer use this in my CMSProject, but simply fill the slug in _pre_save. And then let a corresponding routine run there. Although this is not really perfect, but at least usable ...

And yes, this is a test post for the function to create a slug from a title with umlauts.

sql relay is a SQL connection pool that can serve various databases and handles client connections to the database via a central pool. Ideal in multi-host environments and when the connection load is too high (e.g., Django generates a connection per request).

coverage is a tool for creating coverage reports - which parts of a program were executed and which were not. Useful as a supplement for unit tests to ensure that the unit tests also cover all areas of the code.

A Test Framework for Django

DjangoTesting is part of my DjangoStuff project and is the start of a testing framework for Django, modeled after the testing framework that Ruby on Rails provides. Currently only model tests are implemented, request/response tests are planned.

The testing framework is built solely on unittest and django, so you don't need additional modules (besides my DjangoStuff project, of course). It provides python-based fixture notations (fixtures are just python classes with attributes in a DATA subclass) and a basic command line utility to make use of those tests and fixtures.

Tests and fixtures are stored in applications and projects, so you can have application specific tests (especially usefull with generic applications) and project-level tests that will integrate stuff over several applications.

I think a good testing framework would really be important for Django applications, especially for applications that should be shared between projects. But I do think that a good testing framework needs some banging on, too - so I started it as a small subproject on my own. But if it grows into something useful, I will opt for inclusion into Django trunk.

Case/When/Otherwise for Django

If you have any evil plans for a switch statement for Django (hia rjwittams! ), you might want to look into my TagLib. There is a case/when/otherwise statement in there. It's quite easy to use:

{% case variable %}
{% when "value1" %}
{% endwhen %}
{% when "value2" %}
{% endwhen %}
{% otherwise %}
{% endotherwise %}
{% endcase %}

The reason for the tag structure is that the django template parser only looks for parameterless block-closing tags in the parsefor function and so you can't just pull an easy one like this:

{% if condition %}
{% elif condition %}
{% else %}
{% endif %}

You would have to copy over much from the template parser to get a parsefor that looks for a token with a tag and parameters to close the current block.

So I opted for the scoped tags approach where the "case" tag only sets up a context variable "case" and populates it with a dictionary with "value" and "fired" - with the latter one a trigger that can be fired by any "when" tag to prevent other "when" tags or the "otherwise" tag to fire themselves. A bit ugly, but working.

Adhoc-Organization in CM-Systems

Adhoc organization is what I named the basic design decisions for my new content management system (blog system, personal wiki, digital image shoebox - whatever). It's coming along nicely, even though up to now I only used it as a sample application to make use of my little tools from the DjangoStuff pseudo-project. And it still is one of the best ways to see how tagging or searching or the new calendar tag or other stuff is used.

But it's coming along so good that I think I will be able to change over some sites in the near future. The basic design decisions are somewhat documented in the linked document in my trac-wiki. The main objective for me is to get something that I can use as easy for image presentation as for text presentation and that allows me to really integrate both parts. So that articles really can consist of a multitude of media and text.

It's quite fun to work on a project where you tear down the model and rebuild part of it from time to time, or make major refactoring decisions that leave you with a broken heap of python-bullshit for a while

cucumber2 is a very interesting Object-Relational-Mapper for Python and PostgreSQL, which also supports table inheritance in PostgreSQL.

Django Project - a very nice web framework that I use here.

"Fitting on" some framework

How do you know wether a framework fits your style of thinking? It's not as if you could just look into a mirror wether it suits you nicely, you need other ways to decide that. One way to decide it is productivity - how fast you get your project up and running.

But does that really tell you the whole story? What if the project would have been something completely different? Did you just hit the sweet spot of the framework? Where you just lucky?

One way to decide wether some framework, language or tool fits my style of working for me is to look at the basic abstractions this tool gives me. And to look how I can use them and how naturally they fit my thinking - do I stumble on problems, not immediately knowing what abstraction to use, what tool to pull? Or do things just fall in place?

I discovered quite early on that I am a bit uncommon in programming, in that I don't build my own abstractions and try to translate from them into what the language or framework gives me, but that I start to think directly in the abstractions and syntaxes given to me - but only if they match my way.

So that's for me the ultimate measurement of wether a framework really fits into my thinking: checking from time to time wether I try to do translations or wether stuff just flows. Reaching "the flow" is what it's all about for me nowadays.

So how does Django match up? Quite nicely. It really gives me what I need in most cases, there are only very few areas where "the flow" is broken, where I need to think around problems, start to do translations. One area is special behaviour of entry fields - this curently is done in Django with parameterized instances of predefined field classes. There is no really nice way to do subclassing, you end up copying code from other parts of the django source - definitely breaking "the flow".

But most other parts just fall into place: middleware for global management of the request-response span. Template loaders for - well - template loading (yes, it's not a big deal - but being able to write your own template loader really is helpfull). The urlpatterns -hey, that's really a cool idea, because of it's absolutely loose couplying you don't even try to model your urls after your code structure, but tend to design them. And that's how it should be.

Models just powerfull enough to really move the model-related functionality there (although the class MODULE stuff will make it even nicer, especially the kind of ugly module_globals thingy). It would be cool if model classes would support mixin classes, so that abstract apps could provide stuff that just would be referenced by users to add functionality. But you can solve many of those problems with generated classes - thanks to python introspection (although you need to know a bit about djangos model-magic).

Most complex stuff tends to go into template tags and generic views - my CMS project currently only has 3 view functions of it's own, the rest is abstracted away into generic views (for searching and tagging). Template tags could be a bit easier to write, especially the parser is too primitive - a library of helper functions for easy deconstructing the tag string would be good (hey, maybe I write one, the basics arealready in my SVN repository).

Template filters are a big of an ugly duckling - they don't see the request context, so they can't do much more than just take the incoming object and some constant parameters. I think they should get the context passed in, so that they could be a bit smarter, if needed (like allowing filters to resolve a parameter against the context).

Generic views are quite nice, too - even though I don't use the predefined ones that often. The main reason is that more often than not I end up in wrapping the generic views in some code that modifies their behaviour - and then it's quite often simpler to just roll my own. But they are great for first starts into areas, just tack them into your project and funcitonality is available. You can allways exchange them with your own view functions if you discover that you need to.

And the admin, the one thing that makes Django stand out in the crowd? In my first play-projects I loved it, in later ones I didn't use it (the Gallery doesn't need it), but with the CMS project I did the first one that makes really heavy use of it. And I have to say, I like it. It should get a bit more flexibility (the new_admin branch might help there, as it moves more stuff into templates, so they can be overridden), but overall it's really cool and usefull.

Two things, though, are definitely needed for the admin: full transaction support bound to request-response (ticket #9 in the django trac), because changing stuff and ending up with inconsistent tables is no fun. Like getting an exception because something broke in repr , so the log entry isn't written, but the object is written. Of course you don't notice it, go back, send again, end up with two objects and still no log message ...

The other thing that is needed: basic hooks for object-based authentication. Not a full blown ACL or anything like that, just some really simple hooks from the admin to the model that the user can define to tell the admin wether some object should be editable or should only be shown readonly. The main problem with the current solution is, it only handles full tables - you can't even tell the admin that some user can only work on the current site and can't change objects of other sites (my CMS project makes heavy use of the multi-site capability in Django - one admin server should manage multiple sites in one admin interface).

But all in all webapp building with Django is real fun. It's not only productive to me, it just feels natural to do things the Django way. So, yes, Django fits my thinking style. Seems to have hit home right on.

Markdown for Django

Django already includes a markdown filter (in contrib.markup), but I nonetheless rolled my own Markdown for Django mini-app. The main benefits are link integration with django models (by using generic model queries and get absolute url), a documentation generic view that handles language switching and a nice macro facility for markdown. Macros are a usefull way to extend markdown by writing Django template snippets that are called whenever the users calls the makro in his markdown source.

It was formerly part of the CMS project, but I think it's usefull in it's own and so much better put into the stuff pseudo-project.

Twisted Book is out

Those who can't easily squeeze their brains into the twisted world of Twisted might find help in Twisted Network Programming Essentials - a new book on what is probably the most powerful internet protocol platform for Python.

akaDAV - Lightweight WebDAV server and python module is a WebDAV module for Twisted. With it, you can build your own WebDAV server. Could be useful for me, because then I can run it under user rights, instead of under the rights of the web server ...

python webdav server is another WebDAV server for Python - not updated since 2000, but if it works, it might be sufficient. Perhaps more understandable than Twisted code.

generic search service for Django

If your Django application needs searching capabilities, you can roll your own. Or you can use my generic search view. This provides a parser for queries and a search machinery that is suiteable for moderate database sizes. It provides an extensible google-like syntax.

The main problem is that django doesn't support OR query combinations and that it doesn't support "icontainsnot" queries. So the search engine does multiple selects to get one query. It starts with the longest search word and goes down in size from that result set, restricting it from one step to the next. But since it needs to keep the last result set in memory (at least the list of IDs), if your database contains too much rows, this might pose problems to your server (especially if the users do silly queries that produce large resultsets).

Maybe in future this will learn some optimizations to make it work better with larger databases, but it's quite fine as a search engine for your blog or standard content management systems.

very simple view functions

Sometimes you have a bunch of really simple view functions in your django project (yes, this is for your, bitprophet! ). View functions that are actually not more than just a render to response call - take a template, take some data from the request, stuff them in and render a response. It's rather boring to write them down and it breaks the DRY principle. So what to do? Write your own generic view.

from django.core.extensions \
 import render_to_response

def simple_view(request, template, **kwargs):
 return render_to_response(
 template, kwargs)

That's all. A simple and nice view function that just does that - render a template. It even can be fed with context variables from the urlpattern. Use it like this in your urlconf:

urlpatterns = patterns('',
(r'^page/(?P<arg>.*)/$', 'cool.simple_view',
 {'template': 'app/mytemplate'}),
)

That way a /page/foo/ view would be routed to the 'app/mytemplate' template with a context that just includes the variable 'arg' with the value 'foo'. And you never will need to write those simple_view functions again. For extra spices you could throw in a context_instance = DjangoContext(request) into the render to response call to even get the authenticated user and stuff like that from the request.

Module Hacking for Django

Django itself constructs model modules dynamically from your model classes. That's what I used in my first take at the abstract tagging application. Now I found a better way in the current version - I can modify the dynamic module myself quite easily, generate a dynamic model class and push that into the model module. What it actually does is just mimicking what happens when Python defines a class - most stuff is done by the meta.ModelBase metaclass in Django, anyway. I only had to add some module hacking stuff. Python introspection rules!

What this gives you is a much cleaner interface to create the tagrelation class for your model - just a function call, no silly subclassing or superfluous assignment. Everything happens as by magic.

It is magic.

Twisted Names I should take a look at - a DNS server in Python based on Twisted. I could rewrite it for database usage as an alternative to PowerDNS.

Tagging with Django

Since the question about how to do tagging with Django shows up quite often, I have written a small solution to this problem: AbstractTagging. This is a generic application and generic views that give you a very simple solution to add tagging to any model you have in your django apps. It's currently used by me in my CMS project. The source is in the stuff project.

It was a bit weird to build, because I had to dynamically construct a base class you can subclass in your models - this is because of the magic in django.core.meta, where model classes are turned into modules. But the result is quite nice, I think.

call of the noodle

Someone is writing a Lisp compiler for Python bytecode - very interesting, as you could use the Python libraries under a Lisp dialect. Let's see what the Lisp dialect will look like when the first release comes out and what features it will cover. With support for Lisp macros, it would be very interesting.

Using Django as a CMS

I am currently reworking one of my sites - Content-type: matter-transport/sentient-life-form. It was a Apache::MiniWiki based site before and is now in the transition to being a Django based site. The idea of the code for that site is to build a CMS based on Django that fully uses the Django admin. So the users should be able to do all management only in the admin, while the site itself behaves a bit like a Wiki. Autolinking, Autoediting of missing pages, Editlinks, Versioning (currently missing in the source) - all that should be done based on tools the Django admin provides.

This isn't for a full-blown site, though - the linked site is allmost empty, I never put much up there. It's more a project to dig deeper into the Django admin to see what it's like to work in it - so I know about that stuff when I start to build real projects.

The code itself is freely available - and there is already a nice thing in it. It's a template loader who pulls stuff from the database instead from the filesystem or from python eggs. It's "ticketed" at Django under #633, so it might make it into django.contrib some day.

python - 16.10.2005 - 13.1.2006