programmierung - 3.10.2005 - 22.11.2005

"The Whitespace Thing" for OCaml - Indentation as a syntax element (ala Python and Haskell) for OCaml. Interesting. Although OCaml already has minimal syntax overhead, so I don't really think it's necessary.

Dejavu - Trac - another Object-Relational-Mapper for Python. Sounds quite interesting in some points.

Critical Vulnerability in Content Management System Mambo

PHP is increasingly becoming a security dump:

Not always does disabling register_globals in PHP increase security. Sometimes this opens up a vulnerability. This is also the case with the Content Management System Mambo, which, according to a posting on the security mailing list Full Disclosure, contains a vulnerability that allows attackers to execute their own code on the server.

This is certainly due to the fact that there is hardly any language - apart perhaps from Perl - that carries as much cruft as PHP. The result shows itself again and again in esoteric problems that even catch people who should be prepared for such things based on their experience.

Apples WebObjects with new licensing terms

Apple has clarified the licensing issues with WebObjects - Deployment on Linux boxes is now also completely allowed. Thus, the XCode environment with WebObjects is now completely free from development to deployment.

Loss of Reality Among SAP Board Members

SAP Executive Rants Against Open Source:

Otherwise, it is important not to mess too much with the code of high-quality software programs.

Wait a minute. "High-quality software programs". He works at SAP? Where do they have any high-quality software programs? I mean, if you don't consider "high-quality" as "hopelessly overpriced junk", as he probably does?

Devil's Grin

wikiCalc - a mixture of spreadsheet and wiki. Strange. By Mr. Visicalc himself. Currently only Windows-compatible despite Perl. Well, spreadsheets fit for me with Perl and Windows - all shady stuff.

sql relay is a SQL connection pool that can serve various databases and handles client connections to the database via a central pool. Ideal in multi-host environments and when the connection load is too high (e.g., Django generates a connection per request).

coverage is a tool for creating coverage reports - which parts of a program were executed and which were not. Useful as a supplement for unit tests to ensure that the unit tests also cover all areas of the code.

A Test Framework for Django

DjangoTesting is part of my DjangoStuff project and is the start of a testing framework for Django, modeled after the testing framework that Ruby on Rails provides. Currently only model tests are implemented, request/response tests are planned.

The testing framework is built solely on unittest and django, so you don't need additional modules (besides my DjangoStuff project, of course). It provides python-based fixture notations (fixtures are just python classes with attributes in a DATA subclass) and a basic command line utility to make use of those tests and fixtures.

Tests and fixtures are stored in applications and projects, so you can have application specific tests (especially usefull with generic applications) and project-level tests that will integrate stuff over several applications.

I think a good testing framework would really be important for Django applications, especially for applications that should be shared between projects. But I do think that a good testing framework needs some banging on, too - so I started it as a small subproject on my own. But if it grows into something useful, I will opt for inclusion into Django trunk.

Selenium is a test automator for web applications. It runs directly in the browser and uses IFrames and JavaScript to hook into the page being tested.

Case/When/Otherwise for Django

If you have any evil plans for a switch statement for Django (hia rjwittams! ), you might want to look into my TagLib. There is a case/when/otherwise statement in there. It's quite easy to use:

{% case variable %}
{% when "value1" %}
{% endwhen %}
{% when "value2" %}
{% endwhen %}
{% otherwise %}
{% endotherwise %}
{% endcase %}

The reason for the tag structure is that the django template parser only looks for parameterless block-closing tags in the parsefor function and so you can't just pull an easy one like this:

{% if condition %}
{% elif condition %}
{% else %}
{% endif %}

You would have to copy over much from the template parser to get a parsefor that looks for a token with a tag and parameters to close the current block.

So I opted for the scoped tags approach where the "case" tag only sets up a context variable "case" and populates it with a dictionary with "value" and "fired" - with the latter one a trigger that can be fired by any "when" tag to prevent other "when" tags or the "otherwise" tag to fire themselves. A bit ugly, but working.

Adhoc-Organization in CM-Systems

Adhoc organization is what I named the basic design decisions for my new content management system (blog system, personal wiki, digital image shoebox - whatever). It's coming along nicely, even though up to now I only used it as a sample application to make use of my little tools from the DjangoStuff pseudo-project. And it still is one of the best ways to see how tagging or searching or the new calendar tag or other stuff is used.

But it's coming along so good that I think I will be able to change over some sites in the near future. The basic design decisions are somewhat documented in the linked document in my trac-wiki. The main objective for me is to get something that I can use as easy for image presentation as for text presentation and that allows me to really integrate both parts. So that articles really can consist of a multitude of media and text.

It's quite fun to work on a project where you tear down the model and rebuild part of it from time to time, or make major refactoring decisions that leave you with a broken heap of python-bullshit for a while

cucumber2 is a very interesting Object-Relational-Mapper for Python and PostgreSQL, which also supports table inheritance in PostgreSQL.

PostgreSQL 8.1

PostgreSQL 8.1 with Two-Phase-Commits and User Roles:

Transactions can now be prepared on multiple computers with PREPARE TRANSACTION and executed together later. If a machine fails after PREPARE, the transaction can be correctly completed with COMMIT after the restart.

Yes!

"Fitting on" some framework

How do you know wether a framework fits your style of thinking? It's not as if you could just look into a mirror wether it suits you nicely, you need other ways to decide that. One way to decide it is productivity - how fast you get your project up and running.

But does that really tell you the whole story? What if the project would have been something completely different? Did you just hit the sweet spot of the framework? Where you just lucky?

One way to decide wether some framework, language or tool fits my style of working for me is to look at the basic abstractions this tool gives me. And to look how I can use them and how naturally they fit my thinking - do I stumble on problems, not immediately knowing what abstraction to use, what tool to pull? Or do things just fall in place?

I discovered quite early on that I am a bit uncommon in programming, in that I don't build my own abstractions and try to translate from them into what the language or framework gives me, but that I start to think directly in the abstractions and syntaxes given to me - but only if they match my way.

So that's for me the ultimate measurement of wether a framework really fits into my thinking: checking from time to time wether I try to do translations or wether stuff just flows. Reaching "the flow" is what it's all about for me nowadays.

So how does Django match up? Quite nicely. It really gives me what I need in most cases, there are only very few areas where "the flow" is broken, where I need to think around problems, start to do translations. One area is special behaviour of entry fields - this curently is done in Django with parameterized instances of predefined field classes. There is no really nice way to do subclassing, you end up copying code from other parts of the django source - definitely breaking "the flow".

But most other parts just fall into place: middleware for global management of the request-response span. Template loaders for - well - template loading (yes, it's not a big deal - but being able to write your own template loader really is helpfull). The urlpatterns -hey, that's really a cool idea, because of it's absolutely loose couplying you don't even try to model your urls after your code structure, but tend to design them. And that's how it should be.

Models just powerfull enough to really move the model-related functionality there (although the class MODULE stuff will make it even nicer, especially the kind of ugly module_globals thingy). It would be cool if model classes would support mixin classes, so that abstract apps could provide stuff that just would be referenced by users to add functionality. But you can solve many of those problems with generated classes - thanks to python introspection (although you need to know a bit about djangos model-magic).

Most complex stuff tends to go into template tags and generic views - my CMS project currently only has 3 view functions of it's own, the rest is abstracted away into generic views (for searching and tagging). Template tags could be a bit easier to write, especially the parser is too primitive - a library of helper functions for easy deconstructing the tag string would be good (hey, maybe I write one, the basics arealready in my SVN repository).

Template filters are a big of an ugly duckling - they don't see the request context, so they can't do much more than just take the incoming object and some constant parameters. I think they should get the context passed in, so that they could be a bit smarter, if needed (like allowing filters to resolve a parameter against the context).

Generic views are quite nice, too - even though I don't use the predefined ones that often. The main reason is that more often than not I end up in wrapping the generic views in some code that modifies their behaviour - and then it's quite often simpler to just roll my own. But they are great for first starts into areas, just tack them into your project and funcitonality is available. You can allways exchange them with your own view functions if you discover that you need to.

And the admin, the one thing that makes Django stand out in the crowd? In my first play-projects I loved it, in later ones I didn't use it (the Gallery doesn't need it), but with the CMS project I did the first one that makes really heavy use of it. And I have to say, I like it. It should get a bit more flexibility (the new_admin branch might help there, as it moves more stuff into templates, so they can be overridden), but overall it's really cool and usefull.

Two things, though, are definitely needed for the admin: full transaction support bound to request-response (ticket #9 in the django trac), because changing stuff and ending up with inconsistent tables is no fun. Like getting an exception because something broke in repr , so the log entry isn't written, but the object is written. Of course you don't notice it, go back, send again, end up with two objects and still no log message ...

The other thing that is needed: basic hooks for object-based authentication. Not a full blown ACL or anything like that, just some really simple hooks from the admin to the model that the user can define to tell the admin wether some object should be editable or should only be shown readonly. The main problem with the current solution is, it only handles full tables - you can't even tell the admin that some user can only work on the current site and can't change objects of other sites (my CMS project makes heavy use of the multi-site capability in Django - one admin server should manage multiple sites in one admin interface).

But all in all webapp building with Django is real fun. It's not only productive to me, it just feels natural to do things the Django way. So, yes, Django fits my thinking style. Seems to have hit home right on.

The JavaScript Interactive Interpreter is a nice toy: you can enter JavaScript expressions and see the results directly. So in principle a JavaScript shell - only it runs in the browser window, of course.

Markdown for Django

Django already includes a markdown filter (in contrib.markup), but I nonetheless rolled my own Markdown for Django mini-app. The main benefits are link integration with django models (by using generic model queries and get absolute url), a documentation generic view that handles language switching and a nice macro facility for markdown. Macros are a usefull way to extend markdown by writing Django template snippets that are called whenever the users calls the makro in his markdown source.

It was formerly part of the CMS project, but I think it's usefull in it's own and so much better put into the stuff pseudo-project.

Scatha and Glaurung are two chess programs written in OpenMCL, with Cocoa support from OpenMCL. Nice examples of how to build native OS X applications with OpenMCL - and they are also interesting to play, especially the hexagonal chess version.

Twisted Book is out

Those who can't easily squeeze their brains into the twisted world of Twisted might find help in Twisted Network Programming Essentials - a new book on what is probably the most powerful internet protocol platform for Python.

akaDAV - Lightweight WebDAV server and python module is a WebDAV module for Twisted. With it, you can build your own WebDAV server. Could be useful for me, because then I can run it under user rights, instead of under the rights of the web server ...

Google's Web Accelerator and Damager

Google at it again - Ian pretty much says everything there is to say about it. Google claims they don't want to be "evil." But they are infinitely stupid, as shown by the repeated launch of the Web Damager.

What does the Web Accelerator do, and why is it such a stupid piece of software? Well, it simply follows links. And it does so in advance, before the user does - so to speak, speculative web crawling, but privately for the user. That doesn't sound so bad at first, except that servers are bombarded with traffic they might never have otherwise - because every link is followed, even if the user doesn't go there. And that multiplied by the users who use this thing...

But the traffic is not the real problem - the real problem comes when you consider the context in which this thing runs. And that is, it runs on the user's private computer, between the browser and the network. Just a little proxy of its own. Which, for its work, remembers cookies and similar things and then sends requests to the pages that look as if they come from the user's browser. With their security headers. And cookies.

Apart from the fact that I wouldn't particularly like it if my headers with passwords or session cookies appeared anywhere other than in the browser and the target server - this approach also enables the Web Accelerator to look at areas that a central crawler would not see. For example, areas of pages that are behind logins. Content management systems, where additional links appear after login. Wikis, whose edit links then appear when someone starts a session. Webmail systems, where each mail is represented as a link.

All these systems have one thing in common: for changing actions, a form submission is not always necessary. Often, it is enough to click a link. The current version of a page in the wiki to delete quickly to remove wiki spam - a simple link, only visible to the logged-in user. The mail in the webmail inbox, which is automatically marked as read when called up. The publish link in the CMS, with which a page is put live.

Of course, responsible web application programmers try to put destructive actions behind forms (and thus POST requests) so that a simple link doesn't destroy anything. But this usually only happens in the publicly accessible areas, where otherwise the web robots of the various search engines and spam automata would cause chaos.

But precisely in the areas shielded by login, one normally does not expect automated clicks - and therefore builds comfort features, because one can be sure that a link is clicked consciously and intentionally.

Well, until the Google Web Accelerator came along. From the company that claims to understand the web. Thanks a lot, you assholes.

PS: and contrary to the first version, the new version no longer sends a header with which one could recognize the prefetch requests in order to block them in such critical areas.

python webdav server is another WebDAV server for Python - not updated since 2000, but if it works, it might be sufficient. Perhaps more understandable than Twisted code.

generic search service for Django

If your Django application needs searching capabilities, you can roll your own. Or you can use my generic search view. This provides a parser for queries and a search machinery that is suiteable for moderate database sizes. It provides an extensible google-like syntax.

The main problem is that django doesn't support OR query combinations and that it doesn't support "icontainsnot" queries. So the search engine does multiple selects to get one query. It starts with the longest search word and goes down in size from that result set, restricting it from one step to the next. But since it needs to keep the last result set in memory (at least the list of IDs), if your database contains too much rows, this might pose problems to your server (especially if the users do silly queries that produce large resultsets).

Maybe in future this will learn some optimizations to make it work better with larger databases, but it's quite fine as a search engine for your blog or standard content management systems.

Version Control with SVK

Version Control with SVK is an online book about SVK - a distributed version system that works very well with SVN and CVS (among others). And it offers quite a relief especially for working with patches for upstream systems and for local forks of open source software.

The book is far from complete, but you can already find quite a lot of information in it.

very simple view functions

Sometimes you have a bunch of really simple view functions in your django project (yes, this is for your, bitprophet! ). View functions that are actually not more than just a render to response call - take a template, take some data from the request, stuff them in and render a response. It's rather boring to write them down and it breaks the DRY principle. So what to do? Write your own generic view.

from django.core.extensions \
 import render_to_response

def simple_view(request, template, **kwargs):
 return render_to_response(
 template, kwargs)

That's all. A simple and nice view function that just does that - render a template. It even can be fed with context variables from the urlpattern. Use it like this in your urlconf:

urlpatterns = patterns('',
(r'^page/(?P<arg>.*)/$', 'cool.simple_view',
 {'template': 'app/mytemplate'}),
)

That way a /page/foo/ view would be routed to the 'app/mytemplate' template with a context that just includes the variable 'arg' with the value 'foo'. And you never will need to write those simple_view functions again. For extra spices you could throw in a context_instance = DjangoContext(request) into the render to response call to even get the authenticated user and stuff like that from the request.

Module Hacking for Django

Django itself constructs model modules dynamically from your model classes. That's what I used in my first take at the abstract tagging application. Now I found a better way in the current version - I can modify the dynamic module myself quite easily, generate a dynamic model class and push that into the model module. What it actually does is just mimicking what happens when Python defines a class - most stuff is done by the meta.ModelBase metaclass in Django, anyway. I only had to add some module hacking stuff. Python introspection rules!

What this gives you is a much cleaner interface to create the tagrelation class for your model - just a function call, no silly subclassing or superfluous assignment. Everything happens as by magic.

It is magic.

Twisted Names I should take a look at - a DNS server in Python based on Twisted. I could rewrite it for database usage as an alternative to PowerDNS.

Aperture

It's been announced, and now it's here - Aperture. By Apple. The video about it is nice, and it looks very useful, what they've done. And I would even let myself be persuaded to pay the almost 500 Euros - okay, Photoshop updates would be cheaper for me, but Aperture is built with a focus on RAW and Photoshop only has a RAW importer. But what really bothers me: the hardware requirements. The programmers have lost their marbles.

Yes, photo editing needs memory - after all, it's a lot of data. And you need decently powerful hardware for using filters. And yes, a fast graphics card is useful. But the minimum requirements for Aperture are already partly beyond belief - especially since we know how these minimum requirements will work - probably as well as Mac OS X with 256 MB of memory ...

Sorry, but photo editing is not rocket science and not weather simulation - what is this completely exaggerated resource demand of the software? Have today's programmers completely forgotten how to optimize?

Man, I scanned and processed an entire film with Photoshop 5 on an Apple with 128 MB of RAM and a 275 MHz 603e CPU not so long ago. Of course, RAW images are larger - but why should a photo editing program require a dual G5? Ridiculous. Delusions of grandeur.

So I'll probably just continue working with Photoshop 7, even if the Open Dialog still crashes under Tiger. At least it works decently on my nice, old 12" Powerbook (yes, the one with 867 MHz and only 640 MB of memory). It's enough for my purposes, I don't want to shell out several thousand Euros just to be able to start the photo editing program ...

Tagging with Django

Since the question about how to do tagging with Django shows up quite often, I have written a small solution to this problem: AbstractTagging. This is a generic application and generic views that give you a very simple solution to add tagging to any model you have in your django apps. It's currently used by me in my CMS project. The source is in the stuff project.

It was a bit weird to build, because I had to dynamically construct a base class you can subclass in your models - this is because of the magic in django.core.meta, where model classes are turned into modules. But the result is quite nice, I think.

No idea if it's really The Coolest DHTML / JavaScript Calendar, but it looks quite nice. And it has a few quite important features - such as the ability to move it.

call of the noodle

Someone is writing a Lisp compiler for Python bytecode - very interesting, as you could use the Python libraries under a Lisp dialect. Let's see what the Lisp dialect will look like when the first release comes out and what features it will cover. With support for Lisp macros, it would be very interesting.

Using Django as a CMS

I am currently reworking one of my sites - Content-type: matter-transport/sentient-life-form. It was a Apache::MiniWiki based site before and is now in the transition to being a Django based site. The idea of the code for that site is to build a CMS based on Django that fully uses the Django admin. So the users should be able to do all management only in the admin, while the site itself behaves a bit like a Wiki. Autolinking, Autoediting of missing pages, Editlinks, Versioning (currently missing in the source) - all that should be done based on tools the Django admin provides.

This isn't for a full-blown site, though - the linked site is allmost empty, I never put much up there. It's more a project to dig deeper into the Django admin to see what it's like to work in it - so I know about that stuff when I start to build real projects.

The code itself is freely available - and there is already a nice thing in it. It's a template loader who pulls stuff from the database instead from the filesystem or from python eggs. It's "ticketed" at Django under #633, so it might make it into django.contrib some day.

Business Model of Open Source ...

... just hearing that gives me goosebumps. This absolute inability of economic interest groups to imagine motivation without a business model is sad.

Tailor - Version Broker

Tailor is a Python script that can exchange changesets between different versioning systems. In principle, you can also mirror repositories in other versioning systems with it.

Trac on Darcs

I love Trac as a project tool - and I'm quite satisfied with Subversion for version control (or, more recently, with SVK as a version control tool based on Subversion). But darcs has also excited me because it's so beautifully bureaucratic - and now there's also Trac on Darcs - a patched version that works with darcs instead of Subversion. I think I'll take a closer look at that ...

The same patch also allows a Bazaar-NG backend for Trac - which is particularly interesting because both Trac and Bazaar-NG are written in Python. I think I should take a look at that part as well.

And just now I was pointed out in IRC that SCons is also Python - a pretty nice replacement for the already quite old Make.

This all sounds almost like Developer Nirvana.

If you ever need to help a colleague without SQL experience with Oracle, like I did: Oracle/SQL Tutorial. Quite nicely done.

SVK - Subversion distributed

SVK is a distributed version control system with a special feature: it can mirror trees of other version control systems (including SVN) and then define working copies on them. You can mirror a Subversion repository of a project, create your own local branch, work in it, and version the changes locally. When your own branch is ready, you can make a commit against the original repository, or simply pull a diff and send it to the developer. Very nice thing, especially because of the good integration with Subversion.

For vim: snippetsEmu emulates the snippet function of TextMate, but with vim macros.

How to convert accent characters to their base characters in Python. Pretty basic approach, but sufficient for many purposes. For proper solutions, there's PyICU ...

No idea if I've had this before, but Caching Tutorial for Web Authors and Webmasters looks like a useful description of HTTP caching directives, with explanations for application programmers.

Django i18n status

I worked a bit more on the i18n stuff in django today and finally switched my gallery to the i18n branch. You can now see the strings on that site with either english or german settings. Other languages get english output (so if you are in Germany and still see english strings - check the language settings in your browser, wether German is defined with higher priority than English).

The code works quite nice and I think I will give it a week or so to settle and than start to put finishing touches to it - like adding much more translation hooks to the django source.

Ajax differently

Subway's new Ajax framework has an interesting approach: using inspect, Python source is retrieved from a method and then translated to JavaScript. Of course, only a subset of Python is supported, but the idea is quite interesting - Python syntax for JavaScript.

Of course, the semantic differences between Python's and JavaScript's interpreter execution will bite you sooner or later, but for simpler things (and many Ajax things are actually quite banal code on the JavaScript side), you can build without JavaScript source.

The whole thing would of course be much more elegant if Python had a reasonable macro language integrated - or if you could build macros in Python in the way you can in Common Lisp. This would make the definition of the subset and the creation of JavaScript from it much more elegant. Perhaps Philip J. Eby's work on configuration languages could help - it is essentially the approach of a macro facility for Python.

Personally, I would rather pursue an approach in Python where JavaScript is generated through Python code execution (i.e., not parsing and compiling) - because many Ajax functionalities are quite standardized processes. The DOM tree is usually manipulated according to fixed specifications, with data delivered via JSON. Most of this could be well standardized. However, I don't have any concrete code ready yet, so far Ajax for me is still direct JavaScript code - although with the help of MochiKit.

Let's see what else is happening in the Ajax-Python land. CrackAjax is at least another approach that might inspire others to build on this a bit better.

Pragmatic Ajax is a book (currently in beta - you can pre-order and get the betas as PDF) about Ajax and all the surrounding stuff. The Pragmatic Bookshelf books are usually quite pragmatic (aha) and pleasant to read, so it might be worth it.

WSGI and WSGI Middleware is Easy describes how to work with WSGI middleware and what it actually is.

RobotFlow is based on FlowDesigner - something like Open Source LabView - and is a graphical robot programming platform, comparable to RoboLab. Unfortunately, there is still no LegOS backend (or at least one for the Lego Mindstorms bytecode) for RobotFlow.

OpenMCL 1.0 is out - after quite a long time finally a decent version jump

Python Paste Power

Python Paste Power is a very interesting article about Python Paste, the metaframework by Ian Bicking. It makes the application and distribution of web applications in Python much easier (at least if the framework with which one wants to build the application has Paste support).

IRC Logger update

The IRC logger is working fine, but I wasn't happy with the dependence on muh - so I wrote my own little logger bot in python, based on irclib. Does work fine and does only what I want it to do - logging. I allways feel a bit queasy when IRC bots have command structures and stuff like that and I actually don't need any of those ...

So now the project is mostly complete - just use the django admin to add channels to your database, point the logger bot to some IRC host and see how it joins channels and starts logging.

Oh, there are still things to do - for example the bot needs to rescan the list of channels so it notices newly added channels and leaves deleted channels (and maybe I should add channel activation/deactivation so I can switch off channels for some time without losing the archives), but for now it just logs #django and for that it's good enough.

37signals again

This time it's Writeboard for collaborative text editing over the web. Maybe something like SubEthaEdit for the very poor. From the FAQ:

Is this some sort of wiki?

No way. Not at all. Nope. Wikis are icky. Writeboard is about writing and editing solo or with others. It's all about the words. Wikis are about way more than that which is why they are generally pretty confusing to most folks.

Yes, of course, wikis are icky. And hard to understand. Logical. Might apply to someone who thinks to-do list programs are more brilliant than sliced white bread. Sorry, but it's just getting ridiculous what's coming out of that place. Banal programs don't get smarter just because you wrap them in candy colors ...

Retrocomputing - MIT CADR Lisp Machines

Yeeeehaaaa! The source code of the MIT CADR Lisp Machines - the precursor of most high-end Lisp machines - has been released under a BSD license!

This should hopefully give the CADR Lisp Emulator a further boost. In recent times, things have been a bit quiet around the emulator.

If Symbolics could finally bring themselves to port their OpenGenera platform to OS X, I would be even happier.

And a few more news about the emulator - there is the first support for ChaosNet, including a file server for Linux. And the Lispmachine-Board mentioned in the link would be pretty cool ...