Taste for the Web - a nice cartoon about Paul Graham's articles. Yes, his sometimes rather uncle-like style with constant plugs for Yahoo Stores can get on your nerves from time to time.
programmierung - 22.11.2005 - 2.1.2006
Re: Web application design: the REST of the story - a very interesting discussion of two currently dominant architectural styles for web applications: REST and Continuations.
LGT: Lightweight Game Toolkit for Python - particularly interesting are the NanoThreads (coroutines for Python), the EventNet (enhanced event programming) and Gherkin (an alternative to Pickle/Marshal). There is now an enhanced successor to NanoThreads and EventNet called FibraNet (which is independent of LGT).
Webstemmer - HTML-Grabber that extracts the actual core text from websites based on the layout.
simple_json 1.0 - Alternative to json.py with fewer quirks (therefore with a sociopath as a programmer - but you can't have everything. In this case, functional code is more important than friendly tone).
CSS2/DOM - Styling an input type="file" - wild hacks to style file upload buttons with CSS or JavaScript.
Arithmetic Games
I just realized that it makes more sense to calculate 90*24*60*60 instead of 90*24*60 - at least if you want to express 90 days in seconds. Now, comment cookies should also last longer than 1.5 days
StickBlog » Blog Archive » Upload multiple files with a single file element - a nice method to upload multiple files without having to deal with a forest of browse buttons.
Weblogs - Variation on the previous link, here JavaScript and CSS together.
Dejavu - Trac - another ORM for Python. This one is characterized by absurd class names (Arena, Sandbox, Units ...)
Download DrScheme v300 - a new version of the best Scheme system in the world is out. Grab it while it's fresh. Now with Unicode!
appscript - Python as an alternative to AppleScript. Thus, application control via the AppleScript interfaces directly from Python programs.
Generic Functions with Python
PEAK has been offering generic functions similar to CLOS for Python for quite some time. I always wanted to play around with it, but for a long time it was just part of PyProtocols, and the installation was a bit tricky. However, since September of this year, it has been decoupled and much easier to install. So I dove right in.
And I must say: wow. What Phillip J. Eby has accomplished is truly fantastic. The integration with Python (works from Python 2.3 - he even invented his own implementation of decorators for Python 2.3) is superb, even if, of course, some things take a bit of getting used to.
A small example:
import dispatch
[dispatch.generic()]
def anton(a,b):
"handle two objects"
[anton.when('isinstance(a,int) and isinstance(b,int)')]
def anton(a,b):
return a+b
[anton.when('isinstance(a,str) and isinstance(b,str)')]
def anton(a,b):
return a+b
[anton.when('isinstance(a,str) and isinstance(b,int)')]
def anton(a,b):
return a*b
[anton.when('isinstance(a,int) and isinstance(b,str)')]
def anton(a,b):
return b*a
[anton.before('True')]
def anton(a,b):
print type(a), type(b)
This small example simply provides a function called 'anton', which executes different code based on the parameter types. The example is of course completely nonsensical, but it shows some important properties of generic functions:
- Generic functions are - unlike classic object/class methods - not bound to any classes or objects. Instead, they are selected based on their parameter types.
- Parameter types must therefore be defined - this usually happens via a mini-language with which the selection conditions are formulated. This is also the only syntactic part that I don't like so much: the conditions are stored as strings. However, the integration is very good, and you get clean syntax errors already when loading.
- A generic function can be overloaded with any conditions - not just the first parameter is decisive. Conditions can also make decisions based on values - any arbitrary Python expression can be used there.
- With method combinations (methods are the concrete manifestations of a generic function here), you can modify a method before or after its call without touching the code itself. The example uses a before method that is always (hence the 'True') used to generate debugging output. Of course, you can also use conditions with before/after methods to attach to specific manifestations of the call of the generic function - making generic functions a full-fledged event system.
A pretty good article about RuleDispatch (the generic functions package) can be found at Developerworks.
The example, by the way, shows the Python 2.3 syntax for decorators. With Python 2.4, of course, the @ syntax can also be used. One disadvantage should not be kept secret: the definition of generic functions and their methods is not possible interactively - at least not with the Python 2.3 syntax. Unfortunately, you generally have to work with external definitions in files here.
RuleDispatch will definitely find a place in my toolbox - the syntax is simple enough, the possibilities, however, are gigantic. As an event system, it surpasses any other system in flexibility, and as a general way of structuring code, it comes very close to CLOS. It's a shame that Django will likely align with PyDispatch - in my opinion, RuleDispatch would fit much better (as many aspects in Django could be written as dispatch on multiple parameter types).
LTK - The Lisp Toolkit - if it should just be a bit of GUI, but not necessarily the big hammer is needed - LTK offers simple bindings for TK in Common Lisp. Works excellently with OpenMCL together, even CLISP likes it.
Sams Teach Yourself Shell Programming in 24 Hours - A whole book about shell programming. And of course, a pretty good introduction to the various tools that Unix systems provide. Certainly recommended for anyone who, for example, has gotten a root server and now wants to do more with it - but otherwise knows Linux mainly from the GUI.
[GOODIE] Headless Squeak for OS X (Re: Mac VM 3.2.X)](http://lists.squeakfoundation.org/pipermail/squeak-dev/2002-April/037668.html) - how to get a headless Squeak (Smalltalk environment without GUI component) running under OS X for server services. Particularly interesting for using Seaside.
Hyper Estraier: a full-text search system for communities - Full-text database with attribute search and some other nice features - as well as bindings for various programming languages
The Xapian Project - another full-text indexer, this one with various advanced features such as stemming for different languages.
Inets 2.5.5 - Webserver in Erlang
Is Rails a DSL? What is a DSL, and is it possible in Python? - Domain Specific Languages - a quite useful description and examination of the situation in Python and Ruby.
Linux Daemon Writing HOWTO - how to write a daemon under Linux (general information)
Yaws - another web server in Erlang - this one is HTTP 1.1 compatible and contains approaches for web development
Python Cheese Shop : python-fastcgi 1.0 - FastCGI implementation based on the OpenMarket FastCGI C library and therefore significantly faster than pure Python solutions.
Python OpenID 1.0.1 Released — OpenID Enabled - OpenID client and server in Python. I should check it out, could be quite interesting for comment functions.
Hacking the jProject - The Daily WTF - ouch. An order system where each order is stored in its own table in the SQL Server. Great idea.
How-To Guide for Descriptors - a very good explanation of how properties work in Python and what the magic methods get, set, and del are all about (and how getattribute plays into this).
Just a Thought
What would actually happen if the GNOME developers went to the Linux Kernel Mainling list and announced that they recommend users to use FreeBSD because the chroot model of Linux is pathetic, and the kernel APIs are a mess anyway, and Linux still doesn't have really good filesystem notifications, and the development of Linux simply doesn't take GUI requirements into account enough. Therefore, they would suggest users to use FreeBSD, because the Linux kernel programmers are all idiots anyway.
What would Linus' reaction look like?
And we all make the same mistakes again
There is currently a lot of activity in the area of Microformats - the idea behind it: to store information blocks not in XML, but in predefined HTML. CSS classes are then used to define what a single format is. Logically, there are problems with colliding styles - what a surprise. I myself am always amazed at how much energy developers can spend on stupid ideas.
We once had HTML that not only dealt with semantics but also with layout. And that produced the all-time favorite FONT-TAG orgies on HTML pages. Over time, most people have come to the realization that separating semantics and layout makes sense - semantics as a basis for marking up content, layout in the CSS files, and as a connection between these, the IDs and classes on tags. Additionally, with DIV and SPAN "anonymous" tags without predefined semantics (except "this is a block of text" and "this is an inline stretch of text" - where this meaning can be easily overloaded), for things that don't work with the normal semantic markup (which is mainly due to the rather stupid idea of HTML that there are markings for headings, but no markings for sections of text to which these headings would belong).
What do Microformats do now? Well, the same stupid idea of misusing something - namely in this case the connecting pieces between semantics and layout mentioned above. Microformats give these a meaning - for example, a DIV with a class 'description' would then be the description of a review - read the details in the hReview reference. Sorry, but this must inevitably lead to conflicts - have the idiots never heard of namespaces? The Microformats explicitly address XHTML - and that has exactly the purpose of embedding namespaces. And if you think you have to implement such a stupid idea - couldn't you at least be smart enough to give the parts more cryptic but unambiguous classes?
As I said, it's amazing how much energy goes into such stupid ideas that are doomed to create more problems than solutions.
Deadlock - interesting article about deadlocks in systems and about zombie processes, signal handling, etc.
setting user passwords in admin
A rather ugly - but still useful - monkeypatch:
# monkey-patch for auth.users
from django.models.auth import User
def user_pre_save(self):
if not self.password.startswith('sha1$'):
self.set_password(self.password)
User._pre_save = user_pre_save
Put this into your model file (or somewhere else that is loaded early on) and you will be able to set passwords in the admin by entering clear text passwords. If the password starts with 'sha1$', it is seen as already encrypted and nothing happens. If it doesn't start with that string, it is converted using the standard Django function for password encryption.
No, this isn't something that should go into core - it's far too ugly for that. But at least it allows you to set passwords through the admin, without requiring the user to calculate the actual password hash.
SystemExit and exception handlers
Frequently used: SystemExit. A Python exception that many people don't know. The special thing about this exception: it is not an error. It also does not occur unexpectedly. It is simply triggered by sys.exit. The idea behind this is that you can insert an end processing in the dynamic flow (e.g. some file cleanups), without linking into global exit processing (with all the problems that entails).
The problem is that many programs and libraries install a global exception handler. One that catches every error and sends it nicely formatted by mail, logs it somewhere or something similar. I do this all the time. It also works great - except when you actually want to initiate an early end in your program. Then nothing works anymore - because you get corresponding errors for a non-error.
This becomes particularly critical in connection with multiple processes. If you start a process during operation, you also want to terminate it without executing any subsequent code. You can best see this in an example program:
import signal
import os
try:
pid = os.fork()
if pid:
print "Elternprozess", os.getpid()
else:
print "Kindprozess", os.getpid()
sys.exit(0)
except:
print 'Fehler aufgetreten in Prozess', os.getpid()
print "Das darf nur der Elternprozess ausführen", os.getpid()
This code simply has a global error handler that catches errors in a rather unspecific way. Within the code, a parallel process is started with fork. However, since SystemExit is treated like all other exceptions, the child process is not terminated correctly - a process copies the entire state of the parent process, including return addresses, open error handling, files, database connections and so on.
This is of course fatal - because here sys.exit is caught. So there is an error message for the quite normal sys.exit(0) call. And even worse: since SystemExit is not treated separately, it continues normally afterwards - and the child process runs into code for the parent process. Code runs double, which can have critical results under certain circumstances.
If you can fully control the entire software stack, the solution is simple:
import signal
import os
try:
pid = os.fork()
if pid:
print "Elternprozess", os.getpid()
else:
print "Kindprozess", os.getpid()
sys.exit(0)
except SystemExit:
raise
except:
print 'Fehler aufgetreten in Prozess', os.getpid()
print "Das darf nur der Elternprozess ausführen", os.getpid()
This simply re-raises the SystemExit - i.e. triggers it again - without making a message. In most cases, Python's standard handling will then kick in and convert the SystemExit into a normal termination.
But what to do if you have several stacked variants of the wrong error handling? I had something like this with Django and FLUP (the FCGI/SCGI server for Python). In Django I changed it, then the error hit in FLUP. What do you do then?
The solution is a bit more brutal:
import signal
import os
try:
pid = os.fork()
if pid:
print "Elternprozess", os.getpid()
else:
print "Kindprozess", os.getpid()
os.kill(os.getpid(), signal.SIGTERM)
except:
print 'Fehler aufgetreten in Prozess', os.getpid()
print "Das darf nur der Elternprozess ausführen", os.getpid()
Ultimately, the process simply commits suicide - it sends itself a SIGTERM, i.e. a termination signal. The same one you would normally send from the shell. However, you must then ensure that any necessary post-cleanups are either already done, or then run in a SIGKILL handling routine - otherwise you may have problems (e.g. database transactions should already be committed).
With this solution, you also have to be careful that no open resources block the process - otherwise you may produce zombie processes. Often it is better for such multiprocessing to start a management process much earlier in the system - outside the error handling chain - and then use it to start processing processes. However, this then has the disadvantage that processes started in this way do not inherit the environment of the parent process. Therefore, you usually have to make more preparations to perform the desired actions. Incidentally, Apache pursues a similar approach - there the processes are created from a very early basic state, so that they come as resource-free as possible.
Vampire - An extension of mod_python that makes it more developer-friendly. For example, it can also perform automatic code reloading.
Learning Seaside - cool demo of what can be done with Seaside (Smalltalk web framework) and AJAX. Essentially a database interface with a freely configurable database model - something like Google Base, only cooler.
Ajax Sucks Most of the Time (Jakob Nielsen's Alertbox December 2005) - why Jacob Nielsen is right - sometimes.
Commentary - Sticky notes for websites, implemented as WSGI middleware. Very interesting, could be particularly interesting for source views or similar, or for longer texts.
pyinotify - very nice, finally a usable wrapper for the notify function in Linux. With it, Python programs can be informed about changes in the file system - ideal for directory monitoring.
Paj's Home: Cryptography: JavaScript MD5: sha1.js - JavaScript implementation of SHA1 - practical if you want to avoid plaintext passwords in web forms. Of course, you should always have a fallback, because not everyone has JavaScript available or activated. The site also has MD5 and MD4 implementations and a few other snippets on the topic.
akismet.py - Python interface for the (central) Akismet Spam Scanner.
Development « Akismet - the Akismet API
Louie - a new event dispatching module for Python. Builds on PyDispatcher.
SQLAlchemy README - another ORM for Python, heavily oriented towards SQL and offering a lot of magical syntax. Fascinating how in this area programmers try to abuse every language feature just to avoid writing SQL ...
axentric. a web designer's “tackboard”. - generalized version of the yellow-fade technique by 37signals. Nice for highlighting parts of pages that shouldn't stay permanently.
Again something from the crafting front
Content-type: matter-transport/sentient-life-form - for those who want to get a taste of where my blog is headed. Not quite finished yet, some bugs in my software, a few things waiting for patches in Django, but overall I'm already quite satisfied.
Another OPML server...
Phil Pearson does it again - this time he has reimplemented the community server for Dave Winer's OPML editor in Python (previously he had rebuilt the Radio Community Server, a project I was also briefly involved in). In any case, you can now also publish your OPML editor on your own Linux machine if you have Python and SCGI available there.
DragAndDrop - MochiKit - Trac - Drag and drop with MochiKit.
Weird Python 2.3 Bug
Some bugs you chase are really strange. Just take a look at the following Python script:
import re
r = re.compile('^', re.M)
src = '''<html> <head> <title>Logviewer</title> </head> <body> <div> <h1>Titel</h1> </div> <div> {{}}
{% block content %}
{% endblock %}
</div> </body> </html> '''
for match in r.finditer(src):
print match.start()
Looks quite harmless - it just returns the positions of the newlines (yes, I know, you do this differently - the source is not mine). The script has an infinite loop on the last, closing newline under Python 2.3. If you remove it (i.e., paste the """ directly behind the last tag without a line break), the script works. Under Python 2.4, both variants work. And you have to chase after things like that...
Do I really need to emphasize that this little snippet of code was hidden in a larger pile of code, or?
Microsoft to Standardize Office Formats in ECMA
Stephen Walli(Ex-Microsofter) über die zu erwartenden Fallstricke in der letzten Microsoft-Aktion:
It will likely be a royalty free license, because the current patent license around the proprietary specification is royalty free. That patent license, however, couldn't be sublicensed, so an implementer that wanted to license their implementation under the GPL couldn't. Indeed previous examples around the IETF SenderID standard would force users of other implementations to engage in a license with Microsoft which is a rather onerous problem for free and open source licensed software.
Der Hinweis auf die SenderID-Geschichte ist durchaus wichtig: dort hat Microsoft auch ständig davon geredet, das es ein offener Standard sei, aber dabei immer unterschlagen, das deren Verständnis von offenen Standards absolut inkompatibel mit vielen Bereichen der Open Source Entwicklung sind - mit Sicherheit wird Microsoft wieder die GPL blockieren.
Abgesehen davon, ich find es schon ziemlich armselig, wenn Microsoft sich schlicht weigert ODF zu implementieren und meint einen eigenen Pseudo-Standard ihres Krams machen zu müssen - zumal man ja genau weiss, wie sich Microsoft dann zu solchen Standards verhält. Die werden dann wieder in den passenden Stellen erweitert und schon ists vorbei mit dem freien Zugriff.
Web Development Bookmarklets - various bookmarklets that are very helpful for web development.
Closures python,scheme,ruby - a good explanation of the somewhat faulty lookups for lexical variables in Python (at least when an assignment is involved in an inner scope).
Routes 1.0 Released - this is the Python version of the URL routes from Ruby-on-Rails. Very interesting, I must sit down at some point and see if I can't build this into Django as an alternative URL dispatcher.