Archive 25.1.2005 - 1.2.2005

Nuclear Elephant: DSPAM

Nuclear Elephant: DSPAM is a Bayesian spam filter. However, it's one that doesn't just run for a single user, but typically for an entire group of users. I have it running on simon.bofh.ms to scan all the mailboxes there - it integrates well and has a whole range of interesting features. On one hand, there's the web interface for managing the spam filter, and on the other hand, there's the quite pragmatic method for reporting false detections to the filter. Also nice is the quite broad support for databases (MySQL, PostgreSQL, SQLite, and several db* types). Overall, it makes a really well-rounded impression - the only downside is the lack of translation for the interface.

Whether it actually filters well, I of course can't say yet due to lack of volume - the emails first need to accumulate and be trained. User reports are, however - typical for Bayesian spam filters - quite positive.

Found at Schneier on Security: the weakest link. So much for the topic of security.

Solaris 10 is now available for free download - even though I certainly won't be using it in production, it would definitely be worth taking a look at.

Away with Trackback

Isotopp is pondering trackback spam on the occasion of spam day and presents several approaches. One of them uses a counter-check of the trackback URL against the IP of the submitting computer - if the computer has a different IP than the server advertised in the trackback, it would probably be spam. I've written down my own comments on this - and explained why I'd rather be rid of trackback today than tomorrow. Completely. And yes, that's a complete 180-degree turn on my part regarding trackback.

The IP test approach once again comes from the perspective of pure server-based blogs. But there's unfortunately a large heap of trackback-capable software installations that don't need to run (and often don't run) on the server where the blog pages are located - all tools that produce static output, for example. Large installations are Radio Userland blogs. Smaller PyDS blogs. Or also Blosxom variants in offline mode (provided there are now trackback-capable versions - but since they're typical hacker tools, they definitely exist).

Then there are the various tools that aren't trackback-capable, where users then use an external trackback agent to submit trackbacks.

And last but not least, there are also the various Blogger/MetaWeblogAPI clients that submit the trackback themselves because, for example, only MoveableType in the MetaWeblogAPI allows triggering trackbacks, but other APIs don't.

Because of this, the IP approach is either only to be seen as a filter that lets through some of the trackbacks, or it's a prevention of trackbacks from the users mentioned above. And the latter would be extremely unpleasant.

Actually, the problem is quite simple: Trackback is a sick protocol that was stitched together with a hot needle, without the developer giving even a moment's thought to the whole thing. And therefore belongs, in my opinion, on the garbage heap of API history. The fact that I support it here is simply because WordPress implemented it by default. Once the manual moderation effort becomes too high, trackback will be completely removed here.

Sorry, but on the trackback point the MoveableType makers really showed a closeness to Microsoft behavior: pushed through a completely inadequate pseudo-standard via market dominance - without giving even a thought to the security implications. Why do you think RFCs always have a corresponding section on security problems as mandatory? Unfortunately, all the blog developers faithfully followed along (yes, me too - at Python Desktop Server) and now we're stuck with this silly protocol. And its - completely predictable - problems.

Better to develop and push a better alternative now - for example PingBack. With PingBack, it's defined that the page that wants to execute a PingBack to another page must really contain this link there exactly as it is - in the API, two URLs are always transmitted, its own and the foreign URL. The own URL must point to the foreign URL in the source, only then will the foreign server accept the PingBack.

For spammers this is pretty absurd to handle - they would have to rebuild the page before every spam or ensure through appropriate server mechanisms that the spammed weblogs then present a page during testing that contains this link. Of course that's quite doable - but the effort is significantly higher and due to the necessary server technology, this is no longer feasible with foreign open proxies and/or dial-up access.

Because of this, the right approach would simply be to switch the link protocol. Away with Trackback. You can't plug the trackback hole. PS: anyone who looks at my trackback in Isotopp's post will immediately see the second problem with trackback: apart from the huge security problem, the character set support of trackbacks is simply a complete disaster. The original author of the pseudo-standard didn't think for a minute about possible problems here either. And then some people still wonder why TypeKey from the MoveableType people isn't so well accepted - sorry, but people who make such lousy standards won't be getting my login management either ...

Interview with a link spammer | The Register - of course this could be fake, but the guys from The Register claim they conducted an interview here with a blog spammer.

IT Manager's Journal | Bitter struggle to control SCO Group parent company - cool, the SCO management is tearing itself apart in court proceedings

law blog » MONEY BACK FROM JAMBA & CO. - interesting reference and interesting discussion on the question of whether parents have to get money back from Jamba if they demand it - and their children who are not fully legally competent have taken out a subscription with Jamba.

Orange Data Mining

Another link for the number crunchers: Orange is a data mining library with Python integration and—at least judging by the screenshots—an interesting GUI.

How do you stand it?

Phil Ringnalda recounts his dream about the history of RSS, in which he finds himself in a conversation with early RSS developers discussing the technical choices and philosophical debates that shaped the format.

In the dream, Phil is asked by one of the developers: "How do you stand it?" — referring to the frustrations and complexities that came with RSS adoption and the various competing standards that emerged.

The post reflects on the tensions between simplicity and functionality, and how different visions for what RSS should be led to fragmentation in the ecosystem. Phil uses the dream narrative to explore the human and technical dimensions of this important web technology.


Note: The original source link appears to be from Phil Ringnalda's blog from 2005, discussing RSS history through a dream sequence narrative.

A series of small nice freeware tools for OS X. I particularly like the WordServices and the CalcService (a simple formula evaluator as a service).

SSH on Mobile

MidpSSH | SSH and Telnet client for MIDP / J2ME devices was recommended to me in the comments on an older post. I installed it on my phone and have to say, I'm impressed. Regardless of how silly the idea is to operate an SSH shell via a phone, it works. And with the macros, it could even be useful for some special cases.

Ok, it doesn't make a lot of sense for our server fleet - most of our servers aren't directly accessible from outside. And switching to the next server is quite annoying with mobile text input. But usually I only need access to the front servers to trigger actions from there - and where these are still missing, you could certainly set up scripts on the front servers.

US court: Guantanamo tribunals are unlawful | tagesschau.de - interesting. But whether that will impress Bush much?

WordPress Related Entries plugin

わさび » Archives » WordPress Related Entries plugin - a very nice little plugin that searches for related articles using MySQL's full-text index. Of course, this is only a fairly simple algorithm and the quality of results is nowhere near Google's level, but I installed it anyway. When you go to the detail page of a post (e.g., by clicking on the title), a list of up to 5 matching other articles is displayed.

I also expect this to give somewhat better positioning for various older posts - without having to remember to manually set a link to them every time (hey, most of the time I've forgotten about them myself!). And maybe it will also help people who come via search engines to find what they're looking for.

Besides, it's cool, and cool is good

It's cool, man!

Bill Gates will das Internet sicherer machen - will he discontinue the entire Windows operating system line and eliminate Internet Explorer?

Camera Bellows and Hoods - Bellows manufacturer that produces replacement bellows. Possibly a solution for my Fujica problem.

Camera Bellows Restoration Trick - Tips on the repair and sealing of camera bellows.

darcs - Distributed Versioning

darcs is one of many version control systems vying to succeed CVS. Specifically, darcs belongs to the class of distributed version control systems and is thus naturally superior to Subversion with its centralist approach (at least if you want to manage a distributed project and can't just get by with the central repository). Normally I wouldn't say much about something like this — after all, there are currently more version control projects than there were editors in the 80s. But seriously now: who can ignore a version control system that is written in a functional programming language with lazy evaluation (yes, exactly, this thing is in Haskell — so much for the claim that Haskell is unsuitable for practical projects) and describes itself as being based on a "theory of patches" with roots in quantum mechanics? And the programmers even use literate programming — yes, that somewhat forgotten method by Knuth of combining documentation and code in a single source file and developing a program from a documentation-centric perspective. Simply cool.

Reprinted Repair Manuals - all kinds of service manuals for all kinds of camera types.

Student must go to prison for one and a half years because of computer worm - but when will the company whose vulnerable garbage software enables these attacks finally be brought to court? They sit there and rake in billions - without being held liable for their product defects. Any automobile manufacturer whose products had such massive security flaws would have been sued into the ground long ago.

8 Pieces Winter

8 Pieces Winter - 1

8 Pieces Winter - 1

This afternoon I had the opportunity to take the little digital camera for a walk. By the way, you can use "View Image" on the larger image to display the image in its original size - with newer images, these are then available in 800x600.

8 Pieces Winter - 2

8 Pieces Winter - 2

8 Pieces Winter - 3

8 Pieces Winter - 4

8 Pieces Winter - 5

8 Pieces Winter - 6

8 Pieces Winter - 7

8 Pieces Winter - 8

8 Pieces Winter - 2

8 Pieces Winter - 2

8 Pieces Winter - 2

8 Pieces Winter - 3

8 Pieces Winter - 3

8 Pieces Winter - 3

8 Pieces Winter - 4

8 Pieces Winter - 4

8 Pieces Winter - 4

8 Pieces Winter - 5

8 Pieces Winter - 5

8 Pieces Winter - 5

8 Pieces Winter - 6

8 Pieces Winter - 6

8 Pieces Winter - 6

8 Pieces Winter - 7

8 Pieces Winter - 7

8 Pieces Winter - 7

8 Pieces Winter - 8

8 Stück Winter - 8

8 Stück Winter - 8

8 Pieces Winter - 1

8 Pieces Winter - 1

8 Pieces Winter - 1

"Bild" violates human dignity

BILDblog » "Bild" violates human dignity - and even gets this legally confirmed. Unfortunately just one of many cases. And I don't believe this will put an end to the smut journalism in Bild - they get stopped far too rarely for that.

fjf's (Cocoa) AbiWord for Mac (MacOSX) - funny, I don't seem to have linked to this yet. AbiWord is really a nice word processor. Certainly a worthwhile alternative to larger packages for occasional writers.

Music industry warns heise online over report on copying software

Music industry warns heise online over copying software report - I hope Heise's lawyers have a lot of fun when they (hopefully!) tear apart the music industry in court. I definitely trust Heise's lawyers much more than Waldorf and Stettler ...

I appreciate your message, but I notice you've provided a link and a brief description in German rather than a blog post body in Markdown format.

The description translates to: "Notify visitors about comment moderation in WordPress."

However, I'm designed to translate blog post bodies in Markdown format. To help you, please provide:

  1. The actual Markdown content of the blog post body that needs translation from German to English
  2. Not just the title and URL

Once you share the full blog post body, I'll translate it while preserving all Markdown formatting, code blocks, and links as instructed.

Stupid Spambot at Work

Right now a pretty stupidly constructed spambot is hammering away at my comment function and clogging up my moderation queue - nothing gets through from it because it's so stupid that it posts everything in plain text, loads of links and typical spam words. So it gets caught by the most basic filters. Nonetheless, something like this can of course have fallout - namely comments from others that end up in moderation (e.g. because the number of links is too high) could be overlooked by me in the mess of hundreds of spam comments and accidentally deleted along with it. If that happens, it's not personal. I just don't feel like scrutinizing carefully when dealing with several hundred spam comments to make sure I'm really only deleting spam...

Update: After taking a closer look at it, I've put it in /dev/null for now - the moderation queue is no longer burdened by it and legitimate moderated comments won't accidentally get deleted. What struck me during the closer examination: a large number of very widely scattered IP addresses are being used. Sounds very much like a botnet, especially since the IP addresses, based on spot checks, appear to all be dynamic dialup addresses. So our friends with remotely controlled Windows machines are once again the horse that spam rides on here. Great. Thanks, Microsoft...

DNA Analysis in the Bundestag

Owl Content

DNA Analysis in the Bundestag [raben.horst] - and so we continue building the police state. Never mind that the Constitutional Court restricted the use of DNA samples to particularly serious crimes. Never mind that genetic fingerprinting - and still compulsory - offers far more possibilities than conventional fingerprints. As long as the hardliners get their surveillance and control obsession confirmed.

Using the .Mac SDK - Objective C (and probably also Python via PyObjC) interface to .Mac.

Copyscape - Website Plagiarism Search - Web Site Content Copyright Protection - I just wanted to make a note of this. A search engine that searches for plagiarism of websites.

No more direct access to newsgroups at AOL - we could now dream that September comes to an end ...

Bundestag's Legal Committee votes against software patents

Legal Affairs Committee of the Bundestag votes against software patents - will someone in government finally wake up? Or will the Bundestag's position - like the EU Parliament's position before it - be trampled underfoot?

SCO vs. Linux: SCO Finds IBM's Code Demands Unreasonable

SCO vs. Linux: SCO finds IBM's code demands unreasonable. Amusing - crying for code themselves, but unable to hand over their own. And if they would actually be so blocked by the release of their own code - how do they want to sift through the vastly larger amounts of code from IBM? It's remarkable that the SCO people aren't embarrassed about this whole mess...

Constitutional Court Lifts Ban on Tuition Fees

Constitutional Court lifts ban on tuition fees. Welcome to a two-tier society when it comes to education. No, 500 euros per semester is not a socially acceptable fee. But that's the agenda anyway - those at the bottom are not supposed to have a chance to move up. It's all about elite universities and tuition fees creating an elite - the financial elite. And so after all these decades we've drawn a final line under equal educational opportunities - fittingly in the year when international studies confirm that we don't have much to boast about when it comes to equal opportunity in education anyway.

A nation of poets and thinkers? Not at all. A nation of sheep and fools seems more fitting...

WP-Questionnaire Plugin

Ok, I've finished the plugin for Wordpress 1.5. Simple thing - a plugin and a small management page where you can set up various questions. To install you download the plugin and simply copy the files to the locations specified in the readme.txt and activate the plugin. Then you just add a few questions in the management section under Questionnaire and you're done. When commenting, a more or less silly question is asked, which should be satisfied with as short an answer as possible (we don't want to annoy the commenters too much). If the answer is correct, the comment - provided no other anti-spam methods kick in first - is released immediately. If the answer is wrong, the comment goes into moderation and must be approved by the admin.

You can of course also build a secret IQ test for your commenters with this and instead of simple questions put small riddles in there - only those who solve them are allowed to comment immediately.

I've activated the plugin on my site, let's see if it has any effects on the commenting behavior of people here. You can share your opinions here about what you think of such an anti-spam methodology.

A fairly interesting possible attack on any captcha solution can be found incidentally in the comments to Eric Meyer's WP-Gatekeeper: you can simply collect and save the comment forms. Additionally, you need a site where you can use these - for example, a site for free porn videos. There you present the captchas to the users of these sites and take their answers. You then send this answer to the saved form and the comment is done. Of course you can also take countermeasures against this - probably best would be an encoded timecode in the form and rejection of a timecode that's too old, since the answers from the porn viewers probably won't come immediately. Interesting approach, the whole thing.

Update: the plugin still has two bugs. For one, it also catches trackbacks (which of course never have the necessary variables) and it can currently still be circumvented pretty easily if you know what to look for in the form - you just need to solve one captcha and then you can spam other comments by changing the comment ID. The latter is actually a bug in many captcha solutions - you fall for it too easily, forgetting to bind the captchas to some form of serial number or similar so that a form can only be used once in that form...

So I'll be making an update to the plugin in the near future.

Update 2: the problem with trackbacks and pingbacks should now be solved. The problem with replay is still in there. I still need to think about that a bit. My previous solution approaches don't really appeal to me for that.

Update 3: I've now switched it off here again. I haven't gotten any comment spam so far and without a compelling reason, even a simple question to answer is pretty annoying...

Quotes from Karl Valentin in lecture scripts allowed under conditions - one wonders what Valentin himself would have had to say about that...

From my search engine referrers

The really nice thing about my Zeitgeist is that it also shows me the absurd little things. So I would like to let anyone know who searched for naked pictures of Bill Gates that I don't possess any such pictures and don't intend to have them here on the blog. You have to draw the line somewhere.

Teufelsgrinsen

The Government's Rip-off Aid for Electricity Producers

Large consumers are to be relieved of electricity costs at the expense of private households. And this is not some backbencher demanding this – rather these are demands from the government to a regulatory authority in the energy sector to be established. Great – another piece of evidence that all these wonderful regulations are only about allowing companies to cut themselves the largest possible pieces of the cake at consumers' expense. Politically sanctioned rip-off. An excellent example of Clement-style special democracy.

angry face

As the Schockwellenreiter already correctly asks: is it any wonder when the members of parliament are paid by energy suppliers?

A little spot in the green?

A Spot in the Green?

A Spot in the Green?

First appearance as CDU general - and failed

Debut as CDU General: Kauder shocks Red-Green with Nazi comparison. At least one stops wondering why the JU invites Hohmann as a keynote speaker - the tree simply doesn't fall far from the apple here. The Union has been playing with the right-wing fringe time and again since Kohl.

Protests against the situation in Saxony only exist because a few seats didn't go to their own ultra-rightists there. So no real difference of opinion, but pure turf warfare...

Eric's Archived Thoughts: WP-Gatekeeper

Eric's Archived Thoughts: WP-Gatekeeper is a very interesting approach to comment spam: it simply asks one of many pre-configured questions that a human can answer very easily, but a spam bot cannot. Similar forms are already being used in various blogs, but here it's nicely worked out (although in my view it could also be completely realized as a plugin). The basic idea is essentially that of a CAPTCHA - but a textual CAPTCHA. A human can easily answer the question what is 1+1 - a spam bot won't get anywhere with that. Sure, spammers can create databases of questions and answers. But if everyone sets up their own collection of questions, it won't get them far. For comment spam, it should be a very usable solution.

Unfortunately, there's no such simple solution for trackbacks...

Update: since I find the idea somewhat amusing, I'm currently writing a corresponding plugin. So it's possible that my comments might behave a bit strangely tonight.

FDP's Presentation on Education Policy

FDP: "In Germany, the wrong people are having children" - I was also sitting there pretty flabbergasted at the garbage that Bahr spouted. I just hadn't quite figured out how to verbally attack it. Ralf took that off my hands. Go read it.

freshmeat.net: Project details for JRuby - cool, JRuby has now reached Ruby 1.8. A nice alternative in the Java environment to simply program with Ruby. The Jython folks should get a move on and finally make Jython fully Python 2.3 compatible - there's still a lot that needs work there.

heute.de - The Unequal Brothers. A good summary of the blogosphere and its relationship to journalism.