sysadmin - 5.11.2002 - 28.11.2002

One of the stupidiest talks about browsers

One of the stupidiest talks about browsers - sorry, but those guys don't grasp it. Read their little article, find at least 5 errors (there are much more in there, but we want to keep this simple, don't we?) and then you can keep them. This could really be funny if it wasn't for the fact that they are serious on it ... (found through the "Jürgen Möllemann Gedächtnis Bloq")

Post without title

Found on Scripting News: A feature for a mail server?.

What if every mail server supported a new feature. An XML-RPC interface with one entry point. It takes one parameter, a user name and returns a struct containing a boolean. The boolean is true if there is such a user on that machine. It's a struct so more info can be returned later. My email program could send a message to the server each piece of mail came from. Hey you got someone with this name, and do they send out spam? If the answer is no, filter it to the bit bucket.

Nice idea, but it's already there. Ok, not "XML-RPC", but there are other formats out there and some are much older. You can use the SMTP VRFY command on many mail servers to verify a user. Problem: since user checking on many machines is very hard work (think about stuff like http://hotmail.com/ or http://gmx.de/ - multi-million-user sites!), so not every host supports it, many hosts don't allow VRFY to not give out too much data (since spammers can use this interface to check addresses for legitimacy, too!) and some only give you an OK on every check (for the same reasons, they just hide better).

So would it solve the problem at hand? No. Spammers would just start to use the very same interface to validate their own email lists and use one picked randomly out of the pool of their addresses as the sender. What would we get? Nothing better than now, only better disguised. You have to take into account that spammers do learn, too. They might be at the bottom of social behaviour on the net, but they are not necessarily stupid.

Maybe I'm missing something or it's too early in the morning, but couldn't we ask the servers if they know about this person sending me the spam. I have a feeling that most of the spam I get comes from made-up people.

Oh, sure they are. They have been for some time now. Spammers are not interested in responses. One of the most important things to note: spammers don't care for email replies. They actually don't care for the recipient at all - all they do is send out mail, that's all. They are paid for that. There might be click-throughs (most of porn spam is to get people to click the links in the mail, that's why most porn spam nowadays is HTML with embedded images). But nobody in that business wants you to return anything to the sender.

So what to do about spam? The currently best practice is to set up a Bayesian mail filter like bmf or any of the like projects. There are some to integrate into mail clients, some to integrate into the mail server. Just watch out for them. I use bmf to filter mail on my server and it works quite well after feeding it several hundred mails and it gets better every round. False positives are down to only 2-3 a day (and mostly administrative stuff that is easily spotted in the spambox) and false negatives are down to 5-7 a day, easily spotted in the prefiltered inbox, too. The vast majority of about 70-80 spam mails per day are filtered out just fine.

Found on Scripting News.

Beitrag ohne Titel

What's wrong with Joshua Allen's coment on yEnc? One simple thing: yEnc tried to solve a problem in a way that is not stable: there are already implemented standards for encoding, but yEnc ignores them all and put's it's own on top of NNTP and Netnews. It ignores all RFCs in existance and reinvents the wheel.

Will this break things? Sure, it already did. Not everybody jumps on the bandwagon to implement yEnc, so people have to use external tools to get at the stuff they want. But since yEnc is implemented in the worst way possible, this won't always work. Take for example a bi-charset-Environment like the Mac OS. You usually use Mac charsets externally, but talk latin1 or other standard charsets on protocol level. So your data get's converted from one charset to the other. Since yEnc doesn't give applications not knowing about yEnc any clue about it's existence, the stuff get's broken.

yEnc makes almost all errors UUENCODE made, but adds several layers of it's own errors on top of it. This is just plain stupid. And that it "works" is no reason - it works to the extent that people using yEnc capable programs can exchange information. But the basic idea of the Internet is to enable as much people to use something as possible. That's why MIME is so complicated, it's idea is to set up markers in adavance to allow programs to know that something problematic is ahead, even if they don't know what it actually is - and hand it over to a helper application.

yEncs stupid idea of body-tags makes this automatic forward-compatible way of handling stuff problematic.

A very good discussion of the problems of yenc is at http://www.exit109.com/~jeremy/news/yenc.html - read it, understand it and you know what is wrong with Johns little talk.

Beitrag ohne Titel

If everything works, this should show up on lifejournal and in my own sysadmins corner and so make my life easier, as I only need to edit one system to replicate posts to several hosts. Footbridge rules.

Post without title

I hate broken mail servers and mailfilters going haywire. Today bmf played up on me and trashed some mail. Of course there was bound to be some interesting content in there, but I don't know, since I can't read them. fsck. So bmf is out again. But how to get rid of spam? At least bmf filtered enough of the shit out so I didn't have to wade through them all by hand. Now I have to set up some redirection to only filter stuff that's probably spam to protect important mails. Damn, that's bound to be work.

Post without title

ObStupidProgrammers: start/stop scripts that check for existence of a pid file but not whether the contained pid actually is still running or just left over from a crash ...

Post without title

People taking their notebook out of the network to go on a journey and not only taking their notebook but their cabling adapter and terminator for the 10base2 network with them should expect network troubles. And please don't blame them on the firewall ...

Post without title

Setting up a virtual host with Apache for a Python Community Server with Manila-style hostnames (one host for each server) is quite easy:

set up your domain to have a wildcard A record for the domain of the server

set up your apache to have a virtual host with a "ServerAlias *.doma.in" in it

add a ProxyPass rule for the new server where you rewrite / to become http://pycs.server.doma.in:5445/~~vhost~~/%{HTTP_HOST}/

enable the rewriting rules in rewrite.conf for PyCs so it recognizes those addresses

That's it. Run it. Have fun.

Post without title

Squid overdid it today when all those rules from yesterday didn't work and squid again went on caching files. So now it is replaced by apache. And guess what? It works as expected ...

Beitrag ohne Titel

Squid annoyed me. Again. I had the experience that Squid pings servers even though it shouldn't do. Ok, annoying, but you can cope with that (although it produces timeouts from time to time).

I can live with squid sending out requests two or more times even though there really isn't a need to - looks like it's internal GET processing get's fucked up when some timeout on browser-side occurs.

But what really did it for me was it's stupid HEAD handling. It caches HEAD requests. Yeah, stupid idea, they were originally invented for bandwidth saving requests. They are supposed to go out and check the server, and not be cached, because if the document didn't change the client won't need to fetch it, but it needs to know wether it changed at all.

Ok, set up an acl rule for HEAD methods and set it no_cache (actually what braindead idiot invented the "no cache deny ACL" syntax with ACL describing pages that should actually not be cached? This is arse backwards two times!). Should work. Doesn't work. If there was a GET or POST request before, a HEAD will deliver data from the cache, even though it was set no cache. Stupid. Bad. Ugly.

So now I had to come up with an additional rule to suppress only those GET requests that may lead to bad HEAD requests (luckily this was possible because I was having Problems with AmphetaDesk and it's updates, and AmphetaDesk fills in the Browser name). So now I don't cache HEAD requests and don't cache results for AmphetaDesk. Does it work? Not really, if there are still documents in the cache of squid, it delivers HEAD from those documents, regardless of configuration. Damn.

So I had to remove them. That's what the PURGE method is for, right? Wrong. PURGE only purges Documents, not cached headers. So you first have to GET the document, than to PURGE it, to remove it's cached HEAD requests. Oh-my-god.

And now I still have some TCP MEM HIT in the log, although it shouldn't cache. Looks like it handles memory caching different than disc caching. Oh, and this is reproduceable with 2.2 and 2.4. Damn. Sucker.

Couldn't live be made actually easier for sysadmins? Please?