programmierung - 28.3.2011 - 11.5.2011

App Engine Go Overview. Honestly, I would find it more exciting if Google would move away from outdated Python 2.5. But well, instead of Python 2.7 or a JVM language, you can now program the AppEngine with Go. At the same time, however, prices and conditions have changed, so it would probably be better to first check whether it is worth it at all. Because you can also use Go just as well on your own root server ...

bconstantin / django_polymorphic. Why am I only finding this now? This is a very nice thing for Django projects with inherited models - as soon as you make accesses to a common model class, you only get instances of the common model class with Django - but with Django-Polymorphic you get instances of the concrete subclasses. In principle, this makes the ORM more of an object database. However, this might come at the expense of performance, as more SQL queries are generated.

Mixing it up: when F# meets C#. As you never program in a closed room, the connections between languages are quite important - and especially on platforms like .NET and JVM. The mappings of F# data types to C# data types and the use of these look quite interesting. Using C# data from F# is trivial, but the other way around there are some peculiarities. A similar situation exists with Scala and Java.

birkenfeld / karnickel. Quite a weird thing: Macros at AST level for Python. However, in a form that rather reminds of C macros - so simple expression macros (and quite limited block macros). Above all, you get all the nasty problems of such an unhygienic macro system - like name conflicts between macro-local variables and outer variables. It's also rather just proof that it works and what you can do with the AST module delivered with Python.

dyoo/moby-scheme. Another interesting thing for Android: a PLT Scheme (i.e., Racket) dialect and a suitable toolchain to run applications created from Racket Advanced Student Language + World Primitives (ASL is already quite an extensive Scheme dialect in Racket and the World Primitives are for reactive programming in Scheme) in JavaScript and then bundle them into Android applications. So programming Android phones in a reactive Scheme dialect. Or even shorter: parentheses for Android.

icylisper.in - jark. Hmm, yet another of many solutions for Clojure that enables simplified deployment of Clojure scripts, complete with persistent VM and #! support. Somehow, there seem to be quite a few of these lately.

Pygame Subset for Android. Wow - there is a PyGame subset for Android. Usage is a bit clunky because there is no IDE - you have to place the files on the SD card (hmm - a Nexus S doesn't have an SD card, where does that go there?) and edit them otherwise.

android-scripting - Scripting Layer for Android brings scripting languages to Android.. Interesting project with which you can run various scripting languages on Android phones. Support for Shell, Python, Perl, Ruby, Lua, TCL and JavaScript is already included. For me, Python is of course particularly interesting. Especially because the Android API is made available - you can thus directly interactively or scripted play around with the things.

Scala 2.9.0 RC3 | The Scala Programming Language. Hmm, especially the parallel collections sound interesting - so to speak map/reduce for multicore on local data structures.

jQuery: » jQuery 1.6 Released. Regarding jQuery - a new version has been released. I personally find this .attr vs. .prop change somewhat unpleasant - it could bite me in a few places where I work directly with input fields (various widget code in a rather heavy Django application). Of course, it's great that it gets faster - faster is almost always good.

jgrowl. Definitely check it out, because our hand-knitted notifications are just not as nice and stable. jGrowl makes a much better impression, and as a jQuery plugin, it should also not collide with our jQuery codebase.

PyPy Status Blog: PyPy 1.5 Released: Catching Up. Yay! PyPy is now on par with CPython 2.7! And again a few additional performance improvements. Moreover, the interface for CPython extension modules (i.e. those not written in Python) has been improved, first successes are Tkinter and IDLE.

spock - The Chicken Scheme wiki. If Dylan doesn't fit on JavaScript, how about Scheme? What's interesting here is the connection to Chicken Scheme - Chicken Scheme is one of the more interesting Scheme implementations in recent times that specifically focuses on integration into normal system environments (FFI and easy linking with C libraries), so this also lets us expect a bit from Spock in terms of JavaScript. And the documented functions already look quite good - not just a toy implementation, but apparently already a lot of functionality.

ralph. And if JavaScript under Flusspferd becomes too stupid for someone, they can simply install Ralph and then have a Dylan-like Lisp that compiles its function definitions to JavaScript. For whatever reason one would want that, maybe just because it's possible.

Flusspferd - CommonJS platform | Javascript bindings for C++. For those who want to play with JavaScript completely outside the client world, Flusspferd might be interesting. It is a REPL for JavaScript and various JavaScript libraries (which are oriented towards CommonJS).

PDP-11 emulator. In JavaScript. Runs Unix System 6. Yes, just like that, with disk access and all the well-known programs from back then. Because there aren't enough strange things already.

iPhone Location Data Again

Once again regarding the Apple response to the motion profile allegations and why Apple is right, but there is still a problem (but one that is significantly smaller than the dramatized problem in the press).

Apple produces a database with - anonymously collected, there are no indications so far that it is not anonymous - position data of iPhones with activated GPS, in which positions of networks are stored. Networks in this context are radio masts for GSM, 3G and WLANs that the iPhone sees at that time. However, this is not what is stored in the database that everyone is talking about. This is only the basis on which something is built that then ends up in the database.

The data sent to Apple is averaged internally and a "center" is determined for the networks reported by various iPhones (since the exact position of WLAN routers or radio masts is not simply provided - this must first be determined in some way). This data is stored in a large database at Apple. The position data therefore refers to the center of radio identifications. The original position data is only basic material for the determined position data.

The iPhone can now determine an approximate position via the visible radio identifications and their position information and a weighted average of the data based on transmission strength - but internet access is required for this. And internet access to the database at Apple. Therefore, the iPhone downloads the information about radio identifications and caches this locally. But of course not the entire database - that would be too much. Rather, a relevant excerpt determined by algorithms. This is now the database on the iPhone.

Apparently, Apple not only downloads the networks that the iPhone currently sees, but also neighboring networks - which makes sense, as the user moves around more often and the data from neighboring networks will be needed (potentially - the iPhone does not know in advance where I am going). Presumably, the iPhone will say "I see networks A, B, C" and the database will then provide "here are the networks A-M from the metropolitan area where you are located". The iPhone then takes X% of A, Y% of B and Z% of C as a basis and calculates a rough position and says "here I am". If it then moves into the visibility of network D, its position is already known and the iPhone can perform the position calculation directly without downloading.

In addition, the iPhone seems to store a temporal history of these downloads - presumably the developer assumed that if the user has been there before, there is a high chance that he will go there again. For this purpose, the iPhone keeps these data ready for one year. The claim by Apple that the duration of storage is a bug is certainly rather an embellishment - presumably a developer simply made up a duration and used it without considering how much would really be sensible - after all, these were not special data in his understanding. Only technical caches for downloads that he anyway makes when the user asks for his position.

What does this mean for the user? The data does not reproduce where he was in the coordinates - it only reproduces where the radio identifications are, in whose vicinity he was approximately. And since it also contains neighboring networks, this is really very approximate. Of course, a rough spatial profile of the user can be derived from this - for example, in my data I can indeed see that I have been in Amsterdam, in Frankfurt and in Berlin.

But for example, it also means in reverse that only the approximate regions are included if you also had network reception there, with download options. I was in Copenhagen - there I also had network access via the hotel, so traces of this are present. In Malmö and at the turn of the year in Russia I did not have network access - so GSM, but no internet access - and therefore the iPhone could not access these location data and could not download radio identifications with positions. Therefore, these data are also completely missing from my iPhone and there are no traces of Malmö, Ekaterinburg or Nischni Tagil (the same should apply if you have activated airplane mode or simply turn off WLAN and mobile data).

Furthermore, the spaces should become larger when you come to more rural regions - few WLANs, so mainly GSM cells and these with a larger range and more scattered. If you store a cell with the neighbors, this is already a fairly large area that is covered. In large cities, on the other hand, the covered area should be significantly smaller, simply because WLANs have significantly smaller ranges and there are more of them there. And radio cells there are also usually smaller (just because a cell can only cover a finite number of users, but the user density in cities is greater).

This is particularly interesting for programmers: do you think about what can be derived from cached data when you program? Take as a basis for consideration that someone has access to your DNS cache - which every system has internally, simply to reduce DNS queries. What picture of you as an image could this technically harmless information produce? These are the small pitfalls that programmers like to stumble over. It is actually harmless - auxiliary data that you get from the network is the beginning. Throwing away after use - well, if they are needed again, then it makes sense to have the most frequent ones ready, or? And it is exactly then that you run into problems like Apple currently has.

The discussion about why your browser cache contains porn pictures (because you read your mails with Outlook, for example, and opened a spam mail and had image display activated - not an outlandish situation!), if your wife finds them there, could already become quite interesting. The data no longer shows why they ended up where they ended up.

As stated in the title: I am referring here to the answer from Apple and have only checked this with my own data. My own data matches the information from Apple's statement and this statement itself is also consistent - both the contents and the specification of the use match quite well. I therefore see no reason why I should distrust the statement.

Apple's answer that the iPhone does not record the user's motion profile is therefore correct - it simply stores information for a position determination as an alternative to GPS. At the same time, however, it is at least a profile of the stay in large areas. Criticism is therefore quite appropriate. But in my opinion, it should be more intelligent than "Apple stores the user's positions in the last year", because this is simply wrong.

But as Apple says in the introduction to the answer: these are technical relationships that are more complicated than simply "does Apple store a motion profile Yes/No". And our press has massive problems with questions to which an answer contains more than two sentences. "Apple stores data from which the presence in large areas can be derived" does not sound so great and catchy as a headline.

Unfortunately, this very imprecise reporting can lead to problems arising - if I know that the data only covers regions where I have been, but not precise points of my stay, the explanation why my data from Frankfurt also includes the red light district (it's just near the train station) is much easier than if I have to assume that these are all places where I have been.

Apple must (and will, according to its own explanation) improve this - caching data for a year is nonsense. Backing up the data is also nonsense, they can simply be downloaded again if they are missing. Similarly, the data does not need to be stored if all location services are globally deactivated. It might also be generally interesting to have a switch "Pseudo-GPS Yes/No" or something like that, with which this type of position determination can be deactivated - then the user simply has to wait until the GPS satellites are logged in. Just as, in my opinion, the anonymous data collection for WLAN and radio masts should be switchable.

In my opinion, no cache should exist without a control function for this cache (just as you can also empty the browser cache). Because one thing must be clear: due to the general necessity of linking access time and loaded data (because only in this way can a cache with temporary storage function), every type of cache provides a kind of user profile. And this should be at least rudimentarily controllable by the user (in the sense of deleting). Setting up caches fundamentally with a clear function and a UI for this should become just as much a best practice as the encrypted storage of passwords on servers (hello Sony!).

kiorky/spynner. Wow, that sounds really interesting - a programmatic (i.e., without a user interface) web browser based on QtWebkit as a Python extension. The advantage? Since a full web engine is underneath, you can use all the features of the web browser - for example, client-side JavaScript and all the other things used in web applications. This could be very interesting for automated testing of web applications - or for scraping more complex websites.

IgniteInteractiveStudio/SLSharp. Net - Write GLSL Shaders in C#, the IL code is then automatically loaded onto the GPU. High-Performance-Computing anyone?

IronScheme. Interesting - a Scheme for .NET. And unlike some dead projects I found, something seems to be happening here. Ok, I probably tend more towards IronPython, F# or if it's supposed to be Lisp, Clojure for .NET (there are now quite up-to-date binary packages to try out, unfortunately probably only Windows, at least it spits out errors under Mono).

F Sharp Programming - Wikibooks, open books for an open world. Seems to be a quite nice basic overview of F# - so especially for those who don't already have prior experience (e.g. from OCAML).

Home - Redline Smalltalk - Smalltalk for the Java Virtual Machine.. Not very far yet, but could become interesting at some point - and as an old Smalltalk fan, I naturally have to make a blog mark here.

The plan for mods : The Word of Notch. This is how other game studios should handle mods. Don't sue the people who build on your game, but openly welcome them. Notch even releases the entire source code for mod developers.

tvON / python-wordpress. And to get posts and images into WordPress, I could work with this - a Python library that provides various WordPress functions. However, it comes in different versions, in different states of non-maintenance, so I have to go through it and see if everything runs as I want it to.

Backing Up Flickr. Because I just stumbled over it (I'm looking for ways to automatically push Flickr uploads to the WordPress media library, preferably from the server, without me always having to manually intervene. For this, I would actually have to marry this with WordPress functions (it is a Python script that backs up Flickr images to directories). The backup functionality works, by the way. Maybe not such a bad idea to back up your Flickr account from time to time ...

Jess, the Rule Engine for the Java Platform. If you ever need a rules engine for Java, Jess is based on the core ideas of CLIPS, which has existed for quite some time now (around the mid-80s), but integrates into the Java world. An alternative would be Hamurabi, a rules engine written in Scala that features an integrated DSL with Scala language tools.

Evolutie test. Evolutionary algorithm in JavaScript with visualization in processing.js - started with a random string, the evolution function is the edit distance to the target string and the evolution is what happens - the visualization shows the spread and the convergence to the entered target string.

Re: Factor: Mail with GUI. Nice to see how a more general approach to GUIs makes the code nice and compact - this whole thing reminds me very much of CLIM in terms of structure.

Re: Factor: XKCD. If you want to get an impression of one of the crazier languages - John Benediktssons Blog has a lot of example snippets in Factor, which are usually directly usable in the Factor REPL (or create manageable vocabulary extensions). I am always impressed by the compactness of Factor code. John's code also has the advantage that I can usually understand what is happening - Slavas code, for example, is often much more idiomatic and therefore cryptic for me. But this is certainly also due to the fact that Slava usually writes about the internals of the language, while John simply describes small tricks.

Akka Project. And I definitely had that on the old blog before, but never mind, everything is repeated on TV all the time. And a lot has happened with Akka lately, and it is increasingly establishing itself as the future platform for fault-tolerant systems on the JVM. Many parallels with Erlang's ideas, but with the broader JVM-typical platform (there is hardly anything for which there is not some Java class library and thus also for Scala). Very interesting: Akka brings an implementation of Software Transactional Memory for the Java platform.

Programming Scala. I think I already had this one, but never mind: the second online freely available book about Scala that I stumbled upon today. You can also read it alongside the other one, but it's at a similar language level (i.e., before 2.8).

ScalaQuery. Yes, Scala-Day today. One of the things I was missing so far was a good integration of databases that also makes use of the DSL features and type safety of Scala. So not just sending SQL around via JDBC, but something like LINQ, only for Scala. This looks quite nice.

Programming in Scala, First Edition. And since I have Scala on my mind: the first edition of Programming in Scala is now freely available on the web. Of course, it lacks some things that came with the current Scala version (especially the container libraries are indeed different in 2.8), but it is still certainly a good starting point to read into the language.

Scala IDE for Eclipse. Hmm, it seems that the tools are starting to develop there. I don't have anything against command lines in general and feel more at home on them than in IDEs, but for the general acceptance of languages, IDEs are quite practical. And Scala is still one of the more interesting languages in the JVM environment, even if it has become quite quiet about it in recent times.

agronholm / jython-swingutils. No idea what I could do with it yet, but if the Java world ever becomes interesting, this could become an interesting GUI library (Swing for Jython).

Code rant: Message Queue Shootout!. Not a real shootout and only an incomplete selection of message queues. But still something interesting as a result: if you have nodes that already have their own persistence and transaction solutions, between which you just want to send messages as quickly as possible - there is nothing better than ZeroMQ. It is - due to its architecture - simply the fastest solution. And we are talking about really drastic differences.

NOSQL Databases. Excellent overview of all available NoSQL databases. Good starting point if you want to inform yourself about the available systems and their orientation and implementation.

visionmedia/asset. After having pip (for Python modules) and jip (for Java libraries), here is an analogous tool for JavaScript libraries. So for the automatic installation of JavaScript libraries in node.js project directories from the command line.

jRumble | A jQuery Plugin That Rumbles Elements. The new blink tag! (okay, there are indeed sensible applications, e.g., if you want to briefly display an element on the webpage to indicate that something has happened there - similar to bouncing icons in the OSX Dock).

Python Package Index : pip 1.0. For the sake of completeness, I'm blogging this even though pip is already a fixed part of the Python infrastructure for me. But maybe one or the other has not yet played around with pip, then now is probably the right time to do so. In my opinion, you should always use it together with virtualenv, because then you can easily install exactly the right dependencies for each project and keep them separate from other projects.

sunng87/jip. I'm not currently doing much with Jython, but jip sounds very practical: it's an analog to pip, but for Java libraries. So a simple command-line tool that downloads the necessary jar files and puts them in the right place. Integrated with virtualenv. Much more pleasant for me than, for example, dealing with Maven or similar Java-usual infrastructure tools.

Exploring Beautiful Languages: A quick look at APL. Simply because APL has always fascinated me. Even if I never want to be in the position, based on my own experiences with it, to have to maintain an APL program - for me, APL is the quintessential example of write-only languages.

How I learned to stop worrying and write my own ORM. A bit of background information on why Dapper was developed and which use cases it solves - it is essentially used where direct SQL access was previously "tunneled" over Linq for performance reasons, because Linq2Sql is inefficient there.

Dapper-dot-net - Simple SQL object mapper for SQL Server. Could potentially be quite interesting at work. C# also offers Linq, but according to their measurements, Dapper seems to be significantly optimized for performance.

philikon / python-weave client is even more interesting than the other tool: a Python library for accessing Mozilla Sync. With this, I could build various small tools that automatically mix links into bookmarks or extract them from the sync and move them into other bookmark files. Or how about a cron job that takes links from bookmarks in a special group and automatically posts them to the weblog? All sorts of fun things are possible ...

philikon / weaveclient-chromium. Not yet tried, it is a Chrome extension that integrates Mozilla Sync into Chrome and Chromium. With this, you could finally exchange bookmarks between Chrome and Firefox without having to go through XMarks. If someone now also builds this extension for Safari, I would be happy - the fact that I cannot properly sync between browsers, but each one cooks its own soup, is highly annoying. Mozilla Sync is free to use and behind it is a company that I trust much more in this area than all the others.

Pipe is a module with infix syntax for chained function calls over potentially lazy streams (internally these are generators). Unlike stream (which I mentioned here before), it does not support parallelism, so it's just syntactic sugar. However, I prefer the sugar from stream (i.e., the syntax) and the parallelism of stream is also more interesting than just providing a slightly different syntax.

markrendle/Simple.Data - GitHub. I should check this out, it looks quite interesting - an ORM for .NET.

Basho: An Introduction to Riak. I should take a closer look at this, it has a quite clean and simple architecture and all nodes in the system are equal (this is similar to Cassandra). The whole thing is written in Erlang here, but interesting is the MapReduce interface: functions can be delivered as JavaScript code and the communication goes over a simple JSON interface.

HBase vs Cassandra: why we moved « Dominic Williams. Not entirely uninteresting blog post that dares to compare Hadoop/HBase with Cassandra and tries to highlight the different focuses. His conclusion: HBase is more for warehousing, Cassandra more for transaction processing. Alone, this would make something like Brix even more interesting if it could really combine these two aspects.