"Clever" date formatting accessibility

November 10, 2005
Last night I wrote a little function that tries to show dates cleverly by comparing the date with todays date, it formats the date differently.

If the date is today is just says "Today 10:00" and for yesterday it says "Yesterday 10:00". If it's within a week it shows is like this "Thursday 10:00". If the date is older than about 30 days it skips the time part and just shows "13-May 2005" and if anything else (ie. > 7 and < 30 days) it shows the whole thing like this "13-Oct 2005 10:00".

What do you think about this? ...from a usability/accessability point of view. One counter argument I have against this is that if you print off a page where it says "Today 12:22" and leave that printed paper for a few days, what "Today" means will change.

To demonstrate it, I've put together a little demo page so that you can get a feel for how it works. Please let me know what you think.

Whitelist blacklist logic

November 2, 2005
Tonight I need a little function that let me define a list of whitelisted email address and a list of blacklisted email address. This is then "merged" in a function called acceptOriginatorEmail(emailaddress) which is used to see if a particular email address is acceptable.

I've never written something like this before so I had to reinvent the wheel and guess my way towards a solution. My assumptions are that you start with whitelist and return True on a match on the blacklist, then you check against the blacklist and return False on a match and default to True if no match is made.

This makes it possible to define which email addresses should be accepted and which ones should be rejected like this:

whitelist = ('*', '')
blacklist = ('*')

Using MD5 to check equality between files

October 28, 2005
To some Python users this is old-school old-news stuff but since I've never used it before I found it worth mentioning.

I have a script that scans a rather large tree of folders filled with files. None of the folders have the same name but they can mistakably contain the same files eg:

folder XYZ-2005-11-27/
folder CBA-2005-07-10/

Sometimes two different folders contain the same file names exactly. Sometimes, the file sizes as equal too. But in some of those cases, even though the file sizes and names are the same they are different files. But! If they are the same files just in different locations I want to find them. How to do that?

"Increment numbers in a string"

October 20, 2005
I've just uploaded my second Python Cookbook recipe. It's unfortunately not rocket science but it's application is potentially very useful. With this little function you can generate the next number in a string that contains at least one number.

The mini unittest is quite interesting perhaps:

$ python
from 10dsc_0010.jpg to 10dsc_0011.jpg
from dsc_9.jpg to dsc_10.jpg
from 0000001.exe to 0000002.exe
from ref-04851 to ref-04852

Playing with Reverend Bayesian

October 19, 2005
I've been playing around with Reverend a bit on getting it to correctly guess appropriate "Sections" for issues on the Real issuetracker. What I did was that I downloaded all 140 issuetexts and their "Sections" attribute which is a list (that is often of length 1). From list dataset I did a loop over each text and the sections within it (skipped the default section General) so something like this:

data = ({'sections':['General','Installation'], 
         'text':"bla bla bla..."}
        {'sections':['Filter functions'], 
         'text':"Lorem ipsum foo bar..."}
for item in data:
    secs = [each for each item['sections'] if each != 'General']
    for section in secs:
        guesser.train(section, item['text'])

Dream: python bindings for squidclient

October 11, 2005
At the moment I'm not running Squid for this site but if experimentation time permits I'll have it running again soon. One thing I feel uneasy about is how to "manually" purge cached pages that needs to be updated. For example, if you read this page (and it's cached for one hour) and post a comment, then I'd like to re-cache this page with a purge. Setting a HTTP header could be something but that I would only be able to do on the page where you have this in the URL:


which, because of the presence of a querystring, is not necessarily cached anyway. The effect is that as soon as the "?msg=Comment+added" is removed from the URL, the viewer will see the page as it was before she posted her comment. squidclient might be the solution. ...sort of.

Ruby and Python benchmarked

September 25, 2005
Some Ruby (the programming language) blogger has tried to implement the Edit-Distance algorithm in three different ways in both Python and in Ruby. His conclusion is more steered towards the difference between the algorithms and not so much the language. I guess that's fair because the implementation might be done better in Python than it was in Ruby. But, please notice that this is a Ruby coder and yet his Python implementations are always faster.

A quick glance tells me that the Python programs run about 3 times as fast as the Ruby ones. See the benchmarks

Smurl from Python

September 22, 2005
If you thought the Web Service example on the about page of was complicated, here's a much simpler version. I use this code on my very own site for the email notifications which will contain long URLs.

Feel free to steal this code into your own projects:

from urllib import urlopen, quote

# variable 'url' is defined elsewhere
if len(url) > 80:
    url = urlopen('' % quote(url)).read()

Was that hard?

Python regular expression tester

September 19, 2005
retest I've just discovered retest by Christof Hoeke which is a developers tool for testing and experimenting with regular expressions. It doesn't have a GUI so it uses SimpleHTTPServer to serve a web interface on http://localhost:8087 that uses AJAX to make the interface snappier. You use this if you feel uncertain how to write your regular expression syntax and need a helpful sandbox for playing in.

This is cool because as an application it's very modern. The source code is only 100 lines python code, some javascript code for the AJAX and a relatively simple HTML page. A genuine GUI app would be considerably much more code but would admittedly run faster. However, considering how "basic" this application is, speed is not an issue.

Random ID generator for Zope

September 2, 2005
I was working on a little application that is similar to (where any URL gets a unique random id that can be used to redirect with long clumsy URL strings) and came up with this little algorithm that I wanted to share with the world for some feedback.

There are two loops, one nested inside one big loop that goes on for infinity. The inner loop creates a list of possible combinations from a sample of letters and numbers eg (abc, acb, bac, bca, cab, cba). For each such found combinations it does a check to see if this has been saved before and if not it exists the loop like this:

if not hasattr(self, combination):
    return combination

