Filtered by JavaScript, Python

Page 39

Reset

Most unusual letters in English language

May 12, 2009
11 comments Python

I needed to find out what are the least used letters in the English language. I pulled down a list of about 100,000+ English words, split them all and made a list of about 1,000,000 letters. Sorted them by usage and came up with this as the result:


esiarntoldcugpmhbyfkwvzxjq

It would be interesting to make a heatmap of this over an image of a QWERTY keyboard.

Truncated! Read the rest by clicking the link below.

To JSON, Pickle or Marshal in Python

May 8, 2009
4 comments Python

To JSON, Pickle or Marshal in Python I was reading David Cramer's tip to use JSONField in Django to be able to store arbitrary fields in a SQL database. Nice. But is it fast enough? Well, I can't answer that but I did look into the difference in read/write performance between simplejson, cPickle and marshal.

Only reading:


JSON 0.00593531370163
PICKLE 0.0109532237053
MARSHAL 0.00413788318634

Reading and writing:


JSON 0.0434390544891
PICKLE 0.0289686655998
MARSHAL 0.00728442907333

Clearly marshal is faster but to quote the documentation:

"Warning: The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."

Clearly simplejson is a very fast reader and the JSON format has the delicious advantage that it's "human readable" (compared to the others).

NOTE! I spent about 5 minutes putting together the script and about 10 minutes writing this so feel free to doubt it's scientific accuracy.

Truncated! Read the rest by clicking the link below.

Git + Twitter = Friedcode

April 22, 2009
10 comments Python, Linux

Git + Twitter = Friedcode I've now written my first Git hook. For the people who don't know what Git is you have either lived under a rock for the past few years or your not into computer programming at all.

The hook is a post-commit hook and what it does is that it sends the last commit message up to a twitter account I called "friedcode". I guess it's not entirely useful but for you who want to be loud about your work and the progress you make I guess it can make sense. Or if you're a team and you want to get a brief overview of what your team mates are up to. For me, it was mostly an experiment to try Git hooks and pytwitter. Here's how I did it:

Truncated! Read the rest by clicking the link below.

Too much Python makes Peter a shit Javascript developer

March 13, 2009
0 comments JavaScript

This murdered a good half hour of my time splattered with lots of alert() statements to debug. Basically, in Firefox you can do this:


var word = "Peter";
alert(word[1]); // "e" in Firefox, undefined in IE

This is the wrong way to get to character in a string in Javascript. The correct way is to use charAt() like this:


var word = "Peter";
alert(word.charAt(1)); // "e" in Firefox and IE

I don't know about the other browsers but finally Crosstips.org now works in IE7 too. I haven't even looked at it in IE6 and don't intend to either.

To $('#foo p') or to $('p', $('#foo'))

February 24, 2009
2 comments JavaScript

For the performance interested jQuery users please check out this thread

For the impatient, read Stephens reply He benchmarked what I asked and concluded that $("p", $("#foo")) is much faster in jQuery 1.3.2. I've been coding this style in jQuery for all recent projects so I'm happy with this outcome.

UPDATE

John Resig himself joined in on the discussion and had this to say:

"You should always use $("#foo").find("p") in favor of $("p", $("#foo")) - the second one ends up executing $(...) 3 times total - only to arrive at the same result as doing $("#foo").find("p")."

UPDATE 2

Not only did John join in on the discussion but it also made him work on jQuery 1.3.3 (not yet released at the time of writing) so that it doesn't matter which format you use you get the same performance. See the benchmark here

To assert or assertEqual in Python unit testing

February 14, 2009
17 comments Python

When you write unit tests in Python you can use these widgets:


self.assertEqual(var1, var2, msg=None)
self.assertNotEqual(var1, var2, msg=None)
self.assertTrue(expr, msg=None)
self.assertRaises(exception, func, para, meters, ...)

That's fine but is it "pythonic" enough? The alternative is to do with with "pure python". Eg:


assert var1 == var2, msg
assert var1 != var2, msg
assert expr, msg
try:
   func(para, meter)
   raise Exception
except exception:
   pass

I'm sure there are several benefits with using the unittest methods that I don't understand but I understand the benefits of brevity and readability. The more tests you write the more tedious it becomes to write self.assertEquals(..., ...) every time. In my own code I prefer to use simple assert statements rather than the verbose unittest alternative. Partially because I'm lazy and partially because they read better and the word assert is highlit in red in my editor so it just looks nicer from a distance.

Perhaps some much more clever people than me can explain what a cardinal sin it is to not use the unittest methods over the lazy more pythonic ones.

Incidentally, during the course of jotting down this blog I reviewed some old inherited code and changed this:


self.assertEqual(len(errors),0)

into this:


assert not errors

Isn't that just nicer to use/read/write?

Formatting numeric amounts in Javascript

January 16, 2009
1 comment JavaScript

Dear Lazyweb,

Is there a better method than this to format numeric amounts? Here's a solution I picked up from somewhere and slightly modified. It's heavily string based but passed the tests:


function format_amount(i) {
  if(isNaN(i)) { i = 0.00; }
  var minus = '';
  if(i < 0) { minus = '-'; }
  i = Math.abs(i);
  i = parseInt((i + .005) * 100);
  i = i / 100;
  s = new String(i);
  if(s.indexOf('.') < 0) { s += '.00'; }
  if(s.indexOf('.') == (s.length - 2)) { s += '0'; }
  s = minus + s;
  return s;
}

The "tests" are:


format_amount(100)       == "100.00";
format_amount(100.0)     == "100.00";
format_amount(100.05)    == "100.05";
format_amount(100.051)   == "100.05";
format_amount(-100)      == "-100.00";
format_amount(-100.0)    == "-100.00";
format_amount(-123.45)   == "-123.45";
format_amount(-123.450)  == "-123.45";

So functionally it's OK but I'm not sure it's the best way to do it.

bool is instance of int in Python

December 5, 2008
15 comments Python

I lost about half an hour just moments ago debugging this and pulling out a fair amount of hair. I had some code that looked like this:


result = []
for key, value in data.items():
   if isinstance(value, int):
       result.append(dict(name=key, value=value, type='int'))
   elif isinstance(value, float):
       result.append(dict(name=key, value=value, type='float'))
   elif isinstance(value, bool):
       result.append(dict(name=key, type='bool',
                          value=value and 'true' or 'false'))
...

It looked so simple but further up the tree I never got any entries with type="bool" even though I knew there were boolean values in the dictionary.

The pitfall I fell into was this:


>>> isinstance(True, bool)
True
>>> isinstance(False, bool)
True
>>> isinstance(True, int)
True
>>> isinstance(False, int)
True

Not entirely obvious if you ask me. The solution in my case was just to change the order of the if and the elif so that bool is tested first.

domstripper - A lxml.html test project

November 20, 2008
1 comment Python

I'm just playing with the impressive lxml.html package. It makes it possible to easily work with HTML trees and manipulate them.

I had this crazy idea of a "DOM stripper" that removes all but specified elements from an HTML file. For example you want to keep the contents of the <head> tag intact but you just want to keep the <div id="content">...</div> tag thus omitting <div id="banner">...</div> and <div id="nav">...</div>. domstripper now does that. This can be used for example as a naive proxy that tranforms a bloated HTML page into a more stripped down smaller version suitable for say mobile web browsers. It's more a proof of concept that anything else.

To test you just need a virtual python environment and the right system libs to needed to install lxml. This worked for me:


$ sudo apt-get install cython libxslt1-dev zlib1g-dev libxml2-dev
$ cd /tmp
$ virtualenv --no-site-packages testenv
$ cd testenv
$ source bin/activate
$ easy_install domstripper

Now you can use it like this:


>>> from domstripper import domstripper
>>> help(domstripper)
...
>>> domstripper('bloat.html', ['#content', 'h1.header'])
<!DOCTYPE...
...

Best to just play with it and see if makes sense. I'm not saying this is an amazing package but it goes to show what can be done with lxml.html and the extremely user friendly CSS selectors.

The importance of env (and how it works with virtualenv)

September 18, 2008
8 comments Python

I have for a long time wondered why I'm supposed to use this in the top of my executable python files:


#!/usr/bin/env python

Today I figured out why.

The alternative, which you see a lot around is something like this:


#!/usr/bin/python

Here's why it's better to use env rather than the direct path to the executable: virtualenv. Perhaps there are plenty of other reasons the Linux experts can teach me but this is now my first obvious benefit of doing it the way I'm supposed to do it.

If you create a virtualenv, enter it and activate it so that writing:


$ python 

starts the python executable of the virtual environment, then this will be respected if you use the env shebang header. Good to know.