Peterbe.com

Peter Bengtsson's blog

Filtered by JavaScript, Python

Page 42

Note to self about Jeditable

November 22, 2007
0 comments JavaScript

I've been struggling hard this morning to get Jeditable to work in IE (6 and 7). Whilst everything was working perfectly fine in Firefox, in IE the clickable editable text would just disappear and never return. The solution was to use the latest jQuery 1.2.1. I was using version 1.1.4 which was why it didn't work.

Jeditable is a brilliant plugin with really good configuration options (hint read the source code's documentation comment) and I'll now send an email to Mika about this pitfall and suggest that he includes it in his documentation.

Spellcorrector 0.2

September 24, 2007
3 comments Python

Unlike previous incarnations of Spellcorrector not it does not by default load the two huge language files for English and Swedish. Alternatively/additionally you can load your own language file. The difference between loading a language file and training on your own words is that trained words are always assumed to be correct.

Another major change with this release is that a pickle file is created once the language file or own training file has been parsed once. This works like a cache, if the original text file changes, the pickle file is recreated. The outcome of this is that the first time you create a Spellcorrector instance it takes a few seconds if the language files is large but on the second time it takes virtually no time at all.

Truncated! Read the rest by clicking the link below.

Vertically expanding textarea input boxes

September 19, 2007
0 comments JavaScript

I've recently improved the IssueTrackerProduct so that when you start to write in the little textarea it expands and grows vertically as the text gets larger and larger. Other sites like Highrise do this too for note taking.

Long story short, here's the demo and here's the solution:


function _getNoLines(element) {
  var hardlines = element.value.split('\n');
  var total = hardlines.length;
  for (var i=0, len=hardlines.length; i<len; i++) {
     total += Math.max(Math.round(hardlines[i].length / element.cols), 1) - 1;
  }
  return total;
}

$(function() {

  // First, for all the textareas that have lots of lines of text 
  // in them, we want to double their number of rows
  $('textarea.autoexpanding').each(function() {
     while (_getNoLines(this) > parseInt(this.rows))
       this.rows = '' + Math.round((parseInt(this.rows) * 1.5));
  });

  // When a user enters new lines, if they have entered more
  // lines than the textarea has rows, then double the textareas rows
  $('textarea.autoexpanding').bind('keyup', function() {
     if (_getNoLines(this) > parseInt(this.rows))
       this.rows = '' + Math.round((parseInt(this.rows) * 1.5));
  });

}

Truncated! Read the rest by clicking the link below.

html2plaintext Python script to convert HTML emails to plain text

August 10, 2007
12 comments Python

From the doc string:


A very spartan attempt of a script that converts HTML to
plaintext.

The original use for this little script was when I send HTML emails out I also
wanted to send a plaintext version of the HTML email as multipart. Instead of 
having two methods for generating the text I decided to focus on the HTML part
first and foremost (considering that a large majority of people don't have a 
problem with HTML emails) and make the fallback (plaintext) created on the fly.

This little script takes a chunk of HTML and strips out everything except the
<body> (or an elemeny ID) and inside that chunk it makes certain conversions 
such as replacing all hyperlinks with footnotes where the URL is shown at the
bottom of the text instead. <strong>words</strong> are converted to *words* 
and it does a fair attempt of getting the linebreaks right.

As a last resort, it strips away all other tags left that couldn't be gracefully
replaced with a plaintext equivalent.
Thanks for Fredrik Lundh's unescape() function things like:
   'Terms &amp;amp; Conditions' is converted to
   'Termss &amp; Conditions'

It's far from perfect but a good start. It works for me for now.

Version at the time of writing this: 0.1.

I wouldn't be surprised if I've reinvented the wheel here but I did plenty of searches and couldn't really find anything like this.

Let's run this for a while until I stumble across some bugs or other inconsistencies which I haven't quite done yet. The one thing I'm really unhappy about is the way I extract the body from the BeautifulSoup parse object. I really couldn't find another better way in the few minutes I had to spare on this.

Feel free to comment on things you think are pressing bugs.

You can download the script here html2plaintext.py version 0.1

UPDATE

I should take a second look at Aaron Swartz's html2text.py script the next time I work on this. His script seems a lot more mature and Aaron is brilliant Python developer.

I'm Prolog

May 1, 2007
1 comment Python

Like many other Python fellow geeks on Planet Python I too took the Which Programming Language Are You? quiz. Apparently I'm Prolog.

I've never used Prolog and I barely know how it works or what it's syntax looks like. Well, I guess I'll just erase all my current projects and recode them in Prolog from now on. Unpractical but necessary.

Spellcorrector

April 18, 2007
3 comments Python

I think a lot of Python people have seen Peter Novig's beautiful article about How to Write a Spelling Corrector. So have I and couldn't wait to write my own little version of it to fit my needs.

The changes I added were:

Python 2.4 compatible
Uses a pickleable dict instead of a collection
Compiled a huge list of Swedish words
Skipped edit distances 2 of words longer than 10 characters
Added a function suggestions()
All Unicode instead
A class instead of a function
Ability to train on your own words and to save that training persistently

Truncated! Read the rest by clicking the link below.

setAttribute('style', ...) workaround for IE

January 8, 2007
41 comments JavaScript

I knew had I heard it before but I must have completely missed it anyway and forgotten to test my new Javascript widget in IE. None of the stylesheet worked in IE and it didn't make any sense. Here's how I did it first:


var closer = document.createElement('a');
a.setAttribute('style', 'float:left; font-weight:bold');
a.onclick = function() { ...

That worked in Firefox of course but not in IE. The reason is that apparently IE doesn't support this. This brilliant page says that IE is "incomplete" on setAttribute(). Microsoft sucked again! Let's now focus on the workaround I put in place.

First I created a function to would take "font-weight:bold;..." as input and convert that to "element.style.fontWeight='bold'" etc:


function rzCC(s){
  // thanks http://www.ruzee.com/blog/2006/07/\
  // retrieving-css-styles-via-javascript/
  for(var exp=/-([a-z])/; 
      exp.test(s); 
      s=s.replace(exp,RegExp.$1.toUpperCase()));
  return s;
}

function _setStyle(element, declaration) {
  if (declaration.charAt(declaration.length-1)==';')
    declaration = declaration.slice(0, -1);
  var k, v;
  var splitted = declaration.split(';');
  for (var i=0, len=splitted.length; i<len; i++) {
     k = rzCC(splitted[i].split(':')[0]);
     v = splitted[i].split(':')[1];
     eval("element.style."+k+"='"+v+"'");

  }
}

I hate having to use eval() but I couldn't think of another way of doing it. Anybody?

Anyhow, now using it is done like this:


var closer = document.createElement('a');
//a.setAttribute('style', 'float:left; font-weight:bold');
_setStyle(a, 'float:left; font-weight:bold');
a.onclick = function() { ...

and it works in IE!

is is not the same as equal in Python

December 1, 2006
8 comments Python

Don't do the silly misstake that I did today. I improved my code to better support unicode by replacing all plain strings with unicode strings. In there I had code that looked like this:


if type_ is 'textarea':
   do something

This was changed to:


if type_ is u'textarea':
   do something

And it no longer matched since type_ was a normal ascii string. The correct wat to do these things is like this:


if type_ == u'textarea':
    do something
elif type_ is None:
    do something else

Remember:


>>> "peter" is u"peter"
False
>>> "peter" == u"peter"
True
>>> None is None
True
>>> None == None
True

Fastest way to uniqify a list in Python

August 14, 2006
92 comments Python

SEE UPDATE BELOW

Suppose you have a list in python that looks like this:


['a','b','a']
# or like this:
[1,2,2,2,3,4,5,6,6,6,6]

and you want to remove all duplicates so you get this result:


['a','b']
# or
[1,2,3,4,5,6]

How do you do that? ...the fastest way? I wrote a couple of alternative implementations and did a quick benchmark loop on the various implementations to find out which way was the fastest. (I haven't looked at memory usage). The slowest function was 78 times slower than the fastest function.

Truncated! Read the rest by clicking the link below.

Unicode strings to ASCII ...nicely

August 8, 2006
20 comments Python

This has been a problem for a long time for me. Whenever someone enters a title in my CMS the id of the document is derived from the title. Spaces are replaced with '- and &' is replaced with and etc. The final thing I wanted to do was to make sure the Id is ASCII encoded when it's saved. My original attempt looked like this:


>>> title = u"Klüft skräms inför på fédéral électoral große"
>>> print title.encode('ascii','ignore')
Klft skrms infr p fdral lectoral groe

But as you can see, a lot of the characters are gone. I'd much rather that a word like "Klüft" is converted to "Kluft" which will be more human readable and still correct. My second attempt was to write a big table of unicode to ascii replacements.

It looked something like this:


u'\xe4': u'a',
u'\xc4': u'A',
etc...

Truncated! Read the rest by clicking the link below.