Filtered by Python

Page 29

Reset

Google PageRank algorithm in Python

March 21, 2004
27 comments Python, Mathematics

There are many articles on the net about how the PageRank algorithm works that all copy from the original paper written by the very founders of Google Larry Page and Sergey Brin. Google itself also has a very good article that explain it with no formulas or numerical explanations. Basically PageRank is like social networks. If you're mentioned by someone important, your importance increases and the people you mention gets upped as well.

We recently had a coursework in discrete mathematics to calculate PageRank values for all web pages in a web matrix. To be able to do this you have to do many simplifications and you're limited in terms of complexity to keep it possible to do "by hand". I wrote a little program that calculates the PageRank for any web with no simplifications. The outcome is that I can quickly calculate the PageRank values for each page.

Here's how to use it:


from PageRank import PageRanker
web = ((0, 1, 0, 0),
       (0, 0, 1, 0),
       (0, 0, 0, 1),
       (1, 0, 0, 0))

pr = PageRanker(0.85, web)
pr.improve_guess(100)
print pr.getPageRank()

Truncated! Read the rest by clicking the link below.

To readline() or readlines()

March 12, 2004
30 comments Python

UPDATE 2 (November 2017)

Sorry for not having updated this in so many years. 2004 was a different Peter and I'm sorry if people landed on this blog post and got the wrong idea.

To read lines from a file do this:


with open('somefile.txt') as f:
   for line in f:
       print(line)

This works universally in Python 2 and Python 3. It reads one line at a time by iterating till it finds line breaks.

When you create a file object in Python you can read from it in several different ways. Not until today did I understand the difference between readline() and readlines(). The answer is in the name. readline() reads one line character at a time, readlines() reads in the whole file at once and splits it by line.

These would then be equivalent:


f = open('somefile.txt','r')
for line in f.readlines():
    print line
f.close()

# ...and...

f = open('somefile.txt','r')
for line in f.read().split('\n'):
    print line
f.close()

The xreadlines() function should be used for big files:


f = open('HUGE.log','r'):
for line in f.xreadlines():
   print line
f.close()

Truncated! Read the rest by clicking the link below.

Python UnZipped

March 11, 2004
0 comments Python

Zipping and unzipping a file in Python is child-play. It can't get much easier than this. Well, in Windows you can highlight a couple of files and right-click and select from the WinZip menu. Here's how to do it in Python:


>>> import zipfile
>>> zip = zipfile.ZipFile('Python.zip', 'w')
>>> zip.write('file1.txt')
>>> zip.write('file2.gif')
>>> zip.close()

Still confused? Read this article for "Dev Shed"n:http://www.devshed.com/ then. The article includes some more advanced uses as well.

PSP - Python Server Pages

March 9, 2004
2 comments Python

"Python Server Pages (PSP), as you probably guessed already, is a way to inline Python in HTML or XML documents. The server interprets this inlined code to produce the final HTML sent to the client. This approach has become popular with tools such as JSP, PHP, and ColdFusion."

I really hope this catches on. It'll be the perfect alternative to PHP. PHP is very popular because it's easy to use but the syntax and execution is poor compared to Python. I guess PHP will stay strong for quite some time still but if PSP can get the tight Apache and MySQL integration that PHP has, we have a winner here.

Recon - Regular Expression Test Console

January 14, 2004
0 comments Python

This is a fantastic little Python GUI using Tkinter, for testing your regular expressions in Python. You first paste or write in some text, then you doodle some regular expressions to see the outcome. What I do miss is exporting of actual code. Usually when I write my regular expression I fire up the interactive shell from which I can copy code when I'm happy with it. Like this:


>>> import re
>>> e=re.compile(r'\?q=(.*?)&', re.I)
>>> print e.findall("http://www.google.com.br/search?q=paper+plane&hl=pt-BR")
['paper+plane']

From silly code like that I can actually copy and paste the actual syntax. Ah well, I still like this Recon thing.

PythonPoint

January 10, 2004
0 comments Python

PythonPoint PythonPoint is a piece open source software that converts XML to PDFs that looks like presentation slides.

"Essentially, it converts slides in an XML format to PDF."

Here's a 270Kb sample PDF document.

Case study where Python was the final choice

December 3, 2003
0 comments Python

This is an article about a group of developers who had a task to do but hadn't chosen what programming language to use. The alternatives they had to choose from were C, PHP, Perl, Java and Python; and the requirements where these:

  • We planned to prototype on a remote device and anticipated numerous changes. We needed a language that was designed with change in mind.
  • We wanted to avoid the added step of code compilation in order to minimize the overhead associated with a change. An interpreted language seemed pragmatic.
  • We wanted a language with good introspection capability.
  • We needed to do a lot of string manipulation and file I/O. Whatever language we chose had to excel in both of these areas.

It's so suited for Python that it almost sounds as if they wrote the requirements after the choice. But that's just how Python is.

Py2TeX

November 29, 2003
0 comments Python

Py2TeX
This is a module worth remembering. It converts Python code into TeX nicely. Unfortunately not colour coded, but "=" equal signs become "<--" arrows and negations like "!=" become an equal sign with a dash across it. This will be good for formatting little sections of Python code into serious documentation.

Zolera SOAP Infrastructure 1.4

November 14, 2003
1 comment Python

Nice! Now there's a decent Python SOAP module that I need to find time to explore. I want to set up some web services on my site that can be interfaced in different ways on other servers/clients. Just need to think of something that can be useful.

Any ideas anyone?

(more links on pywebsvcs.sourceforge.net)