Technomancy

Entries tagged “python”

osm2sqlite - A programme to convert from OSM files to SQLite

written by rory, on Nov 29, 2009 7:50:00 PM.

I've created osm2sqlite, a programme to convert from OSM files (OpenStreetMap's XML file format) into a SQLite database. It's not very good for display OpenStreetMap data in a GIS programme. It's good if you want to muck around with raw OpenStreetMap data, in a familiar SQL environment.

I'm using git to store the code, you can get it here: http://repo.or.cz/w/osm2sqlite.git

Sample usage:

  • Checkout the code
    git clone git://repo.or.cz/osm2sqlite.git .
  • For this example, we'll download the OSM data from Ireland.
    wget http://download.geofabrik.de/osm/europe/ireland.osm.bz2
  • Unzip the file. Currently it has to be unzipped. Support for converting bzipped files is planned.
    bunzip2 ireland.osm.bz2
  • And finally run the script to convert the OpenStreetMap file to a database called ireland.sqlite.
    ./convert_osm_to_sqlite.sh -o ireland-2009-11-17.osm -d ireland.sqlite

PyLint plugin to catch if/elif blocks that don't have an 'else' clause

written by rory, on Mar 30, 2009 10:00:00 PM.

Recently I was tracking down a bug in some code. The source of the problem was an 'if' block that had a elif this, elif that, but it didn't have an 'else' clause. It should have had an else clause. It was going through a loop from a database and checking for certain conditions. It would set a variable based on that if block, since there was no else block, the variable was not getting updated in this iteration of the loop, causing a duplicate of the code. If there had been an 'else: assert False' it wouldn't have been a problem, because we could have caught it. After fixing the code here, I started to wonder about any other potential bugs like this were lurking elsewhere. It would be cool if I could analyse the code and see if there were any more if blocks without an else condition.

pylint is a programme that will scan your python source code for coding standards (variable names, unused imports, line length, etc). It's also extensible, so you can write your own plugin for it. this looked like the perfect tool to use. it turned out to be very straight forward to write a plugin that will scan for this kind of thing.

The code is stored in a git repository here. Anonymouse users can push to the 'mob' branch on this, so I welcome all submissions. It's quite easy to use, just call this command from the same directory as the missing_else.py file from the above repository. It'll generate warning messages for if blocks that have elif blocks but have no else blocks. pylint --load-plugins=missing_else sample.py

There are 2 options with this file. By default it will warn you if you have a if that has an elif and doesn't have an else clause. If you add the option "--warn_if_no_else=y" then it'll also warn you if you have a bare if clause that has no else clause. I don't enable this by default.

A powerset generator in python

written by rory, on Mar 17, 2009 2:09:00 PM.

The powerset of a set is the set of all possible subsets of a set. i.e. for the list [1, 2, 3], the power set is [ [], [1], [2], [3], [1, 2], [1, 3], [2, 3], [1, 2, 3] ]. [wikipedia has more]. Generators in Python a powerful concept, and allow you to have lists that you can generate as you go along. So you don't need to calculate all the items in a list before using them. As you can see from the example above, the number of elements in a powerset of a list is much larger than the number of elements in the original list. If there are n items in the original list, then there are 2n items in the powerset. For example, the powerset of a 5 element list has 32 items, the powerset of a 10 element list has 1,024 items and so forth. Since powersets are so large, generators are very helpful here, since you don't have to generate (and store) a 1,024 element list before doing your calculation. Here is a simple generator function in python that will return (or 'yield') all the lists in the powerset of a list.
def powerset(seq):
    """
    Returns all the subsets of this set. This is a generator.
    """
    if len(seq) <= 1:
        yield seq
        yield []
    else:
        for item in powerset(seq[1:]):
            yield [seq[0]]+item
            yield item