avatarArticles by Carissa Brittain

Django, contests and weekly voting

I’ve written before about how OpenDataPhilly uses a ratings module to drive a nomination system. Recently, we added a contest to the site to determine what kinds of data local non-profits and the public would like to see made available. Contests generally have a winner and, in this case, we’re letting the public vote on data sets nominated by non-profits. At first glance this isn’t much different from our current nomination system, but there’s one catch; we wanted users to be able to vote for one entry once a week. Turns out this was more novel than it sounds.

Django has a few modules for rating or voting on content, one of which we’re using for the nomination and comments systems. The inner-workings of the module boil down to the following rules:

  1. A user must be logged in to rate/vote
  2. A user can rate/vote for any number of items
  3. A user can only rate/vote for any particular item once (though they may change their rating/vote later)

Compare this with the rules we wanted to enforce for the contest:

  1. A user must be logged in to vote
  2. A user can only vote once per 7 day period
  3. A user can vote for an item multiple times, so long as rule 2 is preserved

Aside from the first rule, we were trying to do almost exactly the opposite of what our rating module enforced. Rather than retrofit the existing module to allow additional and sometimes contradictory behaviour, we decided to write a very small voting module of our own.

The code revolves around two decision points: is voting allowed and can a specific user vote now. The first question is answered by the contest object itself. A contest knows when it’s starting and ending date are, so if today is after the start date and before the end date, then voting is allowed.

The second question is a bit more complicated, but not by much. Because of rule 2 above, we need to know when a user last voted to know if they’re currently allowed to vote. The database storage for a vote contains a datetime object, a foreign key to the user object and a foreign key to the contest entry so if we sort a user’s votes by time we can retrieve their latest vote.

def user_can_vote(self, user):
    increment = datetime.timedelta(days=7)
    votes = user.vote_set.order_by('-timestamp') #latest on top
    if votes:
        next_date = votes[0].timestamp + increment
        if datetime.datetime.today() < next_date and dt.today() < contest.end_date:
            return False
    return True

The above code gets a user’s votes and orders them by time with the most recent first. If a user has ever voted, we need to check if they’re allowed to vote again yet or if they have to wait. We calculate the earliest time that a user can vote next and check it against the date and time now. We also check the end of the contest against the date and time now. If “now” is before the next time the user can vote or “now” is after the contest’s end date, we return false; the user can’t vote now. If a user has never voted before, or the dates are all ok then the user can vote. This check is done after a user clicks the “vote” button but before a vote is saved to the database. We also display a message saying why this check failed and when a user will be able to vote again.

So we’re taking advantage of all of the spam protection built into Django’s user registration process and running a contest on surprisingly little code: 3 database tables, 200 lines of python (blank lines included) and a few templates is all we needed!

Pending edit system using Django

A common concern when we talk to people about OpenTreeMap is how much to trust the public with an organisation’s tree inventory. Every implementation of this open source system has a different answer. The original site, UrbanForestMap.org, allows a logged-in user to edit almost every bit of information they gather about a tree. PhillyTreeMap.org requires a certain level of reputation before a user can edit everything, but even a new user has considerable edit capabilities. The most recent implementation (still a work-in-progress) introduces a bit of oversight to public edits. The managing group wanted to double-check changes to officially inventoried trees, but didn’t want to get in the way of people adding and editing their own trees.

Lets look at how this changes the user story first:

A logged-in user makes an edit to a tree. The system needs to decide if these changes are applied to the tree or placed in a pending queue. If this is a publicly-entered tree, the changes are applied to the tree. (Start new requirements) If this is an inventory tree, and the user isn’t a member of a management group, add the change to the pending queue. Display any pending changes reasonably near the appropriate current value. (End new requirements)

Most of this happens behind the scenes in the saving logic. I added a bit of code to the top of our tree updater to check if the pending system is active, the user’s permissions and the tree’s origins. If everything checks out, the change goes straight into the updater code. For changes that go into the pending queue, the path to becoming an official change is a little more tortuous.

Since we’re storing these changes for later review, they have to go into the database. I created a new table to hold onto the original tree’s id, the field being changed and the new value as well as the user who submitted it, a date/time stamp and a status field. Each pending change is stored separately; even if the user makes more than one change to the tree, each ‘pend’ can be applied individually.

The rest of the pending system is eye-candy and a bit of slightly tedious templating. Almost every field on a tree’s detail page now needs to check two new things: are there any pending changes for this field, and does this user have permission to approve/disapprove pending edits. If there are pending edits, the new values are added below the current official value. When a managing user views the page, small approve and disapprove buttons also appear next to each pending change. Throw in a management-access-only page for some bulk evaluation and the system is complete!

Using the CQL_FILTER parameter with OpenLayers WMS layers

I’ve used Openlayer’s Marker layer in several projects and have always just accepted that I can’t display more than around 500 markers at a time for a given query. Recently, I found another way. We’re using GeoServer as a WMS tile server for the tree and municipal boundary layers in PhillyTreeMap.org. GeoServer’s WMS implementation allows an additional parameter in the url called CQL_FILTER. This parameter allows you to use a little language called Common Query Language, or CQL, to apply data filters to the tiles that GeoServer generates. CQL is a plain text, human readable query language created by the OGC, but I like to think of it as an extremely limited third-cousin-by-adoption of SQL. I haven’t found too much in the way of documentation on this obscure little gem, so here’s a rundown of how we used it to display search results in PhillyTreeMap.org.

If you look through the CQL and ECQL page in GeoServer’s documentation, there are several examples but they don’t cover everything you can do with CQL. Basically, a CQL filter is a list of phrases similar to where clauses in SQL, separated by the combiner words AND or OR. You can use the following operators in a CQL phrase:

  • Comparison operations: =, <, >, and combinations
  • Math expressions: +, -, *, /
  • NOT
  • IS, EXISTS
  • BETWEEN, BEFORE, AFTER
  • IN
  • LIKE
  • Geometric operators: CONTAINS, BBOX, DWITHIN, CROSS(ES), INTERSECT(S)

Some of these operations have examples in the GeoServer documentation, and others can be inferred from the GeoTools documentation (the stuff in their CQL.toFilter() calls). CQL can also call any of GeoServer’s filter functions.You can add parenthesis as needed to affect the order the filters are evaluated in, just like in SQL.

CQL has a lot of power for such a short spec, but it has a one very large deficiency that requires some database designing to avoid: the utter lack of join support. This makes sense when you consider that GeoServer doesn’t know about joins either. Ultimately, you’re using CQL against the GeoServer layer, not the underlying database structure. Building views or adding reference columns to the table GeoServer is accessing can help get around this.

In PhillyTreeMap.org, we use 4 types of CQL filters: id lookups using =, null checks using IS, date and integer ranges using BETWEEN and text searches using LIKE. Here are examples of those uses along with some array joining to get a valid CQL filter string at the end:

filter_list = []
filter_list.append("species_id = 212")
filter_list.append("height IS NULL")
filter_list.append("dbh BETWEEN 10 and 20")
filter_list.append("neighborhood_id_list LIKE '%42%' ")
cql = ' AND '.join(filter_list)
# should look like this: "species_id = 212 AND height IS NULL AND dbh BETWEEN 10 and 20 AND neighborhood_id_list LIKE '%42%' "

The above CQL filter would locate trees in a specific species, have no height value, only have dbh values between 10 and 20 inches and are in a particular neighborhood. The neighborhood_id_list filter would have been a join if this were written in standard SQL since neighborhoods and trees have a many-to-many relationship in our database. Since we can’t do joins, any time a tree is added or the location is updated, all of it’s related geographies’ ids are added to a reference column on the tree, and used specifically for this type of query.

CQL is passed to GeoServer in the same way as any other WMS variable. We’re using openlayers, so most of the WMS configuration variables are already set when we create the layer. The WMS layer has a little method called mergeNewParams that lets us change those parameters after the layer has been initialized. It also automatically redraws the layer, so the changes take place immediately. To add CQL to the WMS call, just add the CQL_FILTER variable to the layer’s parameters and the layer should update.

wms_layer.mergeNewParams({'CQL_FILTER': "species_id = 212 AND height IS NULL AND dbh BETWEEN 10 and 20 AND neighborhood_id_list LIKE '%42%' "});

You can remove any filters by deleting the parameter from the layer as if it was a normal javascript object. You’ll need to redraw the layer yourself before the change will be visible.

delete wms_layer.params.CQL_FILTER;
wms_layer.redraw();

Using django_sorting without text anchors

In creating the OpenDataPhilly website, we knew we needed to pay extra attention to the usability and features available on the search page. After all, what use is a data catalog if you can’t rely on the search results? While designing the page, we decided we wanted to include several ways to sort and filter results along with the standard text search. I’d used directeur’s django_sorting module before, and was highly impressed at how well it integrated with other info already in search parameters, so I decided to use it again. The only thing keeping me from seamlessly dropping in this module was our desire to use images instead of words for the “click on this to sort” link. Django_sorting was built with table headers in mind; so much so that the examples are all about table header tags with a link inside them. We had a slightly different implementation in mind.

The first hurdle that you might think of would be to not require a table structure. Thankfully, django_sorting doesn’t care how you display the data, it only cares about the fields you want to sort on. You can put the sorting links anywhere and it just works. So far, so good.

The second possible hurdle here is that the anchor template tag specification calls for two parameters: the field to sort on, and a string for the link. Since we didn’t want a text link, we really didn’t care about the second parameter. To my surprise, neither did django_sorting! So our anchor template tags look something like this:

<li id="sort_rating_score">{% anchor rating_score %}</li>

This winds up creating a link that still has something for the title attribute and for the inner text:

<li ...><a title="Rating_score" href="/blah/blah/?sort=rating_score">Rating_score</a></li>

So our template tag is nice and clean, but we still have to deal with some text in the link. Using either straight-up javascript, or some smaller and nicer jQuery, removing the innerHTML is easy so long as the dom can uniquely identify our sort links. I just gave the link’s parent container an id and cleared the innerHTML of each link. At the same time, I added a class and some css to define the image and size.

So now we have a django_sorting anchor with an image instead of the default text link. All done, right? Not quite. We didn’t just want to use an image here, we wanted some mouseover and focus sprite action, too. Another chuck of jQuery, and a querystring plugin, helps us out again:

$("#sort .sort_image").each(function () {
    $(this).hover(function() {
        this.style.backgroundPosition="0 -89px"; //the hover image location
    }, function () {
        var filter_split = this.parentNode.id.split('sort_');
        if ($.query.get('sort') && $.query.get('sort') == filter_split[1]) {
            this.style.backgroundPosition="0 -45px";  //the active image location
        } else {
            this.style.backgroundPosition="0 0"; //the non-active image location
        }
    });
 });

 

jQuery Star Rating plugin as a frontend for Django-ratings

My current Django project, OpenDataPhilly, needed a ratings system and, guess what? Django has a ratings module! All done, right? Weeelllll.. the thing is, jQuery also has a beautiful and well-behaved ratings widget that is very easy to re-skin. How do you choose between two stable plugins? In this case there was no need. I decided to use the Django-ratings module for everything but the visual element and the jQuery Star Rating Plugin for some flashy effect.

So why blog about it then? The implementation was not quite as easy as it sounds.

First thing’s first: download and install the Django-ratings module the same way you would any other module. Follow their instructions to add a ratings field to your model and fix any import or database issues. You should wind up with something like this in your models file:

from djangoratings.fields import RatingField

class MyClass(models.Model):
    ....
    rating = RatingField(range=5)
    ...

Since we’re using jQuery for the front end, that’s all we need to do in the models file.

Next, download the jQuery plugin and put the files somewhere accessible by your app. I put them in my project’s static folder, but I think anywhere is fine. The plugin worked right out of the box for me, no tweaking necessary.

On to the slightly hard part! If you look in the jQuery plugin’s documentation, it expects a pile of radio buttons to skin. No divs or lists or other tricks of nice formatting, because the plugin does all that. Unfortunately, Django’s radio button widgets think they need to do all that, too. Someone has to be told ‘No‘ very firmly. Since the whole point of using the jQuery plugin is to have a pretty and easily skinable front end, we’re going to tell Django to stop rendering radio buttons quite as nicely as it does. And that’s going to take some doing.

Somewhere in your project, create a widgets.py file and hunt down the django.forms.widget.py file so you can copy a few things. The two classes we’re interested in are RadioInput and RadioFieldRenderer. Copy those two classes from Django’s file into your new one and close the Django file. We don’t want to accidentally mess something up in there.  First, we’ll need a few imports at the top of our file:

from django.forms.util import flatatt

from django.utils.encoding import StrAndUnicode, force_unicode
from django.utils.html import conditional_escape
from django.utils.safestring import mark_safe

Next, rename the classes to something not in Django. I chose StarsRadioInput and StarsRadioFieldRenderer. Also the StarsRadioFieldRenderer makes a few references to the RadioInput class. Change these to StarsRadioInput, too. Now that the classes are safely renamed, there are two methods in charge of the html coming out of our custom widget: StarsRadioInput.tag and StarsRadioFieldRenderer.render. We basically need to strip out all of the labels and list tags so that the jQuery plugin can do its css magic.

Change the StarsRadioInput.tag method to:

def tag(self):
  if 'id' in self.attrs:
    self.attrs['id'] = '%s_%s' % (self.attrs['id'], self.index)
    final_attrs = dict(self.attrs, type='radio', name=self.name, value=self.choice_value)
  if self.is_checked():
    final_attrs['checked'] = 'checked'
  return mark_safe(u'<input%s />' % flatatt(final_attrs))

And change the StarsRadioFieldRenderer.render method to

def render(self):
  return mark_safe(u'\n%s\n' % u'\n'.join([u'%s'
                   % force_unicode(w) for w in self]))

Now, all you have to do is make your form add our special renderer to the field and push the rating value into the right data model when the form is submitted.

In the forms.py file, add a field to your form. Don’t forget to create the rating choices tuple and add the ‘star’ class to the field.

RATING_CHOICES = ((1,1), (2,2), (3,3), (4,4), (5,5),)
forms.CharField(widget=forms.RadioSelect(renderer=StarsRadioFieldRenderer, attrs={'class':'star'}, choices=RATING_CHOICES))

In your form’s model, add this to the save method to connect the form data to the actual model’s data (if you need to):

def save(self, *args, **kwargs):
  myobject.rating.add(score=self.rating, user=self.user, ip_address=self.ip_address)
  super(CommentWithRating, self).save(*args, **kwargs)

Here’s what it looks like attached to a comments form:

django-rating and jquery star rating

django-rating and jquery star rating together

 

jQuery, OpenLayers.Layer.Vector and IE8

One of the very first things you learn about OpenLayers is the importance if initializing maps in either the body tag’s onload event or in script tags at the bottom of the page. The basic reason for this is that many of the OpenLayers components need a page’s dom to be complete before it starts adding it’s own dom structure. The onload event is thrown after the browser is finished loading everything else, and the bottom of the page naturally gets rendered last, so these two places tend to work the best. Makes sense right?

If you’ve been using additional frameworks, like  jQuery ( or EXT or Django or Drupal or whatever) you’ve learned to rely on that framework’s onready event, or its rough equivalent.  This event tends to be thrown before the onload event so that the frameworks don’t need to wait for every last image and tag to be loaded. In theory, this could be a very problematic place to put OpenLayers map initialization code. The dom elements that OpenLayers is looking for may or may not be there, and there’s no way to tell. In practice, most browsers, and most OpenLayers components, work fine when initialized in a framework’s onready event. One layer in particular (OpenLayers.Layer.Vector) will not play nice with one particular browser (IE8) if initialized in a framework’s onready event. The vector layer depends on the document object’s namespace attribute, and this attribute simply isn’t always available before IE’s onload event (which is called after the onready event). As you can imagine, this was one headache I wish I could have avoided.

Thankfully, there is a fairly simple fix. Instead of the usual convention of :

$.ready(function() { init_my_map(); });

Add a stand-alone script tag to the extreme end of the body content on the pages where you need a vector layer:

<script>
    init_my_map();
</script>

Geocoding The Centennial Expo… Without A Geocoder!

Set your wayback machine to 1876, the place: the Philadelphia Centennial Exhibition. Wandering around the festival grounds, you would have seen several cameramen snapping away. Some of those images have survived admirably as the Free Library of Philadelphia‘s Centennial Exhibition Collection. Over at PhillyHistory.org, we recently added a selection of those images, and the geocoding presented quite a challenge.

The first hurdle to be overcome was that the vast majority of the Centennial Expo’s buildings (and some of the streets) don’t exist anymore! The Expo took place in a swath of Philly’s Fairmount Park, most of which has turned back into trails and fields over the years.  Some of the buildings were torn down just after the Expo closed; some of them still exist, but in other places. Only a very few buildings and landmarks still occupy the same space they did back in 1876. But do not despair for there was a map! Included in the Free Library’s records was a map of the entire fair, complete with all of the buildings and roads built for the Expo, along with neat things like fountains and plazas. Dana Bauer, one of Azavea’s employees and a geo-referencing expert, took charge of the map and managed to line up enough of the old landmarks with existing geography to give us very accurate coordinates for all of the Centennial Expo’s old buildings! Hooray!

So now we have coordinates, but we need to be able to tie those coordinates to the image records in our system. Here the data was slightly uncooperative. Many of the image records had a building mentioned in some fashion and in one field or another. The challenge here was that the text in the record didn’t always match the official name of the building from the map. For example: the Main Exhibit Hall was mentioned in records as Main Exhibition Building, Main Hall, Main Bldg and several other permutations. We spent some time figuring out around 20 aliases for some of the more popular buildings in the collection.

The last step was to build a parsing-and-geocoding module to find the building references in the record data, match them up with the coordinates we found and add that information to our database. Easy right? Not so fast. All 150-some coordinate pairs and their building text keys needed to be accessible somehow to our data importer. We thought for a while about how best to do that. 150 entries isn’t a huge list, but it’s not short either. We didn’t want to have to mess with it at all once we set something up. This list also wasn’t likely to change over time, so the list didn’t need to be very accessible or easily editable by people. However, we were also going to need it again to return to the Free Library in some format or another at the end of the project. The coordinates were in an Excel spreadsheet by this point, but we didn’t want to deal with the overhead of accessing it directly. We wound up creating a data dictionary inside our importer; kind of a mini geocoder table just for this swath of Fairmont Park circa 1876. We left the parent Excel spreadsheet alone as both a backup archive and the basis for the document we’ll be returning to the Free Library.

While looking though the data for building aliases, we also noted the fields where these names were found. So all we had to do was write a small function to loop through these fields, do some text matching on our data dictionary and add the coordinates to the record in our system. Out of over 1500 images, we wound up geocoding over 90% using this method, and most of the rest simply didn’t have a building reference to find. The entire import, images and all, took under 10 minutes. Not bad at all!