<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Azavea Labs</title>
	<atom:link href="http://www.azavea.com/blogs/labs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.azavea.com/blogs/labs</link>
	<description>Insight on what our engineers are doing</description>
	<lastBuildDate>Thu, 26 Jan 2012 16:14:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Introducing python-sld and django-sld</title>
		<link>http://www.azavea.com/blogs/labs/2011/12/introducing-python-sld-and-django-sld/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/12/introducing-python-sld-and-django-sld/#comments</comments>
		<pubDate>Wed, 21 Dec 2011 18:23:04 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Posts]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1830</guid>
		<description><![CDATA[python-sld python-sld is a simple python library that enables some basic manipulation of StyledLayerDescriptor (SLD) documents. What are SLD documents?  SLD is a standard defined by the Open Geospatial Consortium, or OGC. In their words: The OpenGIS® Styled Layer Descriptor (SLD) Profile of the OpenGIS® Web Map Service (WMS) Encoding Standard defines an extends the [...]]]></description>
			<content:encoded><![CDATA[<h1>python-sld</h1>
<p><a title="python-sld on pypi.python.org" href="http://pypi.python.org/pypi/python-sld/">python-sld</a> is a simple python library that enables some basic manipulation of StyledLayerDescriptor (SLD) documents.</p>
<p>What are SLD documents?  <a title="SLD standard" href="http://www.opengeospatial.org/standards/sld">SLD</a> is a standard defined by the <a href="http://www.opengeospatial.org/">Open Geospatial Consortium</a>, or OGC. In their words:</p>
<blockquote><p>The OpenGIS® Styled Layer Descriptor (SLD) Profile of the OpenGIS® Web Map Service (WMS) Encoding Standard defines an extends the WMS standard to allow user-defined symbolization and coloring of geographic feature and coverage data.</p></blockquote>
<p>In layman&#8217;s terms, SLD is a common way to style your own maps that come from any map server that speaks <a title="WMS standard" href="http://www.opengeospatial.org/standards/wms">WMS</a> (another standard by OGC). Of all the GIS tools available, the WMS server ecosystem is exceptionally rich and diverse. There are <a href="http://www.intergraph.com/sgi/products/productFamily.aspx?family=10">many</a> <a href="http://resources.arcgis.com/content/arcims/10.0/about">proprietary</a> <a href="http://www.pbinsight.com/products/location-intelligence/developer-tools/desktop-mobile-and-internet-offering/mapxtreme-2008/">choices</a>, as <a href="http://www.qgis.org/">well</a> <a href="http://goworldwind.org/server/">as a </a>plethora <a href="http://www.resc.rdg.ac.uk/trac/ncWMS/">of</a> <a href="http://www.mapserver.org/">open</a> <a href="http://mapguide.osgeo.org/">source</a> <a href="http://mapnik.org/">options</a>.</p>
<h2>State of the Art</h2>
<p>Recently in the course of developing new features for <a href="http://www.districtbuilder.org/">DistrictBuilder</a>, we arrived at a point where we needed to generate SLDs dynamically. Looking around at the existing python libraries, we examined:</p>
<ul>
<li><a href="https://github.com/opengeogroep/pySLD">pySLD</a></li>
<li><a href="http://www.webrian.ch/2011/10/save-as-sld-030-released.html">qGIS plugin &#8220;Save as SLD&#8221;</a></li>
<li><a href="http://geoscript.org/py/">Geoscript</a></li>
</ul>
<p>What we were looking for was a pure object model access to components in the SLD, as well as XML validation, with very few dependencies. None of the above projects really fit the bill, so we started working on our own.</p>
<h2>Introducing python-sld</h2>
<p>python-sld in an open source (<a href="http://www.apache.org/licenses/LICENSE-2.0.html">Apache 2.0</a>) library for dynamic SLD creation and manipulation. The project is hosted over on <a href="https://github.com/azavea/python-sld/">github</a>, and the packages are in <a href="http://pypi.python.org/pypi/python-sld">pypi</a> (including generated inline <a href="http://packages.python.org/python-sld/sld.CssParameter-class.html">documentation</a>).</p>
<h3><strong>Features</strong></h3>
<p>Width python-sld, creating new SLD documents is as easy as creating a new instance of a <a href="http://packages.python.org/python-sld/sld.StyledLayerDescriptor-class.html">StyledLayerDescriptor</a> object:</p>
<pre>&gt;&gt;&gt; from sld import *
&gt;&gt;&gt; sld_doc = StyledLayerDescriptor()</pre>
<p>With this SLD document, all descendants are accessed as properties, and most child objects are created off the parent with &#8220;create_xxx()&#8221; methods:</p>
<pre>&gt;&gt;&gt; sld_doc.NamedLayer is None
True
&gt;&gt;&gt; nl = sld_doc.create_namedlayer('My Layer')
&gt;&gt;&gt; nl.Name
'My Layer'</pre>
<p>For most complex types, the parent&#8217;s property is an instance of the class. In our example:</p>
<pre>&gt;&gt;&gt; isinstance(nl, NamedLayer)
True
&gt;&gt;&gt; us = nl.create_userstyle()
&gt;&gt;&gt; us.Title = 'Style Title'
&gt;&gt;&gt; us.Title
'Style Title'
&gt;&gt;&gt; isinstance(us, UserStyle)
True</pre>
<p>A couple pythonic classes break up the monotony, too. For elements that contain collections of items (a <a href="http://packages.python.org/python-sld/sld.FeatureTypeStyle-class.html">FeatureTypeStyle</a> element may contain many <a href="http://packages.python.org/python-sld/sld.Rule-class.html">Rule</a> elements, and <a href="http://packages.python.org/python-sld/sld.Fill-class.html">Fill</a>, <a href="http://packages.python.org/python-sld/sld.Stroke-class.html">Stroke</a>, and <a href="http://packages.python.org/python-sld/sld.Font-class.html">Font</a> elements may contain many <a href="http://packages.python.org/python-sld/sld.CssParameter-class.html">CssParameter</a> elements), they behave as pythonic lists.</p>
<pre>&gt;&gt;&gt; fts = us.create_featuretypestyle()
&gt;&gt;&gt; len(fts.Rules)
0
&gt;&gt;&gt; r1 = fts.create_rule('Criteria 1')
&gt;&gt;&gt; len(fts.Rules)
1
&gt;&gt;&gt; fts.Rules[0].Title == r1.Title
True</pre>
<p>Another bit of pythonic syntactic sugar is the combination of <a href="http://packages.python.org/python-sld/sld.Filter-class.html">Filter</a>s. By constructing filters (with the <a href="http://packages.python.org/python-sld/sld.Rule-class.html">Rule</a> as a parent) and combining them with &#8220;+&#8221; or &#8220;|&#8221;, they create logical &#8220;AND&#8221; and &#8220;OR&#8221; filters, respectively.</p>
<pre>&gt;&gt;&gt; f1 = Filter(r1)
&gt;&gt;&gt; f1.PropertyIsGreaterThan = PropertyCriterion(f1, 'PropertyIsGreaterThan')
&gt;&gt;&gt; f1.PropertyIsGreaterThan.PropertyName = 'number'
&gt;&gt;&gt; f1.PropertyIsGreaterThan.Literal = '-10'
&gt;&gt;&gt;
&gt;&gt;&gt; f2 = Filter(r1)
&gt;&gt;&gt; f2.PropertyIsLessThanOrEqualTo = PropertyCriterion(f2, 'PropertyIsLessThanOrEqualTo')
&gt;&gt;&gt; f2.PropertyIsLessThanOrEqualTo.PropertyName = 'number'
&gt;&gt;&gt; f2.PropertyIsLessThanOrEqualTo.Literal = '10'
&gt;&gt;&gt;
&gt;&gt;&gt; r1.Filter = f1 + f2</pre>
<p>When the SLD object is serialized, it will render an &#8220;ogc:And&#8221; element that contains both property comparisons. You may have noticed that both the &#8220;PropertyIsGreaterThan&#8221; and &#8220;PropertyIsLessThanOrEqualTo&#8221; properties are assigned an instance of a <a href="http://packages.python.org/python-sld/sld.PropertyCriterion-class.html">PropertyCriterion</a> class. This is the common class for all property comparitors. The name of the comparitor determines it&#8217;s logical comparison (less than, greater than, equal to, etc.), and the class has a PropertyName and Literal property, to control which property gets compared, and which value it is compared against.</p>
<p>Finally, serialization is performed on the main StyledLayerDescriptor object, with options to &#8216;prettify&#8217; the output:</p>
<pre>&gt;&gt;&gt; content = sld_doc.as_sld(pretty_print=True)</pre>
<h3><strong>Dependencies</strong></h3>
<p>The <a href="http://lxml.de/">lxml</a> library is required by python-sld. This is the library that provides the underlying parsing and serializing of the XML document, as well as the validation steps against the canonical SLD schema.</p>
<h3><strong>Limitations</strong></h3>
<p>At the current time, only a subset of the entire SLD specification is implemented. All SLD elements are parsed and stored, but only the following elements may be manipulated as objects in python-sld:</p>
<ul>
<li>StyledLayerDescriptor</li>
<li>NamedLayer</li>
<li>Name (of NamedLayer)</li>
<li>UserStyle</li>
<li>Title (of UserStyle and Rule)</li>
<li>Abstract</li>
<li>FeatureTypeStyle</li>
<li>Rule</li>
<li>ogc:Filter (implicit ogc:And and ogc:Or)</li>
<li>ogc:PropertyIsNotEqualTo</li>
<li>ogc:PropertyIsLessThan</li>
<li>ogc:PropertyIsLessThanOrEqualTo</li>
<li>ogc:PropertyIsEqualTo</li>
<li>ogc:PropertyIsGreaterThanOrEqualTo</li>
<li>ogc:PropertyIsGreaterThan</li>
<li>ogc:PropertyIsLike</li>
<li>ogc:PropertyName</li>
<li>ogc:Literal</li>
<li>PointSymbolizer</li>
<li>LineSymbolizer</li>
<li>PolygonSymbolizer</li>
<li>TextSymbolizer</li>
<li>Mark</li>
<li>Graphic</li>
<li>Fill</li>
<li>Stroke</li>
<li>Font</li>
<li>CssParameter</li>
</ul>
<p>All other SLD elements cannot be directly manipulated in python-sld, but are accessible (from a parsed SLD that is perhaps more complex) via the parent object&#8217;s _node property. This is the lxml.Element that the python-sld class represents.</p>
<h1>django-sld</h1>
<p>django-sld builds upon the capabilities in python-sld by enabling quick SLD generation from geographic models. This library is separate from the python-sld library because of the dependencies on <a href="https://www.djangoproject.com/">django</a> and <a href="http://code.google.com/p/pysal/">pysal</a>, the Python Spatial Analysis Library.</p>
<h2>Primer on Geographic Models</h2>
<p>I gave a quick background to geographic models in django to the <a href="http://www.meetup.com/djangoboston/events/43145722/">Boston django meetup</a> last week, and the slides of my presentation are <a href="https://docs.google.com/present/view?id=ddpq33ft_104cdq773cs">available online</a> as a presentation in Google Docs. The slides are embedded here for your convenience:</p>
<p><iframe src="https://docs.google.com/present/embed?id=ddpq33ft_104cdq773cs" frameborder="0" width="410" height="342"></iframe></p>
<h2>Introducing django-sld</h2>
<p>django-sld is an open source (<a href="http://www.apache.org/licenses/LICENSE-2.0.html">Apache 2.0</a>) library for generating SLD documents from geographic querysets. The project is hosted over on <a href="https://github.com/azavea/django-sld/">github</a>, and the packages are in <a href="http://pypi.python.org/pypi/django-sld">pypi</a> (including generated inline <a href="http://packages.python.org/django-sld/">documentation</a>).</p>
<h3><strong>Features</strong></h3>
<p>django-sld enables quick classification of geographic querysets by passing the data distribution of an individual model field into the classification algorithms built into pysal. Not all classification methods in pysal are available, however. At the current version (1.0.3), the following classification algorithms are supported:</p>
<ul>
<li>Equal Interval</li>
<li>Fisher Jenks</li>
<li>Jenks Caspall</li>
<li>Jenks Caspall Forced</li>
<li>Jenks Caspall Sampled</li>
<li>Max P Classifier</li>
<li>Maximum Breaks</li>
<li>Natural Breaks</li>
<li>Quantiles</li>
</ul>
<p>To classify a django queryset, use any of the as_xxx() methods in the djsld.generator module.</p>
<pre>&gt;&gt;&gt; from djsld import generator
&gt;&gt;&gt; qs = MySpatialModel.objects.all()
&gt;&gt;&gt; sld = generator.as_quantiles(qs, 'population', 10)</pre>
<p>The above example assumes that you have a model named &#8220;MySpatialModel&#8221; in django&#8217;s models.py file. The result is a sld.<a href="http://packages.python.org/python-sld/sld.StyledLayerDescriptor-class.html">StyledLayerDescriptor</a> object, which may be serialized to a string with &#8220;as_sld()&#8221;</p>
<pre>&gt;&gt;&gt; sld_content = sld.as_sld(pretty_print=True)</pre>
<p>The &#8220;pretty_print&#8221; option is available to format the SLD in a fashion that is more readable by us humans.</p>
<p>In addition to simple models, django&#8217;s support for related fields really shines, as it&#8217;s possible to classify the distribution on any related field, using the &#8220;__&#8221; (double underscore) format preferred by django:</p>
<pre>&gt;&gt;&gt; sld = generator.as_quantiles(qs, 'city__population', 10)</pre>
<p>The one caveat is that the PropertyName in the criteria will be set to this field name (which is not the way most mapping packages refer to related fields). To accommodate this difference, you may use the &#8216;propertyname&#8217; keyword to control the output PropertyName:</p>
<pre>&gt;&gt;&gt; sld = generator.as_quantiles(qs, 'city__population', 10,
... propertyname='population')</pre>
<h3><strong>Dependencies</strong></h3>
<p>django-sld requires python-sld and the pysal library.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/12/introducing-python-sld-and-django-sld/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Putting the Fun in FOSS</title>
		<link>http://www.azavea.com/blogs/labs/2011/09/putting-the-fun-in-foss/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/09/putting-the-fun-in-foss/#comments</comments>
		<pubDate>Fri, 23 Sep 2011 15:38:42 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[foss4g]]></category>
		<category><![CDATA[GeoServer]]></category>
		<category><![CDATA[i2maps]]></category>
		<category><![CDATA[mapnik]]></category>
		<category><![CDATA[nodejs]]></category>
		<category><![CDATA[opensource]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1814</guid>
		<description><![CDATA[I went to the State of the Map (SotM) and Free and Open Source Software for Geospatial (FOSS4G) Conference in Denver, CO last week, where I was surrounded by geospatial users, developers, and architects. I had the opportunity to attend some workshops and learn about a slew of awesome projects &#8212; I&#8217;m itching to start [...]]]></description>
			<content:encoded><![CDATA[<p>I went to the State of the Map (<a href="http://stateofthemap.org/">SotM</a>) and Free and Open Source Software for Geospatial (<a href="http://2011.foss4g.org/">FOSS4G</a>) Conference in Denver, CO last week, where I was surrounded by geospatial users, developers, and architects. I had the opportunity to attend some workshops and learn about a slew of awesome projects &#8212; I&#8217;m itching to start incorporating many of these new tools and techniques into our solutions.</p>
<h2>Node.js</h2>
<p>I was able to attend some of the workshops &#8212; &#8220;You&#8217;ve got Javascript in your backend&#8221; with <a href="http://nodejs.org/">Node.js</a> and <a href="http://polymaps.org/">Polymaps</a> was a great beginner workshop, introducing lightweight servers and client mapping libraries. I was amazed that a basic web server in node.js is only 5 lines of code. Equally amazing was seeing what capabilities Polymaps had when it weighted in at only 32K (minified) vs. <a href="http://www.openlayers.org/">OpenLayers</a> at 1.2M (minified default build).</p>
<h2>i2maps + pico</h2>
<p>Some exciting visualization tools are coming out of the <a href="http://ncg.nuim.ie/">National Center for Geocomputation</a> at the <a href="http://www.nuim.ie/">National University of Ireland</a>, in the form of <a href="http://ncg.nuim.ie/i2maps/docs/">i2maps</a>. While it&#8217;s relatively immature (not much in the form of documentation), most the basic functionality builds off of OpenLayers.  Since I&#8217;ve already learned the OpenLayers library, I has a short learning curve, and was able to get up to speed pretty quickly.  Their library incorporates some awesome features like dynamically loading and evaluating rasters via canvas (this only works on modern browsers), and even agent-based modeling. I could have stayed in that workshop for a week.</p>
<p>A byproduct of the i2maps project is <a href="https://github.com/fergalwalsh/pico">pico</a>. Pico is a bridge between <a href="http://www.python.org/">Python</a> and Javascript, enabling you to call native Python methods directly from Javascript. It performs all the plumbing for you, allowing you to write a simple callback to handle your method&#8217;s return value. It also takes care of converting Python objects into Javascript objects, allowing you to pass all sort of data back and forth (including rasters!).</p>
<h2>mod-geocache</h2>
<p>Another new project from a contributor to the MapServer project is <a href="http://code.google.com/p/mod-geocache/">mod-geocache</a>, a tile caching service as an <a href="http://httpd.apache.org/">Apache</a> module. This skips a lot of overhead (no proxying, no interpreters, no CGI), and is very fast. In addition, the C implementation has excellent speed and performance. You can perform on the fly tile merging, quantization, and recompression. I&#8217;m excited about this module, and the promise of caching with an Apache server (looks like it has more features than <a href="http://wiki.openstreetmap.org/wiki/Mod_tile">mod_tile</a>).</p>
<h2>Geoserver</h2>
<p><a href="http://geoserver.org/display/GEOS/Welcome">Geoserver</a>&#8216;s next release is also going to include some great features. The ones that really jumped out at me:</p>
<ul>
<li>Time and elevation filters &#8212; e.g. storm tracking, where you can limit the features by a time field.</li>
<li>Styling SLDs in data units &#8212; e.g. &#8220;road is 5m wide&#8221;, and changes dynamically with scale. This greatly simplifies scale-dependent renderers.</li>
<li>Georeferencing of layers can be done in the admin interface.</li>
<li>Layers can be view definitions &#8212; you don&#8217;t have to roll your own views prior to creating the layer.</li>
<li>Virtual Services &#8212; partition the data layers by workspace.</li>
</ul>
<p>These aren&#8217;t all the new features; take a look at the <a href="http://geoserver.org/display/GEOS/State+of+GeoServer+2011">laundry list</a> yourself, and prepare to be impressed.</p>
<h2>Mapnik 2</h2>
<p>I think the reason for calling it <a href="http://trac.mapnik.org/">Mapnik2</a> is that it is literally twice as awesome as it was before. I learned about the new features in Mapnik2 in the lightning talks at SotM, and I think this was one of the few talks that made you feel like you were actually struck by lightning. I can&#8217;t remember half the slides in the talk, but the supported formats, reprojection, styling, and speed improvements left me with my head spinning.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/09/putting-the-fun-in-foss/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Django, contests and weekly voting</title>
		<link>http://www.azavea.com/blogs/labs/2011/09/django-contests-and-weekly-voting/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/09/django-contests-and-weekly-voting/#comments</comments>
		<pubDate>Thu, 15 Sep 2011 13:13:14 +0000</pubDate>
		<dc:creator>Carissa Brittain</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[Django]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1800</guid>
		<description><![CDATA[I&#8217;ve written before about how OpenDataPhilly uses a ratings module to drive a nomination system. Recently, we added a contest to the site to determine what kinds of data local non-profits and the public would like to see made available. Contests generally have a winner and, in this case, we&#8217;re letting the public vote on [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve written before about how <a title="OpenDataPhilly" href="http://www.opendataphilly.org" target="_blank">OpenDataPhilly</a> uses a <a title="jQuery Star Rating with Django-ratings" href="http://www.azavea.com/blogs/labs/2011/04/jquery-star-rating-with-django-ratings/" target="_blank">ratings module</a> to drive a nomination system. Recently, we added a <a title="The OpenDataRace is on!" href="http://www.opendataphilly.org/contest/" target="_blank">contest</a> to the site to determine what kinds of data local non-profits and the public would like to see made available. Contests generally have a winner and, in this case, we&#8217;re letting the public vote on data sets nominated by non-profits. At first glance this isn&#8217;t much different from our current nomination system, but there&#8217;s one catch; we wanted users to be able to vote for one entry once a week. Turns out this was more novel than it sounds.</p>
<p>Django has a few modules for rating or voting on content, one of which we&#8217;re using for the nomination and comments systems. The inner-workings of the module boil down to the following rules:</p>
<ol>
<li>A user must be logged in to rate/vote</li>
<li>A user can rate/vote for any number of items</li>
<li>A user can only rate/vote for any particular item once (though they may change their rating/vote later)</li>
</ol>
<p>Compare this with the rules we wanted to enforce for the contest:</p>
<ol>
<li>A user must be logged in to vote</li>
<li>A user can only vote once per 7 day period</li>
<li>A user can vote for an item multiple times, so long as <a title="Though not this rule 2" href="http://en.wikipedia.org/wiki/Three_Laws_of_Robotics" target="_blank">rule 2</a> is preserved</li>
</ol>
<p>Aside from the first rule, we were trying to do almost <a title="An... evil opposite? Nope." href="http://tvtropes.org/pmwiki/pmwiki.php/Main/EvilCounterpart?from=Main.EvilOpposite" target="_blank">exactly the opposite</a> of what our rating module enforced. Rather than retrofit the existing module to allow additional and sometimes <a title="Wisely avoided" href="http://en.wikipedia.org/wiki/Liar!" target="_blank">contradictory behaviour</a>, we decided to write a very small voting module of our own.</p>
<p>The code revolves around two decision points: is voting allowed and can a specific user vote now. The first question is answered by the contest object itself. A contest knows when it&#8217;s starting and ending date are, so if today is after the start date and before the end date, then voting is allowed.</p>
<p>The second question is a bit more complicated, but not by much. Because of rule 2 above, we need to know when a user last voted to know if they&#8217;re currently <a title="A legal mess in some systems" href="http://en.wikipedia.org/wiki/Voting_system" target="_blank">allowed to vote</a>. The database storage for a vote contains a datetime object, a foreign key to the user object and a foreign key to the contest entry so if we sort a user&#8217;s votes by time we can retrieve their latest vote.</p>
<pre>def user_can_vote(self, user):
    increment = datetime.timedelta(days=7)
    votes = user.vote_set.order_by('-timestamp') #latest on top
    if votes:
        next_date = votes[0].timestamp + increment
        if datetime.datetime.today() &lt; next_date and dt.today() &lt; contest.end_date:
            return False
    return True</pre>
<p>The above code gets a user&#8217;s votes and orders them by time with the most recent first. If a user has ever voted, we need to check if they&#8217;re allowed to vote again yet or if they have to wait. We calculate the earliest time that a user can vote next and check it against the date and time <a title="Now!" href="http://www.geekologie.com/2007/07/16/now-watch.jpg" target="_blank">now</a>. We also check the end of the contest against the date and time now. If &#8220;<a title="What am I looking at?" href="http://28.media.tumblr.com/5iKOt5ZqV55qd8c0MTPKKh4e_500.jpg" target="_blank">now</a>&#8221; is before the next time the user can vote or &#8220;<a title=".. or later." href="http://www.flickr.com/photos/jcorduroy/3725077603/" target="_blank">now</a>&#8221; is after the contest&#8217;s end date, we return false; the user can&#8217;t vote <a title="Now hiring! The board game!" href="http://citycyclops.com/1.04.10.php" target="_blank">now</a>. If a user has never voted before, or the dates are all ok then the user can vote. This check is done after a user clicks the &#8220;vote&#8221; button but before a vote is saved to the database. We also display a message saying why this check failed and when a user will be able to vote again.</p>
<p>So we&#8217;re taking advantage of all of the spam protection built into Django&#8217;s user registration process and running a contest on surprisingly little code: 3 database tables, 200 lines of python (blank lines included) and a few templates is all we needed!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/09/django-contests-and-weekly-voting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Restricting Zoom with Multiple OL Basemaps</title>
		<link>http://www.azavea.com/blogs/labs/2011/09/restricting-zoom-with-multiple-ol-basemap/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/09/restricting-zoom-with-multiple-ol-basemap/#comments</comments>
		<pubDate>Thu, 01 Sep 2011 18:41:42 +0000</pubDate>
		<dc:creator>Kenny Shepard</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[Bing]]></category>
		<category><![CDATA[DistrictBuilder]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[openlayers]]></category>
		<category><![CDATA[openstreetmap]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1595</guid>
		<description><![CDATA[As David recently posted, our team has been hard at work implementing DistrictBuilder, where we&#8217;ve been investing a great deal of effort on both performance and usability. One feature we added in the spring was the addition of basemaps to the user interface. Before this addition, users labored over drawing the perfect district configurations without a whole [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.azavea.com/districtbuilder"><img class="alignleft size-full wp-image-1783" style="border: 0px;" title="DistrictBuilder_logo" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/07/DistrictBuilder_logo.png" alt="DistrictBuilder logo" width="206" height="66" align="left" hspace="4" vspace="4" /></a>As David <a title="Building Districts in Web-Time" href="http://www.azavea.com/blogs/labs/2011/08/building-districts-in-web-time/">recently posted</a>, our team has been hard at work implementing <a title="DistrictBuilder on GitHub" href="https://github.com/PublicMapping/DistrictBuilder">DistrictBuilder</a>, where we&#8217;ve been investing a great deal of effort on both performance and usability. One feature we added in the spring was the addition of basemaps to the user interface. Before this addition, users labored over drawing the perfect district configurations without a whole lot of context of the surrounding environment (e.g. roads, water boundaries, etc.). When the time came to add a basemap to the application, it didn&#8217;t feel right restricting it to a single type of map, or even a single provider. We wanted to allow for users to have the choice to select the best map for the task at hand. Could an application promoting democracy really have it any other way?</p>
<p>We set out to support several base map options as well as any combination of options, including:</p>
<ul>
<li>Bing Maps (satellite, roads and hybrid)</li>
<li>GoogleMaps (satellite, roads and hybrid)</li>
<li>ArcGIS Online (any of several maps)</li>
<li>OpenStreetMap</li>
</ul>
<p>Since DistrictBuilder needs to be flexible enough to meet the needs of users and administrators in a variety of situations, we decided on a two step approach to basemap configuration. First, the administrator specifies, in the configuration file, which of the combinations of map providers and map types are allowed to be selected. Then DistrictBuilder presents all of the configured options to the user, where they can be toggled among at any time while a plan is being viewed or edited.</p>
<p>Here&#8217;s an example of an instance configured with an <a title="OpenStreetMap" href="http://www.openstreetmap.org/" target="_blank">OpenStreetMap</a> road layer, a <a title="Bing Maps" href="http://www.bing.com/maps/">Bing</a> hybrid layer, and a <a title="Google Maps" href="http://lmgtfy.com/?q=google+maps">Google</a> satellite layer:</p>
<p style="text-align: center;"><img class="size-full wp-image-1629 aligncenter" title="Road, Hybrid, and Satellite" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/07/rhs.png" alt="Road, Hybrid, and Satellite" width="253" height="216" /></p>
<p>Here&#8217;s another example with only road layers &#8212; one for each of the three configured providers:</p>
<p style="text-align: center;"><img class="size-full wp-image-1632 aligncenter" title="Roads for three vendors" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/07/rrr.png" alt="Roads for three vendors" width="302" height="202" /></p>
<p>DistrictBuilder currently allows the configuration of basemaps using permutations of each of the three vendors and three map types described above. Adding more options is a relatively easy task, however. With the launch of <a title="Fix Philly Districts" href="http://fixphillydistricts.com">Fix Philly Districts</a>, we wanted the basemap colors to be slightly more muted than the above options, and ended up adding support for the <a title="ArcGIS Online" href="http://www.azavea.com/blogs/labs/2011/02/openlayers-and-arcgis-com/" target="_blank">ArcGIS Online</a> World Topographic Map. We also experimented with the Google Maps V3 custom <a title="Google Styled Maps " href="http://code.google.com/apis/maps/documentation/javascript/styling.html">styling API</a>, which looked great, but introduced performance problems when panning and zooming (animations).</p>
<p>There were, of course, some hoops that needed to be jumped through in order to get all of these basemaps behaving correctly on the same map, which will be discussed below. I&#8217;ve extracted the logic required to do so into a small demo that can be viewed/downloaded <a href="http://s3.azavea.com.s3.amazonaws.com/com.azavea.www/blogs/multiple_base_layers/demo.html">here</a>. The demo has also been embedded into this post, and can be interacted with without going anywhere:</p>
<p><iframe src="http://s3.azavea.com.s3.amazonaws.com/com.azavea.www/blogs/multiple_base_layers/demo.html" frameborder="0" width="100%" height="370"></iframe></p>
<h1></h1>
<h1>Zoom Levels</h1>
<p>Many of the challenges that needed to be overcome to get this working correctly were brought about because we needed to restrict the zoom levels to the area at hand. We wanted to eliminate superfluous zoom levels to ensure the user was always operating within the appropriate boundaries (note: it is not done in this demo, but in DistrictBuilder we also restrict the extent with the &#8216;restrictedExtent&#8217; map parameter, so users can&#8217;t even pan outside of the area).</p>
<p>One difficulty with setting zoom levels on the different layers is that the layers don&#8217;t use zoom parameters consistently. In Bing (the VirtualEarth layer), minZoomLevel and maxZoomLevel are needed. In Google, minZoomLevel is needed, but it requires numZoomLevels instead of maxZoomLevel. And in OpenStreetMap (the OSM layer), well&#8230;no combination of those seem to work &#8212; we needed to slightly modify the XYZ layer (OSM&#8217;s base class) to allow maxResolution to be changed based on the minZoomLevel. To see how this is done, view the demo source. With that change in place, the list of required layer parameters is as follows:</p>
<ul>
<li><strong>Bing</strong> &#8211; minZoomLevel, maxZoomLevel, projection, sphericalMercator, maxExtent</li>
<li><strong>Google</strong> &#8211; numZoomLevel, minZoomLevel, projection, sphericalMercator, maxExtent</li>
<li><strong>OpenStreetMap</strong> - numZoomLevel, minZoomLevel, projection</li>
</ul>
<h1>Coordinate Systems</h1>
<p>We also faced some problems related to coordinate systems. DistrictBuilder uses <a title="GeoServer" href="http://geoserver.org">GeoServer</a> and <a title="GeoWebCache" href="http://geowebcache.org/">GeoWebCache</a> to serve up WMS layers. The coordinate system of our data is one version of the the <a title="GIS Standards gone crazy" href="http://viswaug.wordpress.com/2009/04/01/gis-standards-gone-crazy-epsg-especially/">ever-changing</a> &#8220;Popular Visualization CRS / Mercator&#8221; projection. We needed to match up the OpenLayers projection to the one used on our data, or else we were seeing slight offsets on our overlays. Unfortunately, the &#8216;projection&#8217; layer parameter isn&#8217;t always used within the layers correctly. For example, any layer using the SphericalMercator class gets its projection automatically hardcoded to 900913. We needed to make a slight modification to the SphericalMercator class to allow the &#8216;projection&#8217; parameter to carry through. This can be seen by viewing the demo source.</p>
<h1>Bonus: Math Time!</h1>
<p>One interesting part about implementing zoom restriction was that we needed it to work in any instance of DistrictBuilder &#8212; from large states to small towns, which may have vastly different extents. Instead of having an administrator figure out the proper minimum zoom level, we calculate it automatically based on the extent, which requires a little bit of basic algebra.</p>
<p>For Philadelphia, the extent of our area is:</p>
<pre>[-8397913.926216, 4842467.609439, -8329120.600772, 4895973.529229]</pre>
<p>In DistrictBuilder, we calculate this dynamically on the server side (using Django) by filtering all of the geounits in the database and calling the &#8216;extent&#8217; function on the query set. For the demo, this is hardcoded. Here&#8217;s how to transform this extent into a Spherical Mercator minZoomLevel:</p>
<ul>
<li>Find the width of the area in meters.</li>
</ul>
<pre>var studyWidthMeters = extent[2] - extent[0];</pre>
<ul>
<li>Find the width of the map in pixels. In the demo, this is hardcoded, because we are setting the div size of the map. In DistrictBuilder, the map takes up the whole screen, and this value is calculated on the fly based on the size of the div in which the map occupies.</li>
</ul>
<pre>var mapWidthPixels = 450;</pre>
<ul>
<li>Find the map resolution, or meters per pixel.</li>
</ul>
<pre>var resolution = studyWidthMeters / mapWidthPixels;</pre>
<ul>
<li>Find the maximum map resolution. In Spherical Mercator, the maximum resolution is one 256&#215;256 tile taking up the entire circumference Earth. So dividing the circumference of the earth (~40,000km) by 256 gives us the maximum meters per pixel, which is a constant.</li>
</ul>
<pre>var maxResolution = 156543.033928;</pre>
<ul>
<li>Spherical Mercator zoom levels work like a pyramid. Each zoom breaks the current tile up into a 2&#215;2 group of 256&#215;256 tiles, essentially halving the resolution each time. Therefore, finding the resolution at a given zoom level looks like this:</li>
</ul>
<pre>maxresolution / 2^zoom = resolution</pre>
<ul>
<li>We know the resolution and max resolution already, and need to find the zoom:</li>
</ul>
<pre>zoom = log(maxresolution/resolution)/log(2)</pre>
<ul>
<li>Or in javascript:</li>
</ul>
<pre>var minZoom = Math.log(maxResolution / resolution) / Math.LN2;</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/09/restricting-zoom-with-multiple-ol-basemap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Building Districts in Web-Time</title>
		<link>http://www.azavea.com/blogs/labs/2011/08/building-districts-in-web-time/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/08/building-districts-in-web-time/#comments</comments>
		<pubDate>Mon, 22 Aug 2011 18:07:18 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[DistrictBuilder]]></category>
		<category><![CDATA[GeoServer]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[openlayers]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1581</guid>
		<description><![CDATA[Most recently, the Politics, Redistricting and Elections team has been working closely with the Public Mapping Project to build DistrictBuilder, an open source, web-based application that enables regular citizens to use powerful tools to draw their own legislative districts. If you&#8217;ve seen how badly the professionals can mangle districts (Exhibit A, Exhibit B, etc), it&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.azavea.com/districtbuilder"><img class="alignleft size-full wp-image-1783" style="border: 0px;" title="DistrictBuilder_logo" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/07/DistrictBuilder_logo.png" alt="DistrictBuilder logo" width="206" height="66" align="left" hspace="4" vspace="4" />Most recently, the Politics, Redistricting and Elections team has been working closely with the <a title="The Public Mapping Project" href="http://www.publicmapping.org/">Public Mapping Project</a> to build <a title="District Builder" href="https://sourceforge.net/projects/publicmapping/">DistrictBuilder</a>, an open source, web-based application that enables regular citizens to use powerful tools to draw their own legislative districts. If you&#8217;ve seen how badly the professionals can mangle districts (<a title="Illinois Congressional District #4" href="http://www.redistrictingthenation.com/search.aspx?type=NATIONAL_LOWER&amp;id=4&amp;state=IL">Exhibit A</a>, <a title="Pennsylvanio Congressional District #12" href="http://www.redistrictingthenation.com/search.aspx?type=NATIONAL_LOWER&amp;id=12&amp;state=PA">Exhibit B</a>, <a title="Top 10 Congressional Districts" href="http://www.redistrictingthenation.com/top10.aspx">etc</a>), it&#8217;s easy to imagine that any given citizen, given the right tools, could do it better.</p>
<p>We spent quite a bit of time making the application easy to use and responsive in modern desktop web browsers.  The &#8220;easy to use&#8221; part was tackled by our excellent UI/UX design team. The &#8220;responsive&#8221; part was the domain of  our engineers.  That&#8217;s where the fun began for me.</p>
<p>DistrictBuilder is designed to use any polygon shapefile, transform it into an internal data model, then make that accessible via map tiles and geometric features.  When serving map tiles, we use <a title="Geoserver" href="http://geoserver.org/display/GEOS/Welcome">GeoServer</a> and <a title="GeoWebCache" href="http://geowebcache.org/">GeoWebCache</a> to generate the tiles and cache them, respectfully. This performance is great &#8212; pre-generated map tiles are the best we can aim for with respect to the base map tiles. Serving geometric features at full resolution, however, introduces a slew of problems. A few that stood out right away:</p>
<ul>
<li>Web Browser Limitations &#8212; 9 out of 10 experts agree: too many map features has a significant performance impact on web browsers, with the greatest impact on the Microsoft Internet Explorer browser.</li>
<li>Excessive Coordinates &#8212; delivering lots of polygon coordinate pairs that the user may never see consumes valuable bandwidth and rendering time.</li>
<li>Server Processing Time &#8212; recalculating state-wide geometric features consumes valuable CPU time.</li>
</ul>
<h1>Web Browser Limitations</h1>
<p>First, we tackled the browser performance issues. A sluggish browser is the kiss of death in the web world, and we had to make the application experience as fast as possible before looking at the server processing time.</p>
<p>We originally gave users the power to create highly detailed districts at the statewide level, but realized that no modern web browser could handle the volume of polygon features that would need to be served to represent an entire state.  In order to mitigate this limitation, we limited the size and number of features sent to the browser. With some scale-dependent logic, a user zoomed in to a detail of a district can finely tune the boundary by moving smaller geographic features (e.g. census blocks), and a user zoomed out to the state-wide level can manipulate the districts by moving large geographic features only (e.g. counties). In addition, when editing the finest details, we limit the number of features a user can move in a single edit.</p>
<h1>Excessive Coordinates</h1>
<p>The next thing to go was the set of full resolution geometries. In DistrictBuilder, users never actually see the full geometries, but an <a href="http://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm">adaptively simplified</a> (sometimes called generalized) geometry; depending on the scale of the map view, the server will deliver geometries with appropriate coordinate resolutions. Simply put: as you zoom in on the map, you get more detail in the geometries.</p>
<p>By simplifying counties, the geometries are reduced from 166,958 points to 4,821. When a user is zoomed out, there is no noticeable difference between these geometries!  However, as the user is interacting with higher resolution maps, DistrictBuilder loads in higher-resolution geometries on demand. The following images demonstrate the difference in the geometry detail:</p>
<div id="attachment_1689" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-1689 " title="Low Resolution Transition" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/08/county_vtd_transition-low1.png" alt="Low Resolution Transition" width="500" height="298" /><p class="wp-caption-text">The zoomed in County layer, with a low resolution district overlay (orange line). There are currently 1,414 coordinates in this view of the district overlay.</p></div>
<div id="attachment_1686" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-1686  " title="High Resolution Transition" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/08/county_vtd_transition-high.png" alt="High Resolution Transition" width="500" height="298" /><p class="wp-caption-text">The zoomed in VTD layer, with a high resolution district overlay (orange line). There are currently 3,253 coordinates in this view of the district overlay.</p></div>
<p>You can notice the differences in the district detail if you look closely at the orange district boundary. This transition happens seamlessly in the application, loading in the higher resolution geometries as web users zoom in to areas of interest.</p>
<p>We also eliminated coordinates that you never see.  It made no sense to serve  coordinates that were located in the opposite side of the state where a user was editing, just like you wouldn&#8217;t expect to get an <a title="World Book (tm) Encyclopedia" href="http://www.worldbook.com/encyclopedias/201072_basic_research_package_2010--classic.html" class="broken_link" rel="nofollow">encyclopedia</a> in the mail when releasing an <a title="Request for Information" href="http://en.wikipedia.org/wiki/Request_for_information">RFI</a>. With the <a title="OpenLayers" href="http://www.openlayers.org/">OpenLayers</a> library, Strategies came in handy here, particularly <a href="http://dev.openlayers.org/apidocs/files/OpenLayers/Strategy/BBOX-js.html">BBOX</a>.</p>
<h1>Server Processing Time</h1>
<p>After we had optimized the performance of the user interface, we shifted our focus to the server-side processing.  One of the features that makes DistrictBuilder such a powerful tool is the accuracy of the underlying data and constant feedback of important district statistics. In order to calculate all these statistics on the fly, it is necessary to leverage some tricks already mentioned with respect to map tiles: caching and generalizing.</p>
<p>Computation of the district statistics must happen every time a district boundary is changed. A naive solution to this problem would be to aggregate the values within the boundary every time a change is made.  This approach results in horrible performance. Instead, we just determine what has changed &#8212; which areas were added, which areas were removed &#8212; and recompute the delta, or change, on the previous district value.</p>
<p>Another trick to optimizing performance is in the way we determine the changing boundaries.  I&#8217;ll describe the problem using the census geographies of counties, tracts, and blocks. The structure and detail of the underlying data yielded computationally expensive queries against the block geometries.  We came up with a method of searching for the geographies in a hierarchical fashion &#8212; searching the counties first, then continuing to the next smallest-scale geography only if there was any remaining geometry left in the query.  We did the same for the tracts, and took a shortcut at the block level to exclude the block geometries.  This increased server side performance considerably.</p>
<div id="attachment_1690" class="wp-caption aligncenter" style="width: 374px"><a href="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/08/heirarchy-lg.png"><img class="size-full wp-image-1690" title="King William County" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/08/heirarchy-sm.png" alt="King William County" width="364" height="258" /></a><p class="wp-caption-text">King William county is comprised of 22 Voter Tabulation Districts and 1,527 Census Blocks.</p></div>
<p>Consider the following scenario: a user wants to move King William County (highlighted in yellow) from District 1, which is over populated, to District 3, which is under populated. Changing the boundaries with all the blocks in King William County would require testing at least 4,000 blocks for spatial intersections, then aggregating 1,527 data values, and recomputing the spatial aggregate (union) of those 1,527 geometries. With our hierarchical approach, we can change the boundary of the district with the county boundary, and change the population totals by the county&#8217;s population. A few orders of magnitude fewer operations to perform, and much faster from the user&#8217;s perspective.</p>
<h1>Lessons Learned</h1>
<p>Throughout the DistrictBuilder development process, the same core performance challenge has arisen: the volume of data must be reduced. This applies to all aspects of the application:</p>
<ul>
<li>Map Tiles: pre-render tiles to keep the number of rendered tiles to a minimum at runtime.</li>
<li>Map Features: deliver to the browser only as much information as you can see (perhaps even less).</li>
<li>Database Queries: do anything possible to ensure that geometric operations are performed on simplified geometries.</li>
<li>Aggregating Statistics: cache whatever you can, and only compute the difference from the last cache state.</li>
</ul>
<p>The above steps reduced the sheer number of operations and volume of processing that both the server and browser need to complete when creating new districts. These are lessons that translate well to <em>any</em> &#8220;big data&#8221; problem, and are crucial in bringing sophisticated GIS operations to the web.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/08/building-districts-in-web-time/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Pending edit system using Django</title>
		<link>http://www.azavea.com/blogs/labs/2011/08/pending-edit-system-using-django/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/08/pending-edit-system-using-django/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 21:27:17 +0000</pubDate>
		<dc:creator>Carissa Brittain</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[opensource]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1680</guid>
		<description><![CDATA[A common concern when we talk to people about OpenTreeMap is how much to trust the public with an organisation&#8217;s tree inventory. Every implementation of this open source system has a different answer. The original site, UrbanForestMap.org, allows a logged-in user to edit almost every bit of information they gather about a tree. PhillyTreeMap.org requires [...]]]></description>
			<content:encoded><![CDATA[<p>A common concern when we talk to people about <a title="Open source, wiki-style tree inventory site" href="http://www.azavea.com/products/opentreemap/" target="_blank">OpenTreeMap</a> is how much to trust the public with an organisation&#8217;s tree inventory. Every implementation of this open source system has a different answer. The original site, <a title="Originally build by Urban-Ecos for San Francisco, CA" href="http://www.urbanforest.org/" target="_blank">UrbanForestMap.org</a>, allows a logged-in user to edit almost every bit of information they gather about a tree. <a title="13 counties around Philadelphia, PA" href="http://www.phillytreemap.org" target="_blank">PhillyTreeMap.org</a> requires a certain level of reputation before a user can edit everything, but even a new user has considerable edit capabilities. The most recent implementation (still a work-in-progress) introduces a bit of oversight to public edits. The managing group wanted to double-check changes to officially inventoried trees, but didn&#8217;t want to get in the way of people adding and editing their own trees.</p>
<p>Lets look at how this changes the <a title="One way to gather requirements" href="http://en.wikipedia.org/wiki/User_story" target="_blank">user story</a> first:</p>
<p><em>A logged-in user makes an edit to a tree. The system needs to decide if these changes are applied to the tree or placed in a pending queue. If this is a publicly-entered tree, the changes are applied to the tree. (Start new requirements) If this is an inventory tree, and the user isn&#8217;t a member of a management group, add the change to the pending queue. Display any pending changes reasonably near the appropriate current value. (End new requirements)</em></p>
<p>Most of this happens behind the scenes in the saving logic. I added a bit of code to the top of our tree updater to check if the pending system <a title="It's alive!!!" href="http://www.youtube.com/watch?v=xos2MnVxe-c" target="_blank">is active</a>, the user&#8217;s permissions and the tree&#8217;s origins. If everything checks out, the change goes straight into the updater code. For changes that go into the pending queue, the path to becoming an official change is a little more tortuous.</p>
<p>Since we&#8217;re storing these changes for later review, they have to go into the database. I created a <a title="Not from old wood though.." href="http://topdesigninterior.com/wp-content/uploads/2010/12/Unique-Tree-Table-Coffe-by-Dylan-Gold-stink_tree.jpg" target="_blank">new table</a> to hold onto the original tree&#8217;s id, the field being changed and the new value as well as the user who submitted it, a date/time stamp and a status field. Each pending change is stored separately; even if the user makes more than one change to the tree, each &#8216;pend&#8217; can be applied individually.</p>
<p>The rest of the pending system is <a title="yummm..." href="http://www.lilsugar.com/Delilicious-Monster-Candy-Eyeballs-5756935" target="_blank">eye-candy</a> and a bit of slightly tedious templating. Almost every field on a tree&#8217;s detail page now needs to check two new things: are there any pending changes for this field, and does this user have permission to approve/disapprove pending edits. If there are pending edits, the new values are added below the current official value. When a managing user views the page, small approve and disapprove buttons also appear next to each pending change. Throw in a management-access-only page for some bulk evaluation and the system is complete!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/08/pending-edit-system-using-django/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bring on the data focused basemaps, Esri</title>
		<link>http://www.azavea.com/blogs/labs/2011/08/bring-on-the-data-focused-basemaps-esri/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/08/bring-on-the-data-focused-basemaps-esri/#comments</comments>
		<pubDate>Fri, 12 Aug 2011 18:33:16 +0000</pubDate>
		<dc:creator>Jeremy Heffner</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[ArcGIS.com]]></category>
		<category><![CDATA[Basemap]]></category>
		<category><![CDATA[ESRI]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1660</guid>
		<description><![CDATA[It&#8217;s great to read that Esri is working on features and basemaps to include within ArcGIS.com to support data visualization.     Sometimes the map isn&#8217;t the focus; sometimes the data is the focus.  A few weeks ago, Bern Szukalski wrote an article for the Esri Insider blog that spoke about Esri&#8217;s efforts to create [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s great to read that <a href="http://www.esri.com">Esri</a> is working on features and basemaps to include within <a href="http://www.arcgis.com">ArcGIS.com</a> to support data visualization.     Sometimes the map isn&#8217;t the focus; sometimes the data is the focus.  A few weeks ago, Bern Szukalski wrote an article for the Esri Insider blog <a href="http://blogs.esri.com/Info/blogs/esri-insider/archive/2011/07/25/inside-new-basemaps.aspx">that spoke about Esri&#8217;s efforts to create new basemaps including basemaps for data visualization purposes</a>.   I think this is a great move for Esri.   Last fall, <a href="http://ideas.arcgis.com/ideaView?id=087300000008FU7AAM">I suggested such a muted basemap</a> via the <a href="http://ideas.arcgis.com">ideas.arcgis.com</a> portal, so I was quite excited to hear it is in the works.</p>
<p>A post today, also by Bern, explained a new feature within <a href="http://blogs.esri.com/Support/blogs/arcgisonline/archive/2011/08/11/using-basemap-transparency.aspx">ArcGIS.com to allow the user to mute the basemap by adjusting it&#8217;s display transparency</a>.    The <a href="http://www.azavea.com/products/hunchlab/">HunchLab</a> team stumbled upon this idea a few months back and it&#8217;s been a great way to use the existing topographic basemap.    In our demo instance of HunchLab, we are using the ArcGIS.com Topographic tiles set to a transparency of 60%.   You can see what it looks like below.</p>
<p>Kudos to Esri &#8212; keep the basemap options coming!</p>
<p><img class="aligncenter size-full wp-image-1665" title="2011-08-11_1834--basemaptransparency" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2011/08/2011-08-11_1834-basemaptransparency.png" alt="" width="450" height="331" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/08/bring-on-the-data-focused-basemaps-esri/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scala&#8217;s Numeric type class (Pt. 2)</title>
		<link>http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-2/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-2/#comments</comments>
		<pubDate>Fri, 24 Jun 2011 18:53:38 +0000</pubDate>
		<dc:creator>Erik Osheim</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[fractional]]></category>
		<category><![CDATA[generics]]></category>
		<category><![CDATA[integral]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[numeric]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[specialization]]></category>
		<category><![CDATA[type class]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1513</guid>
		<description><![CDATA[In my last blog post I introduced the idea of type classes (and scala.math.Numeric in particular). I demonstrated basic usage but also alluded to some problems. In this blog post I will explain the problems, talk about specialization, and introduce a version of Numeric that solves many of the problems (com.azavea.math.Numeric). I will also present [...]]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-1/">last blog post</a> I introduced the idea of type classes (and <tt>scala.math.Numeric</tt> in particular). I demonstrated basic usage but also alluded to some problems. In this blog post I will explain the problems, talk about specialization, and introduce a version of Numeric that solves many of the problems (<a href="https://github.com/azavea/numeric"><tt>com.azavea.math.Numeric</tt></a>). I will also present some benchmarking numbers.</p>
<h1>Icky Syntax</h1>
<p>The most noticeable problem when using Numeric (at least in 2.8) is the syntax. As Kevin Wright alluded to in a comment on the last post, you can include the Numeric type class in two ways: via a context bound on your type parameter (<tt>A</tt>) or as an implicit argument.</p>
<p>In my earlier examples I used the implicit argument because I thought it was easier to follow (especially for those who are new to implicits) and it bound the instance of the type class to a named variable (<tt>n</tt>). But you can also use the type bound syntax, which makes the function declaration a bit cleaner looking.</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #008000; font-style: italic;">// this makes it obvious that n is an instance of Numeric[A]</span>
<span style="color: #008000; font-style: italic;">// and gives us a reference to use.</span>
<span style="color: #0000ff; font-weight: bold;">def</span> square<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">implicit</span> n<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #F78811;">&#123;</span>
  n.<span style="color: #000000;">times</span><span style="color: #F78811;">&#40;</span>a, a<span style="color: #F78811;">&#41;</span>
<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #008000; font-style: italic;">// this definition is a bit cleaner looking.</span>
<span style="color: #008000; font-style: italic;">// implicitly[] searches for an implicit Numeric[A] instance.</span>
<span style="color: #008000; font-style: italic;">// unfortunately the body of the function is less clean.</span>
<span style="color: #0000ff; font-weight: bold;">def</span> square<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #F78811;">&#123;</span>
  implicitly<span style="color: #F78811;">&#91;</span>Numeric<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#93;</span>.<span style="color: #000000;">times</span><span style="color: #F78811;">&#40;</span>a, a<span style="color: #F78811;">&#41;</span>
<span style="color: #F78811;">&#125;</span></pre></div></div>

<p>Either way, you&#8217;re still writing things like <tt>n.times(a, a)</tt> rather than <tt>a * a</tt>. Fortunately, Scala 2.9 introduced some nice new implicit parameters that allow us to use infix operators:</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #0000ff; font-weight: bold;">import</span> Numeric.<span style="color: #000000;">Implicits</span>.<span style="color: #000080;">_</span>
<span style="color: #0000ff; font-weight: bold;">def</span> square<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> a <span style="color: #000080;">*</span> a</pre></div></div>

<h1>Confusing Conversions</h1>
<p>The implicits providing infix operators are definitely a big improvement! But while they are nice for simple cases like the previous one they can end up confusing the issue a little bit:</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #0000ff; font-weight: bold;">def</span> twice<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> a <span style="color: #000080;">*</span> <span style="color: #F78811;">2</span>
<span style="color: #008000; font-style: italic;">//error: could not find implicit value for parameter num:</span>
<span style="color: #008000; font-style: italic;">// scala.math.Numeric[Any]</span></pre></div></div>

<p>What&#8217;s going on here? The problem is that <tt>*</tt> is mapped to <tt>Numeric.times(a:A, b:A)</tt> but the second argument we provided isn&#8217;t an <tt>A</tt>, it&#8217;s an <tt>Int</tt>. In Java (and Scala) we are used to being able to mix numeric types like <tt>double</tt> and <tt>int</tt> without worrying: the result will be promoted to <tt>double</tt> (the type with the widest range). Unfortunately, as written Numeric isn&#8217;t able to do these kinds of things and instead types must be converted explicitly. Numeric does provide two special methods, <tt>zero</tt> and <tt>one</tt> which allow you to get those values for all its types.</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #008000; font-style: italic;">// use Numeric.one to get the correct type</span>
<span style="color: #0000ff; font-weight: bold;">def</span> once<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span>A <span style="color: #000080;">=</span> a <span style="color: #000080;">*</span> implicitly<span style="color: #F78811;">&#91;</span>Numeric<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#93;</span>.<span style="color: #000000;">one</span>
&nbsp;
<span style="color: #008000; font-style: italic;">// convert 2 to be the correct type</span>
<span style="color: #0000ff; font-weight: bold;">def</span> twice<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span>A <span style="color: #000080;">=</span> a <span style="color: #000080;">*</span> implicitly<span style="color: #F78811;">&#91;</span>Numeric<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#93;</span>.<span style="color: #000000;">fromInt</span><span style="color: #F78811;">&#40;</span><span style="color: #F78811;">2</span><span style="color: #F78811;">&#41;</span>
&nbsp;
<span style="color: #008000; font-style: italic;">// convert a to be an Int</span>
<span style="color: #0000ff; font-weight: bold;">def</span> thrice<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span>Int <span style="color: #000080;">=</span> a.<span style="color: #000000;">toInt</span> <span style="color: #000080;">*</span> <span style="color: #F78811;">3</span></pre></div></div>

<p>There&#8217;s also no good way to combine two different generic Numeric types, other than converting to a common concrete type (e.g. <tt>Double</tt>). You could imagine Numeric providing something like <tt>fromType[B]</tt> (as well as things like <tt>fromDouble</tt>) but currently it doesn&#8217;t.</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #0000ff; font-weight: bold;">def</span> combine<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric, B<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A, b<span style="color: #000080;">:</span>B<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> a + b
<span style="color: #008000; font-style: italic;">//error: could not find implicit value for parameter num:</span>
<span style="color: #008000; font-style: italic;">// scala.math.Numeric[Any]</span>
&nbsp;
<span style="color: #008000; font-style: italic;">// this works</span>
<span style="color: #0000ff; font-weight: bold;">def</span> combine<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric, B<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A, b<span style="color: #000080;">:</span>B<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> a.<span style="color: #000000;">toDouble</span> + b.<span style="color: #000000;">toDouble</span>
&nbsp;
<span style="color: #008000; font-style: italic;">// this would be nice (for some definition of nice)</span>
<span style="color: #008000; font-style: italic;">// but it doesn't work</span>
<span style="color: #0000ff; font-weight: bold;">def</span> combine<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric, B<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>A, b<span style="color: #000080;">:</span>B<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span>A <span style="color: #000080;">=</span> <span style="color: #F78811;">&#123;</span>
  a + implicitly<span style="color: #F78811;">&#91;</span>Numeric<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#93;</span>.<span style="color: #000000;">fromType</span><span style="color: #F78811;">&#91;</span>B<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>b<span style="color: #F78811;">&#41;</span>
<span style="color: #F78811;">&#125;</span></pre></div></div>

<h1>Performance</h1>
<p>The most disappointing thing (in my opinion) about using <tt>scala.math.Numeric</tt> is its performance. In some cases it performs adequately, but in most cases it is very slow: actual algorithms written with Numeric tend to be 6-30 times slower than their direct counterparts (depending on what exactly they do). Some rough observations are that the comparison operators (e.g. <tt>&gt;</tt>) seem to be slower than the math operators (e.g. <tt>*</tt>), and performance does depend on which type you end up using: <tt>Numeric[Double]</tt> does relatively well (compared to the same code written for <tt>Double</tt>). <tt>Numeric[Int]</tt> does very badly compared to code written for <tt>Int</tt>.</p>
<h1>Specialization to the Rescue!</h1>
<p>Fortunately, Scala itself contains the tool to fix Numeric&#8217;s performance problems: specialization. Specialization was first described in the paper <a href="http://lamp.epfl.ch/~dragos/files/scala-spec.pdf">Compiling Generics Through User-Directed Type Specialization</a> by Iulian Dragos and Martin Odersky and was introduced in Scala 2.8 via the <tt>@specialized</tt> annotation. <a href="http://www.scala-notes.org/2011/04/specializing-for-primitive-types/">Other articles</a> have been written explaining specialization, but I will give another brief one here.</p>
<p>When writing generic functions in Scala (as well as Java), the compiler generates one implementation that supports all possible argument types (e.g. <tt>java.lang.Object</tt>). This fine for Java [1] since all the valid types you could use with your generic function are guaranteed to be subclasses of java.lang.Object. If you remember my first post, you&#8217;ll recall how we had to create boxed instances of <tt>java.lang.Integer</tt> to hold <tt>int</tt> values when using generics.</p>
<p>Unfortunately for Scala it means that using generic functions with value-types (e.g.<tt>Int</tt>) is much slower than functions implemented directly in terms of the value-type itself, since there is a single implementation which requires its arguments to be objects (e.g. boxed).</p>
<p>The <tt>@specialized</tt> annotation allows the Scala compiler to generate separate implementations of the generic function: one for all the reference types (inheriting from<tt>AnyRef</tt>) and one for each of the value types we choose to specialize over (e.g. <tt>Char</tt>, <tt>Double</tt>, <tt>Int</tt>). So if we have a generic function <tt>foo[A](a:A): A</tt> the compiler will generate other functions on primitive types, for instance <tt>fooInt(a:Int): Int</tt>, <tt>fooDouble(a:Double): Double</tt>, etc. And when we write <tt>foo(33)</tt> the compiler will translate that code into <tt>fooInt(33)</tt>.</p>
<p>Since Scala hides boxing/unboxing, specialization doesn&#8217;t make the code look nicer. But it makes a huge difference in performance: Dragos and Odersky report a 22-40x speed up in micro-benchmarks!</p>
<h1>Specializing Numeric</h1>
<p>Soon after specialization was added to Scala 2.8, people began discussing specializing Numeric. Searching Google I find names like Jason Zaugg, Iulian Dragos, and others talking about it. Andreas Flierl emailed the scala-language mailing list on January 3, 2011 with some early tests which got me interested in working on this. So I don&#8217;t want to claim any credit for the amazing work done on implementing <tt>@specialized</tt> or even the idea of specializing Numeric.</p>
<p>Starting with Andreas&#8217; code I began working on building a specialized Numeric, with the goal of making all its operations as fast as direct implementations. There were a bunch of ways that the code needed to be restructured (which I may discuss in a later article) and I learned a ton about implicits, typeclasses, and profiling Scala code. The code is available at: https://github.com/azavea/numeric along with the tests I&#8217;m running. The results follow, but the short story is that with the exception of implicit infix operators all <tt>com.azavea.math.Numeric</tt> code runs just as fast as the direct implementations.</p>
<h1>The Numbers</h1>
<style type="text/css">
    td { text-align: right; padding: 4px; }
    td { border: 1px solid black; padding: 4px; }
    thead { background-color: lightgrey; }
    .na { border: 0px; }
    .base { border: 0px; }
    .sep { border: 0px; colspan="6" }
    .name { background-color: lightgrey; }
    .great { background-color: #99ccff; }
    .good { background-color: #99ff99; }
    .ok { background-color: #ccff99; }
    .poor { background-color: #ffff99; }
    .bad { background-color: #ffcc99; }
    .awful { background-color: #ff9999; }
  </style>
<table>
<thead>
<tr>
<td>test</td>
<td>direct (ms)</td>
<td colspan="2">new (ms)</td>
<td colspan="2">old (ms)</td>
</tr>
</thead>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>from-int-to-int</td>
<td class='base'>22.3</td>
<td class='good'>22.0</td>
<td class='good'> 0.99x</td>
<td class='bad'>169.1</td>
<td class='bad'> 7.60x</td>
</tr>
<tr>
<td class='name'>from-int-to-long</td>
<td class='base'>38.0</td>
<td class='good'>37.3</td>
<td class='good'> 0.98x</td>
<td class='bad'>244.5</td>
<td class='bad'> 6.43x</td>
</tr>
<tr>
<td class='name'>from-int-to-float</td>
<td class='base'>23.1</td>
<td class='good'>23.0</td>
<td class='good'> 0.99x</td>
<td class='bad'>202.5</td>
<td class='bad'> 8.76x</td>
</tr>
<tr>
<td class='name'>from-int-to-double</td>
<td class='base'>40.0</td>
<td class='good'>37.9</td>
<td class='good'> 0.95x</td>
<td class='bad'>271.4</td>
<td class='bad'> 6.78x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>adder-int</td>
<td class='base'>12.6</td>
<td class='good'>13.5</td>
<td class='good'> 1.07x</td>
<td class='awful'>145.3</td>
<td class='awful'>11.50x</td>
</tr>
<tr>
<td class='name'>adder-long</td>
<td class='base'>21.1</td>
<td class='good'>21.4</td>
<td class='good'> 1.01x</td>
<td class='bad'>153.8</td>
<td class='bad'> 7.28x</td>
</tr>
<tr>
<td class='name'>adder-float</td>
<td class='base'>27.5</td>
<td class='good'>27.8</td>
<td class='good'> 1.01x</td>
<td class='good'>27.5</td>
<td class='good'> 1.00x</td>
</tr>
<tr>
<td class='name'>adder-double</td>
<td class='base'>27.3</td>
<td class='good'>26.0</td>
<td class='good'> 0.95x</td>
<td class='good'>27.5</td>
<td class='good'> 1.01x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>array-total-int</td>
<td class='base'>8.3</td>
<td class='good'>8.3</td>
<td class='good'> 1.00x</td>
<td class='awful'>246.4</td>
<td class='awful'>29.86x</td>
</tr>
<tr>
<td class='name'>array-total-long</td>
<td class='base'>11.5</td>
<td class='good'>11.1</td>
<td class='good'> 0.97x</td>
<td class='awful'>310.8</td>
<td class='awful'>27.02x</td>
</tr>
<tr>
<td class='name'>array-total-float</td>
<td class='base'>16.1</td>
<td class='good'>16.6</td>
<td class='good'> 1.03x</td>
<td class='awful'>217.1</td>
<td class='awful'>13.47x</td>
</tr>
<tr>
<td class='name'>array-total-double</td>
<td class='base'>17.3</td>
<td class='good'>16.8</td>
<td class='good'> 0.97x</td>
<td class='awful'>216.8</td>
<td class='awful'>12.57x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>array-rescale-int</td>
<td class='base'>27.1</td>
<td class='good'>26.8</td>
<td class='good'> 0.99x</td>
<td class='awful'>357.8</td>
<td class='awful'>13.19x</td>
</tr>
<tr>
<td class='name'>array-rescale-long</td>
<td class='base'>24.6</td>
<td class='good'>24.6</td>
<td class='good'> 1.00x</td>
<td class='awful'>540.6</td>
<td class='awful'>21.95x</td>
</tr>
<tr>
<td class='name'>array-rescale-float</td>
<td class='base'>59.3</td>
<td class='good'>59.6</td>
<td class='good'> 1.01x</td>
<td class='bad'>366.9</td>
<td class='bad'> 6.19x</td>
</tr>
<tr>
<td class='name'>array-rescale-double</td>
<td class='base'>91.4</td>
<td class='good'>91.1</td>
<td class='good'> 1.00x</td>
<td class='poor'>394.3</td>
<td class='poor'> 4.31x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>find-max-int</td>
<td class='base'>16.6</td>
<td class='good'>16.9</td>
<td class='good'> 1.02x</td>
<td class='awful'>166.0</td>
<td class='awful'> 9.98x</td>
</tr>
<tr>
<td class='name'>find-max-long</td>
<td class='base'>12.1</td>
<td class='good'>12.1</td>
<td class='good'> 1.00x</td>
<td class='awful'>199.6</td>
<td class='awful'>16.46x</td>
</tr>
<tr>
<td class='name'>find-max-float</td>
<td class='base'>22.6</td>
<td class='good'>22.4</td>
<td class='good'> 0.99x</td>
<td class='bad'>187.8</td>
<td class='bad'> 8.30x</td>
</tr>
<tr>
<td class='name'>find-max-double</td>
<td class='base'>23.9</td>
<td class='good'>23.5</td>
<td class='good'> 0.98x</td>
<td class='bad'>195.6</td>
<td class='bad'> 8.19x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>quicksort-int</td>
<td class='base'>155.9</td>
<td class='bad'>820.1</td>
<td class='bad'> 5.26x</td>
<td class='bad'>929.0</td>
<td class='bad'> 5.96x</td>
</tr>
<tr>
<td class='name'>quicksort-long</td>
<td class='base'>1105.0</td>
<td class='ok'>1326.4</td>
<td class='ok'> 1.20x</td>
<td class='ok'>1238.6</td>
<td class='ok'> 1.12x</td>
</tr>
<tr>
<td class='name'>quicksort-float</td>
<td class='base'>228.6</td>
<td class='bad'>1493.0</td>
<td class='bad'> 6.53x</td>
<td class='bad'>1364.5</td>
<td class='bad'> 5.97x</td>
</tr>
<tr>
<td class='name'>quicksort-double</td>
<td class='base'>251.0</td>
<td class='bad'>1470.3</td>
<td class='bad'> 5.86x</td>
<td class='bad'>1306.3</td>
<td class='bad'> 5.20x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>array-allocator-int</td>
<td class='base'>68.4</td>
<td class='good'>73.3</td>
<td class='good'> 1.07x</td>
<td class='great'>42.6</td>
<td class='great'> 0.62x</td>
</tr>
<tr>
<td class='name'>array-allocator-long</td>
<td class='base'>101.9</td>
<td class='good'>102.4</td>
<td class='good'> 1.00x</td>
<td class='ok'>143.3</td>
<td class='ok'> 1.41x</td>
</tr>
<tr>
<td class='name'>array-allocator-float</td>
<td class='base'>87.4</td>
<td class='good'>82.5</td>
<td class='good'> 0.94x</td>
<td class='ok'>146.8</td>
<td class='ok'> 1.68x</td>
</tr>
<tr>
<td class='name'>array-allocator-double</td>
<td class='base'>82.9</td>
<td class='good'>83.0</td>
<td class='good'> 1.00x</td>
<td class='ok'>162.4</td>
<td class='ok'> 1.96x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>insertion-sort-int</td>
<td class='base'>47.5</td>
<td class='good'>48.6</td>
<td class='good'> 1.02x</td>
<td class='awful'>1463.0</td>
<td class='awful'>30.80x</td>
</tr>
<tr>
<td class='name'>insertion-sort-long</td>
<td class='base'>49.9</td>
<td class='good'>50.3</td>
<td class='good'> 1.01x</td>
<td class='awful'>1440.6</td>
<td class='awful'>28.88x</td>
</tr>
<tr>
<td class='name'>insertion-sort-float</td>
<td class='base'>50.0</td>
<td class='good'>50.9</td>
<td class='good'> 1.02x</td>
<td class='awful'>1658.6</td>
<td class='awful'>33.17x</td>
</tr>
<tr>
<td class='name'>insertion-sort-double</td>
<td class='base'>49.3</td>
<td class='good'>49.9</td>
<td class='good'> 1.01x</td>
<td class='awful'>1850.5</td>
<td class='awful'>37.57x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>merge-sort-int</td>
<td class='base'>275.6</td>
<td class='good'>293.0</td>
<td class='good'> 1.06x</td>
<td class='poor'>870.9</td>
<td class='poor'> 3.16x</td>
</tr>
<tr>
<td class='name'>merge-sort-long</td>
<td class='base'>288.3</td>
<td class='great'>215.0</td>
<td class='great'> 0.75x</td>
<td class='poor'>842.9</td>
<td class='poor'> 2.92x</td>
</tr>
<tr>
<td class='name'>merge-sort-float</td>
<td class='base'>186.8</td>
<td class='ok'>206.5</td>
<td class='ok'> 1.11x</td>
<td class='bad'>901.3</td>
<td class='bad'> 4.83x</td>
</tr>
<tr>
<td class='name'>merge-sort-double</td>
<td class='base'>197.6</td>
<td class='ok'>218.8</td>
<td class='ok'> 1.11x</td>
<td class='bad'>895.5</td>
<td class='bad'> 4.53x</td>
</tr>
<tr>
<td class='sep' /></tr>
<tr>
<td class='name'>infix-adder-int</td>
<td class='base'>20.5</td>
<td class='poor'>57.1</td>
<td class='poor'> 2.79x</td>
<td class='bad'>125.9</td>
<td class='bad'> 6.14x</td>
</tr>
<tr>
<td class='name'>infix-adder-long</td>
<td class='base'>20.8</td>
<td class='poor'>62.3</td>
<td class='poor'> 3.00x</td>
<td class='awful'>250.6</td>
<td class='awful'>12.08x</td>
</tr>
<tr>
<td class='name'>infix-adder-float</td>
<td class='base'>20.9</td>
<td class='ok'>33.9</td>
<td class='ok'> 1.62x</td>
<td class='bad'>176.4</td>
<td class='bad'> 8.45x</td>
</tr>
<tr>
<td class='name'>infix-adder-double</td>
<td class='base'>28.8</td>
<td class='ok'>32.9</td>
<td class='ok'> 1.14x</td>
<td class='bad'>189.0</td>
<td class='bad'> 6.57x</td>
</tr>
</table>
<p><br/></p>
<h1>Discussion</h1>
<p>Profiling this was a little bit annoying&#8211;I had to write 4 direct implementations (one for each of the types I was interested in) plus two generic implementations (one using the old <tt>scala.math.Numeric</tt> and one using the new <tt>com.azavea.math.Numeric</tt>). I used <tt>scala.testing.Benchmark</tt> although <a href="">Yuvi Masory</a> recently suggested using <a href="Caliper" class="broken_link" rel="nofollow">https://github.com/sirthias/scala-benchmarking-template</a> (which I still need to go check out).</p>
<p>Some things to note:</p>
<ol>
<li>Old Numeric does much better on floating-point numbers than on integral ones. Why?</li>
<li>For some reason old Numeric is able to beat the direct implementation of the <i>array-allocator</i> test.</li>
<li>Since <tt>scala.util.Sorting[T]</tt> uses <tt>scala.math.Ordering[T]</tt> (which isn&#8217;t specialized) it&#8217;s not possible to make Numeric any faster in this test.</li>
<li><tt>scala.util.Sorting[Long]</tt> is over 4x slower than <tt>Int</tt>, <tt>Float</tt> and <tt>Double</tt> in the direct implementation.</li>
<li>Infix operators are still slow enough that I still have mixed feelings endorsing them.</li>
</ol>
<p>One additional gotcha which I haven&#8217;t discussed at length is what writing your own specialized numeric code (currently) looks like. Unfortunately, there&#8217;s a lot of boiler-plate. While users of your library will just say <tt>doSomethingComplicated(Array(0, 1, 2), 18, 99)</tt> you will be stuck writing:</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #0000ff; font-weight: bold;">def</span> doSomethingComplicated<span style="color: #F78811;">&#91;</span><span style="color: #000080;">@</span>specialized A<span style="color: #000080;">:</span>Numeric<span style="color: #000080;">:</span>Manifest<span style="color: #F78811;">&#93;</span>
<span style="color: #F78811;">&#40;</span>ns<span style="color: #000080;">:</span>Array<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span>, x<span style="color: #000080;">:</span>A, y<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #F78811;">&#123;</span>
  <span style="color: #008000; font-style: italic;">// implementation here</span>
<span style="color: #F78811;">&#125;</span></pre></div></div>

<p>And that&#8217;s only one type parameter; God forbid you want A and B to both be Numeric!</p>
<p>I don&#8217;t know of any easy way around this. You have to say <tt>@specialized</tt> if you want your code to be specialized. If you fail to do this anywhere in the call chain you will use generic implementations from that point down, losing any benefit to specialization. You need to specify <tt>Numeric</tt> because that&#8217;s the point, right? And when writing code that needs to use the type parameter <tt>A</tt> to allocate arrays (among other things) you have to provide a <tt>Manifest</tt>. This gets ugly really fast.</p>
<p>Given my ignorance of the Scala compiler, I can imagine various magical things that might fix this. For instance, I could imagine some kind of synthetic type bound that combines others, e.g. <tt>FastNumeric = @specialized Numeric:Manifest</tt> or something. Is this possible (or even a good idea)? I don&#8217;t know.</p>
<h1>Changes</h1>
<p>This section is a bit rushed&#8211;I may expand it into another post later, but I just wanted to get the information out there.</p>
<p>One of the biggest changes I made to Numeric was tearing out <tt>scala.math.Ordering</tt>&#8211;I didn&#8217;t want to try to specialize &#8220;everything at once&#8221; but <tt>scala.math.Numeric</tt> inherits all of its (incredibly slow) comparison methods from Ordering. Specializing Ordering would solve this problem.</p>
<p>I also created a specialized typeclass with the humorous name &#8220;Convertable&#8221; which applies to all the value types which represent numbers (<tt>Byte</tt>, <tt>Short</tt>, <tt>Int</tt>, <tt>Long</tt>, <tt>Float</tt> and <tt>Double</tt>). The great thing about this is that you can avoid doing boxing/unboxing during conversion between types. It also allows you to implement the <tt>fromType[T]</tt> method I discussed earlier.</p>
<p>I left out <tt>Short</tt> and <tt>Byte</tt> after Paul Phillips pointed out that neither classes operations behave as the direct implementations do. In Scala (as in Java) <tt>Byte + Byte -> Int</tt>, whereas in Numeric <tt>A + A -> A</tt> (so <tt>Byte + Byte -> Byte</tt>). This causes all kinds of bugs. Since <tt>Short</tt> and <tt>Byte</tt> (and <tt>Char</tt>) are converted to <tt>Int</tt> when you actually do things with them, I decided it was unlikely users wanted to be able to use them in generic numeric functions.</p>
<p>A (potentially) controversial change I made was not including <tt>scala.math.Integral</tt> and <tt>scala.math.Fractional</tt>, and unifying support for the division and mod operators. Coming from dynamically-typed languages, I expected to be able to do the following:</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #008000; font-style: italic;">// this does not work with the built-in Numeric</span>
<span style="color: #0000ff; font-weight: bold;">import</span> Numeric.<span style="color: #000000;">Implicits</span>.<span style="color: #000080;">_</span>
<span style="color: #0000ff; font-weight: bold;">def</span> scale<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>n<span style="color: #000080;">:</span>A, num<span style="color: #000080;">:</span>A, denom<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #F78811;">&#40;</span>n <span style="color: #000080;">*</span> num<span style="color: #F78811;">&#41;</span> / denom
&nbsp;
<span style="color: #008000; font-style: italic;">// this *does* work</span>
<span style="color: #0000ff; font-weight: bold;">import</span> scala.<span style="color: #000000;">math</span>.<span style="color: #000000;">Integral</span>.<span style="color: #000000;">Implicits</span>.<span style="color: #000080;">_</span>
<span style="color: #0000ff; font-weight: bold;">def</span> scale<span style="color: #F78811;">&#91;</span>A<span style="color: #000080;">:</span>Integral<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>n<span style="color: #000080;">:</span>A, num<span style="color: #000080;">:</span>A, denom<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #F78811;">&#40;</span>n <span style="color: #000080;">*</span> num<span style="color: #F78811;">&#41;</span> / denom</pre></div></div>

<p>I understand that for someone who&#8217;s excited about creating user-defined numeric types this split isn&#8217;t a big deal. But from my perspective I want a generic numeric type that abstracts across all the useful value-types in Scala. I&#8217;m not opposed to having Integral and Fractional, but I would rather that Numeric not be defined in terms of them.</p>
<p>Finally I did some restructuring to avoid using inner traits and classes, since I had heard that those don&#8217;t specialize well.</p>
<h1>Conclusion</h1>
<p>While I still think there&#8217;s some room to make Numeric better, at this point it&#8217;s possible to write fast generic Numeric code in Scala. Stepping back, I think it&#8217;s incredibly exciting that I can avoid code duplication *and* maintain the speed that I&#8217;m used to from direct implementations.</p>
<p>I&#8217;m not sure what the best path is to add this capability to <tt>scala.math.Numeric</tt>. I think the trait lacks some important features (more type conversions, division, etc) and some of the restructuring is currently necessary to get these numbers. Since binary compatibility is important moving forward for Scala and the Typesafe team it&#8217;s not entirely clear what the best plan is. Hopefully some discussion and work at <a href="http://scalathon.org/">Scalathon</a> can clarify things a bit.</p>
<p>Please respond with any questions or comments. Do let me know if you find problems in the profiling, or try it yourself and get dramatically different numbers. I spent some time trying not to fall into obvious micro-benchmark pitfalls, but there&#8217;s still a lot I don&#8217;t know about the JVM. Thanks for reading!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-2/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Using the CQL_FILTER parameter with OpenLayers WMS layers</title>
		<link>http://www.azavea.com/blogs/labs/2011/06/using-the-cql_filter-parameter-with-openlayers-wms-layers/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/06/using-the-cql_filter-parameter-with-openlayers-wms-layers/#comments</comments>
		<pubDate>Mon, 20 Jun 2011 20:33:20 +0000</pubDate>
		<dc:creator>Carissa Brittain</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[cql]]></category>
		<category><![CDATA[GeoServer]]></category>
		<category><![CDATA[openlayers]]></category>
		<category><![CDATA[wms]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1528</guid>
		<description><![CDATA[I&#8217;ve used Openlayer&#8217;s Marker layer in several projects and have always just accepted that I can&#8217;t display more than around 500 markers at a time for a given query. Recently, I found another way. We&#8217;re using GeoServer as a WMS tile server for the tree and municipal boundary layers in PhillyTreeMap.org. GeoServer&#8217;s WMS implementation allows [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve used Openlayer&#8217;s <a title="The docs" href="http://dev.openlayers.org/apidocs/files/OpenLayers/Layer/Markers-js.html" target="_blank">Marker layer</a> in several projects and have always just accepted that I can&#8217;t display more than around <a title="... or 50 for IE" href="http://trac.osgeo.org/openlayers/wiki/FrequentlyAskedQuestions#Markers" target="_blank">500 markers at a time</a> for a given query. Recently, I found another way. We&#8217;re using <a href="http://docs.geoserver.org/latest/en/user/index.html" target="_blank">GeoServer</a> as a WMS tile server for the tree and municipal boundary layers in <a title="Is your tree in our database?" href="http://www.phillytreemap.org" target="_blank">PhillyTreeMap.org</a>. GeoServer&#8217;s WMS implementation allows an additional parameter in the url called <a title="With an example even!" href="http://docs.geoserver.org/latest/en/user/googleearth/features/filters.html?highlight=cql_filter#cql-filter" target="_blank">CQL_FILTER</a>. This parameter allows you to use a little language called Common Query Language, or CQL, to <a href="http://docs.geoserver.org/latest/en/user/tutorials/cql/cql_tutorial.html" target="_blank">apply data filters to the tiles</a> that GeoServer generates.  CQL is a plain text, human readable query language created by the OGC, but I like to think of it as an extremely limited third-cousin-by-adoption of SQL. I haven&#8217;t found too much in the way of documentation on this <a title="though maybe not this obscure..." href="http://www.farlang.com/gemstones/burnham-precious-stones/page_001" target="_blank">obscure little gem</a>, so here&#8217;s a rundown of how we used it to display search results in PhillyTreeMap.org.</p>
<p>If you look through the <a title="A tutorial with holes" href="http://docs.geoserver.org/latest/en/user/tutorials/cql/cql_tutorial.html" target="_blank">CQL and ECQL</a> page in GeoServer&#8217;s documentation, there are several examples but they don&#8217;t cover everything you can do with CQL. Basically, a CQL filter is a list of phrases similar to where clauses in SQL, separated by the combiner words AND or OR. You can use the following operators in a CQL phrase:</p>
<ul>
<li>Comparison operations: =, &lt;, &gt;, and combinations</li>
<li>Math expressions: +, -, *, /</li>
<li>NOT</li>
<li>IS, EXISTS</li>
<li>BETWEEN, BEFORE, AFTER</li>
<li>IN</li>
<li>LIKE</li>
<li>Geometric operators: CONTAINS, BBOX, DWITHIN, CROSS(ES), INTERSECT(S)</li>
</ul>
<p>Some of these operations have examples in the GeoServer documentation, and others can be inferred from the <a title="Fewer holes, but more obfuscated" href="http://docs.geotools.org/latest/userguide/library/cql/cql.html" target="_blank">GeoTools documentation</a> (the stuff in their CQL.toFilter() calls). CQL can also call any of GeoServer&#8217;s <a href="http://docs.geoserver.org/latest/en/user/filter/function_reference.html#filter-function-reference" target="_blank">filter functions</a>.You can add parenthesis as needed to affect the order the filters are evaluated in, just like in SQL.</p>
<p>CQL has a <a title="Just like strapping a rocket to..." href="http://www.alexross.com/CJ063.html" target="_blank">lot of power</a> for such a short spec, but it has a one very large deficiency that requires some database designing to avoid: the utter lack of join support. This makes sense when you consider that GeoServer doesn&#8217;t know about joins either. Ultimately, you&#8217;re using CQL against the GeoServer layer, not the underlying database structure. Building views or adding reference columns to the table GeoServer is accessing can help get around this.</p>
<p>In PhillyTreeMap.org, we use 4 types of CQL filters: id lookups using =, null checks using IS, date and integer ranges using BETWEEN and text searches using LIKE. Here are examples of those uses along with some array joining to get a valid CQL filter string at the end:</p>
<p><code>filter_list = []<br />
filter_list.append("species_id = 212")<br />
filter_list.append("height IS NULL")<br />
filter_list.append("dbh BETWEEN 10 and 20")<br />
filter_list.append("neighborhood_id_list LIKE '%42%' ")<br />
cql = ' AND '.join(filter_list)<br />
# should look like this: "species_id = 212 AND height IS NULL AND dbh BETWEEN 10 and 20 AND neighborhood_id_list LIKE '%42%' "<br />
</code></p>
<p>The above CQL filter would locate trees in a specific species, have no height value, only have dbh values between 10 and 20 inches and are in a particular neighborhood. The neighborhood_id_list filter would have been a join if this were written in standard SQL since neighborhoods and trees have a many-to-many relationship in our database. Since we can&#8217;t do joins, any time a tree is added or the location is updated, all of it&#8217;s related geographies&#8217; ids are added to a reference column on the tree, and used specifically for this type of query.</p>
<p>CQL is passed to GeoServer in the same way as any other WMS variable. We&#8217;re using openlayers, so most of the WMS configuration variables are already set when we create the layer. The WMS layer has a little method called <a href="http://dev.openlayers.org/docs/files/OpenLayers/Layer/WMS-js.html#OpenLayers.Layer.WMS.mergeNewParams" target="_blank">mergeNewParams</a> that lets us change those parameters after the layer has been initialized. It also automatically redraws the layer, so the changes take place immediately. To add CQL to the WMS call, just add the CQL_FILTER variable to the layer&#8217;s parameters and the layer should update.</p>
<p><code>wms_layer.mergeNewParams({'CQL_FILTER': "species_id = 212 AND height IS NULL AND dbh BETWEEN 10 and 20 AND neighborhood_id_list LIKE '%42%' "});<br />
</code></p>
<p>You can remove any filters by <a href="http://www.javascriptkit.com/jsref/other_operators.shtml" target="_blank">deleting the parameter</a> from the layer as if it was a normal javascript object. You&#8217;ll need to redraw the layer yourself before the change will be visible.</p>
<p><code>delete wms_layer.params.CQL_FILTER;<br />
wms_layer.redraw();<br />
</code></p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/06/using-the-cql_filter-parameter-with-openlayers-wms-layers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Scala&#8217;s Numeric type class (Pt. 1)</title>
		<link>http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-1/</link>
		<comments>http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-1/#comments</comments>
		<pubDate>Thu, 16 Jun 2011 16:39:37 +0000</pubDate>
		<dc:creator>Erik Osheim</dc:creator>
				<category><![CDATA[Posts]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[numeric]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[scalathon]]></category>
		<category><![CDATA[type class]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=1412</guid>
		<description><![CDATA[So I&#8217;ve been spending my R&#038;D time recently working on improving Scala&#8217;s Numeric type class. We&#8217;re about to release an improved version of Numeric, and there&#8217;s also an incubator project aimed at getting it integrated into the Scala project. To explain the project (and in preparation for Scalathon in July) I have embarked on writing [...]]]></description>
			<content:encoded><![CDATA[<p>So I&#8217;ve been spending my R&#038;D time recently working on improving Scala&#8217;s Numeric type class. We&#8217;re about to release an improved version of Numeric, and there&#8217;s also an incubator project aimed at getting it integrated into the Scala project. To explain the project (and in preparation for <a href="http://scalathon.org">Scalathon</a> in July) I have embarked on writing some blog posts to explain what Numeric is, what it does, and how it can be made better.</p>
<p>These posts will assume you know a bit of Scala, but I will try to keep the discussion as general as I can.</p>
<h1>The Goal</h1>
<p>The goal seems simple: to implement an algorithm but without choosing in advance which numeric type (e.g. <tt>int</tt>) to use, and without paying a performance penalty over a &#8220;direct&#8221; implementation.</p>
<p>Most dynamically-typed languages get this &#8220;for free&#8221; (in the sense that they pay a performance penalty regardless) and some statically-typed languages can accomplish this with macros, but I will focus on another approach: type classes. I will explain how Scala is able to transcend its Java roots and accomplish this task using the Numeric type class.</p>
<h1>Java background</h1>
<p>If you work with Java, you probably never write functions which can operate on both integral and floating point values.</p>
<p>Or at least, you probably never support both with one function&#8211;in that case you&#8217;d have to use <tt>java.lang.Number</tt>, which is implemented by Java&#8217;s boxed primitive types (<tt>java.lang.Integer</tt>, <tt>java.lang.Double</tt>) and which only gives you accessors for the various primitive types. It&#8217;s a mess:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> Foo <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #003399;">Number</span> add<span style="color: #009900;">&#40;</span><span style="color: #003399;">Number</span> t, <span style="color: #003399;">Number</span> u<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Double</span><span style="color: #009900;">&#40;</span>t.<span style="color: #006633;">doubleValue</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">+</span> u.<span style="color: #006633;">doubleValue</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #003399;">Number</span> a <span style="color: #339933;">=</span> Foo.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Integer</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span>, <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Integer</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>a.<span style="color: #006633;">intValue</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>This code won&#8217;t work with some <tt>long</tt> values, and it basically forces the implementation to choose a primitive type that it hopes will work with the values it will be given (or to use <tt>java.math.BigDecimal</tt> for everything and incur even more slowness).</p>
<p>You might just support <tt>double</tt> and call it a day. Or you might write one version to deal with <tt>int</tt> and then copy-paste the same code but change all the <tt>int</tt> types to <tt>double</tt>, etc.. No matter what, the whole exercise feels icky.</p>
<h1>Scala background</h1>
<p><a href="http://www.scala-lang.org/">Scala</a> tries to cover up some of Java warts by combining the notion of primitive types (<tt>int</tt>) and boxed types (<tt>java.lang.Integer</tt>) into a single unified type (<tt>Int</tt>). Scala&#8217;s <tt>Int</tt> type uses <tt>int</tt> when possible, but will transform them into <tt>java.lang.Integer</tt> instances behind the scenes if necessary. This makes boxing less annoying to do, but doesn&#8217;t make it any faster.</p>
<p>It also doesn&#8217;t directly help us accomplish our goal. <tt>Int</tt> and <tt>Double</tt> don&#8217;t share an interface which exposes things like <tt>+</tt> so we can&#8217;t usefully abstract across their types. In fact, they don&#8217;t share any superclass short of <tt>AnyVal</tt> so we can&#8217;t even implement something like our Java solution. (This isn&#8217;t totally correct&#8211;we can use Java&#8217;s types from Scala&#8211;but you get the idea.) So what gives? Is Scala even worse than Java when it comes to generic numbers?</p>
<p>No. Scala has more tools at its disposal than Java, and we can get around this problem using one of them: type classes.</p>
<h1>Type classes in Scala</h1>
<p>In Java, if you want a class to satisfy an interface, the class must be written with knowledge of the interface. When you create new interfaces, you have to go back and modify any classes which should implement them. If you&#8217;re using someone else&#8217;s classes and they don&#8217;t know about your interface, too bad!</p>
<p>This makes using other people&#8217;s code harder, and also locks you into someone else&#8217;s (possibly poorly-thought-out) type hierarchies. While I am not claiming that the type hierarchy for Scala&#8217;s numeric types is poorly-thought-out, it is certainly inconvenient for what we want to accomplish.</p>
<p><a href="http://en.wikipedia.org/wiki/Type_class">Type classes</a> (introduced in the <a href="http://en.wikipedia.org/wiki/Haskell_(programming_language)">Haskell</a> programming language) offer an alternative approach. Type classes are not interfaces which classes must implement. Instead a type class requires an instance to &#8220;glue&#8221; each member to the type class. When I say &#8220;glue&#8221; the member classes to the type class, I mean that each instance will provide the methods that the type class requires in terms of a particular member class. For example, if I have a <tt>Flammable</tt> type class and a <tt>House</tt> class, I can create a <tt>HouseIsFlammable</tt> instance that specifies how the <tt>House</tt> class satisfies <tt>Flammable</tt>.</p>
<p>We can implement type classes in Scala using <a href="http://www.scala-lang.org/node/114">implicit parameters</a>. The basic idea is that we create a trait (<tt>Flammable[T]</tt>), along with implicit objects which &#8220;glue&#8221; the supported classes to new our type class (e.g. <tt>WoodIsFlammable</tt>). Each object corresponds to a single supported class (e.g. <tt>Wood</tt>). It implements the trait in terms of that class (e.g. <tt>Flammable[Wood]</tt>), and implements the trait&#8217;s methods (e.g. burn) in terms of the supported class.</p>
<p>Then when we want to use the type class, we make sure the instances are in scope and Scala&#8217;s compiler will automatically pick up those instances for us.</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #0000ff; font-weight: bold;">case</span> <span style="color: #0000ff; font-weight: bold;">class</span> Alcohol<span style="color: #F78811;">&#40;</span>liters<span style="color: #000080;">:</span>Double<span style="color: #F78811;">&#41;</span>
<span style="color: #0000ff; font-weight: bold;">case</span> <span style="color: #0000ff; font-weight: bold;">class</span> Water<span style="color: #F78811;">&#40;</span>liters<span style="color: #000080;">:</span>Double<span style="color: #F78811;">&#41;</span>
&nbsp;
<span style="color: #0000ff; font-weight: bold;">case</span> <span style="color: #0000ff; font-weight: bold;">class</span> Fire<span style="color: #F78811;">&#40;</span>heat<span style="color: #000080;">:</span>Double<span style="color: #F78811;">&#41;</span>
<span style="color: #0000ff; font-weight: bold;">trait</span> Flammable<span style="color: #F78811;">&#91;</span>A<span style="color: #F78811;">&#93;</span> <span style="color: #F78811;">&#123;</span>
  <span style="color: #0000ff; font-weight: bold;">def</span> burn<span style="color: #F78811;">&#40;</span>fuel<span style="color: #000080;">:</span>A<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span> Fire
<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #0000ff; font-weight: bold;">implicit</span> <span style="color: #0000ff; font-weight: bold;">object</span> AlcoholIsFlammable <span style="color: #0000ff; font-weight: bold;">extends</span> Flammable<span style="color: #F78811;">&#91;</span>Alcohol<span style="color: #F78811;">&#93;</span> <span style="color: #F78811;">&#123;</span>
  <span style="color: #0000ff; font-weight: bold;">def</span> burn<span style="color: #F78811;">&#40;</span>fuel<span style="color: #000080;">:</span>Alcohol<span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> Fire<span style="color: #F78811;">&#40;</span><span style="color: #F78811;">120.0</span><span style="color: #F78811;">&#41;</span>
<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #0000ff; font-weight: bold;">def</span> setFire<span style="color: #F78811;">&#91;</span>T<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>fuel<span style="color: #000080;">:</span>T<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">implicit</span> f<span style="color: #000080;">:</span>Flammable<span style="color: #F78811;">&#91;</span>T<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> f.<span style="color: #000000;">burn</span><span style="color: #F78811;">&#40;</span>fuel<span style="color: #F78811;">&#41;</span>
&nbsp;
setFire<span style="color: #F78811;">&#40;</span>Alcohol<span style="color: #F78811;">&#40;</span><span style="color: #F78811;">1.0</span><span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#41;</span> <span style="color: #008000; font-style: italic;">// ok</span>
setFire<span style="color: #F78811;">&#40;</span>Water<span style="color: #F78811;">&#40;</span><span style="color: #F78811;">1.0</span><span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#41;</span> <span style="color: #008000; font-style: italic;">// FAIL</span></pre></div></div>

<p>This is great! Notice that our <tt>Alcohol</tt> class doesn&#8217;t have to know that it is flammable; we just need to make sure that <tt>AlcoholIsFlammable</tt> is in scope whenever we want to pass <tt>Alcohol</tt> to <tt>setFire()</tt>.</p>
<p>This is the exact approach that Scala takes: it defines a <tt>Numeric</tt> type class along with implicit objects that glue various types (like <tt>Int</tt>) to the type class.</p>
<p>For instance in Scala 2.8 (and later) this function will square the given number:</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #0000ff; font-weight: bold;">def</span> square<span style="color: #F78811;">&#91;</span>T<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>a<span style="color: #000080;">:</span>T<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">implicit</span> n<span style="color: #000080;">:</span>Numeric<span style="color: #F78811;">&#91;</span>T<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=</span> n.<span style="color: #000000;">times</span><span style="color: #F78811;">&#40;</span>a, a<span style="color: #F78811;">&#41;</span></pre></div></div>

<p>Notice that we call the methods we&#8217;re interested in (e.g. <tt>times()</tt>) via the instance of our type class (<tt>n</tt>) and not the actual values that belong to the typeclass (<tt>a</tt>). Also, we don&#8217;t use a traditional type bound on <tt>T</tt>, but rather, requiring that an implicit object of type <tt>Numeric[T]</tt> can be found.</p>
<p><a href="http://www.scala-lang.org/api/current/index.html#scala.math.Numeric">Numeric</a> is part of the Standard Scala Library but unfortunately it does not seem to be widely used. I believe there are several reasons for this:</p>
<ol>
<li>You can&#8217;t do this stuff in Java so people are used to working around the problem.</li>
<li>Using type classes in Scala can be a little bit confusing and cumbersome.</li>
<li>Scala&#8217;s Numeric lacks important features (e.g. division).</li>
<li>Scala&#8217;s Numeric is slow.</li>
</ol>
<p>In my <a href="http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-2/">next blog post</a>, I will explain how to use the Numeric type class in more detail, point out the problems I&#8217;m alluding to, and present our improved version along with some benchmarking numbers.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2011/06/scalas-numeric-type-class-pt-1/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

