<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Azavea Labs</title>
	<atom:link href="http://www.azavea.com/blogs/labs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.azavea.com/blogs/labs</link>
	<description>Insight on what our engineers are doing</description>
	<lastBuildDate>Wed, 01 Sep 2010 22:38:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>How We Launched a New CMS (301 redirects and counting)</title>
		<link>http://www.azavea.com/blogs/labs/2010/08/how-we-launched-a-new-cms-301-redirects-and-counting/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/08/how-we-launched-a-new-cms-301-redirects-and-counting/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 20:15:10 +0000</pubDate>
		<dc:creator>Jeremy Heffner</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cms]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[Varnish]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=820</guid>
		<description><![CDATA[When we decided to give our website a brand new look, we also evaluated whether to change Content Management Systems and decided to migrate our content management system to Concrete 5 from DotNetNuke.  DotNetNuke had served us well, but we wanted to move to a more modern and lightweight framework with a better administrative interface. [...]]]></description>
			<content:encoded><![CDATA[<p>When we decided to give our website <a href="http://www.azavea.com/blogs/atlas/2010/08/a-brand-new-look/">a brand new look</a>, we also evaluated whether to change Content Management Systems and decided to migrate our content management system to <a href="http://www.concrete5.org/">Concrete 5</a> from <a href="http://www.dotnetnuke.com/">DotNetNuke</a>.  DotNetNuke had served us well, but we wanted to move to a more modern and lightweight framework with a better administrative interface.   This migration presented us with a few challenges to overcome.</p>
<p>While we redesigned our company website, we didn&#8217;t want to wait to launch it until we had migrated all of our product websites to Concrete 5.   To combine the two systems under one domain name, we leveraged <a href="http://varnish-cache.org/">Varnish</a> as a proxy to split requests between our LAMP stack hosting Concrete 5 and our Windows server hosting DotNetNuke.    We&#8217;ve found Varnish to be a robust and yet lightweight tool for proxying, caching, and load balancing.  You can read more about how we used <a href="http://www.azavea.com/blogs/labs/2009/12/scaling-walkshed-org-with-varnish-and-amazon-web-services/">Varnish to host Walkshed on Amazon Web Services in this related blog post</a>.</p>
<p>Another concern when migrating CMS&#8217;s is making sure that the old URLs continue to function and issue the right redirects to users and search engines.   <a href="http://www.azavea.com/about-us/staff-profiles/andrew-jennings/">Andrew Jennings</a> took this project on and whipped up a URL redirection system for us using Apache&#8217;s <a href="http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html">mod_rewrite</a> module.</p>
<ul>
<li>As requests come into Varnish, we first test to see if Concrete5 can respond to the URL.</li>
<li>If Concrete5 isn&#8217;t managing the content for the URL, the request bounces into our redirection system that maps old URLs to new URLs using a <a href="http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritemap">binary hashtable</a> and issues <a href="http://en.wikipedia.org/wiki/URL_redirection#HTTP_status_codes_3xx">301 permanent redirects</a> to the client.   We found that this approach is faster than simply relying on <a href="http://httpd.apache.org/docs/1.3/howto/htaccess.html">Apache .htaccess</a> files since we have several thousand legacy URLs to support.</li>
<li>If the URL request is not in the mapping file, we simply forward the request to the Windows server to be served by DotNetNuke.</li>
<li>If DotNetNuke returns a 404 error, we redirect the request back to Concrete 5 to respond with a 404 error so that our redirection is invisible to the user.</li>
</ul>
<p>A final goal of the CMS migration was to improve page load times for our visitors.   Having pages that load quickly not only <a href="http://www.hitwise.com/news/au201001.html#body_third">improves the experience for our web visitors</a>, but also has an <a href="http://searchengineland.com/google-now-counts-site-speed-as-ranking-factor-39708">impact on search engine rankings</a>.    Concrete5 suggested we setup <a href="http://pecl.php.net/package/APC">Alternative PHP Cache (APC)</a> to speed up the pages it is serving, which has definitely improved performance on our new pages.</p>
<p>We found a great WordPress plugin (which runs our blogs) that can also leverage APC called <a href="http://wordpress.org/extend/plugins/w3-total-cache/">W3 Total Cache</a>.   It provides a host of other functionality that we&#8217;re exploring including HTML, JS, and CSS minification as well as pushing design assets to Amazon&#8217;s CDN.</p>
<p>I always find it amusing how something as straight forward as a new website launch has so much happening behind the scenes.   To find out more <a href="http://www.azavea.com/blogs/atlas/2010/09/whats-in-a-new-website-beyond-a-new-look-content-that-defines-a-whole-brand/">about the content that went into the new Azavea website read Rachel&#8217;s related blog post</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/08/how-we-launched-a-new-cms-301-redirects-and-counting/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>GPU Occupancy and Idling</title>
		<link>http://www.azavea.com/blogs/labs/2010/07/gpu-occupancy-and-idling/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/07/gpu-occupancy-and-idling/#comments</comments>
		<pubDate>Wed, 07 Jul 2010 14:05:03 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[raster]]></category>
		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=590</guid>
		<description><![CDATA[As our ongoing research into raster processing for GIS on the GPU progresses, we have gone through various stages in the development of each Map Algebra operation.  Having converted a given operation to the GPU, we are finding that there are many potential ways to optimize, and this optimization process brings with it a host [...]]]></description>
			<content:encoded><![CDATA[<p>As our ongoing research into raster processing for GIS on the GPU progresses, we have gone through various stages in the development of each Map Algebra operation.  Having converted a given operation to the GPU, we are finding that there are many potential ways to optimize, and this optimization process brings with it a host of issues that highlight the differences between sequential CPU programming and GPGPU parallel programming.</p>
<p>During the optimization process, we&#8217;ve found (and been told) that the single most important optimization is to ensure memory coalescence.  I blogged about that <a href="http://www.azavea.com/blogs/labs/2010/06/gpu-memory-bandwidth/">before</a>, so if you haven&#8217;t seen it yet, it might be worth reading before you continue on.</p>
<p>After maximum memory coalescence has been achieved, it is possible to focus on 2 additional metrics: occupancy and idling.</p>
<h2>Occupancy</h2>
<p>The occupancy metric is defined as the number of active thread groups per processor divided by the maximum number of thread groups per processor.  It&#8217;s a value in the range of 0-100%.</p>
<p>Occupancy is the number of thread groups (NVidia calls them &#8216;warps&#8217;, ATI calls them &#8216;wavefronts&#8217;) that are active at one time.  At any one time, some thread groups may be processing data, and some thread groups may be accessing global memory.  When some thread groups are accessing global memory, these threads are effectively stalled for hundreds of instructions, while the other thread groups continue on.</p>
<p>Internally, the GPU has a thread group scheduler which controls when thread groups are executed.  This is extremely useful, since highly parallel operations will utilize many thread groups to perform calculations.  The GPU is highly parallel, but even it has its limits.  This is where the thread group scheduler comes in &#8212; it can execute some of the thread groups, while other thread groups are idle, either completed or queued.  This scheduling enables some thread groups to perform memory access, while other thread groups perform calculations.</p>
<p>Understanding the scheduler makes it possible to &#8216;hide&#8217; these global memory accesses by performing ~100 arithmetic instructions between each global memory access.  Hypothetically, if the GPU executed a kernel that accessed global memory, performed a heavy-duty calculation, then saved that result, the occupancy would probably be pretty high.  The thread group scheduler would schedule a set of thread groups for accessing global memory while scheduling another set of thread groups for heavy-duty calculation. This is effectively &#8216;hiding&#8217; the memory access, since the GPU can perform computation instructions while accessing memory. Interestingly, there will be a point when increases to occupancy won&#8217;t improve your performance. It is at this point when all global memory accesses are &#8216;hidden&#8217; by the computation, and it becomes time to look other places for optimization.</p>
<h2>Idling</h2>
<p>The idling metric is defined as the amount of time the GPU is idle divided by the overall execution time of the computation.  It&#8217;s a value in the range of 0-100%.</p>
<p>Idling is something that we have discovered to be critical to the performance of a calculation.  The reference and training documentation instructs GPGPU developers to keep the GPU as busy as possible for as long as possible, and stops there.  By creating this metric, we were able to measure just how much this idling was affecting our computation.</p>
<p>As it turns out, our initial experiments showed that our GPU was idle during periods of memory transfer to and from the CPU.  This idling of the GPU was extending the overall time for computation.  Minimizing this idling through asynchronous kernel execution and memory transfer resulted in a significant and immediate performance improvement.</p>
<h2>Coalescence, Occupancy, Idling</h2>
<p>To summarize, the best way to optimize your GPU computations is to investigate and optimize these three steps (and in this order):</p>
<ol>
<li>Memory coalescence</li>
<li>Thread group occupancy</li>
<li>GPU Idling</li>
</ol>
<p>There are a number of smaller optimization that can be done as well, but we&#8217;ve found these to be the big 3.  Of course, you can continue this process forever, and demonstrate to your boss the law of <a href="http://www.google.com/images?hl=en&amp;q=diminishing+returns+graph">diminishing returns</a>.</p>
<p><span id="more-590"></span><br />

<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script src="/blogs/labs/wp-content/themes/labs/js/wp-blog-form.js" type="text/javascript"></script>
<script src="http://ajax.microsoft.com/ajax/jquery.validate/1.7/jquery.validate.min.js" type="text/javascript"></script>
<div id="blogFormContainer"><form id="blogForm" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="POST">
<h3>Want to know more about our GPU work?</h3>
<input name="oid" type="hidden" value="00D30000000efK8" /> <input name="retURL" type="hidden" value="http://www.azavea.com/Confirmation.aspx" /> <input id="lead_source" name="lead_source" type="hidden" value="Web" /> <input id="Campaign_ID" name="Campaign_ID" type="hidden" value="70130000000P8IZ" /> <input id="00N30000004RyN1" name="00N30000004RyN1" type="hidden" value="1" />
<table border="0">
<tbody>
<tr>
<td><label class="mc_var_label" for="first_name">First Name</label>

<input id="first_name" class="mc_input required" maxlength="40" name="first_name" size="18" type="text" /></td>
<td><label class="mc_var_label" for="last_name">Last Name</label>

<input id="last_name" class="mc_input" maxlength="40" name="last_name" size="18" type="text" /></td>
</tr>
<tr>
<td><label class="mc_var_label" for="company">Company</label>

<input id="company" class="mc_input" maxlength="40" name="company" size="18" type="text" /></td>
<td><label class="mc_var_label" for="email">Email</label>

<input id="email" class="mc_input required email" maxlength="40" name="email" size="18" type="text" /></td>
</tr>
<tr>
<td colspan="2"><label class="mc_var_label" for="description">Message</label>
<div><textarea name="description"></textarea></div></td>
</tr>
<tr>
<td colspan="2" align="center"><span class="sendButton"> </span></td>
</tr>
</tbody></table>
</form></div>
<!-- #blogFormContainer -->

</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/07/gpu-occupancy-and-idling/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>GPU Memory Bandwidth and Coalescing</title>
		<link>http://www.azavea.com/blogs/labs/2010/07/gpu-memory-bandwidth/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/07/gpu-memory-bandwidth/#comments</comments>
		<pubDate>Thu, 01 Jul 2010 09:40:54 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[gpgpu]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=579</guid>
		<description><![CDATA[When one begins to work with GPGPU, the parallel processing benefits can be incredibly beneficial, if you know how to work with coalesced memory. This fits in with a parallel algorithm approach, incorporating the following: thinking about your computation in a data-parallel fashion. transferring working data into a local memory cache. considering scrutinizing how your [...]]]></description>
			<content:encoded><![CDATA[<p>When one begins to work with <a href="http://www.azavea.com/blogs/labs/2010/06/what-the-heck-is-gpgpu/">GPGPU</a>, the parallel processing benefits can be incredibly beneficial, if you know how to work with coalesced memory. This fits in with a parallel algorithm approach, incorporating the following:</p>
<ol>
<li> thinking about your computation in a data-parallel fashion.</li>
<li>transferring working data into a local memory cache.</li>
<li><span style="text-decoration: line-through;">considering</span> scrutinizing how your code performs global memory accesses.</li>
</ol>
<p>The first item almost goes without saying.  If you are hoping to leverage a massively parallel computing device, you obviously have to break your problem or computation down into discrete units that can be operated on in parallel.</p>
<p>It&#8217;s the second and third point that I am going to focus on in this post, since they are the most important factors when optimizing your GPGPU code.  The reason these are the most important factors are that local memory is so much faster at reading and writing than global memory, and the memory module in modern GPUs can perform concurrent reads to sequential global memory positions for an entire thread group.</p>
<h2>Local Memory Caching</h2>
<p>Use of a local memory cache may seem counter-intuitive to a programmer coming from CPU land.  The best analogy would be: storing your working data in RAM instead of on disk.  While not a perfect analogy, a CPU programmer understands perfectly the ramifications of such a design decision &#8212; any data accessed from disk will be retrieved more slowly than data accessed from RAM.  Likewise for local and global memory.  Local memory is on-chip memory that is exceptionally fast.  Global memory is off-chip memory that is often used to transfer data to/from the host (often the CPU).  I&#8217;m talking about a 100x speed difference when using local memory instead of global memory.</p>
<p>In addition to the differences in global and local memory, the memory bandwidth to/from the graphics card (which contains its own memory and processors) and the motherboard (which contains RAM and one or more CPUs) is another bottleneck.  Data transfer rates across the <a href="http://www.pcisig.com/specifications/pciexpress/specifications">PCI Express 2.0</a> bus are about 8 GB/s.  Data transfer rates in the graphics card are around 141 GB/s.  So not only is the place in which you store your working data important, but also when and how you transfer that data to/from the GPU device itself.</p>
<h2>Sequential Global Memory a.k.a. Coalescence</h2>
<p>And &#8220;sequential global memory positions&#8221;? What is that?  Inside a GPGPU kernel, when accessing a portion of global memory, all threads in that group (NVidia calls them &#8216;warps&#8217;, and ATI calls them &#8216;wavefronts&#8217;) access a bank of memory at one time.  For example, if there are 16 threads executing with the same kernel, 16 sequential positions in global memory (1 position per thread) can be accessed in the same time that it would take 1 thread to read 1 position in memory.  If all memory accesses are performed this way, performance can speed up by a factor of 16 (in the memory access code).</p>
<p>That&#8217;s a wonderful way to speed up data-intensive operations, especially when one is working with raster data, and a given block of cells is accessed multiple times.  It is in this scenario that our research has recently landed us.</p>
<p>Another thing worth noting is that coalescence concept applies to global memory on the GPU only &#8212; local memory does not suffer the same performance hit, so does not need to take advantage of this technique.  But global memory access on the GPU takes about 100x as many instructions as local memory access.  This means that if you have coalesced global memory access, you are saving hundreds of instructions per thread.  This starts to add up when you consider that processing a raster may require hundreds or thousands of threads.</p>
<p>Armed with this knowledge, parallel algorithm implementations begin to have similar structures with regards to memory access.  The resulting code can be highly complex, though, and it&#8217;s not trivial to debug, but some new tools from <a href="http://developer.nvidia.com/object/visual-profiler.html">NVidia</a> and <a href="http://developer.amd.com/gpu/StreamProfiler/Pages/default.aspx">ATI</a> are enabling developers to profile and visualize the work performed by the GPU. In my next post, I&#8217;ll discuss latency and occupancy, two metrics that one can use to help optimize GPU kernels.</p>
<p><span id="more-579"></span><br />

<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script src="/blogs/labs/wp-content/themes/labs/js/wp-blog-form.js" type="text/javascript"></script>
<script src="http://ajax.microsoft.com/ajax/jquery.validate/1.7/jquery.validate.min.js" type="text/javascript"></script>
<div id="blogFormContainer"><form id="blogForm" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="POST">
<h3>Want to know more about our GPU work?</h3>
<input name="oid" type="hidden" value="00D30000000efK8" /> <input name="retURL" type="hidden" value="http://www.azavea.com/Confirmation.aspx" /> <input id="lead_source" name="lead_source" type="hidden" value="Web" /> <input id="Campaign_ID" name="Campaign_ID" type="hidden" value="70130000000P8IZ" /> <input id="00N30000004RyN1" name="00N30000004RyN1" type="hidden" value="1" />
<table border="0">
<tbody>
<tr>
<td><label class="mc_var_label" for="first_name">First Name</label>

<input id="first_name" class="mc_input required" maxlength="40" name="first_name" size="18" type="text" /></td>
<td><label class="mc_var_label" for="last_name">Last Name</label>

<input id="last_name" class="mc_input" maxlength="40" name="last_name" size="18" type="text" /></td>
</tr>
<tr>
<td><label class="mc_var_label" for="company">Company</label>

<input id="company" class="mc_input" maxlength="40" name="company" size="18" type="text" /></td>
<td><label class="mc_var_label" for="email">Email</label>

<input id="email" class="mc_input required email" maxlength="40" name="email" size="18" type="text" /></td>
</tr>
<tr>
<td colspan="2"><label class="mc_var_label" for="description">Message</label>
<div><textarea name="description"></textarea></div></td>
</tr>
<tr>
<td colspan="2" align="center"><span class="sendButton"> </span></td>
</tr>
</tbody></table>
</form></div>
<!-- #blogFormContainer -->

</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/07/gpu-memory-bandwidth/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>GPUs and Parallel Computing Architectures</title>
		<link>http://www.azavea.com/blogs/labs/2010/06/parallel-computing-architectures/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/06/parallel-computing-architectures/#comments</comments>
		<pubDate>Tue, 29 Jun 2010 14:45:14 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=565</guid>
		<description><![CDATA[I&#8217;ve been blogging about GPUs recently, and I think you can tell it&#8217;s because I&#8217;m excited about the technology.  General Purpose Computing on the GPU (GPGPU) promises great performance increases in computationally heavy software, which we find immensely useful.  In the past, we&#8217;ve managed to engineer web-based applications (see: SmartConservation) that could run complex models [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been blogging about GPUs recently, and I think you can tell it&#8217;s because I&#8217;m excited about the technology.  <strong>G</strong>eneral <strong>P</strong>urpose Computing on the <strong>GPU</strong> (<a href="http://www.azavea.com/blogs/labs/2010/06/what-the-heck-is-gpgpu/">GPGPU</a>) promises great performance increases in computationally heavy software, which we find immensely useful.  In the past, we&#8217;ve managed to engineer web-based applications (see: <a href="http://www.azavea.com/clients/smartconservationgreenplan.aspx">SmartConservation</a>) that could run complex models by implementing a process queuing architecture, but in these systems, while they will run on the web, processing may still take several minutes and they therefore can neither provide a responsive user experience nor support many users.  We&#8217;ve also engineered a system that can perform fast, distributed raster calculations (see: <a href="http://www.walkshed.org/">Walkshed</a>, powered by <a href="http://www.azavea.com/Products/DecisionTree/Home.aspx">DecisionTree</a>).</p>
<p>One of the reasons that GPGPU is so promising is the increasing number of processing cores available on affordable graphics cards.  This increases the computation capacity by leveraging many processors running in parallel. What&#8217;s interesting is that this technique is not new.  <a href="http://blogs.intel.com/research/authors#timothy_mattson">Timothy Mattson</a>, blogging at <a href="http://blogs.intel.com/research/">Intel</a>, has been doing this since the mid 80&#8242;s.  The Library of Congress contains a book on <a href="http://lccn.loc.gov/72133318">parallel computing structures and algorithms</a>, dating back to 1969.</p>
<p>As we delve deeper into our work improving Map Algebra operations, important differences in algorithm approaches and implementations become apparent: not all parallel architectures are the same.  One might be tempted to think that when switching from the single-threaded CPU logic to multithreaded/parallel logic that there would be one model of parallel computing that is universal.  This is definitely not the case.</p>
<p>Three of the most popular types of parallel computing today are:</p>
<ul>
<li><strong>S</strong>hared-memory <strong>M</strong>ulti<strong>-P</strong>rocessors (SMP)</li>
<li>Distributed-memory <strong>M</strong>assive <strong>P</strong>arallel <strong>P</strong>rocessors (MPP)</li>
<li>Cluster computing</li>
</ul>
<p>Each type of parallel computing has its benefits and drawbacks.  It really just depends what kind of computing you need to do. I&#8217;ll describe these common computing types in detail, starting with the &#8216;traditional&#8217; CPU model.</p>
<p><span id="more-565"></span></p>
<h2>General Purpose</h2>
<p>The &#8216;traditional&#8217; processors used in many computers up until a few years ago were single core processors, called the <strong>C</strong>entral <strong>P</strong>rocessing <strong>U</strong>nit (CPU).  The CPU was able to access a large, general-purpose memory bank, called <strong>R</strong>andom <strong>A</strong>ccess <strong>M</strong>emory (RAM).</p>
<div id="attachment_790" class="wp-caption aligncenter" style="width: 485px"><img class="size-full wp-image-790" title="CPU Architecture" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/cpu-drawing.png" alt="Simplified CPU Architecture" width="475" height="54" /><p class="wp-caption-text">Simplified CPU Architecture</p></div>
<p>In contemporary computers, the CPU often contains more than one core, making CPU&#8217;s capable of more than one instruction at a time.  In addition to superscalar instruction processing, this makes modern CPUs much faster than their single core, scalar predecessors.</p>
<h2>Shared-memory Multiprocessors</h2>
<p>An SMP architecture is probably the one parallel computing architecture that is most like the general purpose architecture with which we are familiar. SMP are a set of processors that all have their own local memory.  These memory banks are shared within a thread group, but not between more than one thread group.  However, each processor also has access to a global memory bank, which is shared between all processors.</p>
<div id="attachment_791" class="wp-caption aligncenter" style="width: 485px"><img class="size-full wp-image-791" title="SMP" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/simd-drawing.png" alt="Simplified Shared-memory Multiprocessor Architecture" width="475" height="256" /><p class="wp-caption-text">Simplified Shared-memory Multiprocessor Architecture</p></div>
<p>This is the parallel architecture that NVidia and AMD/ATI both use in their GPUs.  Likewise, it&#8217;s also the model enforced in the OpenCL specification.</p>
<h2>Distributed-memory Massive Parallel Processors</h2>
<p>The most complicated and flexible architecture type is MPP.  MPP systems isolate memory and processors together, and as such, have no common or shared memory.  Each processor has a dedicated block of local memory, and communicates with other processors via a bus or network.  By varying the number of processors each processor is connected to, different types of MPP systems can be created.  For example:</p>
<ul>
<li>Linear array: if the processors were arranged in a line, each processor is connected to 2 neighboring processors
<p><div id="attachment_793" class="wp-caption aligncenter" style="width: 385px"><img class="size-full wp-image-793" title="Linear Array Architecture" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/linear-array.png" alt="Simplified Linear Array Architecture" width="375" height="53" /><p class="wp-caption-text">Simplified Linear Array Architecture</p></div></li>
<li>Linear ring: if the processors were arranged in a circle, each processor is connected to 2 neighboring processors (a linear array, with the ends connected)</li>
<li>Mesh: if the processors were arranged in a grid, each processor is connected to up to 4 neighbors (3 on the edges, and 2 in the corners)
<p><div id="attachment_794" class="wp-caption aligncenter" style="width: 385px"><img class="size-full wp-image-794" title="Mesh Architecture" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/mesh.png" alt="Simplified Mesh Architecture" width="375" height="130" /><p class="wp-caption-text">Simplified Mesh Architecture</p></div></li>
<li>Tree: if the processors were arranged in a hierarchical manner, with each processor connected to the processor above it, and two processors below it.</li>
<li>Pyramid: if the processors were arranged similar to a tree, but in three dimensions, with each processor connecting to four processors below it.
<p><div id="attachment_795" class="wp-caption aligncenter" style="width: 385px"><img class="size-full wp-image-795" title="Pyramid Architecture" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/pyramid.png" alt="Simplified Pyramid Architecture" width="375" height="202" /><p class="wp-caption-text">Simplified Pyramid Architecture</p></div></li>
<li>Cube: if the processors were arranged similar to a mesh, but in three dimensions, with each processor connected to up to 6 neighbors.</li>
<li>Hypercube: if the processors were arranged similar to a cube, but in four dimensions, with each processor connected to up to 8 neighbors.</li>
</ul>
<p>As you can see, the processors in MPP systems can proliferate quite rapidly with more complex processor network topologies.   We haven&#8217;t worked with any MPP systems for our GPU research, so I&#8217;ll let you ponder that while I return to the GPU architecture.</p>
<h2>GPU Memory &#8211; Not Your Father&#8217;s RAM</h2>
<p>As I mentioned above, GPUs and OpenCL implementations are based on the SMP architecture.  As such, GPUs have multiple types of memory, with different implications for each type.</p>
<ul>
<li><strong>Global memory: </strong>this is often the big number on the graphics card packaging.  512 MB DDR, etc.  This is the amount of global memory that is available to the GPU processors.  This memory is essentially used as a fast cache to the motherboard RAM, since it&#8217;s used to transfer raw data to the GPU for processing, and storing computation results prior to reading out back to motherboard RAM.</li>
<li><strong>Local shared memory:</strong> this is a much smaller bank of memory that is extremely fast.  On the hardware that we&#8217;re using, it&#8217;s limited to 16KB. With some smart memory management, this local memory can really speed up computations, since the instruction cost of accessing this memory is 1% of that required to access global memory.  Also, this memory is shared between all threads in a work-group.</li>
<li><strong>Private thread memory: </strong>this is an extremely small bank of memory that can be used within each thread for variables and temporary storage during your computation. Interestingly, in the NVidia implementation, this uses registers for a certain amount, then starts using global memory when registers are exhausted.</li>
</ul>
<p>The differences in memory types are probably the first thing a general purpose GPU programmer will run into. Another thing to keep in mind is the method by which the GPU achieves such high throughput, and that&#8217;s thread parallelism.</p>
<h2>Single Instruction Multiple Threads</h2>
<p>In OpenCL, each parallel code path executes one kernel.  The best possible outcome (in regards to thread synchronicity) is when each kernel executes the exact same instructions as all other threads.  With each thread managing a different nugget of data, this results in extremely fast execution.  However, if the kernel code diverges or branches, there is a performance penalty: that section of your code will execute serially (think 16x to 32x slower).</p>
<p>NVidia implements this architecture, and has called it <strong>S</strong>ingle <strong>I</strong>nstruction <strong>M</strong>ultiple <strong>T</strong>hreads (SIMT). It&#8217;s kind of like <a href="http://en.wikipedia.org/wiki/Line_dance">line-dancing</a> for threads.  All threads that execute the same instructions can perform together.  If a thread diverges or branches, then the line-dance is broken, and each thread processes a divergent section one after another.  What&#8217;s kind of cool, though, is that the threads will join back up after diverging, and continue on together.</p>
<h2>Wrapping it up</h2>
<p>With a solid understanding of how the GPU operates in addition to the limitations of memory and threading, it&#8217;s relatively easy to start computing on the GPU.  Many common operations are easily parallelizable, such as sorting and basic mathematical operations.  When you start performing serious number crunching, or if you are porting a beefy algorithm from serial CPU code, that&#8217;s when the real fun begins.<br />
<!--more--><br />

<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script src="/blogs/labs/wp-content/themes/labs/js/wp-blog-form.js" type="text/javascript"></script>
<script src="http://ajax.microsoft.com/ajax/jquery.validate/1.7/jquery.validate.min.js" type="text/javascript"></script>
<div id="blogFormContainer"><form id="blogForm" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="POST">
<h3>Want to know more about our GPU work?</h3>
<input name="oid" type="hidden" value="00D30000000efK8" /> <input name="retURL" type="hidden" value="http://www.azavea.com/Confirmation.aspx" /> <input id="lead_source" name="lead_source" type="hidden" value="Web" /> <input id="Campaign_ID" name="Campaign_ID" type="hidden" value="70130000000P8IZ" /> <input id="00N30000004RyN1" name="00N30000004RyN1" type="hidden" value="1" />
<table border="0">
<tbody>
<tr>
<td><label class="mc_var_label" for="first_name">First Name</label>

<input id="first_name" class="mc_input required" maxlength="40" name="first_name" size="18" type="text" /></td>
<td><label class="mc_var_label" for="last_name">Last Name</label>

<input id="last_name" class="mc_input" maxlength="40" name="last_name" size="18" type="text" /></td>
</tr>
<tr>
<td><label class="mc_var_label" for="company">Company</label>

<input id="company" class="mc_input" maxlength="40" name="company" size="18" type="text" /></td>
<td><label class="mc_var_label" for="email">Email</label>

<input id="email" class="mc_input required email" maxlength="40" name="email" size="18" type="text" /></td>
</tr>
<tr>
<td colspan="2"><label class="mc_var_label" for="description">Message</label>
<div><textarea name="description"></textarea></div></td>
</tr>
<tr>
<td colspan="2" align="center"><span class="sendButton"> </span></td>
</tr>
</tbody></table>
</form></div>
<!-- #blogFormContainer -->

</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/06/parallel-computing-architectures/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>CUDA, Stream, and OpenCL</title>
		<link>http://www.azavea.com/blogs/labs/2010/06/cuda-stream-and-opencl/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/06/cuda-stream-and-opencl/#comments</comments>
		<pubDate>Fri, 25 Jun 2010 14:30:27 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[gpgpu]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[opencl]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=558</guid>
		<description><![CDATA[Computing on the GPU, or GPGPU, is a steadily maturing technology.  There are many technologies out in the wild that will enable you to use GPU&#8217;s for computation, but there&#8217;s a catch: the vendors are still vying for the lead.  The two market leaders are currently NVidia and AMD/ATI. That means that NVidia is pushing [...]]]></description>
			<content:encoded><![CDATA[<p>Computing on the GPU, or <a href="http://www.azavea.com/blogs/labs/2010/06/what-the-heck-is-gpgpu/">GPGPU</a>, is a steadily maturing technology.  There are many technologies out in the wild that will enable you to use GPU&#8217;s for computation, but there&#8217;s a catch: the vendors are still vying for the lead.  The two market leaders are currently <a href="http://www.nvidia.com/">NVidia</a> and <a href="http://ati.amd.com/">AMD/ATI</a>.</p>
<p>That means that NVidia is pushing their GPGPU API, which is named <strong>C</strong>ompute <strong>U</strong>nified <strong>D</strong>evice <strong>A</strong>rchitecture,&#8221; or <a href="http://www.nvidia.com/object/cuda_what_is_new.html">CUDA</a>.  Their rival, AMD/ATI, is pushing <a href="http://www.amd.com/US/PRODUCTS/TECHNOLOGIES/STREAM-TECHNOLOGY/Pages/stream-technology.aspx">Stream</a>. Stream incorporates <a href="http://graphics.stanford.edu/projects/brookgpu/">BrookGPU</a>, a compiler and data-parallel language developed at Stanford University, which predates CUDA.</p>
<div id="attachment_705" class="wp-caption aligncenter" style="width: 335px"><img class="size-full wp-image-705   " style="border: 5px solid white;" title="GPGPU APIs" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/apis.png" alt="NVidia &amp; CUDA, ATI &amp; Stream, or OpenCL" width="325" height="100" /><p class="wp-caption-text">NVidia &amp; CUDA, ATI &amp; Stream, or OpenCL</p></div>
<p>Both of these vendor APIs are proprietary, and run on each vendor&#8217;s specific hardware.  This makes sense if a developer can control what hardware computations will be using. Realistically, a developer rarely has such control. So what are the options? At the current time, there are only a couple:  <a href="http://www.khronos.org/opencl/">OpenCL</a> and Microsoft’s <a href="http://en.wikipedia.org/wiki/DirectCompute">DirectCompute</a> technology.  Microsoft&#8217;s technology is limited to Windows Vista and Windows 7, though, so we are focusing on OpenCL.</p>
<p>OpenCL is the <strong>Open</strong> <strong>C</strong>omputing <strong>L</strong>anguage, a language that extends the <a href="http://en.wikipedia.org/wiki/C99">C99</a> standard (a modern dialect of the C programming language) and compiles into device-specific binaries. OpenCL was originally developed by <a href="http://developer.apple.com/technologies/mac/snowleopard/opencl.html">Apple</a>, and handed over to the <a href="http://www.khronos.org/">Khronos Group</a>. The OpenCL standard was ratified by the consortium in December of 2008.  The Khronos Group consortium includes all the major players in the field, including NVidia, AMD/ATI, Apple, and <a href="http://www.intel.com/">Intel</a>.  The list is much more extensive, but those are the four to be happy about.  Intel doesn&#8217;t support OpenCL in their multicore CPUs, but I&#8217;m optimistic that they will release an OpenCL API to leverage CPU cores as well as GPU cores as computing devices.</p>
<p>OpenCL was created to address the <a href="http://www.youtube.com/watch?v=ZwNWviK5z0Q">need for speed</a> in current desktop systems that contain GPU processors.  The language was created to address computing on heterogeneous systems, which, when you think about it, can include many other types of computing devices.  If OpenCL is adopted by <a href="http://developer.android.com/index.html">Android</a>, then you could optimize code to run on Android devices, too. While this may not be the fastest approach, it would potentially let you distribute work among devices.</p>
<p>One caveat to heterogeneous systems, though: OpenCL kernels that are written and optimized for one hardware platform probably won&#8217;t perform the same as on another hardware platform.  While OpenCL enables developers to write code that can run on multiple hardware devices, the hardware implementations may vary.  For example, the number of processor cores, and thus the number of parallel threads may vary widely.</p>
<p>If you can&#8217;t tell already, we are sold on the promise of OpenCL for GPGPU.  The language is easy to use (if you already know C), and it supports the two biggest players in the GPU market, NVidia and AMD/ATI.  We are hoping that Intel releases their OpenCL drivers for CPUs, too, so that we can squeeze out the last drip of computing power for our computations.</p>
<p><span id="more-558"></span><br />


<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script src="/blogs/labs/wp-content/themes/labs/js/wp-blog-form.js" type="text/javascript"></script>
<script src="http://ajax.microsoft.com/ajax/jquery.validate/1.7/jquery.validate.min.js" type="text/javascript"></script>
<div id="blogFormContainer"><form id="blogForm" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="POST">
<h3>Want to know more about our GPU work?</h3>
<input name="oid" type="hidden" value="00D30000000efK8" /> <input name="retURL" type="hidden" value="http://www.azavea.com/Confirmation.aspx" /> <input id="lead_source" name="lead_source" type="hidden" value="Web" /> <input id="Campaign_ID" name="Campaign_ID" type="hidden" value="70130000000P8IZ" /> <input id="00N30000004RyN1" name="00N30000004RyN1" type="hidden" value="1" />
<table border="0">
<tbody>
<tr>
<td><label class="mc_var_label" for="first_name">First Name</label>

<input id="first_name" class="mc_input required" maxlength="40" name="first_name" size="18" type="text" /></td>
<td><label class="mc_var_label" for="last_name">Last Name</label>

<input id="last_name" class="mc_input" maxlength="40" name="last_name" size="18" type="text" /></td>
</tr>
<tr>
<td><label class="mc_var_label" for="company">Company</label>

<input id="company" class="mc_input" maxlength="40" name="company" size="18" type="text" /></td>
<td><label class="mc_var_label" for="email">Email</label>

<input id="email" class="mc_input required email" maxlength="40" name="email" size="18" type="text" /></td>
</tr>
<tr>
<td colspan="2"><label class="mc_var_label" for="description">Message</label>
<div><textarea name="description"></textarea></div></td>
</tr>
<tr>
<td colspan="2" align="center"><span class="sendButton"> </span></td>
</tr>
</tbody></table>
</form></div>
<!-- #blogFormContainer -->
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/06/cuda-stream-and-opencl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What the heck is &#8230; GPGPU?</title>
		<link>http://www.azavea.com/blogs/labs/2010/06/what-the-heck-is-gpgpu/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/06/what-the-heck-is-gpgpu/#comments</comments>
		<pubDate>Wed, 16 Jun 2010 16:00:48 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ati]]></category>
		<category><![CDATA[gpgpu]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[nvidia]]></category>
		<category><![CDATA[parallel]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=553</guid>
		<description><![CDATA[General Purpose Computing on the Graphics Processing Unit, or GPGPU, is a technology that enables software to use the computing power of the multiple processing units that come with modern graphics cards.  The benefits of using these processing units is that computational work gets done by many workers at once, instead of one CPU or [...]]]></description>
			<content:encoded><![CDATA[<p>General Purpose Computing on the Graphics Processing Unit, or GPGPU, is a technology that enables software to use the computing power of the multiple processing units that come with modern graphics cards.  The benefits of using these processing units is that computational work gets done by many workers at once, instead of one CPU or a few threads on a multi-core CPU.</p>
<p>At first, it may sound like multi-threading, and in a way it is.  What&#8217;s key, though, is that the GPU doesn&#8217;t just multi-thread, it&#8217;s synchonized, lock stepped multi-threading.   It&#8217;s called Single Instruction, Multiple Data, or <a href="http://en.wikipedia.org/wiki/SIMD">SIMD</a> in <a href="http://en.wikipedia.org/wiki/Flynn%27s_taxonomy">Flynn&#8217;s Taxonomy</a> (NVidia&#8217;s hardware uses a variant called Single Instruction, Multiple Threads, or SIMT).  This refers to the control flow in each thread on the GPU &#8212; to maximize performance, it&#8217;s important to get all threads running through the same logic at the same time.  In practice, some threads will eventually branch, and keeping the branching to a minimum is critical.</p>
<p>When we have processing tasks that can be highly discrete, the SIMT nature of the GPU enables computations to really fly when using GPGPU.  Structuring computations in this fashion will result in operations that are highly parallelized.  Coincidentally, they will also be easily distributed to other computation engines.  But unlike other distributed computing methods, GPGPU is tightly coupled, and has very low latency &#8212; you get immediate results.</p>
<p><img class="size-full wp-image-606 alignnone" style="margin-left: 150px; margin-right: 150px;" title="OpenCL Logo" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/06/OpenCL_Logo_RGB_172_square.png" alt="OpenCL Logo" width="172" height="172" /></p>
<p>To facilitate all of above, GPGPU requires a video card that supports programmability.  While the major GPU manufacturers have released proprietary GPGPU languages &#8212; for example is NVidia’s CUDA framework &#8212; a cross-platform, device-independent alternative, called <a href="http://www.khronos.org/opencl/">OpenCL</a>, has recently become sufficiently robust that we can begin to use it for implementing GPGPU processing.  While originally developed by Apple, OpenCL is now overseen by a non-profit organization, the <a href="http://www.khronos.org/">Khronos Group</a>.</p>
<p>If you&#8217;ve got a GPU on hand, it&#8217;s quite likely that there is an OpenCL API available in your programming language of choice.  Here&#8217;s a short list from the <a href="http://www.khronos.org/developers/resources/opencl/">Khronos Group&#8217;s</a> website:</p>
<ul>
<li><a href="http://mathema.tician.de/software/pyopencl">Python</a></li>
<li><a href="http://ruby-opencl.rubyforge.org/">Ruby</a></li>
<li><a href="http://www.opentk.com/project/opentk">C</a></li>
<li><a href="http://www.opentk.com/project/opentk">C++</a></li>
<li><a href="http://www.opentk.com/project/opentk">C#</a></li>
<li><a href="http://www.opentk.com/project/opentk">VB.Net</a></li>
<li><a href="http://planet.plt-scheme.org/display.ss?package=opencl.plt&amp;owner=jaymccarthy">Scheme</a></li>
</ul>
<p>By the way, Khronos just released the spec for <a href="http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard">OpenCL 1.1</a> on June 14th, 2010.</p>
<h2>GPGPU for GIS</h2>
<p>Our interest in GPGPU is motivated by our desire to speed up complicated raster math calculations, in particular those that would enable us to make more sophisticated geoprocessing available on web and mobile platforms.  Our vision is to make an order of magnitude or better improvements in geospatial data processing and thereby make web-based analysis tools more responsive and compelling.  To support our research we applied for and were awarded a <a href="http://www.nsf.gov/">National Science Foundation</a> (NSF) <a href="http://www.nsf.gov/eng/iip/sbir/">Small Business Innovation Research</a> (SBIR) grant to benchmark some traditional GIS operations against  GPU-accelerated versions of the same operations.  Our objective is to improve these operations by about 20x, and it looks promising.</p>
<p>We&#8217;re sampling the wide array of <a href="http://www.azavea.com/blogs/newsletter/v3i1/what-the-heck-ismap-algebra/">Map Algebra</a> operations that are in standard GIS toolkits, focusing on a couple examples of each of the following types of operations: <a href="http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=Operators_and_functions_of_Spatial_Analyst">local, focal, zonal, and global</a>.  Some of these types of operations are easily parallelized, such as local and focal operations.  The other operations require a fair amount of algorithm wrangling before they show any improvement on the GPU&#8217;s parallel architecture.</p>
<p>Our research is ongoing, so it&#8217;s still a bit too early to tell, but we&#8217;re excited to already see some impressive performance improvements.</p>
<h2>GPGPU Lessons</h2>
<p>Migrating our chosen Map Algebra operations to the GPU was trivial in some cases, but quite challenging in others.  For example, local Map Algebra operations are quite easy to parallelize.  Each raster cell is compared to another raster cell in the same location.  This requires very little extra memory and few intermediate steps.</p>
<p>Focal (or &#8220;neighborhood&#8221;) operations are not simple, particularly for large neighborhoods, but with some intelligent partitioning, it&#8217;s relatively easy to work with different sections of the rasters.  This requires some intermediate buffers, but performs well using local memory on the GPU.</p>
<p>Zonal operations summarize the cells in one raster data set using zones identified in a second one.  These operations require a couple scans across a raster, and a few intermediate stages &#8212; that one was complicated.</p>
<p>What&#8217;s crazy difficult, though, are the global operations.  These operations are challenging because they operate on the entire dataset, with one cell on one side of the raster potentially affecting a cell on the other side of the raster. Examples of global operations include Euclidean distance, cost-weighted distance, viewshed analysis and others.  These operations are non-trivial to convert to GPU processing in a performant way, but it is definitely possible.</p>
<p><span id="more-553"></span><br />

<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script src="/blogs/labs/wp-content/themes/labs/js/wp-blog-form.js" type="text/javascript"></script>
<script src="http://ajax.microsoft.com/ajax/jquery.validate/1.7/jquery.validate.min.js" type="text/javascript"></script>
<div id="blogFormContainer"><form id="blogForm" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="POST">
<h3>Want to know more about our GPU work?</h3>
<input name="oid" type="hidden" value="00D30000000efK8" /> <input name="retURL" type="hidden" value="http://www.azavea.com/Confirmation.aspx" /> <input id="lead_source" name="lead_source" type="hidden" value="Web" /> <input id="Campaign_ID" name="Campaign_ID" type="hidden" value="70130000000P8IZ" /> <input id="00N30000004RyN1" name="00N30000004RyN1" type="hidden" value="1" />
<table border="0">
<tbody>
<tr>
<td><label class="mc_var_label" for="first_name">First Name</label>

<input id="first_name" class="mc_input required" maxlength="40" name="first_name" size="18" type="text" /></td>
<td><label class="mc_var_label" for="last_name">Last Name</label>

<input id="last_name" class="mc_input" maxlength="40" name="last_name" size="18" type="text" /></td>
</tr>
<tr>
<td><label class="mc_var_label" for="company">Company</label>

<input id="company" class="mc_input" maxlength="40" name="company" size="18" type="text" /></td>
<td><label class="mc_var_label" for="email">Email</label>

<input id="email" class="mc_input required email" maxlength="40" name="email" size="18" type="text" /></td>
</tr>
<tr>
<td colspan="2"><label class="mc_var_label" for="description">Message</label>
<div><textarea name="description"></textarea></div></td>
</tr>
<tr>
<td colspan="2" align="center"><span class="sendButton"> </span></td>
</tr>
</tbody></table>
</form></div>
<!-- #blogFormContainer -->

</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/06/what-the-heck-is-gpgpu/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>GPU Computing for GIS</title>
		<link>http://www.azavea.com/blogs/labs/2010/06/gpu-computing-for-gis/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/06/gpu-computing-for-gis/#comments</comments>
		<pubDate>Mon, 14 Jun 2010 15:01:17 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[computing]]></category>
		<category><![CDATA[gpgpu]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=543</guid>
		<description><![CDATA[We live in exciting times. Computing power continues to grow at an exponential rate, and is well characterized by Moore&#8217;s Law (if you are looking for a graph more recent than 1965, try Wikipedia).  This means that computing power is moving in many directions.  The rise of laptops, notebooks, tablets, and smartphones are a testament [...]]]></description>
			<content:encoded><![CDATA[<p>We live in exciting times.</p>
<p>Computing power continues to grow at an exponential rate, and is well characterized by <a href="http://www.intel.com/pressroom/kits/events/moores_law_40th/index.htm?iid=tech_mooreslaw+body_presskit">Moore&#8217;s Law</a> (if you are looking for a graph more recent than 1965, try <a href="http://en.wikipedia.org/wiki/Moore%27s_law">Wikipedia</a>).  This means that computing power is moving in many directions.  The rise of <a href="http://tuxmobil.org/">laptops</a>, <a href="http://www.datamancer.net/steampunklaptop/steampunklaptop.htm">notebooks</a>, <a href="http://thefastertimes.com/nottruenews/2010/01/26/apple-fans-report-disappointment-with-new-tablet-computer/">tablets</a>, and <a href="http://www.openmoko.com/freerunner.html">smartphones</a> are a testament to the increasing computing power of microprocessors.  They are getting faster, smaller, lighter, more power efficient, and sprouting more cores.</p>
<p>Despite this accelerating computing power, however, on some of our projects, we&#8217;ve seen how many <a href="http://www.azavea.com/Clients/smartconservationgreenplan.aspx">heavy-duty</a> analytical computing tasks remain too costly (in terms of computing time) to be run on the web with more than a small number of users.  However, by distributing the computation across multiple processors and machines, we have found it is possible to improve both the scalability and speed of some geographic data processing tasks.  For one such task, a weighted raster overlay operationg, we have been able to accelerate the process enough to make a scalable web application possible.  Azavea’s <a title="DecisionTree" href="http://www.azavea.com/decisiontree/">DecisionTree</a> framework, developed with support from an <a href="../../newsletter/v2i6/what-the-heck-isan-sbir/">SBIR</a> grant from the <a href="http://www.csrees.usda.gov/funding/sbir/sbir.html">US Department of Agriculture</a>.</p>
<p>With this experience developing distributing geoprocessing algorithms, we have recently been taking a look at technologies that will enable us to make similar types of performance and scalability improvements.  One technology that we believe has great promise for bringing these processes to the web is General Purpose Computing on the Graphics Processing Unit (<a href="http://en.wikipedia.org/wiki/GPGPU">GPGPU</a>).</p>
<p>GPGPU leverages the microprocessors that power many modern graphics cards.  <a href="http://www.nvidia.com/object/what_is_cuda_new.html">NVidia</a> and <a href="http://www.amd.com/US/PRODUCTS/TECHNOLOGIES/STREAM-TECHNOLOGY/Pages/stream-technology.aspx">ATI</a> are the largest players in the high performance video adapter field, and they both have GPU computing libraries that run on their video adapter  hardware.</p>
<h2>GPU&#8217;s are accelerating everything.</h2>
<p>GPU&#8217;s are powerful for general purpose computing not just because of their clock speed, but because there are just so many multiprocessors on today&#8217;s GPU graphics cards.  While a quad-core CPU is a high-end processor for most servers, today&#8217;s high-end graphics cards have 100, 200 and 500 or more cores and are capable of <a href="http://en.wikipedia.org/wiki/FLOPS" target="_blank">gigaFLOPS</a> double precision processing power (<a href="http://www.nvidia.com/object/product_tesla_C2050_C2070_us.html" target="_blank">NVidia</a>, <a href="http://www.amd.com/us/products/desktop/graphics/ati-radeon-hd-5000/hd-5970/Pages/ati-radeon-hd-5970-overview.aspx" target="_blank">ATI</a>, respectively).  And these numbers are doing nothing but going up.</p>
<p>A few ways of comparing just what that means:</p>
<ul>
<li>a handheld calculator runs at about 10 FLOPS (not giga-, just plain FLOPS, one billionth of a gigaFLOP).</li>
<li>by the time you blink your eye, 154 gigaFLOPS have occurred on the NVidia Tesla C2070.</li>
<li>by the time a hummingbird flaps it&#8217;s wings, 10.3 gigaFLOPS have occurred on the same card.</li>
<li>by the time one FLOP has occurred on the same card, your voice has only traveled through 0.64 μm of air (human hair ranges from 17-181 μm thick)</li>
</ul>
<p>In addition to processors and processing speed, GPU cards have  fast, specialized memory access.  They have a limited amount of local memory, but if you can figue out a way to use it efficiently, your memory access is on the order of 100x faster than conventional memory.</p>
<p>The combination of more processors and faster memory means that if you can discretize or parallelize the type of work that you want to perform, you can get radical speed improvements.</p>
<h2><strong>GIS on the GPU.</strong></h2>
<p>That&#8217;s all well and good, but how can GPGPU be used for GIS?  We are <a href="http://www.directionsmag.com/article.php?article_id=3418" target="_blank">not the only ones thinking about this</a>, but the answer depends on what kind of analysis you want to do.  We have been focusing our research on a few types of MapAlgebra operations, and our preliminary investigations have shown that all types of MapAlgebra operations can benefit from processing on the GPU.  In addition, we believe substantial improvements can be made in some types of vector processing with a few likely candidates would be:</p>
<ul>
<li>Vector-to-raster and raster-to-vector conversion</li>
<li>Network analysis</li>
<li>Network routing</li>
<li>Transformations of geometric collections</li>
</ul>
<p>All of these optimizations have the potential of reducing the computing time for heavy duty GIS operations from hours to minutes, and therefore minutes to seconds.  With that kind of speedup, the <a href="http://www.websiteoptimization.com/speed/1/">&#8220;attention threshold&#8221;</a> of the web can be achieved.  It now becomes possible to run more complex GIS tasks in a web environment, bringing more computing power to the masses.</p>
<p>These changes won&#8217;t change the world right away, but it will make GIS analysis more interactive, responsive, and efficient.  Just imagine if you could complete any given task in your day in 1/10th the time (think <a href="http://www.imdb.com/name/nm1714016/">Dash</a>, from the <a href="http://www.imdb.com/title/tt0317705/">Incredibles</a>).<br />
<span id="more-543"></span><br />

<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script src="/blogs/labs/wp-content/themes/labs/js/wp-blog-form.js" type="text/javascript"></script>
<script src="http://ajax.microsoft.com/ajax/jquery.validate/1.7/jquery.validate.min.js" type="text/javascript"></script>
<div id="blogFormContainer"><form id="blogForm" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="POST">
<h3>Want to know more about our GPU work?</h3>
<input name="oid" type="hidden" value="00D30000000efK8" /> <input name="retURL" type="hidden" value="http://www.azavea.com/Confirmation.aspx" /> <input id="lead_source" name="lead_source" type="hidden" value="Web" /> <input id="Campaign_ID" name="Campaign_ID" type="hidden" value="70130000000P8IZ" /> <input id="00N30000004RyN1" name="00N30000004RyN1" type="hidden" value="1" />
<table border="0">
<tbody>
<tr>
<td><label class="mc_var_label" for="first_name">First Name</label>

<input id="first_name" class="mc_input required" maxlength="40" name="first_name" size="18" type="text" /></td>
<td><label class="mc_var_label" for="last_name">Last Name</label>

<input id="last_name" class="mc_input" maxlength="40" name="last_name" size="18" type="text" /></td>
</tr>
<tr>
<td><label class="mc_var_label" for="company">Company</label>

<input id="company" class="mc_input" maxlength="40" name="company" size="18" type="text" /></td>
<td><label class="mc_var_label" for="email">Email</label>

<input id="email" class="mc_input required email" maxlength="40" name="email" size="18" type="text" /></td>
</tr>
<tr>
<td colspan="2"><label class="mc_var_label" for="description">Message</label>
<div><textarea name="description"></textarea></div></td>
</tr>
<tr>
<td colspan="2" align="center"><span class="sendButton"> </span></td>
</tr>
</tbody></table>
</form></div>
<!-- #blogFormContainer -->

</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/06/gpu-computing-for-gis/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Geocoding The Centennial Expo&#8230; Without A Geocoder!</title>
		<link>http://www.azavea.com/blogs/labs/2010/05/geocoding-the-centennial-expo-without-a-geocoder/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/05/geocoding-the-centennial-expo-without-a-geocoder/#comments</comments>
		<pubDate>Tue, 04 May 2010 19:19:56 +0000</pubDate>
		<dc:creator>Carissa Brittain</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=472</guid>
		<description><![CDATA[Set your wayback machine to 1876, the place: the Philadelphia Centennial Exhibition. Wandering around the festival grounds, you would have seen several cameramen snapping away. Some of those images have survived admirably as the Free Library of Philadelphia&#8216;s Centennial Exhibition Collection. Over at PhillyHistory.org, we recently added a selection of those images, and the geocoding [...]]]></description>
			<content:encoded><![CDATA[<p>Set your <a title="Peabody here..." href="http://www.toonopedia.com/peabody.htm" target="_blank">wayback machine</a> to 1876, the place: the <a title="The first official World's Fair in the US" href="http://en.wikipedia.org/wiki/Centennial_Exposition" target="_blank">Philadelphia Centennial Exhibition</a>. Wandering around the festival grounds, you would have seen several <a title="With all of their equipment, too" href="http://pubs.acs.org/doi/abs/10.1021/ie50201a011" target="_blank">cameramen</a> snapping away. Some of those images have survived admirably as the <a title="Yep, it's free!" href="http://www.freelibrary.org/index.htm" target="_blank">Free Library of Philadelphia</a>&#8216;s <a title="Hidden gems in the library..." href="http://libwww.freelibrary.org/CenCol/" target="_blank">Centennial Exhibition Collection</a>. Over at PhillyHistory.org, we recently added a selection of those images, and the geocoding presented quite a challenge.</p>
<p>The first hurdle to be overcome was that the vast majority of the Centennial Expo&#8217;s buildings (and some of the streets) don&#8217;t exist anymore! The Expo took place in a swath of Philly&#8217;s <a title="Just a little piece of it" href="http://www.fairmountpark.org/" target="_blank">Fairmount Park</a>, most of which has turned back into trails and fields over the years.  Some of the buildings were torn down just after the Expo closed; some of them still exist, but in other places. Only a very few buildings and landmarks still occupy the same space they did back in 1876. But do not despair for there was <a title="A google map? no.." href="http://www.google.com/images?q=A+Map!&amp;oe=utf-8&amp;rls=org.mozilla:en-US:official&amp;client=firefox-a&amp;um=1&amp;ie=UTF-8&amp;source=univ&amp;ei=nnLgS4yNFJXy9ASfku3KCQ&amp;sa=X&amp;oi=image_result_group&amp;ct=title&amp;resnum=1&amp;ved=0CBMQsAQwAA">a map!</a> Included in the Free Library&#8217;s records was a map of the entire fair, complete with all of the buildings and roads built for the Expo, along with neat things like fountains and plazas. Dana Bauer, one of Azavea&#8217;s employees and a <a title="A very precise science." href="http://en.wikipedia.org/wiki/Georeference" target="_blank">geo-referencing</a> expert, took charge of the map and managed to line up enough of the old landmarks with existing geography to give us very accurate coordinates for all of the Centennial Expo&#8217;s old buildings! Hooray!</p>
<p>So now we have coordinates, but we need to be able to tie those coordinates to the image records in our system. Here the data was slightly uncooperative. Many of the image records had a building mentioned in some fashion and in one field or another. The challenge here was that the text in the record didn&#8217;t always match the official name of the building from the map. For example: the Main Exhibit Hall was mentioned in records as Main Exhibition Building, Main Hall, Main Bldg and several other permutations. We spent some time <a title="Not quite to this scale." href="http://en.wikipedia.org/wiki/Data_mining" target="_blank">figuring out</a> around 20 aliases for some of the more popular buildings in the collection.</p>
<p>The last step was to build a parsing-and-geocoding module to find the building references in the record data, match them up with the coordinates we found and add that information to our database. Easy right? Not so fast. All 150-some coordinate pairs and their building text keys needed to be accessible somehow to our data importer. We thought for a while about how best to do that. 150 entries isn&#8217;t a huge list, but it&#8217;s not short either. We didn&#8217;t want to have to mess with it at all once we set something up. This list also wasn&#8217;t likely to change over time, so the list didn&#8217;t need to be very accessible or easily editable by people. However, we were also going to need it again to return to the Free Library in some format or another at the end of the project. The coordinates were in an <a title="Copyright Microsoft" href="http://office.microsoft.com/en-us/excel/default.aspx" target="_blank">Excel</a> spreadsheet by this point, but we didn&#8217;t want to deal with the overhead of accessing it directly. We wound up creating a <a title="This one over here" href="http://dotnetperls.com/dictionary-keys" target="_blank">data dictionary</a> inside our importer; kind of a mini geocoder table just for this swath of Fairmont Park circa 1876. We left the parent Excel spreadsheet alone as both a backup archive and the basis for the document we&#8217;ll be returning to the Free Library.</p>
<p>While looking though the data for building aliases, we also noted the fields where these names were found. So all we had to do was write a small function to loop through these fields, do some text matching on our data dictionary and add the coordinates to the record in our system. Out of over 1500 images, we wound up geocoding over 90% using this method, and most of the rest simply didn&#8217;t have a building reference to find. The entire import, images and all, took under 10 minutes. Not bad at all!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/05/geocoding-the-centennial-expo-without-a-geocoder/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Azavea’s Coffee Helper: Caféduino</title>
		<link>http://www.azavea.com/blogs/labs/2010/03/azaveas-coffee-helper-cafeduino/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/03/azaveas-coffee-helper-cafeduino/#comments</comments>
		<pubDate>Mon, 29 Mar 2010 14:03:51 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[arduino]]></category>
		<category><![CDATA[coffee]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=460</guid>
		<description><![CDATA[Azavea has a clearly defined symbiosis with coffee. We have a designated Minister of Coffee, an incredible coffee grinder, and we get selections of coffee from around the world; some of it hand-delivered, and some of it hand-crafted. One of the problems opportunities that I observed was that as coffee was brewed and consumed in [...]]]></description>
			<content:encoded><![CDATA[<p>Azavea has a clearly defined symbiosis with coffee. We have a designated Minister of Coffee, an incredible coffee grinder, and we get selections of coffee <a href="http://www.flickr.com/photos/tags/coffee/map?&amp;fLat=-14.2429&amp;fLon=-51.4122&amp;zl=14">from</a> <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=&amp;q=coffee+near:tasmania&amp;sll=-42.000519,146.703146&amp;sspn=4.061753,3.532104&amp;ie=UTF8&amp;hq=coffee&amp;hnear=Tasmania,+Australia&amp;t=h&amp;z=8">around</a> the <a href="http://en.wikipedia.org/wiki/File:DirkvdM_orosi_valley_bird.jpg">world</a>; some of it <a href="http://www.rainforestours.com/products.htm">hand-delivered</a>, and some of it <a href="http://www.azavea.com/blogs/labs/2009/08/fueling-the-software-engineer/">hand-crafted</a>.</p>
<p>One of the <span style="text-decoration: line-through;">problems</span> opportunities that I observed was that as coffee was brewed and consumed in the office (about 6 brews a day), there would often be an unlucky staffer who picked up the (opaque) coffee pot to find it empty.  I don’t personally drink coffee, so I was unaware of how much anticipation one would have when approaching the pot.  Not knowing how much coffee was in there, and worrying if the mug you pour may be <a href="http://www.google.com/search?q=half+empty">half empty</a> or <a href="http://www.google.com/search?q=half+full">half full</a>.</p>
<p>I slowly set to work in my free time to solve this conundrum.  To me, it seemed like an individual would want to know if there was coffee in the coffee pot before approaching the coffee maker (whom some of us address reverently as <a href="http://www.zojirushi.com/ourproducts/breadmakers/ec_bd15.html">Zojirushi-san</a>).  This could be 1) a web page, 2) a desktop app, or 3) an <a href="http://en.wikipedia.org/wiki/Internet_Relay_Chat">IRC</a> (not <a href="http://en.wiktionary.org/wiki/IIRC">IIRC</a>) bot.</p>
<p>I drew up some schematics, took some measurements, and retreated to my home lab to build an <a href="http://www.arduino.cc/">Arduino</a> based, web-enabled measurement system tailored for the coffee pot. I used an <a href="http://www.arduino.cc/en/Main/ArduinoBoardDiecimila">Arduino Diecimila</a>, an <a href="http://www.arduino.cc/en/Main/ArduinoEthernetShield">Ethernet Shield</a>, a couple piezoelectric sensors, a 3 color LED, a couple buttons, and an awesome hand-crafted wooden base.</p>
<p>A <span style="text-decoration: line-through;">short</span> while later, I had a working prototype ready for testing. This device now sits in Azavea’s kitchen, measuring the weight of the coffee maker, and reporting the measurements to <a href="http://www.pachube.com/">pachube</a>.  After doing some internal evaluation, the name ‘Caféduino’ stuck, and I developed a couple methods of viewing the coffee pot status.</p>
<ol>
<li>Direct      web access
<div id="attachment_462" class="wp-caption aligncenter" style="width: 324px"><img class="size-full wp-image-462 " title="Caféduino Webpage" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/03/cafeduino-1.png" alt="The web page generated by the Cafeduino" width="314" height="118" /><p class="wp-caption-text">The web page generated by the Caféduino</p></div>
<p>Using this method, it’s possible to directly address the Caféduino.  This gives one direct access to the      measurement values, but is more useful for other applications that are      polling the data frequently.</li>
<li>Caféduino      Notification
<div id="attachment_463" class="wp-caption aligncenter" style="width: 231px"><img class="size-full wp-image-463 " title="Caféduino Notifier" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/03/cafeduino-2.png" alt="The system tray notification app." width="221" height="82" /><p class="wp-caption-text">The system tray notification app.</p></div>
<p>Using this method, the Caféduino is polled continuously, and the tiny      coffee mug in the system tray is updated as the coffee level changes.  This is the most aggressive method of      monitoring the Caféduino which, mysteriously, is the most comforting for      users.</p>
<div id="attachment_464" class="wp-caption aligncenter" style="width: 419px"><img class="size-full wp-image-464 " title="Caféduino History" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/03/cafeduino-3.png" alt="Visualizing the Cafeduino history." width="409" height="210" /><p class="wp-caption-text">Visualizing the Caféduino history.</p></div>
<p>When the coffee mug is clicked, the history of the Caféduino is charted in      the window.  What you are seeing is      a Google visualization applet that is consuming the historical data,      stored on <a href="http://www.pachube.com/">http://www.pachube.com/</a>.</li>
<li>IRC      bot integration
<div id="attachment_465" class="wp-caption aligncenter" style="width: 244px"><img class="size-full wp-image-465 " title="Caféduino IRC Conversation" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/03/cafeduino-4.png" alt="A sample IRC conversation with the IRC bot." width="234" height="90" /><p class="wp-caption-text">A sample IRC conversation with the IRC bot.</p></div>
<p>Lastly, the most interactive method of polling the Caféduino is through      our internal IRC channel. The above screenshot is the conversation that I      initiated with the IRC bot, and its response.  It has reassured me that there is      indeed, 77.78% of a pot of coffee left.</li>
</ol>
<p>Now our staff can check in on the coffee pot, to insure that their next visit to the kitchen will be without disappointment.  While this system works well for monitoring the coffee level, the next steps may be more involved – building a machine to <a href="http://gizmodo.com/5015735/the-top-10-rube-goldberg-machines-featured-on-film">automatically brew coffee</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/03/azaveas-coffee-helper-cafeduino/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Azavea Enters MassDOT Developers Real Time Challenge</title>
		<link>http://www.azavea.com/blogs/labs/2010/03/azavea-massdot-dev-challenge/</link>
		<comments>http://www.azavea.com/blogs/labs/2010/03/azavea-massdot-dev-challenge/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 16:20:58 +0000</pubDate>
		<dc:creator>David Zwarg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.azavea.com/blogs/labs/?p=445</guid>
		<description><![CDATA[We here at Azavea like playing with data. It&#8217;s a fact. All sorts of data (geographically aggregated, processed and hunched, walkable, etc) are useful, but not always accessible. So when MassDOT (the Massachusetts Department of Transportation) announced their Developers Real Time Challenge, we knew it was an opportunity that we didn&#8217;t want to pass up. [...]]]></description>
			<content:encoded><![CDATA[<p>We here at Azavea like playing with data. It&#8217;s a fact. All sorts of data (<a href="http://www.azavea.com/products/kaleidocade/home.aspx">geographically aggregated</a>, processed and <a href="http://www.azavea.com/Products/hunchlab/home.aspx">hunched</a>, <a href="http://www.azavea.com/News.aspx?itemid=1132&amp;full=true">walkable</a>, etc) are useful, but not always accessible. So when <a href="http://www.massdot.state.ma.us/main/main.aspx">MassDOT</a> (the Massachusetts Department of Transportation) announced their <a href="http://www.eot.state.ma.us/developers/default.asp?pgid=content/RealtimeChallenge_0210&amp;sid=about">Developers Real Time Challenge</a>, we knew it was an opportunity that we didn&#8217;t want to pass up.</p>
<p>MassDOT has been making steady progress opening transportation data to developers over the past few months.  In September of 2009, they first released 2 full days worth of passenger data, and <a href="http://www.eot.state.ma.us/developers/default.asp?pgid=content/developer_VizChallenge&amp;sid=about">challenged</a> the developer community to play with it to develop a visualization and application that used that data. Then in November of 2009, they hosted a <a href="http://www.massdotdevelopersconference09.com/">developer conference</a>, announced the winners of their first challenge, and announced the release of a <a href="http://www.eot.state.ma.us/developers/realtime/">real-time XML feed</a>. They kept that ball rolling in February 2010 with a <a href="http://c4mbtahackathon.eventbrite.com/">Hackathon</a> at <a href="http://www.mit.edu/">MIT</a> with the <a href="http://civic.mit.edu/">Center for Future Civic Media</a>, and then the announcement of the Real Time Challenge.</p>
<p style="text-align: center;">
<div id="attachment_449" class="wp-caption aligncenter" style="width: 485px"><a href="http://sandbox.azavea.com/mbta/"><img class="size-medium wp-image-449 " title="BusMinder Application" src="http://www.azavea.com/blogs/labs/wp-content/uploads/2010/03/busminder-snap-475x475.png" alt="BusMinder screenshot" width="475" height="475" /></a><p class="wp-caption-text">Screenshot of BusMinder in action.</p></div>
<p>My response to the real time challenge: BusMinder (<a href="http://sandbox.azavea.com/mbta/">http://sandbox.azavea.com/mbta/</a>).  BusMinder is an experiment, designed to enable users to create bus reminders (&#8216;busminders&#8217;) for their favorite bus stop(s).  Users walk through the process of:</p>
<ol>
<li>Selecting a bus stop</li>
<li>Selecting a reminder type (sms or email)</li>
<li>Selecting a time window</li>
</ol>
<p>It&#8217;s that simple to get started. The application will save those settings and remind the user when an MBTA vehicle is approaching that stop, and how far away it is in minutes.  Once a user registers, they can set up multiple reminders (commuting to work and commuting to home, for example).</p>
<p>I introduce BusMinder as an experiment because MassDOT has published this real time XML data feed on an &#8216;trial&#8217; basis. Currently, only 5 bus lines are supported, and it&#8217;s understood that the data is experimental.  In fact, <a href="http://www.eot.state.ma.us/developers/downloads/RelationshipPrinciples_11-12-2009.pdf">one of the bullet points</a> for developers working with MassDOT data is &#8220;Expect change&#8221;.  My hope is that this application is found to be useful, and it helps MassDOT release more of their real time transit information.</p>
<p>I also want to give a shout out to Robert Cheetham, for Azavea&#8217;s generous <a href="http://www.azavea.com/Research.aspx">Research</a> program.  I was able to shift around my projects a bit and take on this challenge under the auspices of my independent research.  Thank you, Robert.</p>
<p>Ok, ok, enough already!  Go to <a href="http://sandbox.azavea.com/mbta/">http://sandbox.azavea.com/mbta/</a> and start playing!</p>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow: hidden;">http://civic.mit.edu/</div>
]]></content:encoded>
			<wfw:commentRss>http://www.azavea.com/blogs/labs/2010/03/azavea-massdot-dev-challenge/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using apc
Database Caching using apc
Object Caching 6103/6401 objects using apc

Served from: www.azavea.com @ 2010-09-03 20:27:42 -->