Last month, I was proud to release our first official language-specific “wrapper” for Cicero, our API for elected official data and district-matching and geocoding. “python-cicero,” as it’s called, is now available to all on Github or the Python Package Index (also known as PyPI). January also happened to be when the brand new GeoPhilly Meetup group was having it’s first joint meeting with the Philly Python User’s Group, and I was excited to have such a perfect nexus event with both Python and GIS nerds in the audience to give a talk about this project. In the words of one of our attendees, John Ashmead (who also has some background in science fiction writing), I did a good job in my talk of conveying the struggle and conflict between “man and machine” inherent in the process of releasing a Python package.
Yes, it’s sad but true: a certain dose of “man vs machine” conflict is inherent because the state of Python packaging is a total mess and has been for a long time. All newcomers, like myself or my colleague Steve Lamb (with his recently packaged django-queryset-csv project), soon discover this when they embark on distributing their first package, and even CPython core contributors admit it without hesitation. The crooked, winding, poorly documented road to a finished Python package is even more mind boggling when you consider that there are nearly 40,000 of these packages on PyPI. This is not a rare, obscure process. Python packages seem easy at face value.
The packaging process is a lot to cover though, so I’ll be writing a separate tutorial on that and my findings in an upcoming Azavea Labs post later this week. Stay tuned!
Designing a Wrapper
For this post, we’ll examine the wrapper itself, along with another face value assumption: that API wrappers are “small potatoes” projects. Searching Google or Github for “api wrapper” will give you an idea of how common these things are – and frequently the same API will have duplicate wrappers written in the same language by different authors. And sure, when compared to large software projects like Azavea’s recent Coastal Resilience mapping application, or our veteran Homelessness Analytics visualization site, the 300 KB python-cicero library is tiny.
However, within the relatively small charge of a library intended to make HTTP requests to an API easier, there is a deceptively sizeable level of design considerations to take into account. Netherland points out a few of these in the previous link, particularly around “wrapping” versus “abstraction.” As when designing all software, especially when its intended to be used by others at a technical level, you have to think about how your users will use your tool and anticipate their needs and desires. Who uses your API? What for? Are your users technical enough that your wrapper is just saving them repeated calls to “urllib2.urlopen()”? Or would they appreciate some guidance and hand-holding in the form of additional abstraction? The answers to those questions inform the interface you design to your wrapper library. Not the most monumental task, but not the smallest either.
Some of our Cicero API users are very technical, and dive straight into the API. But often, our Cicero API clients come to us from smaller, nonprofit political advocacy groups. Sometimes the people who sign up for Cicero accounts at these organizations have a limited technical background – web development skills they’ve picked up on the side for specific projects here and there. It was this type of user that was in my mind as I designed python-cicero, and why I decided to lean towards more abstraction.
Cicero is a paid service, so we’ve implemented a system of authentication to verify users querying the API have an account in good standing. Users send us their account username and password in the payload of a POST request, and we return back to them an authentication token and numeric user ID that they place in the query string of the URLs for their subsequent calls to the API (which, incidentally, are all GET requests).
In the wrapper, I decided to abstract all of that. We have a class, “CiceroRestConnection”, which is instantiated with a username and password. That’s it! You are now ready to make all your API calls with this new class instance without ever having thought about tokens or POST requests or anything beyond remembering your login details.
Under the hood, the __init__ method of the CiceroRestConnection class takes the username and password info, encodes it into a payload, makes the request to Cicero’s /token/new.json endpoint, parses the token and user ID out of the successful response, and assigns these to class attributes so they’re available for use in other class methods for accessing other API endpoints. Roughly every 24 hours, authentication tokens will expire, and Cicero will respond to future calls using the expired token with 401 Unauthorized. If necessary, users can build logic into their Python applications to check for this response, and if received re-call __init__ again to reset their token or re-instantiate the class.
Taking our example “cicero” instance from before, we can make a request to the API’s /official endpoint. All endpoints in Cicero aside from requesting new tokens are HTTP GET requests, so I adopted this as my naming scheme for CiceroRestConnection class methods (“get_official()”, “get_nonlegislative_district()”, “get_election_event()”, etc). The user passes however many keyword arguments (all identical to those described in the Cicero API docs) they need to execute their query to the endpoint they’ve chosen (in this case, we kept it simple with one “search_loc” argument to geocode Azavea’s HQ address). The wrapper makes the request, and parses the response into another set of classes that can be easily navigated with Python’s dot notation, all with proper error handling. The user doesn’t have to fiddle with JSON, Python dictionaries, or anything.
Getting a specific official, district, or election event by its unique ID – in proper ReST fashion – requires placing this numeric ID directly in the root URL, not the query string as another keyword argument – ie, /official/123, not /official?id=123. This makes sense to someone familiar with ReST – you’re requesting a specific resource, and that should be part of the Uniform Resource Locator – but has easily tripped up beginners in the past who expect ID to be just another query string parameter. python-cicero resolves this by having all queries be composed of keyword arguments passed to any of our wrapper methods, including ID. We check for it’s presence and construct the URL appropriately without burdening the user:
Documentation Is Important
A key part of all developer-focused software is having good documentation. You won’t be around to explain how to use it to everyone, so you’d better write that down and write it down clearly. A stalwart in the Python world is the Sphinx system for generating docs. It’s a great tool, but I feel it’s a bit bloated for smaller projects. Also, I don’t like writing in reStructuredText as Sphinx requires and find Markdown to be a bit more intuitive. Furthermore, I personally really appreciate being able to see code alongside my docs, following along in each.
Then, run Pycco against your Python source files with one command:
$ pycco cicero/cicero_response_classes.py
And beautiful, fancy font, syntax highlighted HTML documentation pages pop out – code on one side, docs on the other. Easy!
Try it Out
If you’d like to give Cicero a try, python-cicero is now one of the easiest ways to do it. Either use Python’s “easy_install” utility or the (superior, if you have it) pip to install the wrapper:
$ easy_install python-cicero $ #OR $ pip install python-cicero
Take a look at the docs, available at http://azavea.github.io/python-cicero/ to get a sense of the methods available to you, as well as the “cicero_examples.py” file in the package.
And again, keep an eye out for my upcoming Labs post – we’ll dive in to the more-complex-than-necessary world of creating Python packages and submitting them to the Python Package Index, as I did with python-cicero, with a full tutorial! It should be ready to go this week.