Tuesday, March 16, 2010

Adventures in grading student blogs using gdata python

This post is about how Google's gdata python API helped convince me to use blogger for my student class blog. The screenshot, simple as it is, shows a blog evaluation framework available on no other platform (see the page updated daily here). I created it in about a week starting from scratch with the pretty much only intermediate python development skills. Read on for a narrative about my high level strategy in getting it all to work and where I think I'm going with it.

Introduction

I've been incorporating blogging in my courses since 2004. A typical blogging pattern is for students to generate 3 to five posts per week, with their blogging activity accounting for 30% of their grade. There have been three persistent issues with student blogging:

  • Hosting support
  • Student tracking
  • Analyzing student posts

Motivation for starting with the blogger API

I've used all of the major platforms with a preference toward sixapart's offerings, mainly because I knew them, and movabletype seemed to offer significant customization features. However, about a year ago, I decided I simply had to stop hosting the blogs on my own server. If nothing else, combatting spam and bad guys was getting beyond me. Since then, I've been itinerant across a couple of hosting services.

In my classes, we currently use a group blog (in the end, it's just easier) with between 50 and 70 participants. There is no hosted blogging service that offers a convenient way to track the activities of that many group blog participants (blogger is particularly egregious in this regard but also the easiest for adding participants). Plus, my needs were unique. I wanted to know how many posts students had written and whether these posts met certain criteria.

I had always heard about the blogger api and was becoming increasingly impressed with the gdata python api in general through my brushes with it in various youtube projects. So, last December, as the holiday break was nearing I decided I should take a run at seeing whether the gdata python api would allow me to overcome the limitations of the blogger platform and perhaps go beyond anything I could do on any platform.

Initial Results

My timeframe was one week while my wife and kids were away visiting her father. I downloaded the python gdata V2 archive and started in. I would describe myself as a novice/intermediate python programmer.

Within the week, I had basically achieved what you see here:

http://biggerbuybutton.com/university/results.html

that I'm using to track this blog:

http://winter2010.biggerbuybutton.com/

I'll admit the current approach is rather basic, but it fulfills the role of tracking where students are and indicating posts they may want to work to improve. As shown in this screenshot providing the detail for my own posts, edit links are provided in cases where posts are not up to snuff, and the posts that need work are clearly highlighted with little messages as to what's missing.

Future Plans

What I find most intellectually engaging and spent spring break working on is useful ways to summarize the content of student posts both individually and across the group:

  • Using semantic analysis services from the likes of opencalais, zemanta, and evri (to give just a sample), what are an individual students most common themes?
  • What are the themes shared across students?
  • What is the evolution of themes over time?
  • Who are connecting the most with their peers and over what topics? (post content is supposed to be 80% class related, and true personal posting is vanishingly infrequent).
  • Who is discovering the most new external resources?
  • What are the biggest search keywords and how do they relate to themes raised by the group?

From an operational perspective:

  • How can I share this code?
  • When should I share it?
  • What is the best approach for dealing with unreliable third party services and how should I bullet proof against that?
  • I need to think of adding unit tests as this grows beyond a quick hacking experiment.

This has been long. Thanks for reading this far if you actually made it. I'm interested to hear your thoughts in comments.