Log in

The Anti-Day of Digital Humanities

March 18th, 2009 at 9:25 am MDT
Antimatter

Antimatter

If I’m entitling this post the anti-day of digital humanities it’s not because I’m opposed to this community publication project to bring together digital humanists from around the world to document what they do today. Quite on the contrary, I think this is a fantastic initiative and I’m excited about it. But my typical professional day during a semester as a digital humanist is filled with teaching activities (class prep, lecturing, tutorials, marking, meeting with students, etc.), administrative duties (more meetings, writing reports, etc.) and – time permitting – maybe a bit of time for research (which usually involves cycling through various projects on which I’m behind and have mixed feelings of guilt and anxiety). But today isn’t typical since I’m on sabbatical this year; in fact it’s almost the opposite of typical: (without wanting to rub it in) I get to spend all day doing research. For me as a digital humanist, this is an anti-day.

My usual routine these days after waking up is to help prepare the kids for school/daycare and then wander over to the espresso machine for my first shot. Once ready, I savour it as I open the lid of my laptop and check in on my email. I usually do a quick skim of email and only answer the more interesting or more urgent stuff (sorry if I didn’t answer you in round one this morning ;-). Then I skim quickly through my RSS feeds, especially the news, my colleagues’ blogs, and a few tech blogs that I read though I’m not always sure why (the Mac ones in particular feel more like an addiction then anything else). Then I often choose some music to listen to before I really get to work. The music might be from my collection, from an internet stream, or espace musique on Radio-Canada (also streamed because radio reception is terrible in our house for some reason). Then I get to work (and I’ll post on that in a couple of hours….).

Day(190) of Digital(217) Humanities(158)

March 18th, 2009 at 12:10 pm MDT

Blogging is such a pervasive activity (among digital humanities and beyond) and such a rich source of information. Wouldn’t it be nice if we had text analysis tools that would allow us to ingest blog content and work with it? We have useful web-based text analysis tools like HyperPo and Taporware that allow users to provide their own texts in real-time, but those tools are ill-suited for dealing with larger corpora and for working with blogging content. For instance, if you wanted to feed a blog feed to HyperPo, you’d have to parse the document and strip away the parts you really want (or deal with a lot of superfluous content). My work project for today is to make some progress on improving how Voyeur handles RSS feeds, and especially the Day of Digital Humanities blogs.

Voyeur is a project that Geoffrey Rockwell and I are collaborating: the idea is to combine our experiences building web-based tools. I won’t take up space here describing the project in detail (see this draft of a more detailed description), but here are some of the things we’re focusing on:

  • scale: working with much larger collections then is possible with our current tools
  • portability and citeability: allowing tools to be embedded remotely (like YouTube clips) and making it easier to reference tools and results in scholarly work
  • flexibility: allowing users to provide a wide range of texts and working with them appropriately (making the most of what’s there, from plain text to deeply encoded XML)

Voyeur is currently in what I’d call pre-Alpha – it works, but it’s fragile. In the spirit of extreme programming/analysis, we’ve been focusing only on what we need next, and we’re putting off thorough testing until later. For today’s work, here’s what was needed:

  • an adapter for ingesting RSS
    • ability to treat a feed as one document or count each post as a document
    • be flexible and forgiving in parsing HTML included in the feed
    • identify essential metada (author, title, time of posts)
  • update tool interfaces to use metadata (for display, grouping, sorting)

Below is a screenshot from the Corpus Types tool in Voyeur – running on my computer – with the aggregated RSS feed as input. This afternoon I hope to provide a link to a live version.day-digital-humanities

Lunch with a Librarian

March 18th, 2009 at 3:01 pm MDT

Ok, so it’s not all fun and no play today, I had a leisurely lunch with Jeff Trzeciak, the Chief Librarian at McMaster. Jeff has an excellent reputation for effective innovation and thanks in large part to his stewardship our libraries have won awards, like being named the top academic library in North America for 2008 by the ACRL. Like many of his colleagues, Jeff is endeavouring to respond to new realities for libraries, including plummeting circulation numbers and the libraries new role as essentially middle-player between purchasing consortia and users (some librarians may be less involved now in purchasing and subscription decisions since much more of that happens at a consortial level). So what’s a library to do? Here were some of the things we discussed:

  • many campuses have classrooms on the one hand and social areas (like the student centre) on the other, but not a whole lot to offer in between: libraries are well positioned to provide social study space (not just quiet space) where students can interact and learn (and preferrably sip good coffee)
  • the libraries can provide space for research collaborations, especially for humanists who are less likely to have their own research lab space; these labs can share resources and share technical expertise; moreover, by centralizing some research projects it may be easier to showcase humanities research to students (who are sipping good coffee as they admire and interact with our work)
  • many of the digital humanists I know accumulate projects the way others accumulate shoes or computer cables: new ones keep coming in but old ones are never discarded; as such, we are struggling to innovate with new projects while doing our best to sustain older projects – the libraries can become partners in creating and preserving digital projects (not just the data but the delivery software and architecture as well); of course, the libraries aren’t currently funded for that

At many institutions digital humanities and the libraries form a relatively coherent unit. That’s not the case (yet?) at McMaster, but regardless, I’m convinced that our fortunes as digital humanists are closely bound to the transformations happening in the libraries: success for both involves advocating for the value of creating, enhancing, providing and preserving digital resources within a humanistic perspective.

Voyeur: Seeing through Digital Humanities Blogging

March 18th, 2009 at 11:24 pm MDT

Surprise, surprise, the afternoon and evening of work didn’t go as smoothly as anticipated; in other words this was an entirely typical day. The plan was to provide a not-overly-buggy version of Voyeur that would play nicely with the RSS feeds from the DayOfDH. I did some coding to that end after lunch, but then opted to spend the second half of the afternoon playing with my daughters in our yard (sunny and 18 Celcius, thanks for asking). I thought I could make a few last changes after getting the girls to bed (I almost always work after we put them to bed), but instead spent most of the time wrestling with Tomcat (and if you’ve ever wrestled with web applications in Tomcat, you know you never really win, you just find the best way to surrender). At the moment HyperPo and Voyeur are trying to share the same container, but that’s obviously going to have to change. In the meantime, HyperPo will have to rest a bit (my apologies to the countless masses who depend on HyperPo) – I’ll sleep better tonight knowing that Voyeur is sort of working as I’d hoped.

Voyeur is actually designed to allow web authors (such as bloggers) to embed results in their pages, much like YouTube videos. Unfortunately this installation of Wordpress isn’t configured to permit the necessary tags to make embedding happen smoothly, but you can see an example on my blog. I built two corpora (these are links to a live version of Voyeur):

  1. each author has a separate document (93 documents)
  2. each of the posts is a separate document (about 450 documents as of this writing)

Both formats can be useful, though Voyeur isn’t yet well suited to display hundreds of documents (the second corpus).

full-dayofdh-byauthor1

As I mentioned earlier, Voyeur is pre-Alpha – it may work for you, it may not. If I’m lucky the server won’t have crashed by tomorrow. There’s a ton of things to do on Voyeur, but please do leave a comment if you have any suggestions. In the meantime, if you want to analyze your own DayoDH posts, adapt this URL: http://voyeur.hermeneuti.ca/?inputFormat=RSS2&splitDocuments=true&input=http://ra.tapor.ualberta.ca/~dayofdh/StefanSinclair/feed/ (replace the italicized letters with your own username).