A new way for data journalists to thwart newsroom IT: the Raspberry Pi

image

One of my old jokes is that newsroom IT puts the No in Innovation, so I’m always on the lookout for ways to get around them. And I’ve been playing around with a good one: The Raspberry Pi.

Unfamiliar with the Pi? The Model B Pi is a $35 computer that’s about the size of a deck of cards. It’s got an ethernet port, and you supply the hard drive in the form of an SD card, the keyboard, mouse and monitor. Now, for $35, you’re not getting a ton of horsepower, but for simple repetitive tasks it works great.

What kind of simple, repetitive tasks? Let’s pretend for a second that you wanted to set up a scraper that dumped data into a database every hour. Ideally, you’d have a server somewhere and you’d set up a task on it — I like using ‘nix’s cron for things like this — and off it would go, mindlessly gathering data for you and putting it into a database. You could then go about your life, stopping by from time to time to get that data and do whatever you’re going to do with it. So you ask newsroom IT for this and, of course, the answer is no. And no we won’t give you the money to run this in the cloud for a few bucks a month either. 

Enter the Pi.

For $35, you can write your scripts, put them in a cron job and off it’ll go, gathering your data for you. No need for a server, no need for a server administrator, no need to make sure your work computer stays on and running the whole time, just some elbow grease to get the script running and an ethernet connection to the internet. 

I’ve had my Pi running a repetitive task for two weeks now and it’s plugging along without issue, having gathered 50,000 records without me having to do anything. In a month, I’ll have a dataset worth analyzing, and it will only ever cost me $35. And I can use it for other things as well. 

A cheap scraper bot. Useful!