Looking Outwards – Dataset of GIThub projects’ activities (and swear words by programming language)

by jsinclai @ 9:12 pm 19 January 2010


This guy has a really cool data set of over 5000 “active” GIT repositories. He pretty much did the first 3 steps of Ben Fry’s steps to making an info-vis: acquire, parse, and filter. He produced some basic statistics on the data, but there’s probably a lot more interesting information hiding there!

My favorite is the “Number of swear words per 1000 commits by language.” I remember an old javascript/php web app I made where I didn’t have a dev environment set up, I had to commit to the actual server to see results. Every time I had to debug something I’d end up with 50 or so commits just on that issue…many of the comments were filled with cursing 🙂

1 Comment

  1. Nice!
    Interesting to see Java doing pretty well there. Quite surprising!

    Comment by Karl DD — 19 January 2010 @ 9:23 pm

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
(c) 2016 Special Topics in Interactive Art & Computational Design | powered by WordPress with Barecity