Shan Huang

11 Feb 2014

Demo available here

Wikipainting is like a wikipedia of paintings. It is a really lovely online resource because it provides tons of high resolution images – and everyone loves high res images! I really enjoy browsing the site in my spare time and running into unexpected works for artists I’ve never heard of. I also secretly really want to scrape all the beautiful paintings on the site and save them somewhere on my disk. So for my data scraping project, I scraped all self-portraits found on wikipaintings.org and visualized them based on a timeline:

screen1

Technology wise I used the beautiful soup library which turned out to be really handy for interpreting HTML pages – as many have recommended. I did not run into much trouble to make the scraper work. I scraped all the image urls found in Wikipainting‘s self-portrait genre, along with some other information like artist names, styles of the paintings and urls to the work detail page. But I overlooked a detail – the page displays a scroll bar for artists with more than five self portraits but I god damn forgot to parse it. As a result I missed a good amount of the data and only got 400+ entries in the end. I plan to fix it and rescrape the site in the near future.

Anyways, the process went pretty smoothly overall. I made a grid layout in HTML/JS and fed data to it using the D3 library. I haven’t used D3 before but picking it up wasn’t that challenging. After some tutorial reading, fiddling around with D3 sample code and debugging, I have my visualizer working: http://shan-huang.com/600SelfPortraits/index.html. (Layout design shamelessly stolen from REI1440 project website. )

I think the result is pretty interesting. It’s super fun to dive in and see how (crazy/talented) human being’s self perception evolves over time. Here are screen shots of some amazing self-portraits I discovered through my visualizer:

7^ Self-portrait by Alexander Shilov. Though finished in 1997 it weirdly has the Renaissance look and feel as if it were a portrait of someone from the 16th century…

6 ^ I really, really like this piece for many reasons, color being the most apparent one.

5

^ Another nice black and white piece.

4

^ Another gorgeous piece from the old masters.

Github: https://github.com/yemount/proj2-portraitscraper