I chose to collect the 781 images that result from searching “beautiful” on Google Images. I am interested in the way that these images consistently violate my personal sense of beauty, despite being among the most powerful (visible) beautiful images on the web. My first attempt used the Google Image Search API, but I soon discovered that this API limited me to 64 images, when simple inspection of the results page showed many more images. I then tried downloading the .html of a given results page, and scraping that file for image urls. What I didn’t realize at the time was that a given .html file only shows 100 image urls at a time. David Newbury explained to me that the image urls are dynamically loaded as the user scrolls, so it isn’t possible to download them in one go. Together, we inspected the network activity of the results page, and found requests of the form:
From there, I modified the values in ‘n=1’ and ‘start=100’ eight times to download 781 total image urls. I used the ‘cat’ unix command to combine these 8 .txt files into one file, and then used Python to scrape this file for the urls.
Here is a sketch that identifies categories of normative beauty by grouping related images.
EDIT: What I’d like to do with these images is display them in a grid, and use the Eyetribe eye tracker to log the time spent looking at the images. Then I will change their scale to match gazing time. This way a person can imprint their subjectivity onto the collection of images. (I’m writing this here so I remember to do it.)