guy

03 Apr 2016

I chose the tiny image database for this assignment. It’s a set of over 78 million 32×32 images. I was drawn to it because of the sheer scale of images I would have to work with, and the fact that I has simply no idea what was in the dataset.

It’s kind of a black box: The dataset is one 227GB binary file that can only be access via a Matlab package. So I had to figure out Matlab to get the images. having done that, I ran a TSNE on the first 5000 images:

tsne_grid_gdb_2

It seems that the dataset contains almost anything, al be in compressed to 32 by 32 pixels. faced with so much choice I was unsure what direction to take the images in. One avenue I pursued was trying to reconstruct the original images using super-resolution, a fairly advanced scaling method that gives better results than most other approaches.

This is about the best quality I was able to get out of a tiny picture of an atom bomb:

bomb-228

The result’s weren’t fantastic. one interesting effect was that, when running ofxCCV on these image, the library became more confident in its analysis of pictures that were first super-resolved.

the second thing I tried was to try and find some way of visualizing the similarities in images. For example, in my TSNE there’s a large swath of pictures of bombs that all look very similar. I decided to try and sequence the images by similarity, so se what would come up. this is the output (I recommend opening the videos in a new frame since that will automatically resize them for you):

most of the sequence is nonsense, but there are these nice little sequences that I decided to pull out and put in a separate video:

All in all I’m not fantastically happy with the outcome of this project, just because I lacked so much direction. I actually came up with a few other ideas for things I could have done later in the timeline, but at that point it was too late to switch gears.

All the code I used for extracting images from the binary, for using the Super-resoluton package, for doing CCV tests on the super-images, and for image sequencing can be found here.