DMGordon – Experimental Capture

DMGordon – FINAL!

I expanded upon my event project, which used a generative neural network to ‘rot’ and ‘unrot’ images.

diagram of information flow through a standard neural network

Summary of Neural Networks
Neural networks are an artificial collection of equations which take some form of data as input and produce some form of data as output. In the above image, each of the leftmost circles represents input nodes, which accept a single decimal number. These inputs can represent anything, from statistical information to pixel colors. Each input node then passes the input on to every node in the next layer. The nodes in the next layer accept every number from every one of the input nodes, and combine them into a new number, which is passed along to the NEXT layer. This continues until we reach the output layer, where the numbers contained in the output nodes represent the ‘final answer’ of the network. We then compare this final answer with some intended output, and use a magical method known as backpropagation to adjust every node in the network to produce an output closer to the intended one. If we repeat this process several million times, we can ‘train’ the neural network to transform data in all sorts of astonishing ways.

pix2pix
pix2pix is a deep (meaning many hidden layers) neural network which takes images as input and produces modified images as output. While I can barely grasp the conceptual framework of deep learning, this github repository implements the entire network, such that one can feed it a bunch of image pairs and it will learn to transform the elements of each pair into each other. The repository gives examples such as turning pictures of landscapes during the day into those same landscapes at night, or turning black and white images into full color images.

I decided to see if the pix2pix neural network could grasp the idea of decay, by training it on image pairs of lifeforms in various stages of putrefaction.

My dataset
I originally wanted to do my own time lapse photography of fruits and vegetables rotting, but quickly realized that I had neither the time to wait for decay to occur, nor a location to safely let hundreds of pieces of produce decay. Instead, I opted to get my data from Youtube, where people have been uploading decay time lapses for decades. I took most of my image pairs from Temponaut Timelapse and Timelapse of decay, both of whomst do a good job of controlling lighting and background to minimize noise in the data. By taking screenshots at the beginning and end of their videos, I produced a dataset of 3850 image pairs.

I trained two neural networks: one to ‘rot’, and the other to ‘unrot’. After training each network for 18 hours, they were surprisingly effective at their respective grotesque transformations. Here is an example of the unrotter puffing me up in some weird ways.
Normal:

Unrotted:

However, the pix2pix network can only operate on images with a maximum size of 256 x 256 pixels, far too small to be any real fun. To fix this, I tried both downsampling and splitting every image into mosaics of subimages, which could be passed through the network, then put back together, resized, and layered on top of each other to produce larger images:

LOS ANGELES, CA – DECEMBER 19: Television Personality Paul ‘Pauly D’ DelVecchio arrives at Fox’s “The X Factor” Season Finale Night 1 at CBS Televison City on December 19, 2012 in Los Angeles, California. (Photo by Frazer Harrison/Getty Images)

However, the jarring borders between images had to go. To remedy this, I create 4 separate mosaics, each offset from the other such that every image border can be covered by a continuous section from a different mosaic:

We then combine these 4 mosaics and use procedural fading between them to create a continuous image:

Doing this at multiple resolutions creates multiple continuous images…

…which we can then composite into a single image that contains information from all resolution levels:

Using this workflow, we can create some surreal results using high resolution inputs:

—-

—-

—-

Golan Levin provided me with Ikea’s furniture catalog which produced interesting, albeit lower-res, results:

—

Next Steps
The entire process of downsampling, splitting, transforming, and recompositing the images is automated using Java and Windows batch files. I plan to create a Twitter bot which will automatically rot and unrot images in posts that tag the bot’s handle. This would be both interesting to see what other people think to give the network, and a great way to get publicity.

The training dataset, while effective, is actually pretty noisy and erratic. Some of the training images have watermarks, youtube pop-up links, and the occasional squirrel, which confuse the training algorithm and lead to less cohesive results. I would love to use this project as a springboard to get funding for a grant in which I set up my own time lapse photography of rotting plants and animals using high definition cameras, many different lighting conditions, and more pristine control environments. I think that these results could be SIGNIFICANTLY improved upon to create a ‘cleaner’ and more compelling network.

Special thanks to:
Phillip Isola et al. for their research into Image-to-Image translation
Christopher Hesse for his open source tensorflow implementation
Golan Levin for providing Ikea images and suggesting a Twitter bot
Ben Snell for suggesting multi-resolution compositing
Aman Tiwari for general deep learning advice and helping me through tensorflow code

DMGordon – Final Progress

DMGordon – Event

An artificial neural network is a

which

thus, we can then

So now that we’re clear on what exactly in general a neural network is, we can look at what neural networks can do. The field of deep learning is integral to fields of emerging technology such as autonomous vehicles, computer vision, statistical analysis and artificial intelligence.
Recently, a research paper was published which details using neural networks to manipulate images. The basic process is as follows: the network is given an image as an input, which it then tries to change to match a second image.

I trained a neural network with images of fresh fruit matched with images of rotten fruit. The network is thus trained to rot or unrot any image it is given. I then fed the network images that are not fruit. The results have mixed effectiveness:

Net #2:

indreams from David Gordon on Vimeo.

Net #3:

training images:

which is ‘unrotted’ to:

sometimes it makes mistakes:

which is ‘rotted’ to:

actual rotten cucumber:

mistakenly ‘unrotted’ into a strawberry:

but it really knew how to rot watermelons:

Fresh MJ from David Gordon on Vimeo.

DMGordon – Event Proposal

For my event, I plan to train a neural net to ‘undecay’ images. I will use a Generative Adversarial Network. The dataset are pairs of images taken from time lapse videos on Youtube of rotting food. I will then train a discriminator to recognize the fresh food. The generator will be fed images of rotten food and its output will be judged by the fresh-food-recognizing discriminator. After sufficient training, we can feed any image into the generator for an ‘undecayed’ output.
While I’ve started compiling my data set, I only have around 15 image pairs, and will need at least 20 times that to get any sort of interesting generator output. Also, to generate high-resolution images I will either need a gigantic network or some form of invertable feature extractor, neither of which I have experience in.

DMGordon – Place

For my place project, I made a point cloud compositing system which condenses multiple point clouds into a single three dimensional pixel grid.

I began this project with the intent of capturing the freeway I live next to and a creek in Frick Park. I started with attempting to generate point clouds using a stereoscopic video camera and Kyle McDonald’s OFCV OpenFrameworks addon. The resulting images were too noisy for compositing, however, and I had to switch capture devices to the Kinect. While the Kinect provides much cleaner point clouds, it requires an AC outlet for additional power, tethering me to wall sockets. Others have found portable power supplies to remedy this, and my next step is to follow their lead in making a mobile capture rig.

The Kinect automatically filters its infrared and visible color data through its own computer vision algorithms to produce denoised depth data, which can then be projected onto the visible color image to create colored, depth-mapped images. I could then take each pixel of these images and use the depth data as a z-coordinate to populate a three dimensional color grid. By adding multiple frames of a depth-mapped video into a single color grid, we are treating the color grid like a piece of photopaper during a long exposure. The resulting images contain the familiar effects of long exposure photography in a three dimensional vessel:

In addition to containing an additional dimension of space, we can run colorpoint-by-colorpoint analysis on the grid to split and extract subsections of the image based on various factors. Here are some tests experimenting with simple color-range partitioning:

Full Spectrum:

Filtered for Greenness:

Filtered for Not-Greenness:

Going forward, I see several ways to expand these methods into provocative territory. With a portable capture rig, we could capture more organic and active locations, where movement and changes in light could lead to more dynamic composites. Also, more intense analytics on the composites, such as filtering blocks of the images based on facial recognition or feature classification, would produce more compelling cross sections of the data. Adding multiple Kinects compositing their point clouds into multi-perspective frames would open the door to volumetric analysis. Even rigging a single Kinect up to a Vive controller or an IMU could provide a global coordinate space for expanding the composites beyond a single frustrum of reference.

Here are a couple more images of backyard experiments:

BBQ + House true-color:

BBQ + House 360:

More BBQ related pictures to come

DMGordon-PlaceProposal

I want to use point cloud data to describe a space in multiple moments simultaneously. I can do this using Unity as a display environment, and a Kinect as the tool of capture. I want to capture a couple spots of constant motion that are dear to me. One is the freeway next to my house, and the other is a creek in Frick park. The code portion is based off of a previous project I’ve done, so it will not take as long as it would were I starting from scratch.
The main problem I’m dealing with is how to get the Kinect to the locations I want to capture, which are far from any electrical outlet. I will most likely end up making a DIY battery.

DMGordon-portrait

My portrait is a collection of VR environments which describe different aspects of caro. From our discussions, I plucked three ideas she shared and created virtual environments from them. I did this using Maya and Mudbox for modeling and animation, which were then imported into Unity, which provided the underlying framework for interaction. The three ideas eventually developed into their own scenes, which are connected together by a hub world containing portals that transport the subject to and from each scene.

The most difficult portion of this project was definitely the scene based upon caro’s drive to achieve and compete. I thought to represent this drive using procedurally generated staircases, where steps were constantly being added to the top while the bottom steps would fall away, forcing someone on the stairway to climb at a certain pace or fall. Making the staircase assemble itself in a smooth, yet random path involved both calculating the position of new steps in relation to the ones before them, and having them materialize some distance away and fly to their assigned position to give the effect of being assembled out of the ether. I also had a lot of trouble writing a shader that would make the steps fade to and from transparency in an efficient manner. These challenges all provided good learning experiences which will help me in future projects.

The final product is semi-successful in my opinion. I succeeded in creating an immersive VR experience which interfaced with the Vive. I am happiest with being able to create my own song for the piece, as sound and music are integral to immersion. However, much of the piece (the hub world in particular) have nothing to do with caro, resulting in what seems to be only a half-portrait. If I were to redo this project, I would involve caro much more in content creation, rather than building an infrastructure which is then embellished with details about her.

Portrait from David Gordon on Vimeo.

DMGordon-PortraitPlan

My portrait project is an interactive virtual reality experience primarily developed with the Vive in mind. However, I am also making a non-VR version that will be playable on Mac and Windows operating systems.
Conceptually, the portrait addresses achievement, and the flow of achievement’s value. My portrait’s subject draws strength from past achievements and events in time. The power of past events can exist abstractly within thought and memory, and concretely within mementos.
Technically, I am using this assignment to experiment with different forms of immersion in VR. The subject moves through an interactive ‘hub world’ space which presents doorways into more cinematic environments. Each of these environments experiments with the capabilities of VR in their own way (one is largely based upon Mark Leckey’s GreenScreenRefrigeratorAction). Below are some images of my current progress:

DMGordon-SEM

I chose a square of a holographic book cover to scan. When under slight magnification, it quickly became clear that this square was extremely smooth

While I was hoping to find the microridges that makes the hologram effect on many book covers, these ridges are covered by a transparent polymer coating. However, it is only transparent to light, and the electrons bounced right off it without revealing any of the forms below. The coating is smooth and relatively featureless. We had to zoom in until the sharp edges of the square became a rolling incline before any distinct surface features were visible.

Looks a lot like Enceladus, Saturn’s icy satellite. However, instead of hosting subterranean ocean underneath, the coating gives way to the tangled fibers of paper.

Also visible are sections where the coating has chipped off, revealing the tiny holographic grid beneath (center right). We tried to zoom in on that section, but couldn’t get a clean resolution before our time was up.

Who I am

I am a scraggly dog, and this class is my caring owner. We will embark on great journeys, summit peaks most harrowing. Every Tuesday and Thursday you will awake to me, licking your face. I am hungry, and you must feed me!

Woof woof! This dogger is house trained