Daily Archives: 14 May 2014

Haris Usmani

14 May 2014

banner_ofxCP

Tagline
ofxCorrectPerspective: Makes parallel lines parallel- an OF addon for auto 2d rectification

Abstract
ofxCorrectPerspective is an OpenFrameworks add-on that performs automatic 2d rectification of images. It’s based on work done in “Shape from Angle Regularity” by Zaheer et al., ECCV 2012. Unlike previous methods of perspective correction, it does not require any user input (provided the image has EXIF data). Instead, it relies on the geometric constraint of ‘angle regularity’ where we leverage the fact that man-made designs are dominated by the 90 degree angle. It solves for the camera tilt and pan that maximizes the number of right angles, resulting in the fronto-parallel view of the most dominant plane in the image.

image1_ofxCP-c

2d image rectification involves finding the homography that maps the current view of an image to its fronto-parallel view. It is usually required as an intermediate step for a number of applications- for example, to create disparity maps for stereo camera images, or to make projections over planes non-orthogonal to the projector. Current techniques of image 2d rectification require the user to either manually input corresponding points between stereo images, or adjust tilt and pan until a desired image is obtained. ofxCorrectPerspective aims to change all this.

How it Works
ofxCorrectPerspective
automatically solves for the fronto-parallel view, without requiring any user input (if EXIF data is available, for Focal Length and Camera Model). Based on work by Zaheer et al., ofxCorrectPerspective uses ‘angular regularity’ to rectify images. Angle regularity is a geometric constraint which relies on the fact that in structures around us (buildings, floors, furniture etc.), straight lines meet at a particular angle. Predominantly this angle is 90 degrees. If we know the pairs of lines that meet at this angle, we can use the ‘distortion of this angle under projection’ as a constraint to solve for the camera tilt and pan that results in the fronto-parallel view of that image.

In order to learn about these pairs of lines, ofxCorrectPerspective starts by detecting lines using LSD (Line Segment Detector, RG von Gioi et al.). It then extends these lines, for robustness against noise, and computes an adjacency matrix. This adjacency matrix tells us what pairs we should consider, as pairs of lines ‘probably’ orthogonal to each other. After finding these probable pairs of lines, ofxCorrectPerspective uses RANSAC to separate the inlying and outlying pairs. An inlier pair of lines is one which minimizes distortion of right angles for all prospective pairs. Finally, the best RANSAC solution tells us the tilt and pan required for rectification.

image_small_ofxCP

Possible Applications
ofxCorrectPerspective
can be used on photos, similar to how you’d use a tilt-shift lens. It can compute rectifying homography in a stereo image, to speed up the process of finding disparity maps. This homography can also be used to correct an image projected using a projector that is non-orthogonal to the screen. ofxCorrectPerspective can very robustly remove perspective from planar images, such as a paper scan attempted by a phone camera. It produces some interesting artifacts as well for example, it modulates a camera tilt or pan as a zoom (as shown in the demo video).

image2_ofxCP-c

Limitations & Future Work
ofxCorrectPerspective
works best on images that have a dominant plane, with a set of lines or patterns on it. It also works on multi-planar images but usually ends up rectifying one of the visible plane, as ‘angle regularity’ is a local constraint. One approach to customize this would be to apply some form of segmentation on the image, before running it through this add-on (as done by Zaheer et al.). Another approach could be to allow the user to select a particular area of the image, as the plane to be rectified.

About the Author
Haris Usmani is a grad student in the M.S. Music & Technology program at Carnegie Mellon University. He did his undergrad in Electrical Engineering from LUMS, Pakistan. In his senior year at LUMS, he worked at the CV Lab where he came across this publication.
www.harisusmani.com

Special Thanks To
Golan Levin
Aamer Zaheer
Muhammad Ahmed Riaz

Chanamon Ratanalert

14 May 2014

tl;dr
– this class is awesome, you should take it
– nothing is impossible/just try it regardless
– be proud of your projects

As my junior year draws to a close, I look upon my semester with disdain. Though not because I hated this semester, but rather I am saddened by its finish. I never would have thought that I could come so far in the three years I’ve been in college. I thought my “genius” in high school was my peak. I thought it was good enough—or as good as it was going to get. This course has brought that potential much higher—way beyond what I could have ever imagined I was capable of. And I don’t mean this in terms of intelligence. The experiences I’ve had in the past three years have brought my perseverance potential light years beyond what I could have had if I didn’t go to this school and push my way into this class’ roster.

What have I learned from this course you may ask? The main takeaway I received from this class that I’m sure was intentionally bludgeoned into all of us, was that whether or not something is impossible, you should just try it. The possibility of achieving your goals, believe it or not, increases when you actually reach for them. This mentality has produced amazing projects from the course’s students that I could never have thought to witness right before my eyes. I always felt that those amazing tech or design projects I saw online were like celebrities: you knew they grew from regular people, but you never thought you’d see one in the same room as you.

I will also forever take with me the idea that how you talk about your project is just as important as the project itself. This seems obvious and you’d think that you’ve talked up a project enough, but, as it can be seen with some of my peers’ projects this semester: the right documentation and public potential you give your project can make a world of difference.  How you present a project will determine how it will be perceived. That self-deprecating thing that most people do when they talk about themselves to garner “oh psh, you’re great”s and “what are you talking about, it’s wonderful”s doesn’t work too well for most projects. Looking down upon your own project will make others look down upon it too, and not see it for what it is at that moment. Sure, often times your project might actually stink, but you don’t want others to know that.

You also have to be careful about how much you reveal about your project. You may think that the technical aspects of how many programs are running in the backend or how many wires you needed to solder together is interesting, but it’s really not. If someone looking at your project cares in that much detail, they’ll ask. You have to present your project for what it is at the moment someone sees it, not what it was a couple hours ago while you were booting it up. It’ll be important to say how you created the project (especially if it was in a new and interesting way), but the extra details might be better off left out. But I digress.

I value this course in more ways than I can describe. Let’s just say that I’m very thankful to have been scrolling through CMU’s schedule of classes on the day just before Golan picked his students. Luck could not have been more on my side. Of course, after the class started, luck was no longer a factor—just hard work. And I’m glad to have needed to put in all that hard work this semester. Without it, I would not have realized what great potential there is inside of me and inside of everyone, for that matter. You always feel like there’s a limit within you so when you think you’ve hit it, you never dare to push past it in fear of breaking. This course has annihilated that fear because I’ve realized that the limit only exists within our minds. Okay, maybe I’m getting a little carried away here, but even so, limitless or not, if there is anything you want to try, just try it.

Chanamon Ratanalert

14 May 2014

bannerSm2

Tweet: Tilt, shake, and discover environments with PlayPlanet, an interactive app for the iPad

Overview:
PlayPlanet is a web (Chrome) application made for the iPad designed for users to interact with it in ways other than the traditional touch method. PlayPlanet has a variety of interactive environments from which users can choose to explore. Users tilt and shake the iPad to discover reactions in each biome. The app was created such that the user must trigger events in each biome themselves, unfolding the world around them through their own actions.

Go to PlayPlanet
PlayPlanet Github

My initial idea had been to create an interactive children’s book. Now you may think that that idea is pretty different than what my final product is. And you’re right. But PlayPlanet is much more fitting toward the sprout of an idea that first lead to the children’s book concept. What I ultimately wanted to create was an experience that users unfold for themselves. Nothing to be triggered by a computer system. Just pure user input directed into the iPad via shakes and tilts to create a response.

After many helpful critiques and consultations with Golan and peers (muchas gracias to Kevyn, Andrew, Joel, Jeff, and Celine in particular), I landed upon the idea of interactive environments. What could be more user-input direct than a shakable world that flips and turns at every move of the iPad. With this idea in hand, I had to make sure that it grew into a better project than my book had been looking.

The issue with my book was that it had been too static, too humdrum. Nothing was surprisingly, or too interesting, for that matter. I needed the biomes to be exploratory, discoverable, and all-in-all fun to play with. That is where the balance between what was already moving on the screen and what could be moved came into play. The environments while the iPad was still had to be interesting on their own, but had to be just mundane enough that the user would want to explore more—uncover what else the biome contained. This curiosity would lead the user to unleash these secrets through physical movement of the iPad.

After many hours behind a sketchbook, Illustrator, and code, this is my final result. I’m playing it pretty fast and loose with the word “final” here, because while it is what I am submitting as a final product for the IACD capstone project, this project has a much greater potential. I hope to continue to expand PlayPlanet, creating more biomes, features, and interactions that the user can explore. Nevertheless, I am proud of the result I’ve achieved and am thankful to have had the experience with this project and this studio.

Shan Huang

14 May 2014

One sentence tweetable version: Stitching sky views from all over Manhattan into an infinite sky stream.

In the process of inspiration searching for my final project, I was fascinated by the abundance of data in Google Street View, a web service that grants you instant access to views from all over the globe. I really enjoyed taking street view tours in places I had never been to, or even heard of, like a lonely shore on the north border of Iceland. But as I rolled my camera upwards, I saw buildings and the sky intersecting at the skyline, and the skyline was extending way beyond the view itself, beyond the street, the city and even the country, the continent. So I was inspired to make some sort of collective sky that creates a journey along a physical or a psychological path.

Scraping

Everything starts with scraping. I wrote some python scripts to query images from the Google Street View Image API and stored metadata such latitude, longitude and panorama id in the filenames. An advantage of Google Street View Image API compared to Google’s other street panorama service is that it auto-corrects the spherical distortion in panorama images. I found it really handy because I could skip implementing my own pano unwrapper. But I had to face its downsides too, meaning the maximum resolution I could get was 640×640 and I had to scrape strategical to avoid exceeding the 25k images/day query quota.

40.7553245~-73.96346984

Typically the sky image I got from each query looks like this. I tried scraping several cities in the world, including Hong Kong, Pittsburgh, Rome and New York, but ultimately I settled on Manhattan, New York because the area had the most variation in kinds of skies (skylines of skyscrapers can look very different from that of a two floor building or a high way). Besides contours of Manhattan sky had the simplest geometric shapes, making shape parsing a whole lot easier. In total I scraped around 25K usable images of Manhattan sky.

QQ20140429-1

Shape parsing

I was the most interested in the orientation of the sky area. More specifically, for each image, I wanted to know where the sky exits on four sides of the image. With the color contour detecter in ofxOpenCv, I was able to get pretty nice contours of skies like this:QQ20140428-14

1 (Contour marked by red dotted line)

From full contours I marked its exits on four sides and computed the direction of the exits. This step gave result like this:

2(Exits marked by colored lines)

These exits helped me in deciding along which axis I should chop the image up. For instance if a sky only had exits on the left and right, I’d certainly subdivide it horizontally. If an image had three exits it would then be subdivided along the axis that had exits on both sides. For four-exit images it didn’t really matter which way to subdivide. And finally, images with one or zero exit were discarded.

3

 

The above steps resulted in a huge collection of slices of the original images, with metadata marking the exits on each slice. I also computed the average colors of sky and non-sky regions and recorded them in the metadata file. The collection was tremendously fun to play with because each strip essentially became a paint stroke with known features. I had the power to control the width and color of my sky brush at will. My first experiment with this collection was to align all skies with a specific with along a line:

4

In my final video I sorted the images based on color and aligned them along arcs instead of a flat boring straight line. The result is a semi-dreamy, surreal sky composed ball that led people through a psychological journey in New York.