Category Archives: CapstoneProposal

John Mars

28 Apr 2015

ibldi by @john_a_mars is a web app that creates customizable 3D printed models of urban areas.

Using extracted 3D tiles, i.e., buildings, textures, and terrains, from née Nokia Maps, ibldi is a webapp that allows users to select custom areas within cities (or anywhere that has 3D data available), and have them printed via Shapeways.

There are two competitors in this general area, Terrafab and The Terrainator, but both of them are focused on terrain instead of buildings.

The current iteration of the project is still skeletal, awaiting linking of all the parts and a useful skin (UI/UX) over it all.

Focus Areas:

  • (COMPLETE) Scrape 3D tiles of various cities

    All of the tiles were downloaded via a python script leveraging the Nokia Maps scraper migurski/NokiaWebGL (with improvements to make better OBJ files).

  • (COMPLETE) Creating a web application capable of 3D visualization and API calls

    The backend of the website is running on Node.js with Express as the web framework. Visualization and all mesh editing is done with Three.js.

  • (COMPLETE) Integrating with the Shapeways API

    Shapeways has graciously provided a Node.js wrapper for their API, but neglected to provide any meaningful documentation to its intricacies.

  • (COMPLETE) Creating manifold meshes from the city tiles

    The downloaded tiles are only single surfaces, and they’re fairly messy at that. In order for a model to be 3D-printable, it needs to be manifold, or watertight, i.e., there single surfaces and groups of surfaces are not allowed.

    I eventually ended up finding the edges of the tiles (via vertex hashing and a transversal algorithm), duplicated the edge so all points had a z-height of 0, then triangulated the vertices and made faces.

    This was by far the biggest problem/time-sink of the entire project.

  • Creating the X3D file format required for printing in color

    As has been a running issue throughout the project, keeping individual vertices and faces matched up with uv-coordinates for textures proves a challenge. I’ve been working with OBJ file formats for the length of the project, thanks to their ubiquity and ease-of-use/creation, but Shapeways will only accept VRML or its successor X3D for color prints.

  • Merging it all together

    All (well, most) of the parts work on their own, but the pipeline connecting them all together needs to be finished.

  • Making it work well and look good

    The front end of the website needs to be designed and developed.

Alex Sciuto

28 Apr 2015


Tweetable summary: Explore your thoughts and the words we use. Call 314 222 0893 or visit

Let’s Consider is a phone-based service that users can call and explore what they are thinking about at a moment. Side effects may include some humorous chuckles, and a bit of scratching-of-the-head. With Let’s Consider, a user starts with ten broad categories and then can explore over 30,000+ English-language concepts and words.

Let’s Consider is a mashup of Twilio communication services for phone connectivity, WordNet for hierarchical concepts, a library for simplifying WordNet queries, and natural language processing module for Node.js.


Let’s Consider is the result of multiple iterations.

Originally, the plan was to visualize State of the Union text data using data visualization and text-analysis. Without an interesting question to ask the data, no interesting concepts arose.

Screen Shot 2015-04-27 at 2.29.00 PM
Next, I looked at taking State of the Union audio and mashing it together into super-cuts based on particular phrases. Using detailed and precise transcripts to modify video clips is a very interesting idea with lots of applications. Dozens of audio or video clips could be layered all with a common phrase; video clips could be appended for an infinite State of the Union. Sadly, the technical challenge of getting precise audio transcripts proved insurmountable.

I did some brainstorming with Golan and others about other things we could do with audio. Eventually we were discussing new APIs and services coming out for phone-based applications. I started thinking about connecting people over the phone under different contexts—complaint line, self-help line, sharing experiences line. We were concerned that there might issues connecting people in such an open-ended way.

la-et-jc-imagining-borges-library-of-babel-201-001Inspired by Jorge Luis Borges’ short story “The Library of Babel“, I decided to create my own Library of Babel of all the concepts in English in combination with an audio phone system. Borges’ short-story describes an unimaginably large library that contains 410-page books of every possible combination of characters. Librarians, deluged with so much information, try to find different ways to deal with all the books, but there is no way to order it.

Instead of every combination of characters, Let’s Consider takes ten broad categories of concepts, and then allows the user to go through progressive, increasingly more-specific menus to find what they are interested in.

Description of Let’s Consider

Screen Shot 2015-04-27 at 2.48.43 PM
“The Library of Babel” was published in the collection of short stories called The Garden of Forking Paths. “The garden of forking paths” would be an appropriate subtitle for Let’s Consider. Starting with broad categories, the user goes through more specific steps. Finally they arrive at a concept, then are asked to decide what synonym they are thinking about. They they tell whether this concept makes them feel positively or negatively. Then the call ends.

There is a website ( that visualizes all the calls that have been received. The website serves as a repository for every interaction that people have with the system.

There were two main challenges with making this concept work: how to frame the interaction broadly and how to design an audio-only system.

Framing the concept was a challenge. Some suggested this concept could be used to actually generate interesting information about the person. If a person was going towards sad or melancholy ideas, the system could suggest they do something fun, etc. Others suggested that it could be an absurd experience combining random prompts as the user goes down a rabbit hole of concepts.

I initially used a framing of complaining or grousing. Users were prompted to figure out what was bothering them at the moment (working system name: Let’s Kvetch). I changed this to a psychotherapist framing. The system would infinitely ask questions until arriving a definitive solution humorously at odds with the probing questions.

Based on feedback, I reduced the framing and instead prompt people to try to discover what they are thinking about. I like this framing because it encourages centeredness and reflection using a rather absurd system.

The other challenge was designing the audio only system. Most of the menus the user encounters contains a list of complex concepts. For example trying to find common house plants requires the user to recognize that they are “vascular plant or tracheophytes.” I tried a few ways to get around this including ordering the labels in different ways. I ended up surfacing examples for every choice, so “vascular plant or tracheophytes” also has “such as rose bushes, elm tree, or basil.” This still isn’t perfect because some of the references are obscure and only three examples doesn’t give an accurate overview of 1,000+ concepts. Still it’s a start.

Screen Shot 2015-04-27 at 3.23.37 PM
Timing and word usage were another detail that was important. Twilio allowed inserting pauses in second increments. I wish they had allows millisecond pauses because I would have used those too. Also, the text I sent to Twilio includes periods where commas should be to slow down Twilio’s voice.

Conclusion and Thoughts

Let’s Consider is a middling success in terms of art piece. It requires a lot of work for the user to engage with the service and once the user does, it is a challenge for them to accurately navigate the menu system to get to the concept they are thinking about. Where Let’s Consider is successful is a playful exploration of words and concepts using synthesized voice instead text. The use of voice is especially effective in spite of the higher cognitive load it places on the user. The voice often mispronounces words comedically, and the voice reminds us that these words and concepts are often used in spoken contexts, not just on screen.

There are two areas I wish I could have improved for this final critique. One is to incorporate popularity rankings into the labeling system. By more prominently showing popular concepts, I think the navigability would be increased. I also think that the physical component of the system could better set the stage for the user. Right now, the system connects to any touch-tone phone. Madeline Gannon suggested using an old rotary phone on a pedestal to make the experience of using the system unique and different from normalcy. I agree with that. I think this system would be a nice “art” piece that people more conscientiously interact with.



28 Apr 2015


Camera Kinetics



+Camera Kinetics is a piece of hardware that allows people to obtain INFORMED and stylized camera movements with one click.




One moving subject placed at center frame.  Real time tracking for dynamic shots that move beyond a single frame.


Static Sweep.

One stationary subject at center frame.  Camera completes a multi axes pan around subject.  Subtle effect of rotating background.


Follow Thru.

Follows subject approach and then pans to subjects view.  Cadence of choreography is real time and adaptive.


Inner Workings 



Linear Motion Tracking Algorithm. (Scaled Motion)

As the pan tilt hits rotation lock and subject leaves center, the rail motion will compensate.


Screen Shot 2015-04-28 at 8.04.10 AM copy

Pan Tilt Integration (Pending Hardware)

Kinect point enables 3 dimensional positioning, coupled with rudimentary cv allow for proper handling of complex motion.

Amy Friedman

28 Apr 2015

Eye Tracker Display Interface

Where do people look when given different images dealing with basketball? Does this relate to where one looks during a real basketball game? How do people change where they look based on their knowledge base of the game of basketball?

What Ive done:

Part 1: Created program to save data from eyetracker from several participants, also created the display of images shown to participants on timer. Finally created a survey to better understand the demographics and basketball background of each participant.

Part 2: Created program to aggregate all of the participants eyetracker data, and create different analysis charts based on which images were shown on the current screen. Currently I have Scatter Plot, Scatter Line Plot, Heat Map. Working towards Revealing Map, other comparisons.

Scatter Plot Map 3

Lined Scatter Plot Map2

Heat Map1

Reveal Map4

Next Steps

Next steps include finishing creating a usable interface that people can click through to compare images and the different mapping systems I created. I need to edit the reveal map, cluster map and real time animation as they are currently having issues. Other issues include the time length to show the heat maps.

 Designs of Interface To be CreatedIACDmapshowing4_ IACDmapshowing_ IACDmapshowing2_ IACDmapshowing3_


-suggestions on best way to reveal information comparisons

– what other comparisons should be made?

– what else should be shown?




28 Apr 2015

For my final project, I’ve been working on creating a mashup of different panels from the Dilbert comic strip. Running for over 25 years, author Scott Adams has created close to 10,000 individual strips. To create the mashup, I needed the text from each individual comic strip panel.

As part of the earlier data scraping assignment, I wrote some Python code to scrape the text of each of the Dilbert comic strips embedded in the page source from However, the text only applies to an entire strip, and does not specify the words associated with each particular panel.Screen Shot 2015-04-28 at 3.49.01 AM

For this assignment, I wrote additional Python code to scrape and download each individual comic strip that has been published online. I then had code go through each strip to separate it into individual panels so they could be mashed up with other panels.



In order to isolate dialogue to specific panels, I intended to perform optical character recognition (OCR) on each of the panels. However, using the Tesseract OCR library by itself showed that it is not intelligent enough to separate dialogue from two separate characters (such as the right-most panel, above) — the OCR library treats it as a single block of text.

To work around this, Golan provided me with clever suggestions and valuable software support to use OpenCV and OpenFrameworks to perform a sequence of image manipulations. First, a flood fill from the black upper-left corner is performed to take out the characters, who  are always touching the bottom edge. The image is then converted to grayscale, and a median filter is applied to remove pixel noise. The dialogue letters are then dilated to consolidate them into defined, isolatable blobs so that bounding rectangles can be formed. Screen Shot 2015-04-28 at 3.34.46 AM


The coordinates of the bounding rectangles of each panel were stored in an XML file, which was then used by another Python program to perform OCR on each of the blobs, and use the scraped dialogue as ground truth to determine which OCR’ed text is associated with a specific panel by using the Levenschtein distance algorithm.

Screen Shot 2015-04-28 at 4.33.30 AM

Once the dialogue for specific panels were captured, I then used the CMU rhyming dictionary to randomly assemble a set of panels that rhyme with each other. I’m still working on a Python web server interface, which would allow the user to initiate the generation of a new strip.