For my final project, I wanted to string together clips from various movies in interesting ways. To do so, I built a tool that extracts clips from video files using timing data from a matching subtitle file. After extracting the metadata from each film, I imposed certain rules to select specific clips in specific order. For example: all clips that use the word “yes” or “yeah” as an interjection, all clips that end in a question mark, all clips containing a particular word that occurs frequently throughout the film (determined by term frequency – inverse document frequency), etc. Eventually I decided on coupling phrases that have the same number of syllables and rhyming last words.
With around 35 movie, subtitle pairings, I, In the spirit of iambic pentameter, chose phrases that had around 10 syllables and within those pools of subtitles, determined which subtitles rhymed, and created a massive dictionary mapping subtitles to a list of subtitles that it rhymed with. Although it was possible to filter even further and remove all subtitles not in iambic pentameter (since I had syllable stress data along with each word) I chose to keep all phrases since doing so would leave me with not too many results (however, one can see how simply increasing the number of videos would make creating a Shakespearean sonnet feasible.) I then randomly selected a subtitle and a random rhyming subtitle from my dictionary, found the files from where the originated from, loaded the clips into memory, and saved into a list. Later professional video editing software was used since the size of the resulting video was constrained by the amount of ram that I had.
Here are some results