Alexa Practice Skill Feedback

A summary of the feedback I got for the Alexa Practice Skill:

  1. Most people think it’s very successful in being practical and useful
  2. The skill can offer more feedback to musicians for improvement
  3. The flow of practice can be improved
  4. There should be more documentation
  5. The tutoring aspect is good for all-age

Let’s Practice: Alexa Music Practice Tool

  1. I made an Alexa skill with Abby Lannan, a Euphonium performance major Graduate student at CMU, where it will help you to practice music in various ways. It allows you to play music with other people’s music recordings or upload your own music. It helps you with music ear training, intonation and rhythm accuracy. This project is pitched to Amazon’s Alexa Day as a future Alexa project.

It tracks users’ practice progress and helps them achieve their practice goals. It synthesizes sounds on a cloud server based on voice commands.

Basic Functions:

Skill Development: It has a metronome and drone that offer a variety of sounds. It also allows users to upload recordings to play music with each other.

Tutoring: It has ear training game that teaches music theory.

Record Keeping: It saves user info on cloud database and allows them to track longterm practice progress.

A detailed slides is attached here:

2. Also, I attempted to make an Alexa skill that sends you an email of a pdf of pictures that ‘will make you happy.’ First of all, you tells Alexa what makes you feel happy/grateful. Then, she sends you a pdf to remind you the happy things in the world. The results look pretty like trash.

[hands, anime, butt, beach]

[spagetti, good weather, chris johanson, music]



I visited a Funky Forest installation at One Dome in San Francisco. This multi-user installation invites people to join in an immersive virtual space to create trees with their bodies, and interact with the forest creatures. It uses multiple Kinects.

Tiles of Virtual Space is an “infinite mirror”-like space that visualizes sound patterns that are generated by movements. It uses Kinect to capture multiple people’s movements.





My original goal is to create a visual generator that uses a keyboard interface as the only input and uses the advantage of MIDI sequencer to sequence visuals. I started by exploring Jitter in Max/MSP, and ended up creating 2 projects that are not quite close to my goal.

The first project “keyboard oscilloscope” captures my effort to associate MIDI input with simple geometric shapes. Each additional note input increases the number of cubes that form the shape of a ring, whose overall x position is associated with a low-frequency-oscillator on an oscillator and y position with another low-frequency-oscillator on a low-pass filter.  What I found interesting about this oscilloscope is, as we can see and hear in the video, as the modulation rate increases, we start to hear the beating effect and the changing visuals align with the frame rate and become “static”. Since I started this project by algorithmically generate everything, the color, positions of cubes, number of cubes, it became challenging to proceed to implement the sequencer functionality.

However, I really wanted to build a visual machine that can potentially become a visual sequencer. So I created “weird reality”, which is a virtual space that contains 64 spheres floating around, each corresponds to a different sin wave. “weird reality” has 3 modes:

1. Manual: manually dragging the spheres around and hear the raising and falling of sin waves; 2. turning on the demon mode and the spheres will automatically go up and down; 3. turning on the

2. Demon mode: the spheres automatically go up and down; 3. turning on the

3. Weird mode: the world has an invisible force field that can only be traced by spheres that move around it. The world sometimes changes its perspective and rotates around, and that’s when the force field is traced by all spheres.


I made a “texting” app that reads your text out loud based on your facial expression. I was triggered by the thought that texting software allows people to be emotionally absent. But what if texting apps require users to be emotionally present all the time? By reading your text out loud, or even sending your text the way your face is when you type behind the screen.

I started by exploring the iPhone X face tracking ARkit. Then, I combined facial features with the speech synthesizer by manipulating the pitch and rate of each sentence. Things about your face that change the sound include:

more round eyes are – slower it speaks

more squinty eyes are – faster it speaks

more smiley – higher pitch

more frowny – lower pitch

insert “haha” if jaw wide open

insert “hello what’s up” if tongue out

insert “hello sexy” if wink


From lots of trials and errors, I changed many things, for example: I initially have: eyes rounder – speaks faster. and vice versa. But during testing, I found that it’s more natural the other way around…

My performance is a screen recording of me using the app.


In the final version, I added a correlation between the hue of the glasses and expression: the sum of pitch and rate changes the hue.


!credits to Gray Crawford who helped me extensively with Xcode visual elements!


Silk is an interactive work of generative art.

It is a website that allows users to create organic shapes with minimal mouse control.

What I appreciate about this generative drawing tool is how ACCESSIBLE(easy to use, easy to access) and how well BALANCED it is: it turns simple strokes into mesmerizing, complex, colorful visuals, yet it gives me much freedom such that I do not feel restrained by its power. I like how this tool represents the immense power of simple ideas. It matches with my personal goal of delivering powerful messages with simple concepts. It is a simple concept WELL DONE.

However, what I do not like about this tool, (though I do not yet have a solution to) is how little personal connection I feel towards “my creation”: all products look pretty much the same(same style, same feel, same mechanics). Although there’s much more to explore, I quickly get bored by it.

It is a tool created 8 years ago by Yuri Vishnevsky with a sound designer Mat Jarvis.


Inspired by the effect of Pendulum Waves, I made an iPhone app with Unity that simulates this effect, using the accelerometer of the phone.

My goals are to recreate the visual effect of diverging/converging patterns of pendulums swaying at different frequencies and to introduce interaction with the accelerometer inside of a phone. The final product does not recreate the diverging/converging patterns, yet it presents a mesmerizing wave pattern with the threads of pendulums and produces sound that corresponds to the pattern.

My first iteration:

First iteration

My second iteration:

Second iteration

Although not originally planned, I connected the pendulums and the anchor with gradient threads, which created a very mesmerizing wave-like effect. For a more interesting interactive experience, I added sound to each pendulum, such that when each pendulum makes a click sound when it returns to the center axis.

The final version:

Final version

I used Jacqui Fashimpaur and Alexander Woskob’s help with Unity.


ANIMA II by Nick Verstand

ANIMA II GIF animation
ANIMA II GIF animation

ANIMA II(2017) by Nick Verstand is the second version of a previous work ANIMA(2014). ANIMA II is inspired by the four thousand years old Chinese philosophy of “Wu Xing,” the “Five Elements” of the universe, also means the ever-evolving “Five Stages” that the universe has: metal, wood, water, fire, and earth. The system of “Wu Xing” describes the interactions and relationships between phenomena: which can be natural phenomena, or the interaction between the internal and external self. By balancing the five qualities, one is able to actualize their inner self. 

Image of ANIMA II by Nick Verstand
ANIMA II by Nick Verstand

I was at the premiere exhibition of this piece after I read about it. I admire how this audio-visual piece strikes me as extremely organic, peaceful and engaging. The globe has an internal hemispherical projector, that projects fluid visuals that are algorithmically generated and transition between the five stages. The visual is accompanied by a spatial sound composition constructed from recordings of corresponding five elements in nature. It also the globe communicates to human approaching. It uses 3 Kinect sensors to decide faster or slower diffusion of fluid based on human locations. 

The work is created by a group of people/studios; it took years to complete; used projector, hemispherical lens, 8.1 speaker system, 4DSOUND software, and openFrameworks. 


1. If we define “game” as a formal, closed-form system, with a deliberate system of rules and mechanics, the first proposition suggests that critical play exposes/examines dominant values through its internal design: rules, environments, messages, culture and etc.

2. The second proposition suggests an approach to critical play by challenging the traditional “forms” of games, to evoke surprises that lead to discussions of larger social issues.

3. The third proposition suggests an even more extreme approach: building a counter-intuitive game mechanism.

I found the first two propositions relevant to my goals: creating novel/fun and meaningful interactive experiences that are accessible and intimate. Therefore, I do not wish to create counterintuitive games for the sake of evoking thoughts(not the 3rd proposition). However, in order for my experiences to be novel and fun, they can have unusual forms that bring positive surprises(2nd proposition). The first proposition is crucial. I don’t want any of my work to become a technical demonstration, so it’s important that my work is built upon messages/values that I want to convey. By designing a novel set of rules within an experience that I create, for example, generating music with subtle body movements, I can bring an intimate experience where the translation from movements to sound can be felt and enjoyed.