Understanding Music Visually - Feedback and Feedforward

Introduction and background

Human emotions are expressed through musical performances for the musician and their audience (Gabrielsson & Juslin, 1996). Musical performances typically utilize instruments that are played and vocals that are sung by people. Traditional instruments take form of different physical artefacts of varied shapes and sizes that are used to play by striking, strumming, and blowing to produce different sounds. A performance usually comprises of multiple instruments that are played in harmony. While playing a number of traditional instruments such as the guitar or a piano require skill, technology looks to blur the gaps by transforming traditional instruments into its digital equivalents.

Technology has enabled people to create music that is generated by traditional instruments using computers and even hand held devices in the form of applications. In terms of HCI, technology relied on traditional interaction paradigms that of ‘WMIP’ and ‘Mobile’. The evolution of technology has given rise to more novel ways of interacting with it forming newer paradigm. NIME (New Interfaces for Musical Expression) looks to produce a new paradigm shift to create music in unique interactive ways. These new interfaces are developed in varied forms and have no strict platform. Reactable (Jordà et al., 2017), is an example of using physical blocks and table to interactively create music. Another example is a project created by Cinimod studios (Cinimod Studio, 2017), called as ‘Aurora’ that utilizes gestures without any physical artefacts to create music while providing a dynamic visual experience to people.

SoundScape is a new musical interface created for people to interact with Lycra® as an interface to create music by pushing and gesturing on it while the gestures produce visualizations based upon the extent to which a push or a gesture is produced. Soundscape is an iteration of the concept SoundGarden that was created for coursework studies for the course Physical Computing in the first half of 2017 for the University of Queensland.

Traditional instruments keep us constantly aware of what we can do next in terms of the next key we want to press, the next fret we want to push, and the next string we want to strike by maintaining a sense of visible awareness. Playing traditional instruments produce organic sounds and haptic feedback, while newer forms of technology, that of NIME, fail to reciprocate haptics as accurate as traditional instruments (Jordà, 2003). While feedback mechanisms are an challenge with NIME, there are also challenges with respect to feedforward mechanisms (Dobrian & Koppelman, 2006) in terms of understanding what one can do next. Dobrian (2006) highlights this by explaining the visual information that is traditionally used to show people ‘what can be done’ and ‘what just happened’. Jordi (2003) illustrates highlights the value of visualization in performances as means to enhance the musician’s and audience’s experiences. It was also found that visualizations would positively improve experiences when displayed in accordance with emotions (Chen, Weng, Jeng, & Chuang, 2008).

This report will illustrate a few examples of existing projects, compare them to SoundScape, and explain their feedback and feedforward mechanisms in terms of visualizations. People’s feedback will also illustrate the success to which SoundScape implemented both the mechanisms.


Existing designs

Fire & Ice by Cinimod

Fire and Ice is an installation created by Cinimod (2017) to allow people to use gestures as means to create dynamic visualisations. The sound that are created in the installation is to depict fire and ice. A maximum of two people can simultaneously take the role of fire and ice respectively to move their arms and create dynamic patterns out of the visualizations. Fire and Ice uses a Kinect sensor to monitor people’s arm movement and manipulate visualizations while the visualization themselves is displayed on a flat LED screen. The main aspect of this installation is to provide dynamic visualizations to people who create and who watch them. In order to attain maximum system efficiency, it was important to produce visualizations with minimum latency.

The feedback mechanism of this system is audio-visual but is predominantly visual as people gave more value to what they saw to interact with the installation. Research has identified people’s reliance on visual information to make acumen to interact (Tsay, 2013). Limitation of Fire & Ice as an installation was the lack of cues on how to interact with the system.

People at the El Grande Christmas Party interacting with Soundscape
Fire & Ice by Cinimod
Wood wall for inside Rolls-Royce by Cinimod

The wood wall for inside rolls Royce is an interactive installation which was created by Cinimod for Rolls-Royce to portray the wooden interiors of Rolls-Royce. The interface itself comprised of a flexible screen that could be pushed and gestured to produce a multitude of visual patterns. The installation could afford more than two people interacting on the same screen to produce different patterns based upon the number of simultaneous interactions. The visuals that were produced were mapped using a Kinect sensor behind the screen and a projector was used to create the visualizations. In addition to the visualization, the system also afforded an orchestral soundtrack to allow people to manipulate the sounds while interacting with the visuals.

The feedback the installation provided was a little ambiguous due to the nature of the unpredictable patterns generated due to pushes. However, the objective of the design itself could be to make people to interact and explore newer patterns. In terms of limitations, the system did not allow people to clearly understand what they could do, or what they did.

People at the El Grande Christmas Party interacting with Soundscape
Wood wall for inside Rolls-Royce by Cinimod
Aurora by Cinimod

Aurora is an installation developed by Cinimod to allow people to experience music creation in a gestural format. While the objective of this installation was to create an open-ended playful experience for people, Aurora only affords one person to interact with the installation at a time where the person stands in the centre of the installation and moves their arms around to create music. The visuals react to their arm movements to give them a sense of feedback. Simultaneously, along with the visualizations, the music generated are also manipulated.

The feedback to people are audio-visual but visual cues supersede the audio cues as people use the visualization to understand ‘what they did’. While the interaction mode of this installation is instructing, exploring, and manipulating, the installation does little when it comes to show ‘what possibly can be done’ highlighting its limitation when it comes to feedforward mechanisms.

People at the El Grande Christmas Party interacting with Soundscape
Aurora by Cinimod
Reactable

A more famous example of a NIME that was more successfully implemented is the Reactable. The Reactable is a tangible interface that allows people to control music with the use of physical artefacts on a table. Placing a physical object on the table creates a visualization and sound. Connections can be visually created by placing multiple objects on the table to create compound sounds, they are visually shown in the form of different waveforms to differentiate between connections, and these connections can also visually be manipulated to change. In order to manipulate the music, people use the visual cues and change them with use of the objects. The success of Reactable has resulted in it being recreated for more commercial purposes in the form of applications for iPads and iPhones.

The affordance created by using tangible objects to create and use visualizations to adjust aspects of sound is a good example of good feedback and feedforward. Placing an object creates a ring of controls around the object, further, turning the object results in adjusting the different filters, levels and sound frequencies. People can maintain a sense of control over the system with the support of visualizations.

People at the El Grande Christmas Party interacting with Soundscape
Reactable

Visual Feedback for musical performance

People at the El Grande Christmas Party interacting with Soundscape
People interacting with Soundscape at the El Grande Christmas Party

Whilst visualizations are the premise to provide feedback for people in different ways, they all enhance the overall experience for the people composing the soundtrack. However, the success of SoundScape depends upon people’s feedback. For the purpose, feedback was obtained from people to understand how well these mechanisms have aided them to produce music that is coherent in terms of music generation. The feedback obtained is sectioned into pre-performance and post-performance feedback to detail how feedback mechanisms evolved through the process, its constraints, their success factors and limitations.


People’s feedback and future directions

The feedback mechanisms of SoundScape were both auditory and visual. Prior to the prototyping process, initial discussion with participants indicated the need for them to migrate from an iPad to a larger, more dynamic interface. This was highlighted when participants performed in previous ensembles by looking down on the iPad rather than their audiences. During the initial prototyping stage, participants shared the need to have visualizations that were organic and in addition, students associated colours of the visualizations to the instrument and the sound. Students feedback and previous literature identified the need to have visualizations to create a better experience both for performers and audiences.

Further during the development process, the students were asked questions regarding the different system states to ascertain feedback and feedforward mechanisms. Questionnaires were constructed to determine the different states where students explained the need to have nothing on a blank state. This was found to be a similar feature to the light installation by called Sobaine by Cinimod studios (Cinimod Studio, 2017b). Additionally, touching or interacting with the screen activated the screen. Students wanted to maintain a sense of mystery towards audience by hiding certain visualization and making the interactions ‘magical’ thereby removing feedforward mechanisms. Students also wanted visualizations to scale as they pushed with tiny particles emitting from the parent particle and create a trail when they gestured around the interface that is quite similar to Sobaine.

The final prototype of SoundScape entailed small visualizations at each sensor in order to give a sense of feedback of gestures to performers. The scaling of the size of the visual enabled them to understand that the changes were made, similar to literature that talked about people’s reliance on visuals than auditory in order to change system status. 3 out of the 4 participants felt that the visuals helped telling them where they had pressed, how much they have pressed and its mapped effects while some audio effects were not evidently audible.

Post the performance, during the innovation showcase, more actual feedback was obtained from people who had no previous knowledge about the interface. Participants were found looking at the screen wondering what needs to be done. While some participants hovered their hand over the interface expecting something to happen, other participants touched the interface expecting a response. One participant couldn’t clearly hear audio levels, and was only able to visually ascertain changes and demanded better audio/visual feedback. While the performer who played at the ensemble highlighted the value of visuals for the interface in terms of feedback and a visual spectacle showing that the feedback mechanism worked for the performer.

In order to make this installation more publicly viable, instructions must be clearer by setting a more solid feedforward visual mechanism building on feedback from participants from the innovation showcase. The next iteration of soundscape intends to build on existing projects and feedback from people to produce trails of where people interact, change visualizations to a more fluid visual and establish visual cues with regards to what can be done. In the future, soundscape may incorporate ultrahaptics to create a newer sensory experience to tangibly feel sound through touch (Long, Seah, Carter, & Subramanian, 2014; Obrist, Subramanian, Gatti, Long, & Carter, 2015).


References

Chen, C.-H., Weng, M.-F., Jeng, S.-K., & Chuang, Y.-Y. (2008). Emotion-based music visualization using photos. Advances in Multimedia Modeling, 358–368.

Cinimod Studio. (2017). Cinimod Studio. Retrieved November 14, 2017, from http://www.cinimodstudio.com

Cinimod Studio. (2017b). Sobaine - Cinimod Studio. Retrieved November 20, 2017, from http://www.cinimodstudio.com

Dobrian, C., & Koppelman, D. (2006). The “E” in NIME: Musical Expression with New Computer Interfaces. In Proceedings of the 2006 Conference on New Interfaces for Musical Expression (pp. 277–282). Paris, France, France: IRCAM — Centre Pompidou. Retrieved from http://dl.acm.org/citation.cfm?id=1142215.1142283

Gabrielsson, A., & Juslin, P. N. (1996). Emotional expression in music performance: Between the performer’s intention and the listener’s experience. Psychology of Music, 24(1), 68–91.

Jordà, S. (2003). Interactive music systems for everyone: exploring visual feedback as a way for creating more intuitive, efficient and learnable instruments. In Proceedings of the Stockholm Music Acoustics Conference (SMAC03), Stockholm, Sweden (p. 44).

Jordà, S., Geiger, G., Kaltenbrunner, M., & Alonso, M. (2017). Reactable. Retrieved November 14, 2017, from http://reactable.com/Long, B., Seah, S. A., Carter, T., & Subramanian, S. (2014). Rendering volumetric haptic shapes in mid-air using ultrasound. ACM Transactions on Graphics (TOG), 33(6), 181.

Obrist, M., Subramanian, S., Gatti, E., Long, B., & Carter, T. (2015). Emotions mediated through mid-air haptics. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 2053–2062). ACM.

Tsay, C.-J. (2013). Sight over sound in the judgment of music performance. Proceedings of the National Academy of Sciences, 110(36), 14580–14585. https://doi.org/10.1073/pnas.1221454110