Aesthetic Considerations
in the Use of “Virtual” Music Instruments
Christopher Dobrian
Music Department
303 Music and Media Building
University of California
Irvine, CA 92697-2775 USA
+1 949 824 7288
[email protected]
ABSTRACT
Computer-mediated music control devices compell us to reexamine the relationship between performer and sound, the
nature and complexity of which is theoretically unlimited.
This essay attempts to formulate some of the key aesthetic
issues raised by the use of new control interfaces in the
development of new musical works and new performance
paradigms: mapping the gesture-sound relationship,
identifying successful uses of “virtual” instruments,
questioning the role of “interactivity” in performance, and
positing future areas of exploration.
Keywords
Music, instruments, control interfaces, interactivity
INTRODUCTION
Much of the work being done with computers and music
involves experimentation with the design and use of new
“controllers”—new interfaces to computer-controlled instruments. This experimentation can be divided roughly into
two types of activity: design of new instruments, and
adaptation of non-music technology for musical use.
In the case of most earnest inventors/builders of new
instruments, the design grows from the urge to “build a
better mousetrap”, to overcome the limitations of
traditional instruments. Most artists who have worked
extensively performing with, and composing for, new
instruments, however, have come to realize that this quest
for the ultimate instrument is a modern-day search for the
mythical holy grail. Any instrument requires considerable
dedicated practice to achieve mastery of it, any instrument
has its own immanent limitations, and any instrument is
only useful to the extent that it serves in producing “good”
music, which still has to emanate from the creative human
spirit.
Those who work with adapting non-music technology to
usage for the control of music are usually attracted to the
novelty of the interface for musical applications, and the
potential for new relationships between human activity and
resultant sound. Nearly anything that can produce a voltage
is fair game for experimentation as a control interface to a
computer-mediated instrument. In some cases the very
unconventionality of the instrument becomes a theatrical
element in a performance.
In both cases the fundamental technical issue is the
conversion of analog or digital electrical signals into
control data useful for a computerized sound generator.
This requires thoughtful mapping of gesture to
sonic/musical meaning, and
ultimately
requires
consideration of the creative utility and aesthetic value of
any such mapping.
This article will deal with some of the aesthetic
considerations I have encountered while working with
alternative methods of music control, specifically
referencing the use of video-tracking (using such software
as BigEye and VNS) and motion capture (with a Vicon 8
system) to produce dancer-controlled music. An unfettered
dancer who is directly producing and controlling the
musical events can be considered an extreme case of an
alternative gestural interface: a human moving unrestricted
to perform a “virtual” instrument, one with no tangible
physical interface.
THE GESTURE-SOUND RELATIONSHIP
The Performer-Instrument Relationship
Much of our appreciation of music is in its performance:
the contributions the interpreter makes in terms of
dynamics, rubato, timbre, ornamentation, in some cases
even improvisational decisionmaking, and notably in this
context, the performer s virtuosity and mastery of the
instrument. Indeed it can be said that a major part of the
drama of, for example, the Bach Chaconne is witnessing
the violinist s mastery of the technical challenges that the
written music presents. We are aware of the skills and
maneuvering required, and when the music flows elegantly
from the instrument we are impressed and enthralled by the
technical success of the performer. Our knowledge of the
instrument also contributes to our appreciation of the
timbres, intonation, and effects the player produces.
When we witness a music performance with a new,
unknown instrument, especially one for which the playerinstrument relationship is obscured by the effects of
software and electronics, the drama of the player s control
of the instrument is different. In this case—as in the case of
witnessing for the first time a performance on an
instrument from a foreign culture—our sense of the
performer-instrument relationship is primarily based on
how we perceive the releationship between the performer s
gestures and the sounds (which we presume to be as a
direct result of the gestures).
In traditional instruments—especially percussion and
keyboard instruments—the relationship between gesture
and sound is usually one-to-one: a single action triggers a
single sound, embodied notationally as a single dot on a
page. This traditional relationship leads computer
musicians to be too frequently restricted by this notion
when experimenting with new controllers. In computermediated instruments, however, in which a computer
controls the relationship between gesture and sound
generator, a single trigger can have any result. (The extreme
case is the compact disc player, which permits an entire
Beethoven symphony to be triggered by the flick of a
finger.)
This situation is placed in an interesting reversal when we
witness dancer-controlled music. Traditionally dancers
move in response to a rhythmic stimulus, and the music is
viewed as the independent generator of rhythm which
activates—and to a degree controls—the movement of the
dancer. In some styles, such as flamenco, the musician
reacts to the dancer such that the music is created by a
complex realtime interaction between dancer and musician;
nevertheless, even in this context it is clear that the
musician generates the rhythmic impulses to which the
dancer s movement is synchronized. When the music is
produced by video tracking, motion capture, or other form
of motion sensor, however, these roles are completely
reversed. This raises several new aesthetic questions, which
will be discussed later in this section.
Control Data
Most data received from controllers is one of two types: 1)
inidividual discrete triggers at specific moments, in
response to a specific action or passage of a threshold (e.g.,
a button is pressed, a contact is made, etc.), or 2) streams
of discrete data representing a sampling of a continuous
phenomenon (e.g., a measurement of the movement of a
potentiometer).
Trigger data is most commonly used to act as a toggle
switch from one state to another, or to enact “note” events
in music. The data may contain descriptors of the number
and type of trigger, as in a MIDI note message which
contains channel, key number, and velocity information.
As noted above, this trigger need not actuate only a single
note or sonic event; it can have any result. It is fairly easy
to obtain trigger data from a control device. The only real
challenges are a) the technical question of how to discern
different types of triggers from a single interface, and b) the
aesthetic question of what the triggers should do sonically.
If one thinks of a note not as a single static event, but as a
complex evolving sound with its own internal shape—as in
fact almost all notes are, contrary to their simplifed
notation—then one realizes that the majority of expressive
potential comes from the continuous control of the note s
timbre and dynamics after its initial trigger. This is one of
the principal values of the use of continuous control data.
Continuous control of electronics can give access to sound
parameters not traditionally available, such as filtering and
modulation, panning and reverberation for localization
effects, and simultaneous realtime control of other related
media in performance such as lighting, animation, or video
processing. Continuous control can also be used over
longer periods of time—over the course of many notes—for
shaping larger formal parameters such as crescendi,
accelerandi, note density, etc.
New Issues in Dancer-Controlled Music
As noted earlier, dancer-controlled music reverses some
traditional roles: dance generates music instead of music
generating dance, and the dancer controls musical
performance (and potentially musical structure and content)
instead of musician. This raises interesting new aesthetic
questions for designing the dance-music relationship, for
designing the mediating software, and for composing the
music.
—Choreographic conventions and styles have always
developed without one ever needing to be concerned about
their effect on the music. But when the choreography is
concerned with performing the music as well as the dance,
how does (must) traditional choreography change?
Could/should the prospect of dancer-controlled music lead
to a new vocabulary of movement?
—Given that, in the case of video motion-tracking, the
dancer s movement in the two-dimensional video image is
what controls the musical sound, what is the meaningful
language (i.e., the most useful data to be derived) in the 2D
space? Location? Velocity? Acceleration? Proximity? Size?
—At what structural level of the sound does one want the
dancer s control to be oriented? At the “microcosmic”
timbral level, giving subtle expression to sounds by
continuously controlling sonic parameters? At the
“middleground” level, providing pitch rhythm, and
dynamic information? At the “macrocosmic” level,
providing input parameters for automated algorithmic
music or shaping the formal structure of the piece (note
density, tempo, etc.)?
—In the case of multi-dimensional data input, such as the
Vicon 8 s multiple points in 3D space, how does one
manage and map so many simultaneous control parameters?
As a single progression through a multi-dimensional
musical parameter space? As multiple agents in a 3D
parameter space? Can one use this wealth of data to derive
higher-level information about the characters of the
dancer(s) motion, which might give a more direct interpretation of the intended expressivity of the movement?
These are some of the questions with which one grapples
when designing software and composing music for dancer-
controlled instruments. The next section provides a few
basic observations and suggestions.
Because there is no established standard for the relationship
between movement in a virtual space and the musical
results of that movement, mappings of gesture to sound—
programmed into the computer that mediates between the
controller and the sound generator—must be simple and
direct in order for the audience to perceive the cause-effect
relationship.
by mapping the gesture in a non-linear realtionship to the
intended musical material. A linear movement can be
mapped onto a familiar non-linear musical structure—such
as a diatonic scale—and/or onto a non-linear contour. (This
is most easily achieved with table-lookup, or random
perturbations of input data.) Similarly, events triggered by
the dancer can be realtime-quantized to a metric grid or a
desired rhythmic pattern (i.e., a “groove”, to which all
events must conform or be usefully syncopated). With
these techniques one can read through a table of
possibilities which are an inherently strong sequence, and
which can be presented in any rhythm.
Other Relationships
Music is not just Pitches
TENETS AND GUIDELINES FOR VIRTUAL INSTRUMENTS
Simplicity
The relationship between gesture and sound need not
always be directly proportional. Inverse proportionality,
exponential relationships, slightly distorted or not-strictlylinear relationships can also be perceived easily, and such
divergence from the expected direct proportionality can be
satisfying.
Variety
As with anything in art, things that are overly predictable
quickly become tedious. If one is working with simple
gesture-sound relationships, one must recognize that those
relationships can become boring for the audience very
quickly, and must therefore be frequently varied or
changed. The form of the music/dance piece will thus be
influenced to some degree by the nature of the relationships
established by the virtual instrument.
Multiple Simplicities
A single simple gesture-sound relationship may soon seem
simplistic to an audience, but two or more simple
simultaneous relationships established by the mediating
computer can be considerably more engaging. The audience
follows not only the direct correlations, but also the
counterpoint of mappings—the interaction between the
correlations.
Multiple Performers
The complexity for both performers and audience in
perceiving and understanding the workings of the virtual
instrument seems to grow quickly as soon as a second
performer is introduced. Part of the audience s appreciation
of the work is discovering the nature of the virtual
interface, and this is complicated by the uncertainty of
which dancer is causing which sonic result. The issue is
again one of managing counterpoint. For example,
separating the dancers in space (avoiding “voice-crossing”
in counterpoint terminology), and giving the dancer s
contrasting movements (independence of contrapuntal
elements) can enhance clarity of understanding. And of
course, when obfuscation is the desired goal one can do the
opposite.
“Intelligent” Mappings
Directly mapping motion to sound—for example, mapping
a dancer s position onto pitches of the chromatic scale—can
be unsatisfying musically because of the lack of musical
“sophistication”, the lack of stylistic reference. One can
lend some measure of “musical culture” to the instrument
Too frequently computer musicians are contented with the
simplest and most banale first-choice mapping—locationto-pitch—and do not explore more complex and interesting
relationships sufficiently. Continuous control (of
portamento, timbre, dynamics, etc.) is often more
expressive and more satisfying dramatically than simply
mapping motion to pitch.
Time is Malleable
Introducing delay between gesture and result retains the
simplicity of the correspondence but offsets it in time, to
potentially interesting effect. This can be achieved with
computer scheduling, delay buffers, or even storing input
data and accessing it algorithmically or probabilistically in
the future. Extreme delay, reverse delay, and capture and
storage of data, are potent tools for dealing with
relationships over longer periods of time, to create form in
a composition or improvisation. Combining these
techniques helps one create works with an “open”
(indeterminate) form.
THE QUESTION OF INTERACTIVITY
What is Interactivity?
Interactivity is a term too often employed to describe any
use of a computer in live performance or installation. A
computer might act independently, or might react to
human actions (responding slavishly to triggers, or tracking
continuous input), but this is not interactivity. The prefix
inter- implies that both human and computer can act
independently and react responsively to the actions of the
other. Thus, true interactivity must involve mutual
influence, and cannot be all deterministically programmed.
In a truly interactive instrument, the computer will have the
capability to act independently and to react indeterminately
to input. These characteristics are inherently contrary to an
attempt to produce a fully controlled, determinate,
predictable work of music. One can program an instrument
that responds in a known manner to all likely input data,
but that is just reactive, not interactive.
A truly interactive instrument must have the capability to
respond to input that is not previously known to it (i.e., is
not pre-programmed in a knowledge base, nor handled with
a fully deterministic algorithm), and must be capable of
producing results that are not fully predictable. In other
words, the computer must be able to respond appropriately
to improvisation, and must itself be able to improvise.
This implies that the instrument must not only receive
data, but must have at least rudimentary cognitive ability,
in order to make “musical sense” of the data it receives.
The logical conclusion that interactive instruments
encourage—and indeed are most appropriate for—
improvisatory music, means that it is almost anachronistic
to think of using an interactive system in a fixed piece.
If, as asserted earlier, the drama of musical performance
depends at least in part on the interaction of performer and
instrument, and if an interactive instrument must contain
elements of unpredictability, the performer must have
worked sufficiently with the instrument to be able to
improvise with it in an interesting way. Obviously, then,
working with an interactive instrument requires no lesser
virtuosity and no less rehearsal than any other sort of
improvisation.
Audience Participation
Some have argued that it is less interesting to watch a
performance on an interactive instrument, because the
gesture-sound relationship can be so complex as to be
incomprehensible, and in such a case it becomes an
improvisation that is interesting only to the performer. Part
of the answer to this charge is for the composer,
programmer, and performer to find the appropriate balance
of complexity and comprehensibility (as in any musical
work). But also, the ability of an interactive instrument to
respond to unforeseen input makes such an instrument ideal
for works which incorporate audience participation rather
than passive audience observation. This is already being
actively explored by installation artists. The potential for
participatory musical performance has been insufficiently
explored in the computer music community.
In conceiving works that incorporate audience participation,
the problem for the composer/programmer is how to create
an open form in which the the music or dance can be varied
freely within certain parameters, providing a compelling
experience of interactivity for the audience, but in a manner
that can somehow still be “guaranteed” to work artistically.
If the audience controls the piece, one might wonder, how
can you “guarantee” that it will still be artistically
compelling? Composers may be afraid to relenquish full
control of a piece by allowing improvisation to play a large
role in it, and it s difficult to conceive of composing a
piece that successfully incorporates interactive control by an
unknown audience. But first of all, how certain are we that
compositional determinism of form and content is the main
reason for the success of a music performance? We have
certainly all witnessed bad, lifeless performances of wellwritten music, and we have also witnessed plenty of
compelling improvisations. The conditions that frame a
performance, and the expressive and creative input of the
performers, can be enough to create good music in a variety
of forms and with a wide variety of content. And why
should we apply traditional criteria of what constitutes a
rewarding artistic experience for an audience, in this new
case of audience interaction? The old model is based on the
audience as passive observers of music-making. This new
model proposes audience members as active participants in
the music-making, interacting with intelligent control
systems.
To summarize, true interactivity demands that both human
and computer engage in both original action and responsive
reaction, to create mutual influence. The computer s ability
to do these things in real time demands that the human
performer also do them in real time, that is, improvise. An
improvisation with an interactive instrument may be more
interesting to do than to watch; this implies that audience
participation may be in order.
AREAS FOR FUTURE EXPLORATION
In addition to the possibilities for virtual instruments and
the new exigencies of interactivity outlined above, working
with virtual instruments, video motion-tracking, and
dancer-controlled music provides many other new avenues
of exploration.
—Often discussion of alternative controllers and interactive
instruments focuses too narrowly on control of pitch
material and traditional music constructs. But digital sound
generators open up the music to a whole world of sound.
Digital sampling of recorded sound (pre-recorded or
captured in real time) allows one to explore other
relationships, such as gesture-to-text.
—Given that a virtual instrument (or any computermediated instrument) is just a controller of numerical data,
the controller can be used to shape other digital media. One
can thus explore other relationships, such as gesture-toobject movement (video-controlled animation), and even
gesture-to-image/video (which, in the case of motion
tracking, is video-controlled video).
—New inexpensive wireless cameras present many
promising possibilities, such as dancers carrying cameras or
wearing cameras attached to their body. In this way the
interface to a sensing program such as VNS can move
about the space, personally directed by the performers.
—As noted earlier, controllers can influence not just notes,
but internal aspects of notes (timbre, dynamics, etc.) and
new musical parameters unique to electronic music
(modulation, filtering, spatial location, granular note
density, etc.). Employing scheduling and storage
techniques (extreme delay, capture and storage of data,
reordering of events, etc.) one can shape a larger formal
structure in real time.
CONCLUSION
The discourse regarding the design of new interfaces for
music mostly focuses on technical issues and engineering
challenges. A fascination with novelty drives not only the
design of the instruments, but also the way they are used.
But this is no longer such a new field that novelty alone
can suffice. It is necessary to analyze the instruments
effectiveness in terms of their artistic usage, and time for
the musicians who work with them to discuss what they
have learned up to this point. This article reflects my
attempt to categorize and represent some of my recent
confrontations with compositional and programming
problems while working with interactive virtual computermediated instruments.