Skip to main content

Interactive Music in Videogames: We look at the options, the methods and the impact11 min read

By West B. Latta

Whether you’re a game developer, game player, or carry only passing interest, it is plain to see the growth and advancement of the video game industry over the past decade. Robust graphics systems, ample disc space, bountiful system memory, and dedicated DSP have all become increasingly common on today’s game platforms.

While this continues to drive the look, feel, gameplay, and sound of games, it can be said that, to a large degree, high-profile, large budget games have increasingly looked to film as their benchmark for quality. Achieving a true ‘cinematic’ feel to a game seems to be the hallmark of what we now consider ‘AAA’ games.

As game technology progresses, it is useful to look at not only the ways in which the technological aspects have improved, but also how design and artistic approaches have changed in relation to changing technology. With regards to music, what is it, specifically, about cinematic music that works so well? In this brief article, we’ll take a look at how changing technology has altered our perception and application of what music in games should be.

Where We’ve Been

In the early years of games, music was predominatly relegated to relatively short background loops, generated by on-board synthesizer chips and various systems of musical ‘control data’ that would trigger these pre-scripted musical sequences. While not unlike our use of MIDI today, these systems were typically proprietary, and learning the language and programming of these systems was no mean feat for a workaday composer.

And yet, these were the ‘iconic’ years for video game music – where the Super Mario jingle, the Zelda theme, and many other melody-heavy tunes were indellibly imprinted on the minds of a generation. The limitations of the sound systems in these consoles were, in themselves, a barrier to creating anything other than relatively simple, catchy tunes.

As we progressed into the mid and late 1990’s, technologies afforded us higher quality sounds – with higher voice counts, FM synthesis and even sample playback through the use of wavetable soundcards. Though the sounds were often highly compressed, the playback of real, recorded audio was a leap forward for home consoles and computer games. PCs and even some consoles moved to more MIDI-based or tracker-based musical systems, and so were somewhat easier to compose for than their earlier predecessors. Even so, musical soundtracks didn’t drastically advance beyond the simple, background loop modality for quite some time.

In the mid to late 1990’s, however, we began to hear a shift in game soundtracks. While simple backgrounds were still the norm, there was a sort of “mass-exodus toward pre-recorded background music”(1). there were a few higher profile titles that were afforded a greater percentage of budget, disc space, and system resources. This all added up to a slow, but perceptible shift toward the elusive ‘cinematic’ feel of film. I still remember watching the opening cinematic for Metal Gear Solid 2: Sons of Liberty and thinking to myself, “This can’t be a videogame!” The quality of the voice acting, the soundtrack – the entire game felt, to me, like a dramatic leap forward. This was but one example among many titles that set out to push the boundaries of audio in games.

During the past 10 years, we have seen rapid and dramatic changes in the technology, artistry, and application of music in video games. Disc-based game platforms came to the fore with the release of the Sony Playstation 2 and Nintendo Gamecube early in the decade, and higher powered consumer PCs became increasingly more affordable. As a result, we hear a definite shift in musical scores, with significantly longer runtime, more complexity, more robust instrumentation and arrangement, higher quality samples, and even CD-quality orchestral recordings.

Where Are We Now?

At present, we’re steeped in the current generation of gaming systems. Xbox 360, PS3, Nintendo Wii, and PC gaming have grown to include full HD video resolution and high-quality 5.1 surround sound. Low fidelity, synthesized or sample-based soundtracks have given way to fully arranged and orchestrated scores, recorded by world-class symphonies. While they haven’t yet become household names like Zimmer, Williams or Goldsmith, well-known game composers are highly sought after as developers continue to strive for a more cinematic feel to their games. Truly, some game soundtracks rival those of major motion pictures in quality, scope and performance. This trend has even given way to a small ‘video game soundtrack’ industry, with record labels devoted specifically to releasing and promoting game sountracks to the mass market via CD and digital download.

Moreover, the sound of classic and contemporary video games have increasingly gained mainstream popularity as the synthesizers of old platforms such as Gameboy, C64, and NES have made their way into popular music by some of today’s biggest musical artists. Likewise, game soundtracks are increasingly being presented to the public in unique ways. Bands such as The 1-Ups, The Minibosses and Contraband present re-arranged versions of old game tunes on live instruments, while live orchestras perform soundtracks via events such as Video Games Live.

While it is undeniable that the quality and scope of game music have, in some cases, grown to match that of film, it simply isn’t enough. Games are an interactive medium, and as such, the presentation of musical soundtracks must also be able to adapt to changing gameplay. To get a truly immersive experience, the music in games must change on-the-fly according to what is happening in the game, while still retaining a cinematic quality. Rigidly scripted musical background sequences can’t impart the same level of depth as music that truly matches the moment by moment action.

Surprisingly, adaptive and interactive music schemes have been used in games for longer than we realize. Even the original Super Mario Brothers music changed tempo as the player’s time was running out. Yet making highly interactive, high-quality, orchestral scores adds a layer of complexity seldom attempted by many game developers. Instead, many continue to rely on simple geographic and ‘event’ triggers for our accompaniment, rather than a truly adaptive music system.

While some developers have attempted to tackle this issue themselves, many of their solutions are proprietary. To go a bit deeper into interactive music, we will instead turn our attention to middleware developers. Firelight Technologies – makers of the FMOD Ex audio sytem, and Audiokinetic – makers of Wwise – the two premier audio middleware providers for today’s most popular AAA titles.

FMOD Ex

Firelight has taken a unique approach to dealing with interactive or adaptive music. Their FMOD Designer system allows two distinctly different approaches. Through their Event system, the composer can utilize multichannel audio files, or ‘stems’. This allows certain individual instruments or sections to be added or subtracted based on game states, or any other dynamic information fed into the FMOD engine such as player health, location, proximity to certain objects or enemies, etc. This technique was used to great effect in Splinter Cell:Chaos Theory, where, depending on the level of ‘stealth and stress’ of the player, different intensities of music would begin to brought in. This type of layering is often called a ‘vertical’ approach to music system design.

The second approach FMOD takes is through their Interactive Music system. This system takes a more ‘logic-based’ approach, and allows the designer to define various cues, segments and themes that transition to other cues, segments or themes based on any user-defined set of parameters. Moreover, this particular system allows for beat-matched transitions, and time-synchronized ‘flourish’ segments. In this way, a designer or composer might break down their various musical themes into groups of smaller components. From there, they would devise the logic that determines when a given theme, for example “explore” is allowed to transition to a “combat” theme. This segment and transition based approach is often referred to as a ‘horizontal’ approach.

A system of this kind was used in the successful Tomb Raider: Legend. For that particular project, composer Troels Folmann used a system which he devised called ‘micro-scoring’, crafting a vast number of small musical phrases and themes that were then strung together in a logical way based on the players actions throughout the course of the game. For example, the player may explore a jungle area with an ambient soundtrack playing. As they interact with an artifact or puzzle, a seamless transition is made to a micro-score that is specific to that game event.

Audiokinetic Wwise

Wwise is relatively new to game development, gaining popularity over
the past several years with its first major debut in FASA Interactive’s
Shadowrun. Since that time, Audiokinetic has rapidly enhanced their system,
and their interactive music functionality takes a ‘best of both worlds’
approach.
With Wwise, it is possible to have both multichannel stems as well as
a logic-based approach to music design. A composer can create a series
of themes with time-synchronized transitions based on game events or states,
while simultaneously allowing other parameters to fade various stems in
and out of the mix. This system incorporates both a horizontal and vertical
approach to music design, and it has resulted in an incredibly powerful
toolset for composers and audio designers.

What’s next?

The term ‘videogames’ now seems to encompass an entire spectrum of interactive
entertainment in all shapes and sizes: casual web-based games, mobile
phone games, multiplayer online games, and all manner and scope of console
and PC games. It seems impossible to predict the future of interactive
music for such a variety of forms, and yet we have some clues and ideas
about what might be next for those AAA titles.

First and foremost – we can be sure that the huge orchestras and big-name
composers aren’t going away any time soon. In fact, as more games use
the ‘film approach’ to scoring, it seems more likely that it will continue
to be the standard for what we consider blockbuster games. Fortunately,
tools like FMOD and Wwise have given composers and audio designers robust
tools to adapt and modify their scoring approach to be more truly interactive
with the game environment. As this generation of consoles reaches maturity,
we will yet see some of the finest and most robust implementations of
interactive music, I’m sure.

Even so, pre-recorded orchestral music – however well designed – will
still have some static elements that cannot be changed or made truly interactive.
Once a trumpet solo is ‘printed to tape’, it cannot easily be changed.
Yet the technological leaps of the next-generation of consoles may present
another option. It isn’t unreasonable to think that we may see a sort
of return to a hybrid approach to composing, using samples and some form
of MIDI-like control data. While at first this may seem like a step backward,
consider this: with the increasing quality of commercial sample-libraries
of all types, and the extremely refined file compression schemes used
on today’s consoles, it is possible to think that the next Xbox or Playstation
could, in fact, yield enough RAM and CPU power to load a robust (and highly
compressed) orchestral sample library. The composer, then, is hired to
design a truly interactive music score in a format akin to MIDI – note
data, controller data, as well as realtime DSP effects. This score, then,
would not only adapt in the ways we’ve described above (fading individual
tracks in and out, and logically transitioning to new musical segments)
– but because we have separated the performance from the sample data,
we would now have control over each individual note played. The possibilities
are nearly endless – realtime pitch and tempo modulation, transference
of musical themes to new instruments based on game events, and even aleatoric
or generative composing, which assures that a musical piece conforms to
a given set of musical rules, yet never plays the same theme twice.

Indeed, thesse possibilities and more are surely coming, and it is an
exciting time for composers, audio designers, and gamers alike. For now,
we can enjoy a new level of attention and awareness on game music. We
are treated to truly orchestral experiences, if not completely adaptive
and interactive ones. And yet, in the coming years, interactive music
technology will continue to mature, and we will assuredly hear more sophisticated
implementations of these technologies across the full spectrum of games.
I encourage you to listen closely to the games you or your friends play
over the next few years. The tunes you hear today are helping to shape
a musical revolution for tomorrow.

 
Footnote: 1 – Gamepro
– Next Gen Audio Will Rely On Midi

About the author: West Latta has been making strange noises for over 30 years. He has spent the last several years developing his craft in the game industry as composer, sound designer, and integration specialist. He is currently a Sound Supervisor for Microsoft Game Studios, as well as a freelance writer, composer and audio designer.

Bjorn Lynne

Bjørn Lynne is a Norwegian sound engineer and music composer, now living and working in Stavern, Norway. He was also known as a tracker music composer under the name "Dr. Awesome" in the demoscene in the 1980s and 1990s when he released tunes in MOD format and made music for Amiga games.