Home Learn About MIDI About US Career Center Public Forum Store
Tutorials Resources Fun With MIDI MIDI Products Tech Specs & Info Glossary
 
XMF - eXtensible Music Format XMF Specifications
XMF FAQ
XMF Authoring Tools

Introducing the Interactive XMF Format ("iXMF")

Excerpted From: "Interactive XMF" (2003, Gamasutra)
By Linda Law, Chair, IASIG iXMF Working Group

Interactive XMF ("iXMF")is a special type of XMF file being developed by the iXMF Working Group ("IXWG") of the Interactive Audio Special Interest Group ("IASIG").

A single iXMF file contains all of the information needed for a game soundtrack, or a level, or a character, or any other scope. This includes all media and all information necessary to play that media as the audio artist intended. An iXMF file has its own internal folder tree, starting at a Root folder. This Root folder contains metadata fields, a Cues folder, a MediaChunks folder, a MediaFiles folder, a Transitions folder, a PositionRules folder, and a Callbacks folder. The metadata fields at this root level may contain artist notes and other general information about the soundtrack, and default values for variables.

Using no custom metadata fields and no custom scripts, iXMF is intended to address eighty percent of an audio artist’s needs, eighty percent of the time. Scripting capabilities and artist-defined metadata fields allow the basic functionality to be extended in any manner that is desired by the audio team.

The Cues folder contains all of the cue description resources for the soundtrack and may contain files that provide information for setup and teardown of the soundtrack. Each cue file has a tagged or indexed list of one or more links to media chunk files called a chunk pool, and metadata necessary for using those media chunk files. The cue metadata, like the root node metadata, may contain general information and default values for variables. These variable settings take precedence over their root node values and apply to all media chunks in the cue’s chunk pool. The MediaChunks folder contains all of the media chunk resources for the soundtrack. A media chunk file may contain some of the same metadata fields as a cue. Its variable settings take precedence over those of the cue that refers to it in its chunk pool and apply only to itself. Examples of standard metadata fields are MediaFileID, MediaType, DefaultMediaHandling, DefaultSyncGroup, DefaultTempoMapID, DefaultTransition, DefaultMixGroup, GainTriminDB, and RAM_Usage. In all cases, there will be standard metadata fields, but there is always the option to add custom ones, allowing audio teams to extend the functionality of iXMF in any way they choose.

The MediaFiles folder contains all playable media files for the soundtrack, or pointers to playable media files that exist external to the iXMF file. Playable media files can be any type of audio asset file such as WAV, standard MIDI, AIFF, SDII, etc.

The Transitions folder contains all transition definition resources, the PositionRules folder contains all position rule definition resources, and the Callbacks folder contains all callback definition resources that will be used by the soundtrack. Some of these files will be predefined; some will be scripts created by the sound artist using a simple scripting language. Much of what the audio artist will need in the way of transitions, position rules, and callback definitions will be predefined. Examples of predefined transitions are cross-fades and but edits. Predefined position rules will include start at chunk beginning and start at next bar, and predefined callbacks will include cue end and chunk end. Scripting will allow more exotic and case specific media control.

The Soundtrack Manager

Since iXMF is a cross-platform solution, some platform-independent middleware is needed. This middleware is called the Soundtrack Manager. Its job is to manage the performance of the soundtrack and all of the audio content resources that combine to create the soundtrack. The Soundtrack Manager can be specific to a single game, or a group of games, or a development house, ad infinitum. It supports the same advanced interactive audio feature set on any platform, while also allowing for access to platform-specific features.

The Soundtrack Manager receives high-level requests for interactive audio services from the game and handles them by coordinating the operation of multiple, platform-specific, low-level media players. It supplies these players with sound media stored in the iXMF media files, and controls the players via a small set of simple audio commands that are passed to system-specific Playback API’s via an Adapter Layer. It can also send information back to the game via callbacks or shared variables.

For each platform that will host the game, an Adapter Layer for that platform must be written to communicate between the Soundtrack Manager and the platform’s native API’s. So the Adapter Layer code is platform specific, while the Soundtrack Manager code and audio content are platform independent.

At this point some terms that are used in conjunction with iXMF should be defined. These are media chunk, cue request, and cue. A media chunk is any piece of playable media data. It can be an entire audio file, or a defined contiguous region of an audio file, or a Standard MIDI File, or a defined contiguous region within a Standard MIDI file. The continuous soundtrack is built by stringing media chunks together, and sometimes by layering them. A cue request is an event that the game signals to the Soundtrack Manager, and to which the Soundtrack Manager responds with a corresponding action designed by the audio artist at authoring time. That action is called a cue. A cue can contain any combination of services or operations that the Soundtrack Manager can perform. In most cases a cue will contain a playable soundtrack element but it may also be used to perform other Soundtrack Manager functions that don’t result in something audible, such as setting a variable, loading media, or executing a callback to the game.

The Soundtrack Manager controls the audio playback by providing, at a minimum, the following functionality in response to cue requests:

  • responding to game sound requests by playing appropriate sound media, sometimes influenced by game state
  • constructing continuous soundtrack elements from discrete media chunks, whether via static playlists or dynamic rules
  • dynamically ordering or selecting which media chunks get played, sometimes influenced by game state, sometimes to reduce repetition
  • mixing and/or muting parallel tracks within media chunks
  • providing continuous, dynamic control of DSP parameters such as volume, pan, and 3D spatial position, sometimes influenced by game state, sometimes to reduce repetition
  • controlling how media is handled, including how it is stored and how it is played back
  • handling callbacks

While a game is running, the flow will look something like this. An event will happen, or a condition will arise, in the game and the game will recognize that it needs to send a cue request to the Soundtrack Manager. The Soundtrack Manager will access the appropriate playable sound media along with its interactivity data and play the media according to its artist-specified playback parameters. It does this by passing instructions to the Adapter Layer, which will in turn pass instructions through to the playback API. The interactivity data associated with the media that just played may also include instructions for the Soundtrack Manager to pass data back to the game, which the obedient Soundtrack Manager dutifully performs. Bless its little heart.

Impact of iXMF

With widespread use of iXMF, an audio artist would be able to create audio content and save it in a format that can be read by any game audio playback engine. This could allow a composer to choose and master whichever tools he likes, hopping between them to use his favorite features of each without losing or damaging any file information along the way. Workflow would no longer need to be reinvented for every game, every developer, every platform. An audio artist’s experience would be leveraged because the skills he develops and the terminology he uses would be consistent from one work situation to the next. The rework and file management associated with porting a game to another platform would be eliminated. The existence and acceptance of a standard file format would also encourage development of new and better tools because the user base would be large enough to support this development. And all of this would combine to create an audio development environment in which interactive sound designs can be better sounding, more interesting, and created more rapidly.

Composer George ("Fatman") Sanger excitedly pictures an iXMF future: “Imagine what happens when, instead of just three or four people at each company having access to these tools, millions of people gain that access. College students and home tinkerers and accomplished musicians, engineers and artists will begin not only using the tools, but also creating their own tools and their own ways of writing to and reading this file format. Soon the most avant-garde creative dreams will be realized with ease. As happened with MIDI, as the bank of available tools mounts up, it will become easy to write efficiently, easily, and freely for games, for websites, for amusement parks, for “can’t-play-a-wrong-note” toys, for interactive movies, songs you can change as they play, instruments you can wear like clothes, dogs skywriting with rocket packs, and, before I embarrass myself further, other hitherto unheard-of situations. The only predictable thing is that it will be unpredictable.”

And from the perspective of the game audio programmer, audio engine developer Martin Wilde says, “a game audio engine should afford the audio artist the ability to create a seamless, interactive game score with as little programming intervention as possible.” Traditionally, the generation of a good game audio engine has required a great deal of programming effort. And that effort generally results in an audio engine that runs on only one platform and is focused on implementing what Wilde refers to as low-level operators, which are “bits of code or functions that load and play digital audio or MIDI, but have no higher musical purpose or understanding.” Also, the audio artist must usually turn his creations over to the programmer for integration into the game, which to some extent puts artistic decisions, schedules, and priorities in the programmer’s hands.

Wilde says, “iXMF fixes both those problems. First, we have a very comprehensive description and specification of the high-level audio behaviors audio artists wish to have at their disposal. This is very important and significant! The members of the IXWG, representing literally decades of game audio-making experience, have collaborated on the description and specification of the methods, intelligence and building blocks to make interactive game audio soundtracks. This includes a description of a scripting language that allows audio artists to directly control the integration and presentation of their content with minimal impact on the programming team. Secondly, I as an audio programmer now have an outline from which to build a high-level game audio engine. I don't have to guess what kinds of things might be useful and how they should work in a musical context, I have it all in front of me. All I have to do is code it once, and I'm done. Well, almost. The iXMF spec also includes an Adapter Layer. This is the code that takes the high-level commands issued from the Soundtrack Manager and translates them into the low-level commands of the specific platform on which the game is running. All I have to do is write (or get the platform manufacturers to write!) the various adapter layers to the low-level, platform specific operators I talked about before. Then audio content can truly be authored once, and published many times across platforms.”

Once the initial tasks of writing Adapter Layers and the Soundtrack Manager are completed, the role of the audio engine programmer will change from what it usually is today. Instead of rewriting engines for every new platform and dealing with artistic audio implementation issues, the programmer can focus on what Wilde calls “clarification and refinement.” He adds, “I believe that as iXMF standards and sensibilities proliferate in the marketplace, audio artists and programmers will see a wealth of opportunities we can't even imagine today.”

iXMF can also encourage a much clearer division of labor by providing a means for conveying audio content and interactivity information from the audio artist to the audio programmer in a well thought out, well defined, standardized format. Sanger, describing it from the perspective of an audio designer, put it this way, “Here’s the sweet part. I think that the interaction between sound designer and programmer will become less like the push-and-pull that happens between a control freak having his dream house built and the drunken, arrogant, monomaniacal, but incredibly talented architect whom he has hired, and more like the interaction between a creatively gifted control freak sending a letter and the drunken, arrogant, monomaniacal, but incredibly talented mailman.”

Use of iXMF can also decrease time to market for a game’s release on its initial platform. This might not happen the first time a developer uses it, because, of course, there will be a learning curve associated with utilizing this format. But in subsequent games, time to market will be reduced due to the audio team’s familiarity with a format and workflow that does not change from game to game or platform to platform, and also due to the inherent simplicity and efficiency of a system that focuses artistic considerations on artists and programming considerations on programmers. And, of course, the amount of time and money needed to move that content to other platforms would be, at the very least, tremendously reduced.

If iXMF is widely adopted, there will be a reduction in the amount of training that a new audio team member will require, and an increase in the talent pool from which a developer may select when choosing a new member for that team, since format and workflow will be much more consistent from one development house to the next. Yet developers can always add custom features to enhance their games and distinguish them from those of other developers.

Ultimately, game audio quality itself stands to improve. This is not to say that it will make one of those linear audio snippets sound better, but the implementation of the audio can be vastly improved, resulting in a corresponding perceived improvement in audio quality by the game player. Wilde says, “As the flow of game audio becomes more seamlessly and interactively integrated into games, nay all multimedia presentations, the perceived overall quality of that experience will be greatly enhanced. iXMF wraps up all the bits you need to present that kind of audio experience.”

Wilde believes that widespread use of iXMF is inevitable. He says, “One thing that hasn't been talked about much is the tremendous explosion of content we will see packaged in iXMF files. iXMF will revolutionize and standardize the delivery format of all manner of music and multimedia content, and iXMF players will be the de facto standard everywhere. Content providers won't have to think twice about how to package their creations. Everyone will use iXMF, and then, watch out!” Sanger agrees, saying, “Developers who depend on maintaining their own technical edge over their competitors will be left behind a huge, rapidly advancing community. Developers would be well advised to implement iXMF rather than a proprietary tool, just as they would be well advised not to attempt to create their own internet.”

It all seems to lead to decreases in development time and money, and an increase in game audio quality. The audio artists, the engine programmers, and the development houses all stand to gain from the widespread acceptance and usage of the iXMF standard. If control of the art is kept in the artists’ hands, the result can benefit everyone.

For more informatiomn about iXMF, including the project completion schedule and work items, please visit the IASIG Web Site.

All materials, graphics, and text copyright © 1995-2008 MIDI Manufacturers Association Incorporated.
Use is prohibited without written permission.