Sending audio to a speaker directly in Unity

Sending audio directly to a speaker in Unity

If you develop games using Unity, you will probably never have the need to target audio directly to a single speaker in a surround set-up, but in the world of interactive installations, this is something you will come across sooner or later. For instance, in the spiral sculpture projection mapping project I did in Moscow, there are four touch screen kiosks getting audio from sound showers located above them. Because it isn’t really possible to run a cable from the kiosks directly to the sound showers, the audio comes from the server, positioned in the rack with all the audio equipment.

Each kiosk is completely independent – games are played on each kiosk individually, so the sounds should also only be audible on the kiosk requiring them. Unity just has a single audio listener however, and it is solely in charge of sending audio to the driver. The server application needs to send audio to each client’s speaker separately, though. There is no built-in way of directly sending audio to a specific speaker, apart from stereo panning, which will not do you much good if you need more than 2 speakers.

The audio listener is spatially aware, so my first solution was to simply set up audio sources in such a way that they are only audible from a single corner, setting their spatial blend to fully 3D. If you then change the project’s audio settings to 5.1 or 7.1 surround (not quad!), Unity will output your sounds to the targeted speaker. It does the job for four speakers, but any more and you’ll likely get in trouble, as center speakers muscle in between your lefts and rights.

Targeted speaker surround setup in Unity
Setting up audio sources to get audio from a specific speaker. The camera represents the audio listener.

While this worked as desired in the test setup using the on board sound chipset’s surround output, it ceased functioning when an external ASIO sound card was introduced. The card has 4 outputs, and therefore doesn’t present itself as a 5.1 surround output to the operating system. Unity would see it as a 3.1 system, and always send audio to the fourth speaker as well as the targeted one, regardless of the project or device audio settings.

I got around this by changing the audio clips directly, creating a more robust solution to this problem in the process. While you typically work with mono or stereo sound effects, Unity can also play sounds with more channels. So, the solution is to create audio clips that have sound only on the channels you desire, leaving the others blank. The audio source’s spatial blending must be entirely 2D in this case, as Unity will otherwise change the output speakers based on the position of the source and the listener.

There are two ways you can go about this: changing your existing audio clips at runtime, or making them manually ahead of time. It depends on your project’s requirements – if you need the same sounds in different speakers, as I did, it makes far more sense to do so at runtime, so let’s start with that.

You will need Unity’s AudioClip.Create() function – don’t be fooled by the [deprecated] attribute in Intellisense, only the last two overloads with the _3D argument are deprecated (they got superseded by the spatial blend system). This function will create our new audio clip with the required amount of channels, one for each speaker in your setup.

Next, we need to copy over the audio samples from the original clip to the target channel(s) in the new clip. Unity interleaves the samples for each channel into a single one dimensional array. It looks a little bit like this for a four channel audio clip, where S represent the sample index, and C the channel index:

This is a code sample for creating an audio clip for a single target speaker – I leave it as an exercise of the reader to fill in more target channels if he needs them. The mapping of channel index to target speaker is provided at the bottom of this article.

That’s all there is to it – playing these 2D, multi channel audio clips will send your audio to that speaker only. A word of warning: it is your responsibility to destroy these audio clips once you don’t use them anymore, lest you leak memory. Be sure to check the memory profiler and take a thorough look at the AudioClip counter.

This method creates a fair bit of garbage too, as large float arrays are allocated for both the original audio clip, and the new surround one. It depends on your project if this is a problem. As always, profile first, then optimize only if need be – you could create these audio clips only once and cache them, for instance. If you have long sounds, this might also be necessary to avoid performance spikes caused by long loops – neither the garbage collection nor the loops were a problem in my case and I ended up creating a new audio clip every time a sound is played, but your project might be vastly different.

If you know ahead of time that you only need to play a sound from specific speakers, you can also manually create them. You could save the audio clip created with the code above from Unity using an editor extension for instance (all hail the mighty ScriptableWizard!), or create them in your favorite audio software, as I will discuss next.

I’ll be using Audacity as an example, since it’s free and widely adopted. First off, you need to change its export options. Under Edit > Preferences > Import / Export, tick “use custom mix” instead of the default “mix down to stereo or mono”. As the name indicates, this will allow you to export surround sounds.

audacity settings for surround export

Next, import the clip you wish to edit. If it’s a stereo clip, you will need to split it to two mono tracks.

audacity split to mono

Using ctrl-shift-N, create as many tracks as you need for your speaker set-up. Now move the audio to the target channel index – see the table below to know which one. Finally, export the audio, and import it into Unity. All done!

Here’s hoping this article will be of use to other Unity devs out there, seeking direct speaker control.

And here’s the mapping of channel index to speaker, courtesy of Wikipedia:

Channel index Speaker
0 Front left
1 Front right
2 Center
3 Bass
4 Rear left (5.1) | Surround left (7.1)
5 Rear right (5.1) | Surround right (7.1)
6 Rear left (7.1)
7 Rear right (7.1)