40-channel surround sound at a museum

riredale wrote on 12/6/2013, 10:58 AM
In the Wall Street Journal today an interesting article about a recording of "Spem in Alium" by Thomas Tallis, a choral work by 40 singers and recorded on 40 individual tracks. Playback is from 40 speakers, arranged in a large oval inside New York's Cloisters Museum's stone chapel measuring perhaps 50x150 feet. The museum visitors are able to walk among the speakers (on 5-foot-tall stands) during the performance, and can hear each singer's contribution. According to the article the experience brings out strong emotions including tears and embracing.

Now, just how can I get my laptop to record 40 channels? I'm pretty sure Vegas could easily handle 40 audio tracks.

Comments

musicvid10 wrote on 12/6/2013, 11:56 AM
Focusrite makes 40- and 56-channel audio interfaces for Firewire/Thunderbolt. A laptop should have little issue keeping up.
Small studios used to record that many channels on Pentium.
bdg wrote on 12/6/2013, 12:25 PM
You don't need to record 40 channels at all, just 4.
The trick is to use an Ambisonic mic such as the TetraMic.
Those 4 channels then become your masters and you can produce from them as many channels as you have physical playback channels.
What is also nice is you can continue to produce different playback formats, for instance stereo or 5.1 at any time in the future. As long as you keep the original 4 channel master (or A format as it's called)
But.
It's not as easy as it sounds.
For one thing - if you like close micing (as I do) you can't get more than four or five people (playing acoustic instruments and doing vocals) around the mic in the forward half of the mic - so it sounds good an a 5.1 or stereo system.
Your 40 people would need to be, oh, perhaps 20 or 30 feet from the mic in a circle. You'd need good acoustics for that to work.
riredale wrote on 12/7/2013, 12:52 AM
I think the whole point of this museum setup was that during the recording session every single choir member had their own microphone and exclusive recording track. So if you stood in front of just that loudspeaker during playback all you would hear (for the most part) was just that one singer. Kind of a cool idea.
Rob Franks wrote on 12/7/2013, 6:37 PM
"The trick is to use an Ambisonic mic such as the TetraMic. "

Discovering a "trick" was not the goal.

musicvid10 wrote on 12/7/2013, 7:16 PM
Synthetic 4 channel surround != 40 discrete channels.
bdg wrote on 12/7/2013, 9:55 PM
The 4 channel surround sound captured by a classic Ambisonic mic is *not* synthetic. The 4 channels contain the globe of sound that exists at the single point in space in the centre of the capsules (the software takes care of the fact you cannot physically have the 4 capsule in that one point.)
Note: it is not 4 channels at that point in space. All the 4 channels are is the information (audio) that correctly describes *every* sound at that point in space together with its direction. (Actually 4 capsules is the minimum required.)
In theory this is the perfect way to capture surround sound. In practice, as I said before, it's a lot harder than it seems on the surface.
Whether you want one playback channel or, golly I don't lnow what the practical limit is, 59? 199? 40? - you can get it from that point in space (the 4 recorded channels). And your playback will be perfect surround sound, technically. From an artistic point of view it may well not be perfect.
So your statement: Synthetic 4 channel surround != 40 discrete channels. may well be correct when applied to some other subject. However your statement, when applied to Ambisonics, is just plain incorrect.
musicvid10 wrote on 12/7/2013, 10:12 PM
Ambisonic is synthetic in that a number of discrete channels are extrapolated into a larger number of synthetic channels. Reminiscent of early Dolby matrix surround and PL. The rest is jabberwocky. This topic is about 40 discrete audio channels. Ambisonic doesn't do that, nor anything remotely close. If what you suggest was true, Dolby and DTS would be gripped in a bidding war for control of the technology. They are not.

Bottom line: one will not recreate 40 pristine individual voices from 4 single-point sources, no matter how the mics are positioned or matriced.
tanstaafl

bdg wrote on 12/8/2013, 11:45 AM
What you say about recording with an Ambisonic mic and the associated playback is wrong.
musicvid10 wrote on 12/8/2013, 1:15 PM
[I]"You don't need to record 40 channels at all, just 4."[/I]
Eagerly awaiting your demonstration project.
farss wrote on 12/8/2013, 1:28 PM
The Ambisonic mic does record all the data required to reconstruct the entire sound field at that point in that space.

There's two problems.
How do you put the sound field back, there's no loudspeaker equivalent of the Ambisonic mic.
Knowing everything about that space so the sound goes back into the exact same space or even having all the data to remodel the space which would still be a massive computational problem.
Think about it, you have x number of people in a space. You take the people out of that space after you've made the recording, the space has changed.

The theory behind the Ambisonic is solid and the recording technique is widely used however there's very real world and theoretical limitations to how far it can be pushed. You can get great stereo and surround from the recordings however the placement of the mic is critical. If the theory really held up it wouldn't be.

Bob.
musicvid10 wrote on 12/8/2013, 2:17 PM
Reconstructing a "sound field" and isolating 40 discrete point audio sources are entirely different considerations, and I think bdg knows that.

Had a student tell me there was midi software to rewrite every instrumental part exactly from an orchestral recording, recreating the original score. Of course it doesn't work that way because multitimbral polyphony is incredibly rich and complex, same as a "sound field."

I am, however, working on a way to put the toothpaste back in the tube!

Rob Franks wrote on 12/9/2013, 6:41 AM
"The 4 channel surround sound captured by a classic Ambisonic mic is *not* synthetic. The 4 channels contain the globe of sound that exists at the single point in space in the centre of the capsules (the software takes care of the fact you cannot physically have the 4 capsule in that one point.)"

And in theory you would be correct. It's a lot like these on board 5.1 mics (most of which only have 3 mic capsules). All you need are three channels and the others can be simply calculated. They produce good consumer type 5.1.... but you can hear the spacial errors involved on playback if you listen carefully enough.

However in practice it falls apart to a certain degree because of the error involved. If there is error in the mic system (and there always is), the density and temperature of the air.... blah, blah, blah, then that error is reproduced with increasing inaccuracy with each successive calculation. Like copying a copied key. If I keep copying the latest copied key then sooner or later the compounded error will produce a key that no longer works in the lock.

With 40 channels and 40 independent mics there is no calculating involved and therefore it is simply the individual error of each mic. The end result of course is a much more accurate reproduction.... and THAT is the goal.... NOT trying to skimp on channels and mics
Chienworks wrote on 12/9/2013, 8:26 AM
"The 4 channel surround sound captured by a classic Ambisonic mic is *not* synthetic. The 4 channels contain the globe of sound that exists at the single point in space in the centre of the capsules (the software takes care of the fact you cannot physically have the 4 capsule in that one point.)"

The flaw in that is that you only get the correct reconstruction based on that single point. The entire appeal of the museum exhibit is that you can wander around and approach the site of each individual singer. The playback has to work for a moving point, and Ambisonic doesn't do this at all. Moving around the space may make one or two quadrants slightly louder while others diminish, but it cannot duplicate the effect of arbitrarily walking up to a single performer.

It's like the difference between 3D TV and a hologram. 3D lets you see which objects are closer and in front of those farther away. A hologram lets you turn the image around at whim and look at it from any direction and position you wish, something that simply can't be done with 3D, unless the camera operator knows where you want to go and moves there for you. With the Ambisonic system, the only way to get this exhibit experience is if the mics moved around the room at the time of the recording to go wherever the visitors wanted to move during playback, which is of course impractical even for one visitor, and impossible for multiple visitors wandering around independently.

So, it is indeed true that 4 channel surround simply cannot duplicate the experience of 40 discreet channels.

I recall going to a Civil War battle surround display once as a child. The movie screen wrapped 345° around the audience. While all the seats faced "forward", there was action all around. There were also speakers strewn around the auditorium under various seats and sound effects and voices would come out of these speakers in different places at different times. It really heightened the feeling of being "there" in the middle of the battle, in a way that wouldn't have come across as vividly with only 4 speaker surround sound.
Chienworks wrote on 12/9/2013, 8:32 AM
Re: MIDI - i read an article in PC World magazine ages ago in which a computer guru was explaining the essentials of recording digital audio to his various readers. Keep in mind this is a time before even SoundBlaster cards had hit the scene, and having anything other that "beep" in your computer audio was a novel thing for anyone with a sub-$4000 PC. The author admonished several times that everything needed to be recorded in MIDI only because any other method would use up more drive space than us mere mortals could afford. He explained this was ok because MIDI could reproduce every sound an instrument could make so there could never be any reason in any situation to ever record sampled audio.

The next issue's letters section had a reader comment on the article consisting of one single word. "Vocals?"
musicvid10 wrote on 12/9/2013, 12:26 PM
Of course, MIDI is not sounds. It is a series of off-on switch banks that send signals to a controller, which could be anything, such as lighting automation or game controller. The sounds come from a slaved synthesizer, which is of course, sampled or generated audio loops.
Rob Franks wrote on 12/9/2013, 3:17 PM
"The flaw in that is that you only get the correct reconstruction based on that single point. The entire appeal of the museum exhibit is that you can wander around and approach the site of each individual singer"
But you would be able to walk around which is the point he's trying to make.

In theory you could use that one point to physically reconstruct each individual channel and their location relative to one another, and play that back on its own speaker as if it were recorded by 40 mics on 40 channels.
Consider the whole thing compared to gps... if you have as little as 3 satellites and through triangulation, you can come close to pinpointing an exact location. You can do the same thing with sound, but the calculations are quite a bit more complicated, Dolby does this with their 5.1 mic systems on consumer cameras. With only three mic capsules, these cams put out 5.1 channels with the other two being calculated (then of course the .1 channel).

The various errors involved however (mic errors... the density of the air and how much further the sound must travel through it... blah, blah, blah) would prevent anything nearly as accurate.
Chienworks wrote on 12/9/2013, 5:13 PM
I can't quite agree. One of the goals with surround sound is to give the entire audience a fairly similar experience no matter where they sit in the sound field. Of course, people on the far left are going to hear the left speakers a little more loudly than those on the right, but not necessarily overwhelmingly so. Surround sound will maintain your screen-based orientation when you move your position.

The point of this exhibit is the opposite, in that when someone moves to the spot where a singer had been, they hear THAT singer as if they were standing right next to him/her. With 40 different singers in 40 different spots, there's no way that fewer than 40 channels can recreate that experience.
ChristoC wrote on 12/9/2013, 5:21 PM
> With 40 different singers in 40 different spots, there's no way that fewer than 40 channels can recreate that experience.

Precisely! Strangely, I am creating exact same sort of thing (with 40 voices) for an arty festival in a couple of months; the audio is the easy part with the right gear (soundcard with 128 Inputs and 128 Outputs) - also there's 40 Ultra High Def videos to go along with it - now that part is complicated!
Rob Franks wrote on 12/9/2013, 9:55 PM
"Surround sound will maintain your screen-based orientation when you move your position."
No it won't... and doesn't.

You can maximize the surround experience at one point or GENERALIZE it over several different points, but the more you generalize it the more inaccurate it becomes.

Not sure if you have heard of the audyssey auto eq system but I have it on my 5.1 surround receiver. The idea is to plug in a mic and set it at the central most seat and then press the go button. A series of audio tests are done coupled with some calculation time and the end result is a perfect 5.1 balance (eq and gain) to that seating position.

Now the Audyssey system does allow for averaging and adjusting for up to 6 or 7 different seating positions (which I tested just for the heck of it) and it does work... but then when you go back to the original (center) seating position you can hear a difference. In other words the center position becomes less accurate with each additional position you ask Audyssey to consider.

The bottom line is that you can adjust surround sound for a single point and have it extremely accurate, or you can adjust it for a larger set of points at the cost of accuracy, and that accuracy of course will get progressively worse the further away you get from that central focal point. It's one of the resons I don't worry too much about surround sound when I'm in the theaters because it's best in the center and I like sitting down in front. The surround at home however I can't live without because I can perfect it to my particular seating position.

"The point of this exhibit is the opposite, in that when someone moves to the spot where a singer had been, they hear THAT singer as if they were standing right next to him/her."
And again, in theory you can do that with one 3 or 4 channel mic in one location, by adding/subtract volumes, phases and gains from each mic. You can do it with your own ears to a certain extent and you only have two. You can work out basic depth with two ears (when somebody is standing behind you, you know that they are far or close... left or right). The errors involved however with 4 mics (vs 40) will make it much less accurate.
musicvid10 wrote on 12/9/2013, 10:51 PM
"Eagerly awaiting your demonstration project."

Rather than carry this discussion past the point of absurdity, perhaps Mr. Franks would like to join bdg in taking up this as-yet unanswered challenge. As easy as you guys make it sound, it should be easy to whip up a proof of concept say, by the first of the year? 40 discrete voices from 4 stationary audio recording points is what I understand you are proposing (or twenty voices, if you can only muster a chamber choir). No cheating by switching the sources and the output.

"
farss wrote on 12/10/2013, 2:09 AM
I have no idea how you would evaluate such a proof of concept as a pass or fail.

On the other hand several years ago now at IBC I heard "surround sound" from a single speaker and it most certainly worked and was a commercially available product, a quite expensive one. Of course it wasn't just one driver, there was around 300 piston style 300 drivers in the thin box. It really nothing more than an extension of the phased array speaker concept used in large venues.

I still say the problem is being able to recreate the sound field because you need a point source and no such speaker exists. I think this was the reason Prof Bose used spark gaps as the transient sound source in his early work at MIT.

I'd also ask if this were as easy as some imply why we're now talking about 10.2 sound systems.

Bob.
Rob Franks wrote on 12/10/2013, 6:00 AM
"perhaps Mr. Franks would like to join bdg in taking up this as-yet unanswered challenge"
I think you have completely missed the point I was driving at.

Although it is entirely possible in theory... it FAILS in practice. Perhaps you should reread my posts.
musicvid10 wrote on 12/10/2013, 7:41 AM
With respect for your thoughts and standing, I think you missed my point.

[I]"Although it is entirely possible in theory..."[/i]
Just the theory being tossed around fails the test of physical acoustics on several fundamental levels -- making the thought of a test project (or continued speculation) silly. Acoustic propagation theory says it doesn't work on paper, not even if there was such a thing as a perfect, linear medium (not some benign "error").

With that, I loved the video, having done something similar 18 years ago with one of my choirs for a performance of the Kyrie. The audience was encouraged to stroll among and around the antiphonal choir lines to experience the full effect of the piece. Best of the holidays.
Rob Franks wrote on 12/10/2013, 6:02 PM
"Just the theory being tossed around fails the test of physical acoustics on several fundamental levels -- making the thought of a test project (or continued speculation) silly"

So why then did you suggest a test project in the first place? You enjoy being silly? (your word, not mine)