[HN Gopher] The human ear detects half a millisecond delay in sound
       ___________________________________________________________________
        
       The human ear detects half a millisecond delay in sound
        
       Author : pizza
       Score  : 216 points
       Date   : 2021-08-05 03:56 UTC (19 hours ago)
        
 (HTM) web link (www.aalto.fi)
 (TXT) w3m dump (www.aalto.fi)
        
       | abeppu wrote:
       | I'm confused about why the methodology that they talk about is
       | innovative. They talk about shifting when a sound event of a
       | given frequency arrives, relative to another sound. But with
       | digital audio, esp that you can prepare in advance, isn't this
       | pretty simple? Like, given signals A and B, apply a band-pass
       | filter to A (for the desired frequency window), truncate off the
       | first k samples based on the time shift of interest, and add to
       | B. Even if you do need it in real time, why isn't this equivalent
       | to delaying B by k samples?
       | 
       | I must be missing something, because the researcher makes this
       | sound like a cool new technique.
        
       | vbphprubyjsgo wrote:
       | For some reason I thought this was about reaction time. _Of
       | course_ you can detect small variations in time. Even visually
       | the eye will see differences between a light on for 1ms and one
       | on for 10ms.
        
       | NDizzle wrote:
       | I remember way back when the original Xbox was out, and modded.
       | Whatever video player it had let you adjust the sound delay for
       | the (new at the time) mpeg4 rips of whatever movie or show you
       | downloaded.
       | 
       | It was like a light switch went off when you got the audio delay
       | synced up perfectly. I guess I recall this vividly because 1) it
       | was a huge problem and 2) the adjustment to audio delay was on
       | the shoulder buttons, while fast forward and rewind were on the
       | triggers. Such a large problem that the adjustment wasn't buried
       | in menus, but right there on the controller. Prime real estate.
        
         | Aerroon wrote:
         | Humans can ignore the desync between audio and video to about
         | -125 ms to +45 ms. Ideally you don't want to be off by more
         | than a frame or two though.
         | 
         | If the desync goes out of these bounds then watching someone
         | speak becomes uncomfortable.
        
           | nitrogen wrote:
           | Using a rough rule of thumb that one millisecond equals one
           | foot, it kind of makes sense that we would be able to follow
           | speech coming from someone up to 45 feet away. But the
           | tolerance on the negative side is surprising.
        
         | Guillaume86 wrote:
         | It might have been XBMC (xbox media center). Great software and
         | it still exists today on several platforms (PC, Android TV,
         | etc) under the name of Kodi.
        
       | BoxOfRain wrote:
       | I spent lockdown coming up with a music podcast with a friend,
       | it's amazing how much even a little delay can throw you off. I
       | have a hybrid digital/analogue setup where an analogue mixer is
       | fed either from a digital source by a DAC or an analogue source
       | via my microphone or a turntable. The analogue audio then goes
       | through some analogue processing equipment and is then fed into
       | the ADC and recorded/streamed digitally. It's not the most
       | efficient setup in the world, but I have a real thing for
       | flashing lights and physical switches!
       | 
       | If you talk into the mic with your headphones plugged into the
       | analogue side then switch to monitoring the software instead, you
       | can really notice the latency even though it's practically not
       | that much. I read about a technology that's supposed to prevent
       | angry customers from giving poor call centre employees a tirade
       | of abuse by echoing their voices back at them on a very slight
       | delay which is apparently quite intolerable, and I could believe
       | it!
        
         | Cyykratahk wrote:
         | This is known as Delayed Auditory Feedback
         | 
         | https://en.wikipedia.org/wiki/Delayed_Auditory_Feedback
        
         | toast0 wrote:
         | > echoing their voices back at them on a very slight delay
         | which is apparently quite intolerable, and I could believe it!
         | 
         | Sometimes I get this effect on my normal calls, and it is
         | pretty awful. Echo wasn't nearly so bad on circuit switched
         | calls, but now that everything is digital and has sampling and
         | codec delays, the echos come back so much slower.
        
       | AYBABTME wrote:
       | Something I've been wondering is, assuming that the brain works
       | at a couple dozen hertz frequency, how can we perform any
       | sub-10ms task?
        
         | chousuke wrote:
         | Wouldn't the brain send the signals pre-emptively such that
         | they trigger the action at the correct time, accounting for
         | delays? I don't think it's possible to perform any _conscious_
         | action that quickly, but if you 've trained to perform some
         | pattern, your brain would learn to compensate for any
         | processing delays.
        
       | archibaldJ wrote:
       | I was playing with my synthesiser the other day and realized what
       | I was really doing is using my eardrum to explore the search
       | space of oscillation patterns, and their combinatorial sets.
       | 
       | Music in this sense is the class of all time series of
       | combinations of such patterns that we find resonating as we are
       | able to encode and decode emotions and feelings from these sounds
       | (as well as appreaciation for them).
       | 
       | I wonder what other cool things we can do with this singal
       | processing ability of ours as we enter the age of brain computer
       | interfaces and psychedelics.
        
       | bob1029 wrote:
       | Its amazing how much difference .5ms can make in stereo imaging.
       | Any decent home audio receiver has the ability to set the
       | distance of each speaker individually so that you can compensate
       | for physical placement constraints.
       | 
       | If you get a stereo pair perfectly locked in on delay to
       | eardrums, you can produce an extremely compelling listening
       | experience for those in a very specific region of the room.
       | 
       | Finding out about all this can be revolutionary for some music
       | enthusiasts. Once you get a good listening setup (or headphones),
       | you start going through old things to see how the "stage
       | presence" sounds, or if you are now able to physically place each
       | instrument in the virtual space.
        
         | function_seven wrote:
         | My dad bought an Isuzu Impulse in 1985. It had a fancy OEM
         | stereo system with Technics branding, and a button for the
         | driver to toggle the imaging. The button was on the 7-band
         | graphic EQ that game with the car. (I don't think I've ever
         | seen another factory stereo with physical EQ sliders like
         | that.)
         | 
         | It was magic, if you were sitting in the driver's seat.
         | Toggling that button made the sound switch from "okay" to
         | "magically spatial."
         | 
         | I'm pretty sure that lead to a conversation about how we locate
         | sounds. Also wonder how it was implemented in 1985. I doubt
         | there were DSP chips in there. What's the "simple" way of
         | adding delay to some speakers?
        
           | djtriptych wrote:
           | My friend had an acura in the 90s with physical EQ sliders.
           | Can't remember if it was OEM or not but my guess is that it
           | was.
        
           | TheOtherHobbes wrote:
           | The simple way is to add some cross-channel antiphase to both
           | channels. So R = (R - cL), and L = (L - cR) where c << 1.
           | 
           | This is very cheap and easy with opamps.
           | 
           | It does indeed sound magical, and makes the stereo image
           | expand beyond the speakers.
           | 
           | It also does weird things to a mono mixdown and if c is too
           | big you get a hole in the middle. But if you keep c small and
           | use it as the final effect just before the speakers that's
           | not a problem.
           | 
           | You could also use BBD[1] chips to add a ms or so of analog
           | delay, but that's less likely because it would have been more
           | complex and expensive.
           | 
           | Digital audio delays had appeared in studios by the late
           | 1970s, but they were still more expensive than BBDs in the
           | mid-80s, so unlikely for in-car use.
           | 
           | [1] Bucket Brigade Delay
        
           | jasonwatkinspdx wrote:
           | > I'm pretty sure that lead to a conversation about how we
           | locate sounds.
           | 
           | So, psychoacoustics is incredibly complicated. There's
           | something like 13 different mechanisms that co-operate in
           | sound localization.
           | 
           | However, the bulk of it was known quite a bit before 1985,
           | and had nothing to do with the "spatialize me" button on a
           | specific car stereo.
           | 
           | There's no simple way to add a delay to some speakers unless
           | you're working in the digital domain. In analog you have two
           | basic choices. With passive components, you build a ladder
           | filter, which is as the name suggests, just a long chain of
           | low pass or all pass filters. Each "rung" only adds group
           | delay on the scale of a couple usec, so these get very big
           | and expensive fast. They also suffer from accumulated
           | imprecision issues. With active components you can create a
           | feedback loop through an op amp. This is how guitar delay
           | pedals work, but the more delay you have the more distortion
           | you introduce.
           | 
           | Technically there's a 3rd way: extremely long wires, but
           | that's basically never practical.
           | 
           | Thankfully these days everything starts out in the digital
           | domain, so you just need a controllable fifo before the DAC.
           | Entry level home theater receivers have had this since circa
           | 2000.
        
             | monocasa wrote:
             | There's also ultrasonic delay lines. That's how they did
             | reverb back in the analog only days.
        
             | foobarian wrote:
             | The oldschool way is to convert the signal to the physical
             | domain. https://anasounds.com/analog-spring-reverb-how-it-
             | works/
             | 
             | Edit: re: extremely long wires, this is how some of the
             | physical layer testing is done for networking equipment.
             | Have rolls of hundreds of miles of fiber sitting on the
             | ground to simulate large distances between switches.
             | 
             | Edit2: https://www.m2optics.com/products/fiber-test-
             | boxes/multi-spo... :-)
        
               | gugagore wrote:
               | Here are a variety of techniques:
               | https://en.m.wikipedia.org/wiki/Analog_delay_line
               | 
               | I think a better word than "physical" domain is
               | "mechanical" domain. Mechanical waves propagate much more
               | slowly than electromagnetic waves.
        
             | canadianfella wrote:
             | >Technically there's a 3rd way: extremely long wires, but
             | that's basically never practical.
             | 
             | Has this ever been done?
        
             | noir_lord wrote:
             | > Technically there's a 3rd way: extremely long wires, but
             | that's basically never practical.
             | 
             | Did the math, ~170km (assuming 300,000 km/s and speed of
             | light in copper been 90%) - that's a _long_ wire, would
             | also have to be superconducting or very high voltage ;).
             | 
             | Whatever room temperature superconductors end up costing
             | Audiophiles will be the early adopters ;).
        
             | function_seven wrote:
             | Hell, even if the premise of the conversation was wrong, I
             | still learned something that day :)
             | 
             | DSPs were a thing in the 80s, right? I guess the question
             | is: were they so expensive that it was unlikely to find one
             | in an OEM stereo of a mid-priced car? A sibling comment
             | mentions that this button may have just been a stereo
             | expando kind of thing, rather than localizing the sound
             | stage through signal delays. I'm thinking that may be
             | right, and that I'm misremembering the feature. It would be
             | awesome to find some original owners manuals for the 1985
             | Impulse that have any mention of this feature. My DDG-fu is
             | failing me right now.
             | 
             | Now, which one of those 13 mechanisms is failing on me when
             | the damn cricket keeps "moving" around the room as I try to
             | follow the chirp?
        
               | serf wrote:
               | Here's a picture of the head-unit, maybe?[0]
               | 
               | or an earlier one? [1]
               | 
               | [0]: https://i0.wp.com/www.curbsideclassic.com/wp-
               | content/uploads...
               | 
               | [1]: https://www.thetruthaboutcars.com/wp-
               | content/uploads/2014/10...
        
               | jasonwatkinspdx wrote:
               | DSPs were around, but there's no way that car stereo was
               | digitizing the signal, delaying it, then converting it
               | back to analog.
               | 
               | I get it was an impressive experience, but it's
               | essentially certain it's what the other poster said: just
               | boosting the out of phase content between the channels.
               | This was a very in vogue effect at the time. I remember
               | listening to the top 40 on the radio one time and Madonna
               | had some new song where they were hyping it as surround
               | sound and turning it into a whole event. In any case this
               | effect can be more effective than you might assume. After
               | all, the first consumer version of Dolby Surround was
               | just this out of phase content run through a bandpass
               | filter and sent to surround speakers.
               | 
               | I knew a family friend with the 90s version of the
               | Impulse. Neat quirky car from what I remember. As a kid I
               | definitely thought it was very cool.
        
               | gugagore wrote:
               | Do you mean that they simply amplify the difference
               | between the two signals?
        
               | jasonwatkinspdx wrote:
               | Yup, that's the basic idea. It's a very simple circuit,
               | which was the appeal before the digital everything era we
               | live in now.
        
               | wiredfool wrote:
               | Sounds like q-sound. There was a Rodger Waters album
               | mastered with it (Amused to Death) that has a dog barking
               | way outside the normal sound stage.
        
           | wgj wrote:
           | At that time, the most common way to do stereo enhancement
           | was to decrease the L + R component of the stereo signal. [0]
           | L+R/L-R was (and still is) a common way to encode stereo
           | signals, including for FM radio. [1]
           | 
           | The impact on the stereo field by just changing the mix of
           | these two components is profound. No signal delay needed.
           | 
           | For signals that are simple L and R, sum them to get L+R and
           | difference to get L-R. So you can use this technique on any
           | stereo source.
           | 
           | [0] https://www.sweetwater.com/insync/stereo-enhancement-
           | work-mo... [1]
           | https://en.wikipedia.org/wiki/FM_broadcasting#Stereo_FM
        
             | function_seven wrote:
             | I may be misremembering--I was a kid at the time--but I
             | distinctly remember the "wow" effect only working if you
             | were sitting in the driver's seat. To me that suggests that
             | the signals were delayed to the nearer speakers to center
             | the sound stage around the driver rather than some
             | arbitrary point above the center console.
             | 
             | But looking up photos online, I see the button I was
             | talking about labeled as "Ambience", which kind of suggests
             | the method you're describing.
             | 
             | It also turns out that using the Internet to find technical
             | info about a 40-year-old car that was never popular to
             | begin with is very hard!
        
           | jareklupinski wrote:
           | a reaaaaaaaalllllllyyy long wire :P
           | 
           | pro audio equipment sometimes used 'bucket brigade' chips to
           | implement a delay line (shame that you can't get them
           | anymore)
        
             | wgj wrote:
             | Good news. Bucket brigade (BBD) chips are still produced
             | and available from various sources. [0][1] They are used
             | today in a lot of guitar/synth gear.
             | 
             | [0] https://www.electrosmash.com/mn3007-bucket-brigade-
             | devices [1] https://www.coolaudio.com/features-
             | page.php?product=V3205SD
             | 
             | They way they work by design, clock noise needs to be
             | filtered out of the final signal, so relatively heavy low
             | pass filtering is standard. The result isn't very hi-fi.
        
               | jareklupinski wrote:
               | whoa ty! just unblocked a 5-year old project :)
        
             | Tempest1981 wrote:
             | Also called https://en.wikipedia.org/wiki/Analog_delay_line
        
               | bellyfullofbac wrote:
               | And they even have this digitally (fiber optic). Relevant
               | Tom Scott: https://www.youtube.com/watch?v=d8BcCLLX4N4
        
         | JKCalhoun wrote:
         | I built a pair of full-range speakers and was therefore
         | introduced to that effect -- an effect I had only experienced
         | before with headphones.
         | 
         | No special modern receiver in my case, just simple speakers. It
         | seems the 2-way, 3-way speakers I grew up with kill stereo
         | imaging. (Never mind the crossovers eat power and diminish the
         | efficiency of the speaker -- requiring a higher current amp,
         | etc... Lovely what a small 1/2 Watt tube amp and a pair of
         | full-range drivers can sound like ... and throw in a sub.)
        
           | bob1029 wrote:
           | > it seems the 2-way, 3-way speakers I grew up with kill
           | stereo imaging.
           | 
           | Yeah there are ways to build crossover networks that can
           | minimize these issues (phase shift) across the frequency
           | range. The most ideal crossover would dissipate 100% of the
           | undesired acoustic power as heat rather than storing it as
           | reactive energy in inductors, but the frequency domain be a
           | tricky beast to dance with.
           | 
           | The best overall approach is probably the 4th order Linkwitz-
           | Riley filter:
           | 
           | https://en.wikipedia.org/wiki/Linkwitz%E2%80%93Riley_filter#.
           | ..
        
         | Saris wrote:
         | This is why both sets of speaker cables to my amplifier are the
         | same length.
         | 
         | It does make me curious about the recording process and effects
         | if one microphone has lets say 50 feet of cable and another on
         | a different musician has 100 feet of cable.
        
           | mohaba wrote:
           | It's easy to do the math and show this is nonsense.
           | 
           | If a foot is a nanosecond for c, then 50ft difference is 50ns
           | or about 10000x times smaller than the smallest difference.
           | Even if the speed in a cable was 1%c, it's still under the
           | proposed threshold.
           | 
           | Even so, try convincing my dad that it does not matter.
        
           | canadianfella wrote:
           | >This is why both sets of speaker cables to my amplifier are
           | the same length.
           | 
           | Why?
        
       | cesaref wrote:
       | It's not clear from the article, but I think what they are doing
       | is delaying part of the frequency range of a signal. All this
       | talk of negative delays is really confusing, it's just that they
       | are delaying frequencies outside the range they are considering
       | by a larger amount than within their range of interest. The
       | overall effect is the same, there is going to have to be latency
       | through the system.
       | 
       | So, I presume, their listeners are hearing a timbral shift caused
       | by the harmonics being advanced/retarded. This is not the same
       | effect as feeding a delayed signal into one ear and a non-delayed
       | into the other (the relative phase being used for locating a
       | sound).
       | 
       | They are I guess exploring how accurately they need to reproduce
       | an impulse response to make an accurate transducer, and from the
       | article, I believe they have concluded that they need to be more
       | accurate than they previously thought. Given Genelec make a range
       | of speakers with DSP for this sort of thing, it's I guess partly
       | a marketing campaign to convince people that there is some
       | benefit from their DSP corrected monitors.
       | 
       | Of course, for a producer to have an accurate monitor is useful,
       | but if the listening public have non-aligned drivers, the sonic
       | benefits from worrying about this stuff are of somewhat limited
       | value.
        
         | Ballas wrote:
         | Yes, the way they describe it with "time traveling" is
         | especially confusing. I still don't understand exactly what
         | their claim is or what they intend on selling.
        
         | xen2xen1 wrote:
         | Takes me a minute to remember monitors and speakers can be the
         | same exact thing, and not a monitor with speakers built in
         | (like a TV).
        
         | dr_dshiv wrote:
         | Right, it seems people can detect the .5ms phase difference in
         | the timbre. It is a noticeable difference, but they aren't
         | detecting the delay, per se.
        
         | nitrogen wrote:
         | The claim of novelty for time-travel filtering seems odd, so I
         | wonder if this was a mistake in transcribing the quotation for
         | the article. Any FIR filter can effectively time travel if you
         | consider the peak to be time zero. Negative delay of one part
         | of the sound is positive delay of the rest. It's not new to
         | this article, but this is something that is new with DSP that
         | is not so easy without DSP.
         | 
         | The DRC room correction software could achieve excellent
         | results with frequency-dependent temporal correction back in
         | the early/mid 2000s: http://drc-fir.sourceforge.net/
        
         | TrackerFF wrote:
         | I believe that the BBE Sonic Maximizer
         | (http://www.bbesound.com/products/sonic-
         | maximizers/default.as...) products fix that. They (time) align
         | the components in the frequency spectrum, for lack of a better
         | description. edit: Or rather, it works as a dynamic EQ to
         | dampen/amplify certain frequencies.
         | 
         | Most guitarists that have a rack-setup, probably have or have
         | tried these - in practice, the effect is just a more smoothed
         | sound.
        
       | erdewit wrote:
       | > The sounds easiest to identify were a castanet, a percussion
       | instrument, and short clicks.
       | 
       | Not coincidentally, this is exactly the type of music where lossy
       | encodings such as MP3, Ogg or AAC fail.
        
       | tuatoru wrote:
       | > castanets and short clicks
       | 
       | I was doing just this experiment three days ago. I have a new
       | audio interface, and used audacity to try to establish its round-
       | trip delay. I created a rhythm track using a click sound,
       | connected one channel's playback to an input, and recorded it.
       | The latency was about 35.1 ms.
       | 
       | Audacity (the 2.x series) allows adjustment of latency
       | compensation in whole milliseconds.
       | 
       | Edit: I set the latency adjustment for -35 ms so there was only
       | the 0.1-ish residual latency.
       | 
       | Playing one channel of the original track and the recording
       | together (after adjusting the channel balance for equal loudness)
       | was quite disconcerting.
        
         | FelipeCortez wrote:
         | Your buffer size is probably too high. Try 128 or 64
        
           | jacquesm wrote:
           | That is always going to be a compromise between the chance of
           | a buffer underrun and annoying latency. Compared to analog
           | digital audio solves some problems but introduces a whole
           | raft of new ones.
        
       | LatteLazy wrote:
       | Isn't this how we hear directionally? 300m/s means 150mm
       | distance. Ears are only about 250mm apart, so if you couldn't
       | hear a 0.5ms diff you'd have no idea where sound was coming from
       | right?
        
         | _Microft wrote:
         | A timing difference is not the only way to tell from which side
         | a sound came. Since the ears are at opposite sides of the head
         | and the external ears have particular (but unchanging [0])
         | shapes, a sound will be changed in a different but predictable
         | way depending on the direction it comes from. This effect also
         | helps with detecting whether a sound comes from the front/back
         | or top/bottom of the listener.
         | 
         | https://en.wikipedia.org/wiki/Head-related_transfer_function
         | 
         | [0] Change the environment around the ear a bit, e.g. by
         | putting a hand relatively close to the external ear and listen
         | how doing that changes sounds.
        
           | LatteLazy wrote:
           | Thanks! That was a good read!.
        
       | vernie wrote:
       | There's a saying in real-time audio processing that "the ears
       | don't blink"
        
       | nabla9 wrote:
       | >This work demonstrates how the group-delay response of
       | headphones and loudspeakers can be perceptually tested, and leads
       | to a better understanding of how audio systems should be
       | equalized to avoid audible group-delay distortion.
        
         | szszrk wrote:
         | I find all of this hilarious as what I've learned about audio
         | production is that delays are used as equalizers, comb
         | filtering effect used in practice. Not just on analogue
         | devices, I was told once by (AFAIR?) Drumgizmo devs that
         | digital as well!
         | 
         | Btw. Drumgizmo is an amazing opensource drum plugin, that you
         | mix like you would mix any real drum.
        
           | nabla9 wrote:
           | You don't want your loudspeakers to have their own
           | distortions.
           | 
           | You want to transmit the distortions audio production made
           | without add-ons.
        
       | [deleted]
        
       | tintt wrote:
       | Given the speed of sound, 0.5ms delay occurs when you talk to
       | anybody more than half-a-feet away
        
       | phreeza wrote:
       | Humans can do way better than that, somewhere in the 20
       | microsecond range. See for example Mills (1958).
       | https://scholar.google.com/scholar?cluster=11401696136259886...
       | 
       | Edit: in fact, humans are among the best animals at all at this
       | task, I think it is not really clear why. They are on par with
       | highly specialist auditory hunters like barn owls. May be that
       | humans can use their higher cognitive faculties to somehow get
       | better at the task than the average animal but it's not clear to
       | me how that would happen.
        
         | chiefalchemist wrote:
         | I would imagine this might not be an ear mechanics issue, but a
         | brain processing issue. Some animals have dedicated "processor"
         | and other do not. While in the larger more agile human brain
         | it's simply one of many features. Maybe?
        
         | andi999 wrote:
         | Since humans rely on groups and communication it is probably to
         | hear/locate who in a group is talking.
        
       | zwieback wrote:
       | So what does that mean for a large symphony where musicians are
       | spread out much farther than sound travels in a millisecond?
       | Let's say it's 10m, that would be about 29ms. I've heard really
       | good musicians and conductors compensate for that but I'm
       | doubtful.
       | 
       | So we must be talking about more subtle effects, like phase
       | shifts, and not about how the waveform envelopes line up.
        
         | temporallobe wrote:
         | It's the same for organists in large cathedrals where there is
         | a huge delay and subsequent reverb, as the pipes are often
         | quite far from the player and sound bounces off every
         | conceivable hard surface before making its way back to the
         | organist. They learn to compensate for this mostly by just
         | being very accurate players and trusting that the outcome is
         | correct.
        
         | herendin2 wrote:
         | It's one reason that a conductor sends visual cues to the whole
         | orchestra at the speed of light. That's the base clock source
        
           | oscardssmith wrote:
           | one interesting thing is that you will fairly frequently see
           | professional orchestras be out of phase with the conductor,
           | but everything still works out.
        
             | zwieback wrote:
             | Does a really good conductor train each section to come in
             | on different parts of the visual cue then, e.g. double bass
             | would play before the violins in the front? I guess you
             | can't precisely solve this problem for more than a single
             | listener.
        
               | toast0 wrote:
               | As a (former) bass player, you need to start a bit early,
               | yeah. But that's not so much for distance but because it
               | takes longer for the sound to start. Pipe organ is of
               | course much worse. I would do one concert a year sitting
               | next to an organist... The keyboard noises and motion
               | from him playing way before me were pretty distracting.
               | And when we stopped during rehersal, he'd stop on the
               | keys right away, but the organ would keep going for a
               | bit.
        
               | zwieback wrote:
               | Yeah, I noticed that when I started playing upright bass,
               | compared to the bass guitar the sound starts with a
               | noticeable delay.
        
               | InitialLastName wrote:
               | If the listener is relatively further away from the
               | orchestra than the orchestra members are to each other,
               | it mostly all balances out, especially when you add the
               | room and bandshell reverberation (which tends to wash the
               | note onsets to some degree).
               | 
               | This is also helped by the instruments in the back of the
               | orchestra favoring the low frequencies, where the
               | listener has less access to precise timing information.
        
         | Tushon wrote:
         | Since light travels a lot faster than sound, I imagine the
         | visual cues of conductor and/or lead of your section and
         | practice of staying on beat matter more than hearing someone
         | across the symphony.
        
         | Jordrok wrote:
         | To add on to what the other commenters have already said - this
         | is especially true for marching bands where you are even
         | further spread out and your distance to various sections
         | changes over the course of the performance. You have to learn
         | to (at least somewhat) disregard what your ears are telling you
         | and focus solely on the drum majors for staying in time.
         | Listening to the drumline to keep yourself on beat can be a
         | very bad idea when they're half a football field away.
         | 
         | This effect is amplified even more for enclosed stadiums where
         | there tends to be a large amount of echo, and can really be
         | quite disorienting the first time you experience it.
        
       | 1-6 wrote:
       | There goes my ambition in syncing musicians remotely. I guess
       | introducing a little bit of delay and using GNSS clocks would
       | sort of help.
        
       | matchagaucho wrote:
       | Hearing predators approaching from behind is the evolutioning
       | justification for this trait.
       | 
       | Oculus, and other 3D platforms, utilize HRTF (head related
       | transfer functions) with subtle millisecond phase shifts to
       | localize objects.
        
       | hoseja wrote:
       | Yes, in fact, human ear can detect up to ~1/20000s delays in
       | sound...
        
       | irjustin wrote:
       | VLC and adding/removing the sound delay. Not half-millisecond
       | timing, but man it got super annoying +/-5ms.
        
         | globular-toast wrote:
         | I initially thought about synchronisation with video too (aka
         | lip sync). However, I don't think they are talking about video
         | but merely whether a difference is detectable (rather than
         | acceptable). I suspect the threshold for lip sync acceptability
         | is a lot higher than what they measured here. I would have
         | thought the threshold was higher than 5ms, but I haven't done
         | any rigorous testing.
        
           | drmpeg wrote:
           | See Figure 2 on page 4.
           | 
           | https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-
           | BT.1359-1-...
        
             | globular-toast wrote:
             | Thanks, that's really interesting, especially how video lag
             | is more detectable than audio lag. I recently set up a
             | display with a 130ms lag and this explains why it was so
             | bad before I corrected the audio!
        
         | gsich wrote:
         | Delay compared to what? Video? I doubt normal stereo audio
         | files are out of sync.
        
           | jacquesm wrote:
           | The article is quite fluffy, but from what I gather it is
           | about the difference in arrival time when a sound is
           | perceived through multiple filters for different frequency
           | bands, each of which has a programmable delay.
           | 
           | One possible application for something like this would be to
           | design a loudspeaker where the time-of-flight of the
           | soundwave travelling from a tweeter to the listener would be
           | the same as the same sound sent from a midrange speaker. Most
           | designs do not adjust for such differences.
           | 
           | The more interesting version - to me, at least - would be
           | where you compare between the two arrivals at your left and
           | right ear, which can do very small fractions of that even
           | with relatively high pitched sounds allowing for sound source
           | location based on phase.
        
             | gsich wrote:
             | I think most speaker setups have a microphone that you
             | place where the listener is to calibrate. Don't know if
             | that involves delay measurements.
        
               | jacquesm wrote:
               | There was the Philips concept called MFB which integrated
               | an accelerometer into the low frequency speakers in order
               | to match the desired waveform with the one that actually
               | went out. Quite a neat little concept, they are still
               | somewhat popular in the HiFi scene.
               | 
               | The microphone systems mostly calibrate for frequency
               | loss across the spectrum, time-of-flight adjustments I
               | haven't seen but it's quite possible they're out there
               | somewhere, but they would require a pretty nifty delay
               | mechanism or a digitization step and then turning the
               | signal yet again back into analog which would probably
               | cause more problems than it solved.
        
             | tuatoru wrote:
             | > One possible application for something like this would be
             | to design a loudspeaker where the time-of-flight of the
             | soundwave travelling from a tweeter to the listener would
             | be the same as the same sound sent from a midrange speaker.
             | Most designs do not adjust for such differences.
             | 
             | Most designs, true. Very high-end speakers do sometimes
             | incorporate all-pass filters in parts of their crossovers
             | to compensate for the set-back of the apparent sound source
             | in woofers. But very few people tilt their speakers so that
             | the distance from the midrange and woofer to their ears is
             | exactly the same as that from the tweeter at the listening
             | position, and getting complicated crossovers to be reliable
             | at 100 watt power levels is very expensive.
             | 
             | Time-of-flight compensation is trivial in the digital
             | domain, so some active crossovers (which operate on line
             | level signals, before the power amplifiers) also do this.
             | The downside, of course, is needing one power amplifier per
             | speaker cone, not just one per box, and the extra wires if
             | using separate amplifier blocks rather than active
             | speakers.
             | 
             | The consensus seems to be that it's not worth the trouble
             | in ordinary listening situations.
        
               | jacquesm wrote:
               | The one-amp-per-device is actually quite common in
               | monitor speakers.
        
       | webdevtway87 wrote:
       | Meanwhile, Android Chrome deliberately introduces a ~300
       | millisecond audio delay to make web games impossible and force
       | game devs to use the Play Store so that Google gets its pound of
       | flesh. The superpowered[1] database has many measurements showing
       | how bad Android is.
       | 
       | Those of you with Android devices can try this ancient web audio
       | demo to see that the issue remains unfixed all these years later,
       | except on certain select premium devices (certain Samsungs,
       | etc.).
       | 
       | https://webaudiodemos.appspot.com/TouchPad/index.html
       | 
       | You should hear sound the instant a rectangle is touched/clicked
       | and you will with a desktop browser but not with most Androids.
       | 
       | Apple also employs dark patterns on iOS Safari to make web games
       | impossible:
       | 
       | 1. No full screen. ("But the user wouldn't know how to exit full-
       | screen - unless they buy from the App Store! We also plan to make
       | full screen PWAs impossible because of, like, Reasons and totally
       | not because of our 30% App Store cut!")
       | 
       | 2. Slow WebGL and no WebGL 2.0. ("But we need to run extra shader
       | validation security checks every single frame even though we
       | technically only need to do it once up front! But this extra
       | Privacy and Security[TM] is totally unnecessary if you buy from
       | the App Store!")
       | 
       | 3. No device motion tilt control. ("Allowing the user to consent
       | to tilt control would violate the user's Privacy and Security[TM]
       | - unless they buy from the App Store!")
       | 
       | 4. No JS debug console unless you register an Apple developer
       | account, download 10+ gig Xcode on a Mac-only computer, and ask
       | Apple for authorization to debug javascript on your own freakin
       | iPhone you paid for with your own damn money.
       | 
       | I was forced to give up using "open" web technologies for games
       | and instead go with native code game engines (Godot, Unreal,
       | Unity, ...).
       | 
       | [1] https://superpowered.com/latency
        
       | JoeAltmaier wrote:
       | Oh I knew this. Doing the Sococo media mixer, we understood that
       | audio sensitivity is orders of magnitude more critical that
       | video. Heck, video could stall and delay and be fuzzy and folks
       | didn't say much. But miss a single sample of audio at 44KHz and
       | folks would speak up.
        
       | alex_smart wrote:
       | This is news? :o
        
       | avnigo wrote:
       | I guess that makes sense considering we use the delay of sounds
       | reaching each ear separately to determine the direction sound
       | comes from.
       | 
       | Assuming the speed of sound is around 340 m/s, in half a
       | millisecond it travels about 17 cm, which, I would guess, is
       | larger than the average distance between our ears.
       | 
       | So, using that rough estimation, I would cautiously extrapolate
       | that we probably detect a delay under 0.5 ms, but I'd be
       | interested to see what "detect" means exactly.
        
         | [deleted]
        
         | jacquesm wrote:
         | We do much better than that, we use a combination of phase
         | (arrival time of the sound-front if you wish at low
         | frequencies) and amplitude to determine direction.
         | 
         | Good article on this:
         | 
         | http://alumni.media.mit.edu/~araz/sss/Sound_Localization.htm...
        
           | sudosysgen wrote:
           | Don't forget the transformation of the signal by the head,
           | chest and outer ear! It's quite amazing.
        
           | jcims wrote:
           | Yeah I'm thinking the relative phasing of the group delay is
           | essentially having a comb filter effect on the audio, and the
           | difference is perceived qualitatively rather than anything
           | approximating a time delay.
           | 
           | For anyone not familiar, here's an example of comb filtering,
           | where reflections interfere and you get wobbles in your
           | frequency response (shown in an fft mid video).
           | 
           | https://www.youtube.com/watch?v=Amj4UevyRfU
        
         | jasonwatkinspdx wrote:
         | "Detect" for experiments like this is almost certainly an ABX
         | test. That is you get to listen to samples A, B, and X
         | repeatedly and are asked to decide whether X is A or B.
        
         | amelius wrote:
         | > in half a millisecond it travels about 17 cm, which, I would
         | guess, is larger than the average distance between our ears
         | 
         | What is the speed and variability of neural signals traveling
         | through the brain?
        
         | robwwilliams wrote:
         | The headline is odd. There are populations of neurons in the
         | auditory system--the medial superior olive--that receive input
         | from both ears. Even in humans these MSO neurons are
         | exquisitely sensitive to binaural differences. This is how the
         | owl catches the mouse at night. In some species delays of 10-20
         | microseconds can be detected and encoded by MSO neurons even
         | tracking up to frequencies well above 40 kHz. This is amazing
         | when you realize that neurons cannot fire at a rate above 1
         | kHz. Phase locking and ensemble encoding is used.
         | 
         | For example, adult mice have small heads, and ear separation is
         | merely 5-7 mm; yet this is sufficient to locate the position of
         | an ultrasonic squeak generated by a mouse pup at 40 kHz. This
         | is a computational feat that requires extreme temporal
         | precision in binaural auditory processing across comparatively
         | noisy wetware (transduction noise, phase locking error,
         | synaptic release noise, conduction velocity smear, dendritic
         | integration in MSO).
         | 
         | And doubly impressive given that the brain has no "given" time
         | base or oscillator to define a compute cycle. We must all build
         | and refine our own internal set of pseudo-clocks for sensory
         | and motor systems, in order to define the cumulative temporal
         | context in which we are embedded.
         | 
         | This is crucial for the mouse to quickly avoid the talons of
         | the owl.
         | 
         | More on timing in brain: @robwilliamsiii (see pinned tweet).
        
           | randlet wrote:
           | Thanks for this super interesting comment. The complexity of
           | life never ceases to boggle the mind.
        
         | lebuffon wrote:
         | It's more involved than just left/right delay. We localize
         | sound in the vertical direction and front and back as well.
         | 
         | This is accomplished by the delays created between direct into
         | the ear sound and reflections off the outer ear folds.
         | 
         | These reflections create "comb" filters in the audio spectrum
         | which we learn to associate with direction. Its remarkable.
         | 
         | A test to prove this was so was to fill the outer ear with
         | plasticine and perform localization tests on subjects. They
         | could not localize sound in that condition.
         | 
         | The early work as I recall was at the Heinrich Herz institute
         | in the mid 1970s.
         | 
         | I am suspicious that part of what this article is reporting is
         | due to phase cancellation effects causing similar filtration
         | that people can hear as timbre change rather than actually
         | detecting the time delay.
         | 
         | (Source: My recording engineering final paper)
        
           | EForEndeavour wrote:
           | Now I really want some plasticine / modeling clay to attach
           | to parts of my pinnae and experiment with my ability to
           | localize sounds.
           | 
           | Have video games already emulated the spectral effects of
           | sound direction for players using headphones, or even speaker
           | systems with known spatial distribution? I can imagine
           | modulating the sound coming from an in-game object to match
           | its perceived source direction to its location relative to
           | the player.
        
             | kaoD wrote:
             | I think this is called HRTF. The bad news is there are many
             | of them and only a few work for each person.
             | 
             | This often results in being unable to tell front/back
             | apart, frontal sounds perceived as coming from above, etc.
        
             | chiefgeek wrote:
             | I recently purchased a system from Dr Jeffery Thompson for
             | sound healing. It creates custom binaural beats based on
             | the body's stress response as monitored by heart rate
             | variability). The system includes a pair of sophisticated
             | 3D microphones that go over the ear with the pickups being
             | located as close to the ear canal as possible. This allows
             | them to record the sound as close as possible to what the
             | ear hears. We use them to record the client singing their
             | tone. It then gets disguised and wrapped back into the
             | binaural beat so you are essentially singing to yourself.
             | I'm eager to go out and record some natural sounds in 3D
             | for use as background.
        
               | jmole wrote:
               | Link? I'm trying to picture the apparatus you're
               | describing but it's not making sense. Are there
               | headphones involved or is it just a wearable microphone?
        
             | sudosysgen wrote:
             | Some of them do, yes. For example CS:GO has an HRTF model
             | that does this. It has variability though because each
             | one's ear is slightly different, so these models are
             | general and may work better or worse for different people.
        
             | noir_lord wrote:
             | Some games have it, with good headphones and when it's
             | implemented well it's great otherwise couldn't tell if it
             | was on or off.
             | 
             | Even a "brand" for it THX Spatial Audio
        
             | kd5bjo wrote:
             | The problem is that each person's pinnae have a unique
             | geometry, which makes the notches of the comb filter lie at
             | different frequencies for everyone. As far as I know,
             | there's no good way to determine this other than a direct
             | measurement, which requires specialist equipment.
        
               | PaulKeeble wrote:
               | Creative (Soundblaster) now have a phone app that you can
               | take a picture of your ear and it uses a machine learnt
               | model to produce a somewhat better head related transform
               | function based on it. It is an improvement on SBX and
               | CMSS before it but its not perfect either.
        
               | nitrogen wrote:
               | Video games, VR, and other headphone systems do use
               | generalized HRTFs for directional audio.
               | 
               |  _Edit:_ I 'm currently working on a series of videos to
               | explain sound direction and perception. Some of the code
               | I wrote for my demos is or will be on GitHub.
        
         | [deleted]
        
         | smcl wrote:
         | That's a nice estimation, but I would have imagined direction
         | is down to whether a sound is more muffled or quiet in one ear
         | than the other.
        
           | whiddershins wrote:
           | No it is difference in phase.
        
             | jacquesm wrote:
             | Only for LF.
        
           | vitus wrote:
           | To piggyback on the sibling comments:
           | 
           | If you only determined direction based on relative volume
           | between your ears, you wouldn't be able to distinguish
           | between sounds in front vs behind you.
        
             | ryandvm wrote:
             | You also wouldn't be able to do the cocktail party trick.
             | Your brain's ability to ignore everything that isn't of the
             | desired ITD phase shift is what allows you to selectively
             | ignore most of the noise in the room and focus on a single
             | conversation.
        
             | smcl wrote:
             | Yep that's why I added "muffled" (i.e. not 100% clear).
             | Maybe it's worth clarifying that I wasn't trying to correct
             | the commenter, just adding my original (naive? wrong?)
             | belief about how this worked :)
        
               | Ensorceled wrote:
               | Since you sound sincere here ... I read your comment as
               | disagreeing with and correcting the original comment.
        
               | smcl wrote:
               | Oops, it wasn't intended! I think maybe the "but" in my
               | comment makes it sound that way?
        
               | Cederfjard wrote:
               | Yeah, to me it reads "that's a nice theory you have, but
               | now I'll tell you mine, which I think is the correct
               | one".
        
               | smcl wrote:
               | I just liked how they related the speed of sound to the
               | distance between ears to explain the (possibly less than)
               | ~0.5ms thing, and how it was nicer than the naive
               | explanation I had previously assumed
        
           | mattkrause wrote:
           | It's a combination of both.
           | 
           | The relevant keywords are "Interaural Time Difference" (ITD;
           | this phenomenon) and "Interaural Intensity (or Level)
           | Difference" (IID/ILD; i.e., volume).
           | 
           | In fact, there are a few other mechanisms too. The shape of
           | the pinna (external ear) does some filtering that allows you
           | to distinguish sounds that produce identical ITD/IID.
           | 
           | The neuroscience of this is really fascinating, and the
           | circuits have been worked out pretty well.
        
           | Griffinsauce wrote:
           | That doesn't need to be the case when there are reflections.
        
             | smcl wrote:
             | I'm not sure I follow
        
               | jacquesm wrote:
               | A reflection can easily be louder than the original
               | sound, but the longer time-of-flight allows the ear to
               | distinguish between the two. That's how even in a hall
               | with echoing walls you can still pinpoint a soundsource
               | with relatively high accuracy.
        
               | datameta wrote:
               | The first chapter of The Sound Book by Trevor Cox goes
               | into great detail about this (but without getting deep
               | into the math of it). Well worth the read for those
               | interested in the acoustics of architecture and our
               | perception of how it modifies the soundscape.
        
               | jacquesm wrote:
               | Another interesting one is 'the acoustical foundations of
               | music'.
               | 
               | ISBN 0393090965
        
         | frankus wrote:
         | I remember back in the late 90s I had a BeBox and there was
         | this cool nodes-and-connections audio program where you could
         | add various effects. My friend mentioned this delay effect and
         | so we rigged up a graph where one stereo channel was delayed by
         | some number of milliseconds. The effect was pretty uncanny.
         | Wearing headphones, the sound seemed to be coming exclusively
         | from the non-delayed side, until you removed that headphone and
         | the delayed side was clearly still playing.
        
           | out_of_protocol wrote:
           | That's built-in echo-cancelling module, allowing you to
           | easily getting correct direction to the true source of sound
        
             | nitrogen wrote:
             | AKA the Haas effect or precedence effect, making the first
             | sound to arrive the most important for localization, even
             | if echoes are louder.
        
       | jacquesm wrote:
       | It can do a whole lot better than that, it can detect phase
       | changes at a very small fraction of a millisecond. It has to
       | because that's how we determine direction of a sound source at
       | low frequencies.
        
         | quickthrower2 wrote:
         | That it can do that with 2 ears and the latency between them
         | signalling through the brain is amazing
        
           | da_chicken wrote:
           | Well, it's not _just_ latency in many cases. If there 's
           | someone typing at the desk to your left, that noise has
           | latency but it has different loudness in your left and right
           | ears. The sound reaching your right ear has to make it's way
           | around your head, so it sounds different not merely more
           | delayed. Your brain has learned what all of that means.
        
             | phreeza wrote:
             | In addition to loudness and latency differences, there is
             | even a frequency-specific effect due to the shape of the
             | head and in particular your ear (your pinna to be precise).
             | This is actually probably the reason why our ears have the
             | weird shape they do. This is different for every person,
             | which means that headphone-based simulated spatial audio
             | can never be perfect.
        
           | tiagod wrote:
           | You can even hear the interference between two sine waves,
           | without them actually "mixing" in the air, by feeding one to
           | each ear. This doesn't work for every frequency, but when it
           | does it sounds the same as it would if they were both played
           | to both years, but it's happening in your brain!
           | 
           | https://en.wikipedia.org/wiki/Beat_(acoustics)#Binaural_beat.
           | ..
        
             | jacquesm wrote:
             | I've done this and it makes you tired extremely quickly for
             | some reason. It's worse than tuning up a piano from
             | scratch.
        
       | junon wrote:
       | Was just thinking about this the other day. In music production
       | you can determine this; you have to be able to hear just a
       | millisecond of difference sometimes to align things well,
       | especially when you're working with instruments that have
       | considerable delay (e.g. almost all Kontakt instruments).
        
         | jacquesm wrote:
         | Essentially anything that is MIDI based or has an audio buffer
         | somehwere.
        
           | junon wrote:
           | Yes but what I meant is that even though you place a midi
           | note directly on a beat, many instruments still have a pretty
           | considerably delay, in the order of 10-100ms.
        
             | jacquesm wrote:
             | Not only that, those delays themselves tend to change
             | depending on how 'busy' the instrument is, for instance,
             | whether you are using layered patches, multiple notes at
             | once, different instruments at once and so on. It can
             | change from one note to the next, sometimes without any
             | clear hint as to why that is the case.
        
       | mauvehaus wrote:
       | Highly recommended: Seeing The Visitors at ICA Boston. It's a
       | dozen-ish musicians playing in separate rooms of a big house
       | where they can't see each other and can only hear each other over
       | headphones. It's presented as nine channels of video, each with
       | their own audio.
       | 
       | The effect is pretty wild and magical. The music, I suspect, is
       | intentionally written to not require precise timing, and part of
       | the charm of the piece is the musicians feeling each other out as
       | they play. It definitely plays with your expectations of what
       | constitutes music and how you hear timing in music.
       | 
       | The ICA owns the piece (a copy of the piece?), but it isn't
       | currently on display :-(
       | 
       | https://www.icaboston.org/exhibitions/ragnar-kjartansson-vis...
        
         | bazeblackwood wrote:
         | I saw it at the Broad in LA, and it was mind-blowing.
        
       ___________________________________________________________________
       (page generated 2021-08-05 23:03 UTC)