[HN Gopher] The human ear detects half a millisecond delay in sound
___________________________________________________________________
The human ear detects half a millisecond delay in sound
Author : pizza
Score : 216 points
Date : 2021-08-05 03:56 UTC (19 hours ago)
(HTM) web link (www.aalto.fi)
(TXT) w3m dump (www.aalto.fi)
| abeppu wrote:
| I'm confused about why the methodology that they talk about is
| innovative. They talk about shifting when a sound event of a
| given frequency arrives, relative to another sound. But with
| digital audio, esp that you can prepare in advance, isn't this
| pretty simple? Like, given signals A and B, apply a band-pass
| filter to A (for the desired frequency window), truncate off the
| first k samples based on the time shift of interest, and add to
| B. Even if you do need it in real time, why isn't this equivalent
| to delaying B by k samples?
|
| I must be missing something, because the researcher makes this
| sound like a cool new technique.
| vbphprubyjsgo wrote:
| For some reason I thought this was about reaction time. _Of
| course_ you can detect small variations in time. Even visually
| the eye will see differences between a light on for 1ms and one
| on for 10ms.
| NDizzle wrote:
| I remember way back when the original Xbox was out, and modded.
| Whatever video player it had let you adjust the sound delay for
| the (new at the time) mpeg4 rips of whatever movie or show you
| downloaded.
|
| It was like a light switch went off when you got the audio delay
| synced up perfectly. I guess I recall this vividly because 1) it
| was a huge problem and 2) the adjustment to audio delay was on
| the shoulder buttons, while fast forward and rewind were on the
| triggers. Such a large problem that the adjustment wasn't buried
| in menus, but right there on the controller. Prime real estate.
| Aerroon wrote:
| Humans can ignore the desync between audio and video to about
| -125 ms to +45 ms. Ideally you don't want to be off by more
| than a frame or two though.
|
| If the desync goes out of these bounds then watching someone
| speak becomes uncomfortable.
| nitrogen wrote:
| Using a rough rule of thumb that one millisecond equals one
| foot, it kind of makes sense that we would be able to follow
| speech coming from someone up to 45 feet away. But the
| tolerance on the negative side is surprising.
| Guillaume86 wrote:
| It might have been XBMC (xbox media center). Great software and
| it still exists today on several platforms (PC, Android TV,
| etc) under the name of Kodi.
| BoxOfRain wrote:
| I spent lockdown coming up with a music podcast with a friend,
| it's amazing how much even a little delay can throw you off. I
| have a hybrid digital/analogue setup where an analogue mixer is
| fed either from a digital source by a DAC or an analogue source
| via my microphone or a turntable. The analogue audio then goes
| through some analogue processing equipment and is then fed into
| the ADC and recorded/streamed digitally. It's not the most
| efficient setup in the world, but I have a real thing for
| flashing lights and physical switches!
|
| If you talk into the mic with your headphones plugged into the
| analogue side then switch to monitoring the software instead, you
| can really notice the latency even though it's practically not
| that much. I read about a technology that's supposed to prevent
| angry customers from giving poor call centre employees a tirade
| of abuse by echoing their voices back at them on a very slight
| delay which is apparently quite intolerable, and I could believe
| it!
| Cyykratahk wrote:
| This is known as Delayed Auditory Feedback
|
| https://en.wikipedia.org/wiki/Delayed_Auditory_Feedback
| toast0 wrote:
| > echoing their voices back at them on a very slight delay
| which is apparently quite intolerable, and I could believe it!
|
| Sometimes I get this effect on my normal calls, and it is
| pretty awful. Echo wasn't nearly so bad on circuit switched
| calls, but now that everything is digital and has sampling and
| codec delays, the echos come back so much slower.
| AYBABTME wrote:
| Something I've been wondering is, assuming that the brain works
| at a couple dozen hertz frequency, how can we perform any
| sub-10ms task?
| chousuke wrote:
| Wouldn't the brain send the signals pre-emptively such that
| they trigger the action at the correct time, accounting for
| delays? I don't think it's possible to perform any _conscious_
| action that quickly, but if you 've trained to perform some
| pattern, your brain would learn to compensate for any
| processing delays.
| archibaldJ wrote:
| I was playing with my synthesiser the other day and realized what
| I was really doing is using my eardrum to explore the search
| space of oscillation patterns, and their combinatorial sets.
|
| Music in this sense is the class of all time series of
| combinations of such patterns that we find resonating as we are
| able to encode and decode emotions and feelings from these sounds
| (as well as appreaciation for them).
|
| I wonder what other cool things we can do with this singal
| processing ability of ours as we enter the age of brain computer
| interfaces and psychedelics.
| bob1029 wrote:
| Its amazing how much difference .5ms can make in stereo imaging.
| Any decent home audio receiver has the ability to set the
| distance of each speaker individually so that you can compensate
| for physical placement constraints.
|
| If you get a stereo pair perfectly locked in on delay to
| eardrums, you can produce an extremely compelling listening
| experience for those in a very specific region of the room.
|
| Finding out about all this can be revolutionary for some music
| enthusiasts. Once you get a good listening setup (or headphones),
| you start going through old things to see how the "stage
| presence" sounds, or if you are now able to physically place each
| instrument in the virtual space.
| function_seven wrote:
| My dad bought an Isuzu Impulse in 1985. It had a fancy OEM
| stereo system with Technics branding, and a button for the
| driver to toggle the imaging. The button was on the 7-band
| graphic EQ that game with the car. (I don't think I've ever
| seen another factory stereo with physical EQ sliders like
| that.)
|
| It was magic, if you were sitting in the driver's seat.
| Toggling that button made the sound switch from "okay" to
| "magically spatial."
|
| I'm pretty sure that lead to a conversation about how we locate
| sounds. Also wonder how it was implemented in 1985. I doubt
| there were DSP chips in there. What's the "simple" way of
| adding delay to some speakers?
| djtriptych wrote:
| My friend had an acura in the 90s with physical EQ sliders.
| Can't remember if it was OEM or not but my guess is that it
| was.
| TheOtherHobbes wrote:
| The simple way is to add some cross-channel antiphase to both
| channels. So R = (R - cL), and L = (L - cR) where c << 1.
|
| This is very cheap and easy with opamps.
|
| It does indeed sound magical, and makes the stereo image
| expand beyond the speakers.
|
| It also does weird things to a mono mixdown and if c is too
| big you get a hole in the middle. But if you keep c small and
| use it as the final effect just before the speakers that's
| not a problem.
|
| You could also use BBD[1] chips to add a ms or so of analog
| delay, but that's less likely because it would have been more
| complex and expensive.
|
| Digital audio delays had appeared in studios by the late
| 1970s, but they were still more expensive than BBDs in the
| mid-80s, so unlikely for in-car use.
|
| [1] Bucket Brigade Delay
| jasonwatkinspdx wrote:
| > I'm pretty sure that lead to a conversation about how we
| locate sounds.
|
| So, psychoacoustics is incredibly complicated. There's
| something like 13 different mechanisms that co-operate in
| sound localization.
|
| However, the bulk of it was known quite a bit before 1985,
| and had nothing to do with the "spatialize me" button on a
| specific car stereo.
|
| There's no simple way to add a delay to some speakers unless
| you're working in the digital domain. In analog you have two
| basic choices. With passive components, you build a ladder
| filter, which is as the name suggests, just a long chain of
| low pass or all pass filters. Each "rung" only adds group
| delay on the scale of a couple usec, so these get very big
| and expensive fast. They also suffer from accumulated
| imprecision issues. With active components you can create a
| feedback loop through an op amp. This is how guitar delay
| pedals work, but the more delay you have the more distortion
| you introduce.
|
| Technically there's a 3rd way: extremely long wires, but
| that's basically never practical.
|
| Thankfully these days everything starts out in the digital
| domain, so you just need a controllable fifo before the DAC.
| Entry level home theater receivers have had this since circa
| 2000.
| monocasa wrote:
| There's also ultrasonic delay lines. That's how they did
| reverb back in the analog only days.
| foobarian wrote:
| The oldschool way is to convert the signal to the physical
| domain. https://anasounds.com/analog-spring-reverb-how-it-
| works/
|
| Edit: re: extremely long wires, this is how some of the
| physical layer testing is done for networking equipment.
| Have rolls of hundreds of miles of fiber sitting on the
| ground to simulate large distances between switches.
|
| Edit2: https://www.m2optics.com/products/fiber-test-
| boxes/multi-spo... :-)
| gugagore wrote:
| Here are a variety of techniques:
| https://en.m.wikipedia.org/wiki/Analog_delay_line
|
| I think a better word than "physical" domain is
| "mechanical" domain. Mechanical waves propagate much more
| slowly than electromagnetic waves.
| canadianfella wrote:
| >Technically there's a 3rd way: extremely long wires, but
| that's basically never practical.
|
| Has this ever been done?
| noir_lord wrote:
| > Technically there's a 3rd way: extremely long wires, but
| that's basically never practical.
|
| Did the math, ~170km (assuming 300,000 km/s and speed of
| light in copper been 90%) - that's a _long_ wire, would
| also have to be superconducting or very high voltage ;).
|
| Whatever room temperature superconductors end up costing
| Audiophiles will be the early adopters ;).
| function_seven wrote:
| Hell, even if the premise of the conversation was wrong, I
| still learned something that day :)
|
| DSPs were a thing in the 80s, right? I guess the question
| is: were they so expensive that it was unlikely to find one
| in an OEM stereo of a mid-priced car? A sibling comment
| mentions that this button may have just been a stereo
| expando kind of thing, rather than localizing the sound
| stage through signal delays. I'm thinking that may be
| right, and that I'm misremembering the feature. It would be
| awesome to find some original owners manuals for the 1985
| Impulse that have any mention of this feature. My DDG-fu is
| failing me right now.
|
| Now, which one of those 13 mechanisms is failing on me when
| the damn cricket keeps "moving" around the room as I try to
| follow the chirp?
| serf wrote:
| Here's a picture of the head-unit, maybe?[0]
|
| or an earlier one? [1]
|
| [0]: https://i0.wp.com/www.curbsideclassic.com/wp-
| content/uploads...
|
| [1]: https://www.thetruthaboutcars.com/wp-
| content/uploads/2014/10...
| jasonwatkinspdx wrote:
| DSPs were around, but there's no way that car stereo was
| digitizing the signal, delaying it, then converting it
| back to analog.
|
| I get it was an impressive experience, but it's
| essentially certain it's what the other poster said: just
| boosting the out of phase content between the channels.
| This was a very in vogue effect at the time. I remember
| listening to the top 40 on the radio one time and Madonna
| had some new song where they were hyping it as surround
| sound and turning it into a whole event. In any case this
| effect can be more effective than you might assume. After
| all, the first consumer version of Dolby Surround was
| just this out of phase content run through a bandpass
| filter and sent to surround speakers.
|
| I knew a family friend with the 90s version of the
| Impulse. Neat quirky car from what I remember. As a kid I
| definitely thought it was very cool.
| gugagore wrote:
| Do you mean that they simply amplify the difference
| between the two signals?
| jasonwatkinspdx wrote:
| Yup, that's the basic idea. It's a very simple circuit,
| which was the appeal before the digital everything era we
| live in now.
| wiredfool wrote:
| Sounds like q-sound. There was a Rodger Waters album
| mastered with it (Amused to Death) that has a dog barking
| way outside the normal sound stage.
| wgj wrote:
| At that time, the most common way to do stereo enhancement
| was to decrease the L + R component of the stereo signal. [0]
| L+R/L-R was (and still is) a common way to encode stereo
| signals, including for FM radio. [1]
|
| The impact on the stereo field by just changing the mix of
| these two components is profound. No signal delay needed.
|
| For signals that are simple L and R, sum them to get L+R and
| difference to get L-R. So you can use this technique on any
| stereo source.
|
| [0] https://www.sweetwater.com/insync/stereo-enhancement-
| work-mo... [1]
| https://en.wikipedia.org/wiki/FM_broadcasting#Stereo_FM
| function_seven wrote:
| I may be misremembering--I was a kid at the time--but I
| distinctly remember the "wow" effect only working if you
| were sitting in the driver's seat. To me that suggests that
| the signals were delayed to the nearer speakers to center
| the sound stage around the driver rather than some
| arbitrary point above the center console.
|
| But looking up photos online, I see the button I was
| talking about labeled as "Ambience", which kind of suggests
| the method you're describing.
|
| It also turns out that using the Internet to find technical
| info about a 40-year-old car that was never popular to
| begin with is very hard!
| jareklupinski wrote:
| a reaaaaaaaalllllllyyy long wire :P
|
| pro audio equipment sometimes used 'bucket brigade' chips to
| implement a delay line (shame that you can't get them
| anymore)
| wgj wrote:
| Good news. Bucket brigade (BBD) chips are still produced
| and available from various sources. [0][1] They are used
| today in a lot of guitar/synth gear.
|
| [0] https://www.electrosmash.com/mn3007-bucket-brigade-
| devices [1] https://www.coolaudio.com/features-
| page.php?product=V3205SD
|
| They way they work by design, clock noise needs to be
| filtered out of the final signal, so relatively heavy low
| pass filtering is standard. The result isn't very hi-fi.
| jareklupinski wrote:
| whoa ty! just unblocked a 5-year old project :)
| Tempest1981 wrote:
| Also called https://en.wikipedia.org/wiki/Analog_delay_line
| bellyfullofbac wrote:
| And they even have this digitally (fiber optic). Relevant
| Tom Scott: https://www.youtube.com/watch?v=d8BcCLLX4N4
| JKCalhoun wrote:
| I built a pair of full-range speakers and was therefore
| introduced to that effect -- an effect I had only experienced
| before with headphones.
|
| No special modern receiver in my case, just simple speakers. It
| seems the 2-way, 3-way speakers I grew up with kill stereo
| imaging. (Never mind the crossovers eat power and diminish the
| efficiency of the speaker -- requiring a higher current amp,
| etc... Lovely what a small 1/2 Watt tube amp and a pair of
| full-range drivers can sound like ... and throw in a sub.)
| bob1029 wrote:
| > it seems the 2-way, 3-way speakers I grew up with kill
| stereo imaging.
|
| Yeah there are ways to build crossover networks that can
| minimize these issues (phase shift) across the frequency
| range. The most ideal crossover would dissipate 100% of the
| undesired acoustic power as heat rather than storing it as
| reactive energy in inductors, but the frequency domain be a
| tricky beast to dance with.
|
| The best overall approach is probably the 4th order Linkwitz-
| Riley filter:
|
| https://en.wikipedia.org/wiki/Linkwitz%E2%80%93Riley_filter#.
| ..
| Saris wrote:
| This is why both sets of speaker cables to my amplifier are the
| same length.
|
| It does make me curious about the recording process and effects
| if one microphone has lets say 50 feet of cable and another on
| a different musician has 100 feet of cable.
| mohaba wrote:
| It's easy to do the math and show this is nonsense.
|
| If a foot is a nanosecond for c, then 50ft difference is 50ns
| or about 10000x times smaller than the smallest difference.
| Even if the speed in a cable was 1%c, it's still under the
| proposed threshold.
|
| Even so, try convincing my dad that it does not matter.
| canadianfella wrote:
| >This is why both sets of speaker cables to my amplifier are
| the same length.
|
| Why?
| cesaref wrote:
| It's not clear from the article, but I think what they are doing
| is delaying part of the frequency range of a signal. All this
| talk of negative delays is really confusing, it's just that they
| are delaying frequencies outside the range they are considering
| by a larger amount than within their range of interest. The
| overall effect is the same, there is going to have to be latency
| through the system.
|
| So, I presume, their listeners are hearing a timbral shift caused
| by the harmonics being advanced/retarded. This is not the same
| effect as feeding a delayed signal into one ear and a non-delayed
| into the other (the relative phase being used for locating a
| sound).
|
| They are I guess exploring how accurately they need to reproduce
| an impulse response to make an accurate transducer, and from the
| article, I believe they have concluded that they need to be more
| accurate than they previously thought. Given Genelec make a range
| of speakers with DSP for this sort of thing, it's I guess partly
| a marketing campaign to convince people that there is some
| benefit from their DSP corrected monitors.
|
| Of course, for a producer to have an accurate monitor is useful,
| but if the listening public have non-aligned drivers, the sonic
| benefits from worrying about this stuff are of somewhat limited
| value.
| Ballas wrote:
| Yes, the way they describe it with "time traveling" is
| especially confusing. I still don't understand exactly what
| their claim is or what they intend on selling.
| xen2xen1 wrote:
| Takes me a minute to remember monitors and speakers can be the
| same exact thing, and not a monitor with speakers built in
| (like a TV).
| dr_dshiv wrote:
| Right, it seems people can detect the .5ms phase difference in
| the timbre. It is a noticeable difference, but they aren't
| detecting the delay, per se.
| nitrogen wrote:
| The claim of novelty for time-travel filtering seems odd, so I
| wonder if this was a mistake in transcribing the quotation for
| the article. Any FIR filter can effectively time travel if you
| consider the peak to be time zero. Negative delay of one part
| of the sound is positive delay of the rest. It's not new to
| this article, but this is something that is new with DSP that
| is not so easy without DSP.
|
| The DRC room correction software could achieve excellent
| results with frequency-dependent temporal correction back in
| the early/mid 2000s: http://drc-fir.sourceforge.net/
| TrackerFF wrote:
| I believe that the BBE Sonic Maximizer
| (http://www.bbesound.com/products/sonic-
| maximizers/default.as...) products fix that. They (time) align
| the components in the frequency spectrum, for lack of a better
| description. edit: Or rather, it works as a dynamic EQ to
| dampen/amplify certain frequencies.
|
| Most guitarists that have a rack-setup, probably have or have
| tried these - in practice, the effect is just a more smoothed
| sound.
| erdewit wrote:
| > The sounds easiest to identify were a castanet, a percussion
| instrument, and short clicks.
|
| Not coincidentally, this is exactly the type of music where lossy
| encodings such as MP3, Ogg or AAC fail.
| tuatoru wrote:
| > castanets and short clicks
|
| I was doing just this experiment three days ago. I have a new
| audio interface, and used audacity to try to establish its round-
| trip delay. I created a rhythm track using a click sound,
| connected one channel's playback to an input, and recorded it.
| The latency was about 35.1 ms.
|
| Audacity (the 2.x series) allows adjustment of latency
| compensation in whole milliseconds.
|
| Edit: I set the latency adjustment for -35 ms so there was only
| the 0.1-ish residual latency.
|
| Playing one channel of the original track and the recording
| together (after adjusting the channel balance for equal loudness)
| was quite disconcerting.
| FelipeCortez wrote:
| Your buffer size is probably too high. Try 128 or 64
| jacquesm wrote:
| That is always going to be a compromise between the chance of
| a buffer underrun and annoying latency. Compared to analog
| digital audio solves some problems but introduces a whole
| raft of new ones.
| LatteLazy wrote:
| Isn't this how we hear directionally? 300m/s means 150mm
| distance. Ears are only about 250mm apart, so if you couldn't
| hear a 0.5ms diff you'd have no idea where sound was coming from
| right?
| _Microft wrote:
| A timing difference is not the only way to tell from which side
| a sound came. Since the ears are at opposite sides of the head
| and the external ears have particular (but unchanging [0])
| shapes, a sound will be changed in a different but predictable
| way depending on the direction it comes from. This effect also
| helps with detecting whether a sound comes from the front/back
| or top/bottom of the listener.
|
| https://en.wikipedia.org/wiki/Head-related_transfer_function
|
| [0] Change the environment around the ear a bit, e.g. by
| putting a hand relatively close to the external ear and listen
| how doing that changes sounds.
| LatteLazy wrote:
| Thanks! That was a good read!.
| vernie wrote:
| There's a saying in real-time audio processing that "the ears
| don't blink"
| nabla9 wrote:
| >This work demonstrates how the group-delay response of
| headphones and loudspeakers can be perceptually tested, and leads
| to a better understanding of how audio systems should be
| equalized to avoid audible group-delay distortion.
| szszrk wrote:
| I find all of this hilarious as what I've learned about audio
| production is that delays are used as equalizers, comb
| filtering effect used in practice. Not just on analogue
| devices, I was told once by (AFAIR?) Drumgizmo devs that
| digital as well!
|
| Btw. Drumgizmo is an amazing opensource drum plugin, that you
| mix like you would mix any real drum.
| nabla9 wrote:
| You don't want your loudspeakers to have their own
| distortions.
|
| You want to transmit the distortions audio production made
| without add-ons.
| [deleted]
| tintt wrote:
| Given the speed of sound, 0.5ms delay occurs when you talk to
| anybody more than half-a-feet away
| phreeza wrote:
| Humans can do way better than that, somewhere in the 20
| microsecond range. See for example Mills (1958).
| https://scholar.google.com/scholar?cluster=11401696136259886...
|
| Edit: in fact, humans are among the best animals at all at this
| task, I think it is not really clear why. They are on par with
| highly specialist auditory hunters like barn owls. May be that
| humans can use their higher cognitive faculties to somehow get
| better at the task than the average animal but it's not clear to
| me how that would happen.
| chiefalchemist wrote:
| I would imagine this might not be an ear mechanics issue, but a
| brain processing issue. Some animals have dedicated "processor"
| and other do not. While in the larger more agile human brain
| it's simply one of many features. Maybe?
| andi999 wrote:
| Since humans rely on groups and communication it is probably to
| hear/locate who in a group is talking.
| zwieback wrote:
| So what does that mean for a large symphony where musicians are
| spread out much farther than sound travels in a millisecond?
| Let's say it's 10m, that would be about 29ms. I've heard really
| good musicians and conductors compensate for that but I'm
| doubtful.
|
| So we must be talking about more subtle effects, like phase
| shifts, and not about how the waveform envelopes line up.
| temporallobe wrote:
| It's the same for organists in large cathedrals where there is
| a huge delay and subsequent reverb, as the pipes are often
| quite far from the player and sound bounces off every
| conceivable hard surface before making its way back to the
| organist. They learn to compensate for this mostly by just
| being very accurate players and trusting that the outcome is
| correct.
| herendin2 wrote:
| It's one reason that a conductor sends visual cues to the whole
| orchestra at the speed of light. That's the base clock source
| oscardssmith wrote:
| one interesting thing is that you will fairly frequently see
| professional orchestras be out of phase with the conductor,
| but everything still works out.
| zwieback wrote:
| Does a really good conductor train each section to come in
| on different parts of the visual cue then, e.g. double bass
| would play before the violins in the front? I guess you
| can't precisely solve this problem for more than a single
| listener.
| toast0 wrote:
| As a (former) bass player, you need to start a bit early,
| yeah. But that's not so much for distance but because it
| takes longer for the sound to start. Pipe organ is of
| course much worse. I would do one concert a year sitting
| next to an organist... The keyboard noises and motion
| from him playing way before me were pretty distracting.
| And when we stopped during rehersal, he'd stop on the
| keys right away, but the organ would keep going for a
| bit.
| zwieback wrote:
| Yeah, I noticed that when I started playing upright bass,
| compared to the bass guitar the sound starts with a
| noticeable delay.
| InitialLastName wrote:
| If the listener is relatively further away from the
| orchestra than the orchestra members are to each other,
| it mostly all balances out, especially when you add the
| room and bandshell reverberation (which tends to wash the
| note onsets to some degree).
|
| This is also helped by the instruments in the back of the
| orchestra favoring the low frequencies, where the
| listener has less access to precise timing information.
| Tushon wrote:
| Since light travels a lot faster than sound, I imagine the
| visual cues of conductor and/or lead of your section and
| practice of staying on beat matter more than hearing someone
| across the symphony.
| Jordrok wrote:
| To add on to what the other commenters have already said - this
| is especially true for marching bands where you are even
| further spread out and your distance to various sections
| changes over the course of the performance. You have to learn
| to (at least somewhat) disregard what your ears are telling you
| and focus solely on the drum majors for staying in time.
| Listening to the drumline to keep yourself on beat can be a
| very bad idea when they're half a football field away.
|
| This effect is amplified even more for enclosed stadiums where
| there tends to be a large amount of echo, and can really be
| quite disorienting the first time you experience it.
| 1-6 wrote:
| There goes my ambition in syncing musicians remotely. I guess
| introducing a little bit of delay and using GNSS clocks would
| sort of help.
| matchagaucho wrote:
| Hearing predators approaching from behind is the evolutioning
| justification for this trait.
|
| Oculus, and other 3D platforms, utilize HRTF (head related
| transfer functions) with subtle millisecond phase shifts to
| localize objects.
| hoseja wrote:
| Yes, in fact, human ear can detect up to ~1/20000s delays in
| sound...
| irjustin wrote:
| VLC and adding/removing the sound delay. Not half-millisecond
| timing, but man it got super annoying +/-5ms.
| globular-toast wrote:
| I initially thought about synchronisation with video too (aka
| lip sync). However, I don't think they are talking about video
| but merely whether a difference is detectable (rather than
| acceptable). I suspect the threshold for lip sync acceptability
| is a lot higher than what they measured here. I would have
| thought the threshold was higher than 5ms, but I haven't done
| any rigorous testing.
| drmpeg wrote:
| See Figure 2 on page 4.
|
| https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-
| BT.1359-1-...
| globular-toast wrote:
| Thanks, that's really interesting, especially how video lag
| is more detectable than audio lag. I recently set up a
| display with a 130ms lag and this explains why it was so
| bad before I corrected the audio!
| gsich wrote:
| Delay compared to what? Video? I doubt normal stereo audio
| files are out of sync.
| jacquesm wrote:
| The article is quite fluffy, but from what I gather it is
| about the difference in arrival time when a sound is
| perceived through multiple filters for different frequency
| bands, each of which has a programmable delay.
|
| One possible application for something like this would be to
| design a loudspeaker where the time-of-flight of the
| soundwave travelling from a tweeter to the listener would be
| the same as the same sound sent from a midrange speaker. Most
| designs do not adjust for such differences.
|
| The more interesting version - to me, at least - would be
| where you compare between the two arrivals at your left and
| right ear, which can do very small fractions of that even
| with relatively high pitched sounds allowing for sound source
| location based on phase.
| gsich wrote:
| I think most speaker setups have a microphone that you
| place where the listener is to calibrate. Don't know if
| that involves delay measurements.
| jacquesm wrote:
| There was the Philips concept called MFB which integrated
| an accelerometer into the low frequency speakers in order
| to match the desired waveform with the one that actually
| went out. Quite a neat little concept, they are still
| somewhat popular in the HiFi scene.
|
| The microphone systems mostly calibrate for frequency
| loss across the spectrum, time-of-flight adjustments I
| haven't seen but it's quite possible they're out there
| somewhere, but they would require a pretty nifty delay
| mechanism or a digitization step and then turning the
| signal yet again back into analog which would probably
| cause more problems than it solved.
| tuatoru wrote:
| > One possible application for something like this would be
| to design a loudspeaker where the time-of-flight of the
| soundwave travelling from a tweeter to the listener would
| be the same as the same sound sent from a midrange speaker.
| Most designs do not adjust for such differences.
|
| Most designs, true. Very high-end speakers do sometimes
| incorporate all-pass filters in parts of their crossovers
| to compensate for the set-back of the apparent sound source
| in woofers. But very few people tilt their speakers so that
| the distance from the midrange and woofer to their ears is
| exactly the same as that from the tweeter at the listening
| position, and getting complicated crossovers to be reliable
| at 100 watt power levels is very expensive.
|
| Time-of-flight compensation is trivial in the digital
| domain, so some active crossovers (which operate on line
| level signals, before the power amplifiers) also do this.
| The downside, of course, is needing one power amplifier per
| speaker cone, not just one per box, and the extra wires if
| using separate amplifier blocks rather than active
| speakers.
|
| The consensus seems to be that it's not worth the trouble
| in ordinary listening situations.
| jacquesm wrote:
| The one-amp-per-device is actually quite common in
| monitor speakers.
| webdevtway87 wrote:
| Meanwhile, Android Chrome deliberately introduces a ~300
| millisecond audio delay to make web games impossible and force
| game devs to use the Play Store so that Google gets its pound of
| flesh. The superpowered[1] database has many measurements showing
| how bad Android is.
|
| Those of you with Android devices can try this ancient web audio
| demo to see that the issue remains unfixed all these years later,
| except on certain select premium devices (certain Samsungs,
| etc.).
|
| https://webaudiodemos.appspot.com/TouchPad/index.html
|
| You should hear sound the instant a rectangle is touched/clicked
| and you will with a desktop browser but not with most Androids.
|
| Apple also employs dark patterns on iOS Safari to make web games
| impossible:
|
| 1. No full screen. ("But the user wouldn't know how to exit full-
| screen - unless they buy from the App Store! We also plan to make
| full screen PWAs impossible because of, like, Reasons and totally
| not because of our 30% App Store cut!")
|
| 2. Slow WebGL and no WebGL 2.0. ("But we need to run extra shader
| validation security checks every single frame even though we
| technically only need to do it once up front! But this extra
| Privacy and Security[TM] is totally unnecessary if you buy from
| the App Store!")
|
| 3. No device motion tilt control. ("Allowing the user to consent
| to tilt control would violate the user's Privacy and Security[TM]
| - unless they buy from the App Store!")
|
| 4. No JS debug console unless you register an Apple developer
| account, download 10+ gig Xcode on a Mac-only computer, and ask
| Apple for authorization to debug javascript on your own freakin
| iPhone you paid for with your own damn money.
|
| I was forced to give up using "open" web technologies for games
| and instead go with native code game engines (Godot, Unreal,
| Unity, ...).
|
| [1] https://superpowered.com/latency
| JoeAltmaier wrote:
| Oh I knew this. Doing the Sococo media mixer, we understood that
| audio sensitivity is orders of magnitude more critical that
| video. Heck, video could stall and delay and be fuzzy and folks
| didn't say much. But miss a single sample of audio at 44KHz and
| folks would speak up.
| alex_smart wrote:
| This is news? :o
| avnigo wrote:
| I guess that makes sense considering we use the delay of sounds
| reaching each ear separately to determine the direction sound
| comes from.
|
| Assuming the speed of sound is around 340 m/s, in half a
| millisecond it travels about 17 cm, which, I would guess, is
| larger than the average distance between our ears.
|
| So, using that rough estimation, I would cautiously extrapolate
| that we probably detect a delay under 0.5 ms, but I'd be
| interested to see what "detect" means exactly.
| [deleted]
| jacquesm wrote:
| We do much better than that, we use a combination of phase
| (arrival time of the sound-front if you wish at low
| frequencies) and amplitude to determine direction.
|
| Good article on this:
|
| http://alumni.media.mit.edu/~araz/sss/Sound_Localization.htm...
| sudosysgen wrote:
| Don't forget the transformation of the signal by the head,
| chest and outer ear! It's quite amazing.
| jcims wrote:
| Yeah I'm thinking the relative phasing of the group delay is
| essentially having a comb filter effect on the audio, and the
| difference is perceived qualitatively rather than anything
| approximating a time delay.
|
| For anyone not familiar, here's an example of comb filtering,
| where reflections interfere and you get wobbles in your
| frequency response (shown in an fft mid video).
|
| https://www.youtube.com/watch?v=Amj4UevyRfU
| jasonwatkinspdx wrote:
| "Detect" for experiments like this is almost certainly an ABX
| test. That is you get to listen to samples A, B, and X
| repeatedly and are asked to decide whether X is A or B.
| amelius wrote:
| > in half a millisecond it travels about 17 cm, which, I would
| guess, is larger than the average distance between our ears
|
| What is the speed and variability of neural signals traveling
| through the brain?
| robwwilliams wrote:
| The headline is odd. There are populations of neurons in the
| auditory system--the medial superior olive--that receive input
| from both ears. Even in humans these MSO neurons are
| exquisitely sensitive to binaural differences. This is how the
| owl catches the mouse at night. In some species delays of 10-20
| microseconds can be detected and encoded by MSO neurons even
| tracking up to frequencies well above 40 kHz. This is amazing
| when you realize that neurons cannot fire at a rate above 1
| kHz. Phase locking and ensemble encoding is used.
|
| For example, adult mice have small heads, and ear separation is
| merely 5-7 mm; yet this is sufficient to locate the position of
| an ultrasonic squeak generated by a mouse pup at 40 kHz. This
| is a computational feat that requires extreme temporal
| precision in binaural auditory processing across comparatively
| noisy wetware (transduction noise, phase locking error,
| synaptic release noise, conduction velocity smear, dendritic
| integration in MSO).
|
| And doubly impressive given that the brain has no "given" time
| base or oscillator to define a compute cycle. We must all build
| and refine our own internal set of pseudo-clocks for sensory
| and motor systems, in order to define the cumulative temporal
| context in which we are embedded.
|
| This is crucial for the mouse to quickly avoid the talons of
| the owl.
|
| More on timing in brain: @robwilliamsiii (see pinned tweet).
| randlet wrote:
| Thanks for this super interesting comment. The complexity of
| life never ceases to boggle the mind.
| lebuffon wrote:
| It's more involved than just left/right delay. We localize
| sound in the vertical direction and front and back as well.
|
| This is accomplished by the delays created between direct into
| the ear sound and reflections off the outer ear folds.
|
| These reflections create "comb" filters in the audio spectrum
| which we learn to associate with direction. Its remarkable.
|
| A test to prove this was so was to fill the outer ear with
| plasticine and perform localization tests on subjects. They
| could not localize sound in that condition.
|
| The early work as I recall was at the Heinrich Herz institute
| in the mid 1970s.
|
| I am suspicious that part of what this article is reporting is
| due to phase cancellation effects causing similar filtration
| that people can hear as timbre change rather than actually
| detecting the time delay.
|
| (Source: My recording engineering final paper)
| EForEndeavour wrote:
| Now I really want some plasticine / modeling clay to attach
| to parts of my pinnae and experiment with my ability to
| localize sounds.
|
| Have video games already emulated the spectral effects of
| sound direction for players using headphones, or even speaker
| systems with known spatial distribution? I can imagine
| modulating the sound coming from an in-game object to match
| its perceived source direction to its location relative to
| the player.
| kaoD wrote:
| I think this is called HRTF. The bad news is there are many
| of them and only a few work for each person.
|
| This often results in being unable to tell front/back
| apart, frontal sounds perceived as coming from above, etc.
| chiefgeek wrote:
| I recently purchased a system from Dr Jeffery Thompson for
| sound healing. It creates custom binaural beats based on
| the body's stress response as monitored by heart rate
| variability). The system includes a pair of sophisticated
| 3D microphones that go over the ear with the pickups being
| located as close to the ear canal as possible. This allows
| them to record the sound as close as possible to what the
| ear hears. We use them to record the client singing their
| tone. It then gets disguised and wrapped back into the
| binaural beat so you are essentially singing to yourself.
| I'm eager to go out and record some natural sounds in 3D
| for use as background.
| jmole wrote:
| Link? I'm trying to picture the apparatus you're
| describing but it's not making sense. Are there
| headphones involved or is it just a wearable microphone?
| sudosysgen wrote:
| Some of them do, yes. For example CS:GO has an HRTF model
| that does this. It has variability though because each
| one's ear is slightly different, so these models are
| general and may work better or worse for different people.
| noir_lord wrote:
| Some games have it, with good headphones and when it's
| implemented well it's great otherwise couldn't tell if it
| was on or off.
|
| Even a "brand" for it THX Spatial Audio
| kd5bjo wrote:
| The problem is that each person's pinnae have a unique
| geometry, which makes the notches of the comb filter lie at
| different frequencies for everyone. As far as I know,
| there's no good way to determine this other than a direct
| measurement, which requires specialist equipment.
| PaulKeeble wrote:
| Creative (Soundblaster) now have a phone app that you can
| take a picture of your ear and it uses a machine learnt
| model to produce a somewhat better head related transform
| function based on it. It is an improvement on SBX and
| CMSS before it but its not perfect either.
| nitrogen wrote:
| Video games, VR, and other headphone systems do use
| generalized HRTFs for directional audio.
|
| _Edit:_ I 'm currently working on a series of videos to
| explain sound direction and perception. Some of the code
| I wrote for my demos is or will be on GitHub.
| [deleted]
| smcl wrote:
| That's a nice estimation, but I would have imagined direction
| is down to whether a sound is more muffled or quiet in one ear
| than the other.
| whiddershins wrote:
| No it is difference in phase.
| jacquesm wrote:
| Only for LF.
| vitus wrote:
| To piggyback on the sibling comments:
|
| If you only determined direction based on relative volume
| between your ears, you wouldn't be able to distinguish
| between sounds in front vs behind you.
| ryandvm wrote:
| You also wouldn't be able to do the cocktail party trick.
| Your brain's ability to ignore everything that isn't of the
| desired ITD phase shift is what allows you to selectively
| ignore most of the noise in the room and focus on a single
| conversation.
| smcl wrote:
| Yep that's why I added "muffled" (i.e. not 100% clear).
| Maybe it's worth clarifying that I wasn't trying to correct
| the commenter, just adding my original (naive? wrong?)
| belief about how this worked :)
| Ensorceled wrote:
| Since you sound sincere here ... I read your comment as
| disagreeing with and correcting the original comment.
| smcl wrote:
| Oops, it wasn't intended! I think maybe the "but" in my
| comment makes it sound that way?
| Cederfjard wrote:
| Yeah, to me it reads "that's a nice theory you have, but
| now I'll tell you mine, which I think is the correct
| one".
| smcl wrote:
| I just liked how they related the speed of sound to the
| distance between ears to explain the (possibly less than)
| ~0.5ms thing, and how it was nicer than the naive
| explanation I had previously assumed
| mattkrause wrote:
| It's a combination of both.
|
| The relevant keywords are "Interaural Time Difference" (ITD;
| this phenomenon) and "Interaural Intensity (or Level)
| Difference" (IID/ILD; i.e., volume).
|
| In fact, there are a few other mechanisms too. The shape of
| the pinna (external ear) does some filtering that allows you
| to distinguish sounds that produce identical ITD/IID.
|
| The neuroscience of this is really fascinating, and the
| circuits have been worked out pretty well.
| Griffinsauce wrote:
| That doesn't need to be the case when there are reflections.
| smcl wrote:
| I'm not sure I follow
| jacquesm wrote:
| A reflection can easily be louder than the original
| sound, but the longer time-of-flight allows the ear to
| distinguish between the two. That's how even in a hall
| with echoing walls you can still pinpoint a soundsource
| with relatively high accuracy.
| datameta wrote:
| The first chapter of The Sound Book by Trevor Cox goes
| into great detail about this (but without getting deep
| into the math of it). Well worth the read for those
| interested in the acoustics of architecture and our
| perception of how it modifies the soundscape.
| jacquesm wrote:
| Another interesting one is 'the acoustical foundations of
| music'.
|
| ISBN 0393090965
| frankus wrote:
| I remember back in the late 90s I had a BeBox and there was
| this cool nodes-and-connections audio program where you could
| add various effects. My friend mentioned this delay effect and
| so we rigged up a graph where one stereo channel was delayed by
| some number of milliseconds. The effect was pretty uncanny.
| Wearing headphones, the sound seemed to be coming exclusively
| from the non-delayed side, until you removed that headphone and
| the delayed side was clearly still playing.
| out_of_protocol wrote:
| That's built-in echo-cancelling module, allowing you to
| easily getting correct direction to the true source of sound
| nitrogen wrote:
| AKA the Haas effect or precedence effect, making the first
| sound to arrive the most important for localization, even
| if echoes are louder.
| jacquesm wrote:
| It can do a whole lot better than that, it can detect phase
| changes at a very small fraction of a millisecond. It has to
| because that's how we determine direction of a sound source at
| low frequencies.
| quickthrower2 wrote:
| That it can do that with 2 ears and the latency between them
| signalling through the brain is amazing
| da_chicken wrote:
| Well, it's not _just_ latency in many cases. If there 's
| someone typing at the desk to your left, that noise has
| latency but it has different loudness in your left and right
| ears. The sound reaching your right ear has to make it's way
| around your head, so it sounds different not merely more
| delayed. Your brain has learned what all of that means.
| phreeza wrote:
| In addition to loudness and latency differences, there is
| even a frequency-specific effect due to the shape of the
| head and in particular your ear (your pinna to be precise).
| This is actually probably the reason why our ears have the
| weird shape they do. This is different for every person,
| which means that headphone-based simulated spatial audio
| can never be perfect.
| tiagod wrote:
| You can even hear the interference between two sine waves,
| without them actually "mixing" in the air, by feeding one to
| each ear. This doesn't work for every frequency, but when it
| does it sounds the same as it would if they were both played
| to both years, but it's happening in your brain!
|
| https://en.wikipedia.org/wiki/Beat_(acoustics)#Binaural_beat.
| ..
| jacquesm wrote:
| I've done this and it makes you tired extremely quickly for
| some reason. It's worse than tuning up a piano from
| scratch.
| junon wrote:
| Was just thinking about this the other day. In music production
| you can determine this; you have to be able to hear just a
| millisecond of difference sometimes to align things well,
| especially when you're working with instruments that have
| considerable delay (e.g. almost all Kontakt instruments).
| jacquesm wrote:
| Essentially anything that is MIDI based or has an audio buffer
| somehwere.
| junon wrote:
| Yes but what I meant is that even though you place a midi
| note directly on a beat, many instruments still have a pretty
| considerably delay, in the order of 10-100ms.
| jacquesm wrote:
| Not only that, those delays themselves tend to change
| depending on how 'busy' the instrument is, for instance,
| whether you are using layered patches, multiple notes at
| once, different instruments at once and so on. It can
| change from one note to the next, sometimes without any
| clear hint as to why that is the case.
| mauvehaus wrote:
| Highly recommended: Seeing The Visitors at ICA Boston. It's a
| dozen-ish musicians playing in separate rooms of a big house
| where they can't see each other and can only hear each other over
| headphones. It's presented as nine channels of video, each with
| their own audio.
|
| The effect is pretty wild and magical. The music, I suspect, is
| intentionally written to not require precise timing, and part of
| the charm of the piece is the musicians feeling each other out as
| they play. It definitely plays with your expectations of what
| constitutes music and how you hear timing in music.
|
| The ICA owns the piece (a copy of the piece?), but it isn't
| currently on display :-(
|
| https://www.icaboston.org/exhibitions/ragnar-kjartansson-vis...
| bazeblackwood wrote:
| I saw it at the Broad in LA, and it was mind-blowing.
___________________________________________________________________
(page generated 2021-08-05 23:03 UTC)