[HN Gopher] Using ASCII waveforms to test real-time audio code
___________________________________________________________________
Using ASCII waveforms to test real-time audio code
Author : jwosty
Score : 70 points
Date : 2021-10-13 18:30 UTC (4 hours ago)
(HTM) web link (goq2q.net)
(TXT) w3m dump (goq2q.net)
| spicybright wrote:
| Why would you use ascii for something like a waveform, something
| that's inherently a graph?
|
| Sure, maybe you don't need that much resolution for what the use
| case is. But it's the equivalent of looking at a graph and
| squinting your eyes to blur it.
| jwosty wrote:
| In short, because text is much easier to deal with than
| bitmaps, and there is much more tooling that "just works" for
| text than actual graphics, like Expecto's textual diffing in
| assertations. @MayeulC said it well:
| https://news.ycombinator.com/item?id=28856884
| [deleted]
| bitwize wrote:
| That's so cool, and reminds me of how I used Gnuplot as a
| makeshift oscilloscope to test and evaluate some (not real time)
| software synthesis I was doing.
| munchler wrote:
| This is great. People are doing very cool things with F# these
| days.
| jwosty wrote:
| Thanks, I like to think so! I didn't see other people doing
| much audio programming in F#, so I figured someone would be
| interested in seeing what it can look like.
| brianberns wrote:
| FWIW, you might like this:
| https://github.com/brianberns/FYampaSynth
| jwosty wrote:
| That looks right up my alley, thanks for the link!
| phab wrote:
| This approach is neat for observability, but it's worth noticing
| that it essentially quantises all of your samples down to the
| vertical resolution of your graph. If you somehow introduced a
| bug that caused an error that was smaller than the step size then
| these tests wouldn't catch it.
|
| (e.g. if you somehow managed to introduce a constant DC-offset of
| +0.05, with the shown step size of 0.2, these tests would
| probably never pick it up, modulo rounding.)
|
| That said, these tests are great for asserting that specific
| functionality does broadly what it says on the tin, and making it
| easy to understand why not if they fail. We'll likely start using
| this technique at Fourier Audio (shameless plug) as a more
| observable functionality smoke test to augment finer-grained
| analytic tests that assert properties of the output waveform
| samples directly.
| PaulDavisThe1st wrote:
| A more accurate and only slightly more complex process for this
| is to generate numerical text representations of the desired
| test waveforms and then feed them through sox to get actual
| wave files. The numerical text representations are likely even
| easier to generate programmatically than the ascii->audio
| transformation.
| jwosty wrote:
| That's true that it quantizes (aka bins) the samples, so it
| isn't right for tests that need to be 100% sample-perfect, at
| least vertically speaking. I suppose it is a compromise between
| a few tradeoffs - easy readability just from looking at the
| code itself (you could do images, but then there's a separate
| file you have to keep track of, or you're looking at binary
| data as a float[]) vs strict correctness. The evaluation of
| these tradeoffs would definitely depend on what you're doing,
| and in my case, most of the potential bugs are going to relate
| to horizontal time resolution, not vertical sample depth
| resolution.
|
| If the precise values of these floats is important in your
| domain (which it very well may be), a combination of approaches
| would probably be good! Would love to hear how well this
| approach works for you guys. Keep me updated :)
| phab wrote:
| I'm not sure it makes sense to separate "vertical"
| correctness from "horizontal" correctness when it comes to
| "did the feature behave" though; to extend the example in
| TFA, if your fade progress went from 0->0.99 but then stopped
| before it actually reached 1 for some reason, you might find
| that you still had a (small, but still present) signal on the
| output, which, if the peak-peak amplitude was < 0.1, the test
| wouldn't catch.
|
| Obviously any time you're working with floating-point sample
| data the precise values of floats will almost always not be
| bit-accurate against what your model predicts (sometimes even
| if that model is a previous run of the same system with the
| same inputs as in this case); it's about defining an
| acceptable deviation. I guess what I'm saying is that for
| audio software, a peak-peak error of 0.1 equates to a signal
| at -20 dBFS (ref DBFS@1.0) (which of course is quite a large
| amount of error for an audio signal), so perhaps using
| higher-resolution graphs would be a good idea.
|
| (Has anyone made a tool to diff sixels yet? /s)
| jwosty wrote:
| Fair points here. Unfortunately adding more vertical
| resolution starts to get a little unwieldy to navigate
| through. Maybe it could start using different characters to
| multiply the resolution to something sufficiently less
| forgiving of errors. If it could choose between even 3
| chars, for example, it would effectively squash 3 possible
| values into one line, tripling the resolution.
| necubi wrote:
| This is such a great idea! I've really struggled with how to test
| real-time audio code in the live looper I've been working on [0].
| Most of my tests use either very small, hand-constructed arrays,
| or arrays generated by some function.
|
| This is both tedious and makes it very hard to debug test
| failures (especially with cases like crossfades, pan laws, and
| looping). I love the idea of having a visual representation that
| lets me see what's going wrong in the test output, and I'm
| definitely going to try to implement some similar tests.
|
| I'm also curious what the state-of-the-art is for these sorts of
| tests. Does anyone have insight into what e.g., ableton's test
| suite looks like?
|
| [0] http://github.com/mwylde/loopers
| jwosty wrote:
| > I'm also curious what the state-of-the-art is for these sorts
| of tests. Does anyone have insight into what e.g., appleton's
| test suite looks like?
|
| I don't know, but if I were to make an educated guess, maybe
| rendering stuff to actual audio files is a common approach?
| That way when something goes wrong, they can inspect it in a
| standard waveform editor?
| rbanffy wrote:
| Am I the only one almost offended by Braille not being ASCII?
|
| edit: Yes. I miscalculated the dot density.
|
| /me slaps forehead
| thewakalix wrote:
| Aren't those asterisks?
| rbanffy wrote:
| Oh... The shame...
|
| Yes. I miscalculated the dot density. :-(
| rbanffy wrote:
| If we go beyond ASCII, Unicode specifies 2x2 mosaics since ever
| (they were present in DEC terminals) and 2x3 mosaics (from
| Teletext and the TRS-80) since version 13. Some more enlightened
| terminals (such as VTE) implement those symbols without the need
| of font support.
|
| Or you can use Braille to get 2x4 mosaics, but they usually look
| terrible.
| jwosty wrote:
| I just might have to try this next.
| focom wrote:
| Would love to use it as a library! Is it open source?
| jwosty wrote:
| I've added an fssnip for the ASCII renderer. It uses NAudio.
| Should be pretty easy to use. http://www.fssnip.net/85g
| jwosty wrote:
| Not yet, but it certainly could be. Would it be useful to
| publish the helper classes that render the waves out to ASCII?
| That's really the guts of the thing. After that, you just use
| whatever testing framework you want to do the actual diffing
| (in my case Expecto for F#).
| robotsteve2 wrote:
| Once you've got the waveforms as arrays, what do you need the
| ASCII rendering for?
|
| Instead of diffing ASCII-rendered waveforms, save the arrays and
| diff the arrays (and then use any kind of numerical metric on the
| residual). Scientist programmers have all sorts of techniques for
| testing and debugging software that processes sampled signals.
| user-the-name wrote:
| Imagine if we had terminals that could handle graphical data. We
| wouldn't have to do weird kludges like this, we could just plot
| the waveforms in the output of our tools.
|
| But it's 2021, and not only is this not possible, there is not
| even a path forward to a world where this would be possible. It's
| just not an option. Nobody is working on this, nobody is trying
| to make this happen. We're just sitting here with our text
| terminals, and we can't even for a second imagine that there
| could be anything else.
|
| It's sad, is what it is.
| HPsquared wrote:
| Notebook interfaces are basically that, e.g. Jupyter or
| Mathematica.
| voldacar wrote:
| In TempleOS you can mix text, images, hyperlinks, and 3d models
| in the terminal. This is true for the whole system: you could
| literally have a spinning 3d model of a tank as a comment in a
| source file. That's right, it took a literal schizophrenic to
| make an OS with a feature that should have been standard
| decades ago.
|
| Nobody tries to make actually interesting new operating systems
| anymore. OS research today is just "let's implement unix with
| $security_feature", nobody is actually trying to make computers
| more powerful or fun to use, or design a system based off of a
| first-principles understanding of what a computer should be.
|
| God I wish I was born in the lisp machine timeline
| woodrowbarlow wrote:
| the downside of rich terminal output is that media formats
| become the system's responsibility. applications can't output
| media in formats that aren't provided by the system, because
| then the terminal wouldn't know how to display it and interop
| with other applications (e.g. piping) wouldn't work either.
| voldacar wrote:
| You could let a program create an API for manipulating a
| new type of data and inform the system about it so that
| other programs could use it. This is more or less what
| AmigaOS did; you installed a datatype for e.g. a PSD file,
| then all your programs that worked with images could read
| PSD files. I think it's a nice idea.
| PeterisP wrote:
| The features you describe belong to the app ecosystem, not to
| the OS - IMHO the OS is about hardware and drivers, and what
| kind of graphics is supported by your terminal and source
| file editor is orthogonal to the OS and could be done in any
| of the current OS'es; but that would require a
| rewrite/redesign/reimagining of the whole standard
| application package which seems a much larger project than
| "merely" an OS.
| voldacar wrote:
| "There are more things in heaven and earth, Horatio, than
| are dreamt of in your philosophy"
|
| An OS facilitates communication between programs running on
| a computer. Unix lets those programs communicate by sending
| characters of text to each other. You could just as easily
| imagine an OS that lets them communicate by sending images,
| audio/video, 3d models, etc. An OS can be way more than
| what you think it is. To detox your brain from this unix
| worldview, spend some time in a VM and play around with
| amigaOS or opengenera. Those were actual coherent OSes with
| an actual view of what a computer should be and how it
| should behave. Unix isn't.
|
| > reimagining of the whole standard application package
| which seems a much larger project than "merely" an OS.
|
| By OS, I don't mean kernel. I mean the base set of software
| that lets you interact with your computer and do
| interesting stuff with it.
| rbanffy wrote:
| The line between app platform and OS is a blurry one. The
| Amiga OS, for instance, has libraries for specific file
| types that expose standardized entry points. This way, if
| you install the library for Photoshop files, all graphics
| programs that adhere to that protocol will be able to read
| and write Photoshop PSD files. Microsoft had DDE and,
| later, OLE, for embedding objects from one program into
| data from another in a standard way all programs were
| supposed to share. It was a pain.
|
| This blurry line is present in other environments as well.
| In the Apple Lisa, installing a program resulted in new
| templates in the Stationery folder. In Smalltalk,
| installing a program adds its class definitions to the
| system as independent entities you could use in your own
| programs.
|
| Not all operating systems are the children of Unix and VMS.
| bitwize wrote:
| Smalltalk's components were so tightly interdependent
| that their integration smoke test was 4+3, because
| evaluating a simple addition expression exercised like
| 3/4 of the entire system.
| outworlder wrote:
| Some terminals can.
|
| https://iterm2.com/documentation-images.html
|
| That's iterm's own implementation. There's also sixel, as
| pointed out by another comment.
| MayeulC wrote:
| In truth, it's because text is quite easy to handle. It's easy
| to make a program that handles text, too.
|
| And so we have a lot of text editors, diff tools, efficient
| compression, tools like sort and uniq: the whole unix
| ecosystem.
|
| So if you transform sound to text, you can then use text tools
| to compare the output to catch differences. A simple
| serialization of numerical sample values would have caught the
| bug, but I agree that having a way of visualizing the output is
| nice.
|
| Command line input, programming, etc. is also still mostly done
| with text, because it's easy to transform. Of course, you can
| imagine working at a higher level with objects (like powershell
| does IIRC), mimetypes, etc.
| thanatos519 wrote:
| Maybe you would like to support https://ctx.graphics/
| zokier wrote:
| > Imagine if we had terminals that could handle graphical data.
|
| We have. They are called "browsers". You might be even using
| one right now!
| charlesdaniels wrote:
| I would point out that sixels[0] exist. There is a nice
| library, libsixel[1] for working with it, which includes
| bindings into many languages. If the author of sixel-tmux[2][3]
| is to be believed[4], the relative lack of adoption is a result
| of unwillingness on the part of maintainers of some popular
| open source terminal libraries to implement sixel support.
|
| I can't comment on that directly, but I will say, it's pretty
| damn cool to see GnuPlot generating output right into one's
| terminal. lsix[5] is also pretty handy as well.
|
| But yeah, I agree, I'm not a fan of all the work that has gone
| into "terminal graphics" that are based on unicode. It's a
| dead-end, as was clear to DEC even back in '87 (and that's
| setting aside that the VT220[6] had it's own drawing
| capabilities, though they were more limited). Maybe sixel isn't
| the best possible way of handling this, but it does have the
| benefit of 34 years of backwards-compatibility, and with the
| right software, you can already use it _now_.
|
| 0 - https://en.wikipedia.org/wiki/Sixel
|
| 1 - https://saitoha.github.io/libsixel/
|
| 2 - https://github.com/csdvrx/sixel-tmux
|
| 3 - https://news.ycombinator.com/item?id=28756701
|
| 4 - https://github.com/csdvrx/sixel-tmux/blob/main/RANTS.md
|
| 5 - https://github.com/hackerb9/lsix
|
| 6 - https://en.wikipedia.org/wiki/VT220
| user-the-name wrote:
| That's a protocol that's a good forty years old, and even
| that is not supported. And I can see why, why on earth would
| you want to be adding support for that in 2021? What a
| ridiculous state of affairs.
| jwosty wrote:
| That's interesting. Do you think sixels could work for the
| baseline tests? Would it be feasible to have them display
| nicely in an IDE, like VS Code or Visual Studio?
| MayeulC wrote:
| I find kitty's graphics protocol to be a superior
| implementation of the idea:
| https://sw.kovidgoyal.net/kitty/graphics-protocol/
| rbanffy wrote:
| The venerable xterm and a lot of later physical terminals
| (those things with CRTs) can emulate Tektronix (Tektronix, that
| today makes instruments, also made computer terminals with
| fancy storage CRTs that were kind of e-paper-like, but green -
| and sometimes yellow - screen) graphics. iTerm2 and some
| others, as pointed out, can do Sixel graphics (a format
| designed originally for DEC dot-matrix printers that some DEC
| terminals also implement).
| user-the-name wrote:
| I mean, yes, that is how sad the current state is.
| rbanffy wrote:
| VTE and, with it, almost every Linux distro, will get Sixel
| support soon. I volunteered to add Tektronix graphics to it
| too, but this is neither a dire need, nor something I have
| done before, so it'll take some time.
| user-the-name wrote:
| It's forty years old. Why on earth would you be adding
| that in 2021?
|
| Why are we not focusing our energy on making something
| that is actually up to date?
| rbanffy wrote:
| Because things that existed 40 years ago are useful,
| already have software written for it, are compatible in
| sometimes unforeseen ways (a DEC dot-matrix graph can be
| printed as is on a Sixel-compatible terminal!) and have
| been battle tested for ages.
|
| There is a reason the Unix way of bytestream-based shell
| and pipes is still useful and present these days to the
| point that That Other OS is now embedding Linux in it.
|
| Also, these ancient terminals often had some interesting
| typography options that are encoded in the ANSI standard
| that most modern terminals don't bother (line attributes
| that generate wider and taller cells are one such
| example).
|
| These formats may be more desirable than more modern and
| complete ones such as PostScript for other reasons. I
| wouldn't advise implementing a terminal capable of
| rendering PostScript graphics because it's one more way
| to infiltrate malware in your computer by rendering
| untrusted inputs (There are a lot of RCE opportunities in
| exploiting vulnerable decoders).
| gwbas1c wrote:
| > It's sad, is what it is.
|
| With graphics being everywhere in 2021, I wouldn't call this
| situation "sad," I'd think a lot more critically about _why._
|
| To start with, fixed-width text is significantly easier to work
| with than graphics.
|
| Nothing's stopping anyone from writing a CI tool that outputs
| to HTML with embedded images. The bigger question is why it's
| uncommon.
___________________________________________________________________
(page generated 2021-10-13 23:00 UTC)