[HN Gopher] Emulating an emulator inside itself. Meet Blink
___________________________________________________________________
Emulating an emulator inside itself. Meet Blink
Author : 0xhiro
Score : 86 points
Date : 2023-01-04 19:42 UTC (3 hours ago)
(HTM) web link (hiro.codes)
(TXT) w3m dump (hiro.codes)
| asciii wrote:
| > Me: How small can an emulator be? Blink: Yes.
|
| My favorite FAQ
| mtlynch wrote:
| Blink sounds cool, but this blog post is pretty thin. It's just
| restating a handful of tweets about Blink by its author.
| jart wrote:
| Author of Blink here. Ask me anything :-)
| monocasa wrote:
| What do you account for the perf win over Qemu? A bunch of
| micro optimizations, less abstraction layers, or something
| more systemic?
| saagarjha wrote:
| QEMU TCG is not particularly optimized for performance.
| It's not all that hard to do better than it, especially if
| you target only one architecture.
| bonzini wrote:
| It's also not that easy though, and blink's code
| generation is comparable to QEMU in 2007 or so.
|
| I suspect the reason why blink is faster to _not_ be
| related to code generation, as I mentioned in another
| comment.
| jart wrote:
| Blink is like a Tesla sports car whereas Qemu is like a
| locomotive. I think what may be happening, is Qemu has a
| lot of heavy hitting optimizations that benefit long-
| running compute intensive programs. But if you just want to
| run an big program like GCC ephemerally as part of your
| build system, the cost of the the locomotive gaining speed
| doesn't pay off, since there's nothing to amortize over.
| Blink's JIT also accelerates quickly because it uses a
| printf-style DSL and it doesn't relocate. The tradeoff is
| that JIT path construction sometimes fails and needs to be
| retried.
|
| Another great example of this tinier is better phenomenom,
| would be v8 vs. quickjs. Fabrice Bellard singlehandedly
| wrote a JavaScript interpreter that runs the Test262 suite
| something like 20x faster Google's flagship V8 software,
| because once again, tests are ephemeral. It's amazing how
| much quicker QuickJS is. But if you wanted to do something
| like write a JS MPEG decoder to show television
| advertisements without a <video> tag then v8 is going to be
| faster, since it's a locomotive.
|
| Fabrice Bellard wrote Qemu too. But I suspect his Tiny Code
| Generator has gotten a lot heftier over the years as so
| many people everywhere contributed to it. I really want to
| examine his original source code, since I'd imagine what he
| originally did probably looked a lot more like Blink than
| it looks like modern Qemu.
| [deleted]
| JoshTriplett wrote:
| Would it be fair to describe Blink's JIT as more of a
| "baseline JIT" to QEMU's "optimized JIT", or does that
| analogy not accurately capture what you mean in the first
| paragraph?
| saagarjha wrote:
| > Fabrice Bellard singlehandedly wrote a JavaScript
| interpreter that runs the Test262 suite something like
| 20x faster Google's flagship V8 software
|
| Something is wrong here. How did you test this? QuickJS
| might start up faster on very small testcases but V8 is
| not _that_ slow; it needs to have very low latency on a
| webpage too. Did you run a debug build or something?
| rcme wrote:
| I have no knowledge of what allows QuickJS to run the
| tests faster, or if it even does run the tests faster,
| but QuickJS does have one big speed advantage over V8 in
| some circumstances: QuickJS allows ahead-of-time
| compilation of JS to byte code. This removes the need to
| parse the JS at execution time. It's a pretty nifty
| feature.
| andai wrote:
| Fascinating. Most JS code is ephemeral, i.e. rarely is
| something as intensive as video encoding done in the
| browser (and even then WebAssembly would usually be
| preferred).
|
| It seems to me like browsers would benefit from running
| most code in QuickJS, and then spinning up V8 only for
| those rare cases of long-running JS?
| nightpool wrote:
| "Ephemeral" is relative. Most JS code in the browser runs
| for at least 30 seconds, if not longer, as the user
| interacts with the page. That's plenty of time to spend
| spare cycles on JITing in the background to make
| responsiveness better without worrying about 100s of
| milliseconds of startup / shutdown latency.
| saagarjha wrote:
| V8 is optimized for real-world use cases, not benchmarks.
| Any modern browser will blow QuickJS out of the water for
| anything that's non-trivial.
| bonzini wrote:
| Hi Justine, QEMU developer here. Great job on Blink! You
| have done a lot of cool work and it's been fun to follow.
| I enjoyed looking at different choices you made in the
| frontend, for example flags handling is very different
| from QEMU.
|
| QEMU's code generator is actually pretty fast and
| shouldn't really be expensive. It's a handful of passes
| that are run on individual basic blocks, certainly not
| optimal when a lot of code runs once as is the case for a
| very short compile but it's nothing like v8.
|
| I suspect an even more silly reason--startup time might
| even be the biggest factor, because I think qemu-user's
| startup has never been optimized. I assume both QEMU and
| blink binaries are statically linked (or both dynamically
| linked, alternatively)?
|
| Anyhow these theories should be pretty easy to disprove
| just by compiling something larger than hello world, so I
| will do it in case there's some low-hanging fruit left.
| pwdisswordfish9 wrote:
| > Fabrice Bellard singlehandedly wrote a JavaScript
| interpreter that
|
| No he didn't.
| [deleted]
| googlryas wrote:
| Charlie Gordon was involved too, I suppose would be a
| more constructive comment.
| Y_Y wrote:
| Why bother making this? (even if it is really cool)
| trashburger wrote:
| https://justforfunnoreally.dev/
| jart wrote:
| We do what we must because we can.
| sidewndr46 wrote:
| Where is the getting started guide for this?
| jart wrote:
| Here's a gentle introduction. https://github.com/jart/blink
| /tree/master/third_party/sector... See also
| https://justine.lol/sectorlisp2/
| gabcoh wrote:
| The comparison with QEMU is with KVM disabled, right?
| Assuming this is true, how does it compare with KVM enabled?
| fwsgonzo wrote:
| KVM allows you to run guests directly on the CPU and has
| native performance
| monocasa wrote:
| Well, not quite 'native'. TLB refills are 4x to 5x as
| expensive, and anything that needs a context switch tends
| to be at a minimum twice as expensive, and it's common to
| balloon even farther from there.
| fwsgonzo wrote:
| I guess that's mostly if you are running a full operating
| system inside it, generally in Qemu. It doesn't have to
| be - could just be a program. Tiny programs running in
| KVM can use big pages and never cause or require any
| pagetable changes.
|
| For simple workloads it can even be faster than native
| unless you dynamically load something that uses bigger
| pages for your native program, eg.
| https://easyperf.net/blog/2022/09/01/Utilizing-Huge-
| Pages-Fo...
| monocasa wrote:
| It's harder to force huge pages on a guest than it is to
| just use them in regular user space where you can simply
| mmap them in.
|
| And none of that accounts for the increased context
| switch time.
| fwsgonzo wrote:
| The guest is not in control - sure theres a few pages at
| the beginning of each section that has to be 4k until you
| reach the first 2MB-multiple.
|
| What context switch time? It takes 5 micros to enter and
| leave the guest. The rest is just "workload".
|
| The point is: KVM is native speed if you never have to
| leave. I don't need to prove this for anyone to
| understand it has to be true.
| monocasa wrote:
| > The guest is not in control
|
| The guest has it's own page tables above the nested guest
| phys->host phys tables.
|
| > What context switch time? It takes 5 micros to enter
| and leave the guest. The rest is just "workload".
|
| And then the kernel doesn't know what to do with nearly
| every guest exit on KVM, so then you trap out to host
| user space, which then probably can't do much without the
| host kernel so you transition back to kernel space to
| actually perform whatever IO is needed, then back to host
| user, then back to host kernel to restart the guest, then
| back from host kernel to guest. So six total context
| swaps on a good day guest->host_kern->host_user->host_ker
| n->host_user->host_kern->guest.
| fwsgonzo wrote:
| Right, that's very true! It's clear that you know what
| you're talking about when it comes to KVM and maybe even
| the internal structure in Linux. However, I/O can be
| avoided. Imagine a guest that needs no I/O, doesn't have
| any interrupts enabled, and simply runs a workload
| straight on the CPU (given that it has all the bits it
| needs). That is what I have made for $COMPANY, which is
| in production, and serves a ... purpose. I can't really
| elaborate more than I already have. But you get the gist
| of it. It works great. It does the job, and it sandboxes
| a piece of code at native speed. Lots of ifs and buts and
| memory sharing and tricks to get it to be fast and low
| latency. No need for JIT, which is a security and
| complexity nightmare.
|
| The topic of this thread is about Blink, which happens to
| be a userspace emulator. Hence my comment.
| jart wrote:
| I usually measure the functions I write in picoseconds
| per byte, so 5 microseconds is an eternity.
| [deleted]
| bonzini wrote:
| 10 ps/byte is equivalent to 100 GB/sec; unless you
| routinely write functions that are in the tens of GB/sec
| range, so you probably mean nanoseconds?
| monocasa wrote:
| I think this is a user mode emulator, so qemu with kvm
| isn't a great comparison.
| jart wrote:
| Blink is primarily a user mode emulator, but it does
| support real mode BIOS programs. It can even bootstrap
| Cosmopolitan Libc bare metal programs into long mode.
| Here's a video of Blink doing just that. https://storage.
| googleapis.com/justine/sectorlisp2/sectorlis...
| [deleted]
| gabcoh wrote:
| Is this true? Why can't qemu use kvm for user mode
| emulation?
| monocasa wrote:
| Nobody's really set it up to do that as it's easier to
| use Linux's sandboxing features if you're looking to run
| user code of the same cpu ISA. GVisor has an
| (experimental last time I checked) backend that uses KVM
| to run user mode code, but there you have the win of the
| sandboxing code being written in a memory safe language
| and giving you a real privilege boundary as opposed to
| the sieve that qemu-user is. In just about every other
| instance just running code natively in regular user space
| (even if sandboxed with seccomp or a ptrace jail)
| achieves the underlying goals better.
| jart wrote:
| It depends on whether you're more afraid of language bugs
| or hardware bugs. One potentially nice thing about having
| a tool like Blink that can fully virtualize the memory of
| existing programs, is it's sort of like an extreme
| version of ASLR. In order to virtualize a fixed address
| space, you have to break apart memory into pieces and
| shuffle them around into things like radix tries, and
| that might provide enough obfuscation of the actual
| memory to protect you from someone rowhammering your
| system. I don't know if it's true but it'd be fun to
| test.
| fathyb wrote:
| KVM requires additional privileges. A Linux container
| would need privileged rights and access to /dev/kvm to
| run QEMU with KVM for example, whereas any container
| should be able to run it in user-mode.
| monocasa wrote:
| That's not really an issue, as there's a lot of
| infrastructure around optionally giving device file
| access to containers. That's why
| SECCOMP_IOCTL_NOTIF_ADDFD exists.
| 0xhiro wrote:
| Blink is a new CPU emulator written in C, made by Justine Tunney.
| Besides having a really cool name, blink has a lot of impressive
| features and some of them will blow your mind!
| sitkack wrote:
| Is it not too late to RiiR?
| [deleted]
| pdntspa wrote:
| This guy got Doom running inside of Doom using a code execution
| exploit
|
| https://www.youtube.com/watch?v=c6hnQ1RKhbo
| zamadatix wrote:
| "blinkenlights" put a smile on my face.
|
| Looks like it itself is not yet able to be compiled with
| Cosmopolitan Libc (though it emulates programs compiled with it)
| but it's planned - very cool!
| jart wrote:
| Author here. I'm planning to get Blink to compile with
| Cosmopolitan Libc as soon as possible. There's just a few
| switch statements that need to be refactored. There's a really
| nice `cosmocc` toolchain that makes building POSIX software
| with Cosmo easier than ever. See
| https://github.com/jart/cosmopolitan/blob/master/tool/script...
| and
| https://github.com/jart/cosmopolitan/blob/master/tool/script...
| jvolkman wrote:
| Will compiling with Cosmopolitan enable it to run on Windows?
| jart wrote:
| Absolutely. If you download last year's release of
| Blinkenlights, you can actually use this software on
| Windows today. It works great in the Windows 10 command
| prompt or powershell.
| https://justine.lol/blinkenlights/download.html
| jvolkman wrote:
| Awesome. I actually started trying to get it to build
| against mingw-w64 earlier today, but I guess I'll just
| wait for you. :)
|
| I'm not a windows user, but super interested in using
| Blink to ship pre-compiled binaries as part of various
| Bazel rule sets.
| [deleted]
| mhh__ wrote:
| Is it actually 2x faster or 2x faster at starting up? QEMU does
| so much stuff, running cc1 on hello world isn't really a stress
| of the interpreter IMO as much as all the crap that goes around
| it.
| jart wrote:
| Blink actually does run the GCC9 CC1 command from start to
| finish twice as fast. Qemu takes 600ms to run it and Blink
| takes 300ms. Both Qemu and Blink use a JIT approach. Since GCC
| CC1 is a 33mb binary, a lot of the time it takes to run it, it
| stresses the JIT pretty hard.
| https://twitter.com/JustineTunney/status/1610276286269722629
| mhh__ wrote:
| That's partly what I meant though, how fast is it at a longer
| running process? C doesn't require all that much semantic
| analysis so there usually isn't all that much hot code in the
| compiler, so it would suit a simple-fast JIT whereas QEMU
| does do some basic optimizations.
|
| I've only ever really skimmed the TCG source code but it
| wouldn't surprise me if a new-er JIT could smack it's arse
| given that with these old C codebases (it's probably one of
| Bellard's few flaws) it's pretty hard to actually make true
| architectural changes.
|
| The Java/script (I think more Javascript but I'm hedging my
| bets by including jvms too) JITs are probably the cutting
| edge but I'd imagine still quite beatable for a few cases.
| muricula wrote:
| At a glance, the debugger user interface looks much nicer than
| gdb's terminal ui. How tightly coupled is the debugger interface
| to the emulator/debugger engine? How much work would it be to
| plug in a different debugger, say lldb or gdb, into the ui
| instead of blink?
|
| I think the user experience of cli debuggers is generally
| somewhat dreadful when compared to their gui cousins -- they seem
| to display a much narrower view of what's going on. Could the big
| blinkenlights debugger view be useful outside of blink itself?
| saagarjha wrote:
| Ideally Blink would just support the GDB RSP, so you could
| directly use GDB or LLDB on the emulator itself.
| yurymik wrote:
| You can go other way around and use other TUIs for GDB:
|
| * https://github.com/pwndbg/pwndbg *
| https://github.com/longld/peda * https://github.com/hugsy/gef
| AaronFriel wrote:
| This is really neat, but not to be confused with Blink, the name
| of the browser engine underlying Google Chrome, Chromium, and
| derivative browsers.
| rcarr wrote:
| The problem is compounded further considering the most popular
| terminal emulator on iOS is also called Blink.
| jart wrote:
| Blink is short for Blinkenlights. See
| https://github.com/jart/blink#blinkenlights and
| https://justine.lol/blinkenlights/
| saagarjha wrote:
| Pretty sure there are several SSH clients that are several
| times more popular than Blink :)
| thriftwy wrote:
| I was thinking they compiled Blink to Javascript and are
| rendering web pages with it.
| jart wrote:
| We just managed to compile Blink to a 300kb javascript file
| today. Follow https://github.com/jart/blink/issues/8 for
| updates on our progress.
| thriftwy wrote:
| But I was thinking of HTML renderer...
| chazeon wrote:
| There is also an iOS/iPadOS SSH Client called Blink [1], short
| for Blink Shell, which I use almost daily.
|
| [1]: https://blink.sh/
| [deleted]
| duxup wrote:
| Video conferencing is mentioned.
|
| Anyone know of any other use cases they have in mind?
|
| I always hear tech spec rumors but never about anything I would
| want to do with this type of thing... outside say gaming?
| bowmessage wrote:
| I think you're looking for the Apple headset thread:
| https://news.ycombinator.com/item?id=34250929
___________________________________________________________________
(page generated 2023-01-04 23:00 UTC)