[HN Gopher] Dynamic translation of Smalltalk to WebAssembly
       ___________________________________________________________________
        
       Dynamic translation of Smalltalk to WebAssembly
        
       Author : lioeters
       Score  : 141 points
       Date   : 2024-07-09 10:34 UTC (1 days ago)
        
 (HTM) web link (thiscontext.com)
 (TXT) w3m dump (thiscontext.com)
        
       | Retr0id wrote:
       | > I don't expect the WASM translations to be much (or any) faster
       | at the moment, but I do expect them to get faster over time, as
       | the WASM engines in web browsers improve (just as JS engines
       | have).
       | 
       | I haven't been following WASM progress, do people generally share
       | this optimism for WASM performance improvement over time,
       | relative to JS? I was under the vague impression that all the
       | "obvious" optimizations had already been done (i.e. JIT, and more
       | recently, SIMD support)
        
         | afavour wrote:
         | Yeah, I'd be surprised. I'm sure there are improvements to be
         | made out there (e.g. maybe the sandbox could be faster) but the
         | crazy leaps we've seen in JS performance are primarily because
         | the complexity in the language makes for a very complex
         | implementation with plenty of inefficiencies. By comparison
         | WASM is already pretty close to the metal.
        
         | dgb23 wrote:
         | There seems to be alot of work going into WASM itself, much of
         | it is performance related:
         | 
         | https://github.com/WebAssembly/proposals
        
           | mjhay wrote:
           | If only you could directly access the DOM or other browser
           | APIs. I understand the GC proposal might help with this (?).
        
             | dgb23 wrote:
             | I found a discussion with some explanations to why that is
             | and what other solutions could be looked at:
             | 
             | https://github.com/WebAssembly/design/issues/1184
        
               | mjhay wrote:
               | Thank you. That helps with my confusion a lot.
        
               | andsoitis wrote:
               | _"The main reason is that direct access to the DOM
               | requires the ability to pass references to DOM /JS
               | objects through Wasm. Consequently, when GC happens for
               | JavaScript, the collector must be able to find and update
               | such references on the live Wasm stack, in Wasm globals,
               | or in other places controlled by the Wasm engine. Hence
               | the engine effectively needs to support GC in some form.
               | 
               | However, the new proposal for reference types that we
               | split off from the GC proposal tries to give a more
               | nuanced answer to that. It introduces reference types
               | without any functionality for allocating anything within
               | Wasm itself. In an embedding where host references are
               | garbage-collected that still requires a Wasm
               | implementation to understand GC. But in other embeddings
               | it does not need to."_
        
             | mason_mpls wrote:
             | As someone who wants WASM to free us all from JS. Is this
             | even worth it if DOM bindings for WASM will never get a
             | released? I've been following this since 2018 and it seems
             | like we're still at square 1 with DOM bindings. Incredibly
             | frustrating.
        
         | tracker1 wrote:
         | I think it depends on what you are doing... I think there are
         | huge opportunities for interop performance as well as just
         | optimized process paths for certain things. I'm not that deep
         | on the technical side, more of an avid observer. It just seems
         | that as long as many things in WASM are slower than general
         | performance for say Java or C#, that there is definitely room
         | to improve things.
         | 
         | As an example, looking at the in-the-box Garbage Collection
         | support that's being flushed out, as an example will improve
         | languages that rely on GC (C#, Java, Go, etc) without having to
         | include an implementation for the runtime.
         | 
         | Another point where there's potential for massive gains are the
         | browser UI interop as well, and I think there's been a lot of
         | effort to work within/around current limitations, but obviously
         | there's room to improve.
        
         | Oreb wrote:
         | > I was under the vague impression that all the "obvious"
         | optimizations had already been done (i.e. JIT, and more
         | recently, SIMD support)
         | 
         | Aren't threads still missing? That seems like a pretty major
         | optimization, now that almost any CPU you can buy has multiple
         | cores.
        
           | aseipp wrote:
           | Yes, but they're very close to being standardized and have
           | multiple existing implementations among different browser
           | engines and also non-browser runtimes. (Of course, I don't
           | think the threading proposal is exactly what OP had in mind
           | for the case of general WASM perf improvements, but you are
           | right it is in practice a big performance barrier in the
           | bigger scheme.)
        
             | garaetjjte wrote:
             | Shared memory is standardized, you can run WASM threads in
             | the browser through web workers. WASI threads
             | standardization is in limbo because apparently component
             | model is very important and nobody knows how threads will
             | interact with that.
        
               | aseipp wrote:
               | Yes, I should have been more explicit: the raw WASM
               | proposal only really defines how shared memory and cross-
               | thread atomics work where it's assumed each thread runs a
               | wasm module (with shared memory regions that are
               | appropriately mapped.) It does not specify how the host
               | environment actually spawns or manages threads or what
               | hostcalls are available for that.
               | 
               | That said I think in the browser something like
               | emscripten can do something akin to "use web workers with
               | shared array buffers" to back it all up so that threading
               | APIs roughly work, but yes, WASI currently has nothing
               | for this.
        
         | connicpu wrote:
         | From my understanding, most of the WASM performance improvement
         | expectations have been around the cost of calling browser APIs
         | from within WASM. The performance itself is basically near
         | native speed with a small overhead for the cost of bounds
         | checking/wrapping the memory block. Last I checked, most WASM
         | apps are calling DOM APIs through a javascript middleman, which
         | obviously sucks for performance. But native importing of DOM
         | APIs is something that I believe was being worked on and could
         | be here soon?
        
       | stevedekorte wrote:
       | Great to see work like this being done. Javascript is often a
       | "good enough" language, but an efficient Smalltalk (or Self)
       | language with support for things like system images, become:,
       | coroutines, and other advanced features would open up a lot of
       | advanced programming techniques like fast portable images,
       | transparent futures, cooperative concurrency without async/await
       | on every call, etc.
        
         | davexunit wrote:
         | On a similar note, you can now run Scheme in the browser via
         | wasm: https://spritely.institute/hoot/
         | 
         | The current release can do coroutines via the delimited
         | continuation support, but the next release will have ready-to-
         | use lightweight threads (aka "fibers") that integrate with JS
         | promises. No async/await marked functions/calls.
        
           | OnlyMortal wrote:
           | As an Obj-C guy, if I declare a method as input only and no
           | return, it can run async in the run loop.
           | 
           | I _assume_ Smalltalk can do the same?
        
           | epolanski wrote:
           | Effect-ts is also built on the same principles (fibers) if
           | one wants to stay in TS land.
           | 
           | https://effect.website/docs/guides/runtime
        
       | DonHopkins wrote:
       | Craig Latta's Caffeine work live coding with Smalltalk and
       | SqueakJS is amazing.
       | 
       | https://observablehq.com/@ccrraaiigg/caffeine
       | 
       | >Caffeine integrates SqueakJS, a JavaScript implementation of the
       | Squeak Smalltalk virtual machine, with several JavaScript runtime
       | environments, including web frontends (web browsers, with DOM,
       | DevTools, and Observable integration), backends (Node]S), and Web
       | Workers.
       | 
       | https://github.com/ccrraaiigg
       | 
       | Craig Latta - Caffeine - 26 May 2021:
       | 
       | https://vimeo.com/591827638
       | 
       | >Caffeine ( caffeine.js.org ) is a livecoded integration of the
       | SqueakJS Smalltalk virtual machine with the Web platform and its
       | many frameworks. Craig Latta will show the current state of
       | Caffeine development through live manipulation and combination of
       | those frameworks. The primary vehicle is a Caffeine app called
       | Worldly, combining the A-Frame VR framework, screen-sharing, and
       | the Chrome Debugging Protocol into an immersive virtual-reality
       | workspace.
       | 
       | >Craig Latta ( blackpagedigital.com ) is a livecoding composer
       | from California. He studied music at Berkeley, where he learned
       | Smalltalk as an improvisation practice. He has worked as a
       | research computer scientist at Atari Games, IBM's Watson lab, and
       | Lam Research. In 2016 he began combining Smalltalk technologies
       | with the Web platform, with an emphasis on spatial computing. He
       | is currently exploring spatial audio for immersive workspaces.
       | 
       | SqueakJS - A Squeak VM in JavaScript (squeak.js.org) 115 points
       | by gjvc on Oct 27, 2021 | hide | past | favorite | 24 comments
       | 
       | https://news.ycombinator.com/item?id=29018465
       | 
       | DonHopkins on Oct 27, 2021 | prev | next [-]
       | 
       | One thing that's amazing about SqueakJS (and one reason this VM
       | inside another VM runs so fast) is the way Vanessa Freudenberg
       | elegantly and efficiently created a hybrid Smalltalk garbage
       | collector that works with the JavaScript garbage collector.
       | 
       | SqueakJS: A Modern and Practical Smalltalk That Runs in Any
       | Browser
       | 
       | https://freudenbergs.de/vanessa/publications/Freudenberg-201...
       | 
       | >The fact that SqueakJS represents Squeak objects as plain
       | JavaScript objects and integrates with the JavaScript garbage
       | collection (GC) allows existing JavaScript code to interact with
       | Squeak objects. This has proven useful during development as we
       | could re-use existing JavaScript tools to inspect and manipulate
       | Squeak objects as they appear in the VM. This means that SqueakJS
       | is not only a "Squeak in the browser", but also that it provides
       | practical support for using Smalltalk in a JavaScript
       | environment.
       | 
       | >[...] a hybrid garbage collection scheme to allow Squeak object
       | enumeration without a dedicated object table, while delegating as
       | much work as possible to the JavaScript GC, [...]
       | 
       | >2.3 Cleaning up Garbage
       | 
       | >Many core functions in Squeak depend on the ability to enumerate
       | objects of a specific class using the firstInstance and
       | nextInstance primitive methods. In Squeak, this is easily
       | implemented since all objects are contiguous in memory, so one
       | can simply scan from the beginning and return the next available
       | instance. This is not possible in a hosted implementation where
       | the host does not provide enumeration, as is the case for Java
       | and JavaScript. Potato used a weak-key object table to keep track
       | of objects to enumerate them. Other implementations, like the
       | R/SqueakVM, use the host garbage collector to trigger a full GC
       | and yield all objects of a certain type. These are then
       | temporarily kept in a list for enumeration. In JavaScript,
       | neither weak references, nor access to the GC is generally
       | available, so neither option was possible for SqueakJS. Instead,
       | we designed a hybrid GC scheme that provides enumeration while
       | not requiring weak pointer support, and still retaining the
       | benefit of the native host GC.
       | 
       | >SqueakJS manages objects in an old and new space, akin to a
       | semi-space GC. When an image is loaded, all objects are created
       | in the old space. Because an image is just a snapshot of the
       | object memory when it was saved, all objects are consecutive in
       | the image. When we convert them into JavaScript objects, we
       | create a linked list of all objects. This means, that as long as
       | an object is in the SqueakJS old-space, it cannot be garbage
       | collected by the JavaScript VM. New objects are created in a
       | virtual new space. However, this space does not really exist for
       | the SqueakJS VM, because it simply consists of Squeak objects
       | that are not part of the old-space linked list. New objects that
       | are dereferenced are simply collected by the JavaScript GC.
       | 
       | >When full GC is triggered in SqueakJS (for example because the
       | nextInstance primitive has been called on an object that does not
       | have a next link) a two-phase collection is started. In the first
       | pass, any new objects that are referenced from surviving objects
       | are added to the end of the linked list, and thus become part of
       | the old space. In a second pass, any objects that are already in
       | the linked list, but were not referenced from surviving objects
       | are removed from the list, and thus become eligible for ordinary
       | JavaScript GC. Note also, that we append objects to the old list
       | in the order of their creation, simply by ordering them by their
       | object identifiers (IDs). In Squeak, these are the memory offsets
       | of the object. To be able to save images that can again be opened
       | with the standard Squeak VM, we generate object IDs that
       | correspond to the offset the object would have in an image. This
       | way, we can serialize our old object space and thus save binary
       | compatible Squeak images from SqueakJS.
       | 
       | >To implement Squeak's weak references, a similar scheme can be
       | employed: any weak container is simply added to a special list of
       | root objects that do not let their references survive. If, during
       | a full GC, a Squeak object is found to be only referenced from
       | one of those weak roots, that reference is removed, and the
       | Squeak object is again garbage collected by the JavaScript GC.
       | 
       | DonHopkins on Oct 27, 2021 | parent | next [-]
       | 
       | Also: The Evolution of Smalltalk: From Smalltalk-72 through
       | Squeak. DANIEL INGALLS, Independent Consultant, USA
       | 
       | https://smalltalkzoo.thechm.org/papers/EvolutionOfSmalltalk....
       | 
       | >A.5 Squeak
       | 
       | >Although Squeak is still available for most computers, SqueakJS
       | has become the easiest way to run Squeak for most users. It runs
       | in just about any web browser, which helps in schools that do not
       | allow the installation of non-standard software.
       | 
       | >The germ of the SqueakJS project began not long after I was
       | hired at Sun Microsystems. I felt I should learn Java; casting
       | about for a suitable project, I naturally chose to implement a
       | Squeak VM. This I did; the result still appears to run at
       | http://weather-dimensions.com/Dan/SqueakOnJava.jar .
       | 
       | >This VM is known in the Squeak community as "Potato" because of
       | some difficulty clearing names with the trademark people at Sun.
       | Much later, when I got the Smalltalk-72 interpreter running in
       | JavaScript, Bert and I were both surprised at how fast it ran.
       | Bert said, "Hmm, I wonder if it's time to consider trying to run
       | Squeak in JavaScript." I responded with "Hey, JavaScript is
       | pretty similar to Java; you could just start with my Potato code
       | and have something running in no time."
       | 
       | >"No time" turned into a bit more than a week, but the result was
       | enough to get Bert excited. The main weakness in Potato had been
       | the memory model, and Bert came up with a beautiful scheme to
       | leverage the native JavaScript storage management while providing
       | the kind of control that was needed in the Squeak VM. Anyone
       | interested in hosting a managed-memory language system in
       | JavaScript should read his paper on SqueakJS, presented at the
       | Dynamic Languages Symposium [Freudenberg et al. 2014].
       | 
       | >From there on Bert has continued to put more attention on
       | performance and reliability, and SqueakJS now boasts the ability
       | to run every Squeak image since the first release in 1996. To run
       | the system live, visit this url:
       | https://smalltalkzoo.thechm.org/HOPL-Squeak.html?launch
       | 
       | codefrau on Nov 5, 2021 | root | parent | next [-]
       | 
       | Dan published an updated version of that paper here:
       | 
       | https://smalltalkzoo.thechm.org/papers/EvolutionOfSmalltalk....
       | 
       | Would be great if you could cite that one next time. The main
       | improvement for me is not being deadnamed. There are other
       | corrections as well.
        
         | DonHopkins wrote:
         | Here's some stuff Vanessa and I discussed about Self and her
         | SqueakJS paper:
         | 
         | DonHopkins 6 months ago | parent | context | favorite | on:
         | Croquet: Live, network-transparent 3D gaming
         | 
         | Excellent article -- Liam Proven does it again! Speaking of a
         | big Plate of Shrimp --
         | https://www.youtube.com/watch?v=rJE2gPQ_Yp8 ...
         | 
         | The incredible Smalltalk developer Vanessa Freudenberg -- who
         | besides being Croquet's devops person, also developed Squeak
         | Smalltalk, EToys, Croquet, and the SqueakJS VM written in
         | JavaScript, and worked extensively with Alan Kay -- was just
         | tweeting (yeah, it's ok to deadname Twitter!) about reviving
         | Croquet from 20 years ago:
         | 
         | https://twitter.com/codefrau/status/1738778761104068754
         | 
         | Vanessa Freudenberg @codefrau
         | 
         | I've been having fun reviving the Croquet from 20 years ago
         | using @SqueakJS . It's not perfect yet, but a lot of the old
         | demos work (sans collaboration, so far). This is pretty close
         | to the version Alan Kay used to give his Turing Award lecture
         | in 2004:
         | 
         | https://github.com/codefrau/jasmine
         | 
         | Live version: https://codefrau.github.io/jasmine
         | 
         | This is a version of Croquet Jasmine running on the SqueakJS
         | virtual machine. Here is an early demo of the system from 2003.
         | Alan Kay used it for his Turing Award lecture in 2004. While
         | working on that demo, David Smith posted some blog entries (1,
         | 2, 3, 4, 5), with screenshots uploaded to his Flickr album.
         | 
         | This is work-in-progress. Contributions are very welcome.
         | 
         | -- Vanessa Freudenberg, December 2023
         | 
         | Dan Ingalls @daningalls
         | 
         | Yay Vanessa! This is awesome. These are mileposts in our
         | history that now live again!
         | 
         | https://twitter.com/codefrau/status/1526618670134308864
         | 
         | Vanessa Freudenberg @codefrau 7:40 PM * May 17, 2022
         | 
         | My company @CroquetIO announced #MicroverseBuilder today.
         | 
         | Each microverse is "just" a static web page that you can deploy
         | anywhere, but it is fully 3D multiplayer, and can be live-
         | coded. Portals show and link to other developer's worlds.
         | 
         | This is our vision of the #DemocratizedMetaverse as opposed to
         | the "Megaverses" owned by Big Tech.
         | 
         | It runs on #CroquetOS inside your browser, which provides the
         | client-side real-time synchronized JS VMs that you already know
         | from my other posts.
         | 
         | #MicroverseBuilder is in closed alpha right now because we
         | don't have enough #devrel people yet (we're hiring!) but you
         | can join our Discord in the mean time and the open beta is not
         | far away.
         | 
         | We are also looking for summer interns! #internships
         | 
         | https://www.youtube.com/watch?v=CvvuAbjh11U
         | 
         | And of course #CroquetOS itself is already available for you to
         | build multiplayer apps, as is our #WorldcoreEngine, the game
         | engine underlying #MicroverseBuilder.
         | 
         | Learn more at https://croquet.io/docs/ and let's get hacking :)
         | 
         | And as of today, #MicroverseBuilder is Open Source!
         | 
         | lproven 6 months ago | next [-]
         | 
         | Thanks Don! This is my original submission from back at the
         | time:
         | 
         | https://news.ycombinator.com/item?id=35302162
         | 
         | HN really needs a better automatic-deduplication engine. E.g.
         | If the same link is posted again months later, mark the
         | original post as new again with an upvote, and the caption (if
         | changed) as a comment...
         | 
         | codefrau 6 months ago | prev [-]
         | 
         | Haha, thanks for the plug, Don!
         | 
         | I just fleshed out the README for my Croquet resurrection
         | yesterday so others may have an easier time trying it. It maybe
         | even contribute :)
         | 
         | https://github.com/codefrau/jasmine
         | 
         | DonHopkins 6 months ago | parent [-]
         | 
         | Vanessa, it has always amazed me how you managed to square the
         | circle and pull a rabbit out of a hat by the way you got
         | garbage collection to work efficiently in SqueakJS, making
         | Smalltalk and JavaScript cooperate without ending up with two
         | competing garbage collectors battling it out. (Since you can't
         | enumerate "pointers" with JavaScript references by just
         | incrementing them.)
         | 
         | https://freudenbergs.de/vanessa/publications/Freudenberg-201...
         | 
         | >The fact that SqueakJS represents Squeak objects as plain
         | JavaScript objects and integrates with the JavaScript garbage
         | collection (GC) allows existing JavaScript code to interact
         | with Squeak objects. [...]
         | 
         | >* a hybrid garbage collection scheme to allow Squeak object
         | enumeration without a dedicated object table, while delegating
         | as much work as possible to the JavaScript GC,
         | 
         | Have you ever thought about implementing a Smalltalk VM in
         | WebAssembly, and how you could use the new reference types for
         | that?
         | 
         | https://bytecodealliance.org/articles/reference-types-in-was...
         | 
         | codefrau 6 months ago | root | parent [-]
         | 
         | I would like to speed up some parts of SqueakJS using web
         | assembly. For example BitBlt would be a prime target. For the
         | overall VM, however, I'll leave that to others (I know Craig
         | Latta has been making progress).
         | 
         | I just love coding and debugging in a dynamic high-level
         | language. The only thing we could potentially gain from WASM is
         | speed, but we would lose a lot in readability, flexibility, and
         | to be honest, fun.
         | 
         | I'd much rather make the SqueakJS JIT produce code that the
         | JavaScript JIT can optimize well. That would potentially give
         | us more speed than even WASM.
         | 
         | Peep my brain dumps and experiments at
         | https://squeak.js.org/docs/jit.md.html
         | 
         | DonHopkins 6 months ago | root | parent | next [-]
         | 
         | >Where this scheme gets interesting is when the execution
         | progressed somewhat deep into a nested call chain and we then
         | need to deal with contexts. It could be that execution is
         | interrupted by a process switch, or that the code reads some
         | fields of thisContext, or worse, writes into a field of
         | thisContext. Other "interesting" occasions are garbage
         | collections, or when we want to snapshot the image. Let's look
         | at these in turn. This sounds similar to Self's "dynamic
         | deoptimization" that it uses to forge virtual stack frames
         | representing calls into inlined code, for the purposes of the
         | debugger showing you the return stack that you would have were
         | the functions not inlined.
         | 
         | I always thought that should be called "dynamic pessimization".
         | 
         | Debugging Optimized Code with Dynamic Deoptimization. Urs
         | Holzle, Craig Chambers, and David Ungar, SIGPLAN Notices 27(7),
         | July, 1992.
         | 
         | https://bibliography.selflanguage.org/dynamic-deoptimization...
         | 
         | That paper really blew my mind and cemented my respect for
         | Self, in how they were able to deliver on such idealistic
         | promises of simplicity and performance, and then oh by the way,
         | you can also debug it too.
         | 
         | codefrau 6 months ago | root | parent | next [-]
         | 
         | Absolutely. And you know Lars Bak went from Self to Strongtalk
         | to Sun's Java Hotspot VM to Google's V8 JavaScript engine. My
         | plan is to do as little as necessary to leverage the enormous
         | engineering achievements in modern JS runtimes.
         | 
         | DonHopkins 6 months ago | root | parent | prev [-]
         | 
         | Glad I asked! Fun holiday reading to curl up with a cat to
         | read. Thanks!
         | 
         | I love Caffeine, and I use Craig's table every day! Not a look-
         | up table, more like a big desk, which I bought from him when he
         | left Amsterdam. ;)
         | 
         | ---
         | 
         | Vanessa> Our guiding principle will be to keep our own
         | optimizations to a minimum in order to have quick compiles, but
         | structure the generated code in a way so that the host JIT can
         | perform its own optimizations well.
         | 
         | Don> That's the beautiful thing about layering the SqueakJS VM
         | on top of the JS VM: you've already paid for it, it works
         | really well, so you might as well use it to its full extent!
         | 
         | Very different set of trade-offs than implementing Self in C++.
         | 
         | Vanessa> Precisely. My plan is to do as little as necessary to
         | leverage the enormous engineering achievements in modern JS
         | runtimes.
        
         | stevedekorte wrote:
         | "In JavaScript, neither weak references... is generally
         | available". I think that was true with the old weak collation
         | classes, but doesn't the newer JS WeakRef provide proper weak
         | references?
        
       ___________________________________________________________________
       (page generated 2024-07-10 23:01 UTC)