[HN Gopher] Write Your Own Virtual Machine (2022)
       ___________________________________________________________________
        
       Write Your Own Virtual Machine (2022)
        
       Author : sebg
       Score  : 314 points
       Date   : 2024-12-26 19:30 UTC (1 days ago)
        
 (HTM) web link (www.jmeiners.com)
 (TXT) w3m dump (www.jmeiners.com)
        
       | saithound wrote:
       | Alas, educational architectures like the Brookshear Machine and
       | Little Computer look nothing like real ones, making them worse
       | than useless: in my experience students who take courses using
       | these often end up with a more distorted understanding of
       | computers than those who take no classes at all.
       | 
       | Most people who want to learn a bit about how their machines work
       | would be better served by taking an operating systems course.
       | Same goes here: if you only have time for a short tutorial, I
       | recommend "Writing my own bootloader" instead. [1]
       | 
       | (This is not meant to say that the Write your own VM tutorial is
       | a bad tutorial; only that, in my experience, most people who'd do
       | it would be best served by a different subject)
       | 
       | [1] https://dev.to/frosnerd/writing-my-own-boot-loader-3mld
        
         | markus_zhang wrote:
         | I just did the LC-3 one and look forward to learning a bit
         | about dynamic recompilation using LC-3 as an (inappropriate)
         | target machine in the current project.
         | 
         | Can you please elaborate why it is bad to use LC-3 for studying
         | computer architecture? I do understand it is completely
         | different from real hardware, and too simplified. But from the
         | perspective of writing CPU emulators, is it bad for me?
        
           | saithound wrote:
           | I answered a similar question by anyfoo downthread, the point
           | about implementing a PIT should explain what I mean, and why
           | I think that emulating a simple real architecture might be
           | more useful while requiring similar effort.
        
             | c0wb0yc0d3r wrote:
             | Link in case the order changes.
             | 
             | https://news.ycombinator.com/item?id=42519788
        
             | anyfoo wrote:
             | Again, you are coming at this from a software engineering
             | direction, not a computer architecture/engineering one.
        
               | saithound wrote:
               | Since I have the exact same thought (that _you_ are
               | coming at this from a software engineering direction, not
               | a computer architecture /engineering one), this is
               | unlikely to be further productive :(
               | 
               | You say you want students to be able to write
               | emulators/simulators etc. (you list a bunch of
               | software.engineering tasks). I say writing an LC-3
               | emulator won't give you any transferable skills for
               | writing, say, a Gameboy Emulator (still a 30yo system),
               | because these educational arhitectures are bad and
               | abstract away precisely the things that you have to get
               | right to do that, leaving you with something trivial to
               | implement: a C array for "memory" and a few bitwise
               | operations.
               | 
               | Since we're largely looking at the same things and coming
               | to orthogonal conclusions, I suggest we agree to disagree
               | on this one.
        
           | anyfoo wrote:
           | When studying _computer architecture_ , I think the
           | implementation details of a real architecture, even a simple
           | one, might actually slow you down if your goal is to learn
           | about dynamic recompilation. The commenter you replied to
           | seems to be coming from a software engineering perspective,
           | i.e. learning how to program a computer (which someone
           | interested in computer architecture probably knows about
           | already).
        
         | userbinator wrote:
         | My recommendation is to use an old 8-bit architecture like 6502
         | or Z80.
         | 
         | Apparently a lot of CS courses in India still use the 8086/8088
         | too.
        
           | markus_zhang wrote:
           | Back in China my first comp arch class used 8051, it was fun.
           | I remember everyone getting a development board and playing
           | with assembly.
        
             | userbinator wrote:
             | 8051-core SoCs seem to be found in tons of cheap Chinese
             | electronics. MP3 players, toys, SD card/USB drive
             | controllers, touchscreen drivers etc.
        
               | kragen wrote:
               | Yeah, the 8051 family is still pretty popular, unlike,
               | for example, the 68HC11/08/12 etc.. I think it's not as
               | popular as the PIC, ARM, AVR, or even MSP430, but it
               | seems to be more popular than STM8, M16C, SH-4, MAXQ,
               | MIPS, or 8086. I'm not sure how to assess the Z80 (8080
               | variant), Z8, RL78, and RX families, all of which seem to
               | be showing disturbing signs of life. Fortunately the 8048
               | seems to be dead.
        
               | userbinator wrote:
               | _I think it 's not as popular as the PIC, ARM, AVR, or
               | even MSP430_
               | 
               | Far more 8051 cores are shipped than all of those
               | combined, excluding ARM.
        
               | kragen wrote:
               | Even today? That's plausible, but I'd like to know what
               | leads you to believe it. I was mostly looking at Digi-Key
               | stock numbers, which are at best a very imperfect
               | measure.
        
             | rramadass wrote:
             | For an excellent (and free) 8051 book with lots of C code
             | see Michael Pont's _Patterns for Time-Triggered Embedded
             | Systems_ - https://www.safetty.net/publications/pttes
        
               | markus_zhang wrote:
               | Thanks, I have heard a lot of good words about this book.
        
           | whartung wrote:
           | I wrote a simple 6502 simulator. Just made it a byte code
           | interpreter.
           | 
           | No cycle accurate anything. Simple I/O. Since I/O on the 6502
           | is all memory accesses, adding console is as simple as
           | echoing whenever a character is stored in a particular memory
           | address. No reason the simulate a VIA chip or UART or
           | anything that low level.
           | 
           | Fun project. You can take it as deep as you like.
        
         | PaulHoule wrote:
         | I think of Knuth's old MIX which was a decimal machine which
         | might have gotten built in the 1960s but which nobody has built
         | since the 1970s. A system like that can teach you many
         | fundamentals but not the tricks here
         | 
         | https://en.wikipedia.org/wiki/Hacker%27s_Delight
         | 
         | which mainly depend on conventional numeric representations.
        
           | kragen wrote:
           | MIX wasn't decimal, but rather abstract over the choice of
           | base, a decision Knuth changed in later volumes of the
           | series, even before switching to MMIX.
           | 
           | There _have_ been decimal and hybrid computers built since
           | the 01970s (especially for pocket calculators, but new
           | decimal instructions were still being added to Intel 's
           | lineup in the 8086 with AAM and AAD, and I think decimal
           | continued to be a consideration for the AS/400 into the 90s),
           | but MIX is really kind of an 01950s design, with:
           | 
           | - word-addressed memory
           | 
           | - five bytes per word (plus a sign bit)
           | 
           | - one word per instruction
           | 
           | - six bits per byte (in binary realizations)
           | 
           | - possible decimal
           | 
           | - a single accumulator
           | 
           | - sign-magnitude
           | 
           | - no stack, and
           | 
           | - a nonrecursive subroutine call convention relying on self-
           | modifying code.
           | 
           | You can find one or another of these characteristics in
           | machines designed since 01960, but the combination looked
           | old-fashioned already when it was introduced.
        
         | anyfoo wrote:
         | Can you elaborate what you don't like about this one, LC-3, in
         | particular? I'm not familiar with it, but just had a look at it
         | on Wikipedia. After your comic, I was expecting something
         | weird, but upon a quick glance, it doesn't seem too jarring. A
         | bit like a mixture of s/360, some x86, and a tiny bit of ARM
         | (or other RISCy architectures). With lots of omissions and some
         | weirdness of course, but the goal seems to be to quickly come
         | to a working implementation. I'm curious what exactly you think
         | makes it "worse than useless" for teaching.
        
           | saithound wrote:
           | I'm not talking about the instruction set, or teaching basic
           | assembly (probably anything except Malbolge is suitable for
           | that).
           | 
           | Let's look at just one thing every programmer has to deal
           | with, memory.
           | 
           | On an LC-3, the address space is exactly 64KiB. There is no
           | concept of missing memory, all addresses are assumed to
           | exist, no memory detection is needed or possible, and memory
           | mapped IO uses fixed addresses.
           | 
           | There are no memory management capabilities on the LC-3, no
           | MMU, no paging, no segmentation. In turn there are no memory-
           | related exceptions, page faults or protection faults.
           | 
           | When an x86 machine boots with 1MB of RAM, the 4GB address
           | space still exists in full, but accessing certain addresses
           | will cause bus timeouts, crashes. One must track and manage
           | available memory. There's a BIOS, and manually probing memory
           | locations may trash its critical structures. There's INT
           | 0x15.
           | 
           | I picked memory arbitrarily but you run into the same
           | limitations no matter what you pick. Would a students who was
           | educated on LC-3 know how a computer keeps time? Of course
           | not, there's no PIT, there's no CMOS clock. Would they have
           | thought about caches? Nope.
           | 
           | Oh, but wouldn't a student who implements a timer emulation
           | extension for LC-3 learn more about timers than somebody who
           | just learned to use an x86 PIT? Alas, no. There are 20
           | equally easy and reasonable mathematical ways to implement a
           | timer abstraction. A good 15 of these are physically
           | impossible on real hardware, out of the remaining 5 two would
           | be prohibiitively expensive due to electrical engineering
           | reasons, one has never been implemented in real hardware due
           | to historical accidents, and two are designs that are
           | actually in use. So to write timer emulation that teaches you
           | anything at all about how actual timers work, you'll have to
           | look at and understand a real architecture anyway.
           | 
           | That's why educational architectures are so contraproductive.
           | They abstract away exactly the things that make modern
           | computers modern computers. One comes away with fundamentally
           | wrong ideas about what computers do and how they actually
           | work, or could work.
           | 
           | It's like learning to drive in GTA: in principle, there could
           | be plenty of skills that transfer to the real thing, but in
           | practice you'll prefer to teach how to drive to the person
           | who didn't play GTA at all.
        
             | upghost wrote:
             | Interesting. How would you advocate actually gaining that
             | knowledge then?
             | 
             | It seems like a student would need to know a significant
             | amount of coding in order to learn those abstractions in an
             | interactive manner.
             | 
             | And by learn them I mean _learn_ them (not just following a
             | tutorial), organizing the code for a fully working x86
             | architecture is no joke.
             | 
             | But a student with that level of skill probably doesn't
             | _need_ to learn the x86 architecture so intensively, they
             | are probably already employable.
             | 
             | I am asking this seriously, by the way, not trying to
             | nitpick. I'm trying to put together a free course based on
             | the video game Turing Complete[1] but from what you're
             | saying it sounds like it might not be very effective. (to
             | be clear the goal is to teach programming, not Computer
             | Engineering)
             | 
             | [1]:
             | https://store.steampowered.com/app/1444480/Turing_Complete/
        
               | saithound wrote:
               | Very good question.
               | 
               | My working assumption throughout was that the people in a
               | computer architecture class already had 1 or 2 semesters
               | of other programming courses where they worked in a high-
               | level language, and are looking to learn how computers
               | work "closer to the hardware". And educational
               | architectures create a completely false impression in
               | this domain.
               | 
               | If I had to teach assembly programming to people who
               | never programmed before, I'd _definitely_ not want to
               | start with x86 assembly. I'd start by teaching them
               | JavaScript so that they can program the computers that
               | they themselves, and other people, actually use. At that
               | point they'd be ready to learn computer architechture
               | through an x86 deep dive, but would no longer need to
               | learn it, since, as you said, they'd probably already be
               | employable. But the same goes for learning LC-3, and much
               | more so.
               | 
               | To be honest, my opinion is only that educational
               | architectures are a poor way to learn what modern
               | computers actually do, and while I think I have good
               | reasons for holding that particular opinion, I don't have
               | the breadth of experience to generalize this specific
               | observation into an overarching theory about teaching
               | programming and/or compsci. I hope your course will be a
               | useful resource for many people, but I doubt listening to
               | me will make it better: my experience does not generalize
               | to the domain you're targeting.
        
               | TuringTest wrote:
               | The thing is that you are conflating CPU architectures
               | with computer architectures; In academia they are treated
               | as two different educational topics, for good reason.
               | 
               | The first one covers the lowest level of how logic gates
               | are physically wired to produce a Turing-equivalent
               | computing machine. For that purpose, having the simplest
               | possible instruction set is a pedagogical must. It may
               | also cover more advanced topics like parallel and/or
               | segmented instruction pipelines, but they're described in
               | the abstract, not as current state-of-the-art industry
               | practice.
               | 
               | Then, for actually learning how modern computers work you
               | have another separate one-term course for whole machine
               | architecture. There you learn about data and control
               | buses, memory level abstractions, caching, networking,
               | parallel processing... taking for granted a previous
               | understanding of how the underlying electronics can be
               | abstracted away.
        
               | upghost wrote:
               | I appreciate the candid response. I have noticed there is
               | a class of very intelligent, well-educated adult learners
               | who have nevertheless been unexposed to software
               | education until adulthood who are now looking for a
               | career change. I've found that there is a lot of
               | difficulty initially with combining abstractions, i.e.,
               | "a variable holds a value, a function is also a value, a
               | function is also a thing that sometimes takes values and
               | sometimes returns values, therefore a variable can hold a
               | function, a function can be passed as an argument to a
               | function, and a function can return a function".
               | 
               | Reasonable adults might have reasonable questions about
               | those facts, such as, "what does any of that have to do
               | with a computer?"
               | 
               | To my embarrassment, I realized they were completely
               | right and my early exposure to software made me overlook
               | some extremely important context.
               | 
               | So for these adults, the expectation of struggling
               | through a few semesters/years of javascript is not an
               | optimal learning route.
               | 
               | My hope was that working from the logic gate level up
               | would at least provide the intuition about the
               | relationship between computers (Turing Machines, really,
               | not modern computers) and software.
               | 
               | However, I think based on your excellent critique I will
               | be sure to include a unit on how "educational
               | architectures are very different from modern
               | architectures and I may have ruined your brain by
               | teaching you this" haha.
        
               | tonyarkles wrote:
               | Many years ago I taught CMPT215 - Intro to Computer
               | Architecture. This was a second year course and taught
               | MIPS assembler. There's a chain of follow on courses
               | culminating in a 400-level course where you implement a
               | basic OS from scratch on x86 (when I took it) or ARM
               | (present day, I believe).
               | 
               | MIPS was, I think, a decent compromise between academic
               | architecture and industrial architecture. There was a
               | decent amount of practical stuff the students had to deal
               | with without having to eg do the dance from x86 real mode
               | to protected mode.
               | 
               | One of the most memorable lectures though was on the
               | second last day. As it turned out I had gotten through
               | all of my material one say early. We had a review/Q&A
               | session scheduled for the last day but needed 1h30 of
               | filler. I ended up putting together a lecture on JVM
               | bytecode. Throughout the course I would often use C <->
               | MIPS ASM throughout examples because the 214, which was a
               | prerequisite, was a C course. All of their first year
               | work had been in Java. Anyway, we took real Java programs
               | and disassembled them down to bytecode and walked through
               | it, showing how it's very similar to hardware assembly
               | but with extra instructions for eg placing objects on the
               | stack or calling methods on objects. The class ended up
               | going almost an hour over because everyone was oddly
               | engaged and had a ton of questions.
        
             | anyfoo wrote:
             | There seems to be a misunderstanding what fields those
             | learning architectures are geared towards.
             | 
             | They are usually not for learning about application or even
             | systems programming, in assembly and otherwise. They are
             | about CPU architectures (and some surrounding concepts)
             | themselves. The goal is to be able to quickly build your
             | own emulator/simulator (like the VM behind the link), and
             | maybe an assembler. _Or_ , coming from the other side,
             | given a working emulator/simulator, implement a high level
             | language compiler, such as for a simplified C language.
             | 
             | In both those goals the CPU architecture is geared towards
             | quickly converging towards a working system. So of course
             | they don't require you to implement (as someone learning
             | architecture engineering) or even use (as someone learning
             | to build a compiler) an entire MMU, much like your first
             | project car likely wouldn't contain a computerized double
             | clutch transmission. Instead, they are simplified CPU
             | architectures that allow you to make steady progress,
             | without being slowed down by the tedious complexity that a
             | real architecture brings for reasons that are not relevant
             | yet when starting out.
             | 
             | Then, once that is achieved, you can freely add things like
             | an MMU, a cache controller, maybe shift towards a pipelined
             | or even superscalar architectures...
             | 
             | Besides, a very large class of computers don't have many of
             | the things that you mentioned. For one, even very advanced
             | microcontrollers explicitly don't even have an MMU, because
             | that would destroy latency guarantees (very bad for
             | automative controllers). For the rest, I've got to say that
             | there is a certain irony in complaining that computer
             | architecture students don't know about "modern computers",
             | while in the same breath mentioning things like INT 15h,
             | segmentation, and x86 PITs, as if we were in the 1990s.
        
             | 1000100_1000101 wrote:
             | > On an LC-3, the address space is exactly 64KiB. There is
             | no concept of missing memory, all addresses are assumed to
             | exist, no memory detection is needed or possible, and
             | memory mapped IO uses fixed addresses.
             | 
             | > There are no memory management capabilities on the LC-3,
             | no MMU, no paging, no segmentation. In turn there are no
             | memory-related exceptions, page faults or protection
             | faults.
             | 
             | Sounds an awful lot like a Commodore 64, where I got my
             | start. There's plenty to learn before needing to worry
             | about paging, protection, virtualization, device discovery,
             | bus faults, etc.
             | 
             | It sounds like it's not teaching the wrong things like your
             | GTA driving example, but teaching a valid subset, but not
             | the subset you'd prefer.
        
         | hayley-patton wrote:
         | The LC-3 has pretty odd addressing modes - in particular, you
         | can do a doubly indirect load through a PC-relative word in the
         | middle. But you still have to generate subtraction from
         | negation, and negation from NOT and ADD ,,#-1. (I suppose NOT
         | d,s = XOR d,s,#-1 would be a better use of the limited
         | instruction encoding space too.)
        
       | UniverseHacker wrote:
       | As a teenager I took an intro CS class at a community college,
       | and the instructor had us design a simple cpu instruction set,
       | and write our own VM and assembler that worked and let me write
       | and run assembly programs. It was shockingly easy, and it was
       | amazing how much is demystified computers for me.
       | 
       | I feel like one could learn every level of computing this way-
       | from designing a real cpu for a FPGA, to writing a simple OS and
       | programs that run on it. This stuff is all shockingly simple if
       | you just want it to work and don't need all of the extra
       | performance and security modern computing needs.
        
         | tralarpa wrote:
         | That's what the nand2tetris course does, I think (I only looked
         | at the first lessons)
        
         | spit2wind wrote:
         | That sounds like a fun class! It sounds very similar to
         | https://www.nand2tetris.org/ or Charles Petzold's book "Code".
        
           | NooneAtAll3 wrote:
           | or https://nandgame.com/
        
             | dimava wrote:
             | Or Turing Complete [0] the game
             | 
             | [0]
             | https://store.steampowered.com/app/1444480/Turing_Complete/
        
         | markus_zhang wrote:
         | I think once you move from early fantasy CPUs to early CPUs in
         | production such as 80286, the complexity immediately moves up
         | drastically. IIRC it involves at least memory segmentation,
         | protected mode (MMU).
        
           | remexre wrote:
           | the 80286 has its own problems/inessential complexity
           | 
           | if you look at this from the riscv angle, moving from "u-mode
           | only vm that doesn't use paging under the hood" to "u+s-mode
           | vm with sv39" isn't an enormous jump in complexity imo
           | 
           | i think i might teach it starting as like, "sv21" (page
           | tables aren't nested), then pose real sv39 and the tree
           | structure as the solution to making a sparse mapping over
           | 512GiB
           | 
           | then moving on to the idea of having a TLB is simple,
           | especially if students have already been introduced to
           | hashtables
        
             | PaulHoule wrote:
             | Sometimes I dream of a 24-bit generation of computers (in
             | terms of address space and the space of index math) of
             | which
             | 
             | https://en.wikipedia.org/wiki/Zilog_eZ80
             | 
             | may be the best realization. You can _almost_ run a real
             | operating system on that chip in the sense of a 24 bit
             | supervisor that could run multiple 16 bit processes (if you
             | could get CP /M to run you have one heck of a userspace)
             | Unfortunately you can't trap the instruction that switches
             | back to 24 bit mode.
             | 
             | Would be nice too if a 24 bit supervisor could contain a 24
             | bit application, that requires some kind of memory
             | management and I'd imagine something a little lighter
             | weight than the usual paging system, say something that
             | maps (0..N) in logical space to (i+0..i+N) in physical
             | address space. I like the "access register" concept
             | 
             | https://en.wikipedia.org/wiki/Access_register
             | 
             | but would probably have a small number of memory banks such
             | as 16 or 64. In a load-store architecture it doesn't seem
             | like much of a burden to add a bank id to stores and loads.
             | In return you get not just more RAM but also separation of
             | code and data, video memory, file mmap(ing) and such.
             | 
             | What bugs me is how to control the mapping of memory banks
             | to physical memory, on one hand you want the supervisor in
             | charge so it can be used for memory protection, on the
             | other hand some programming techniques would want the speed
             | of changing the memory banks from user space.
             | 
             | The 80286 was a turkey because it didn't meet the minimal
             | viable product level of being a target for a 24-bit OS that
             | could virtualize DOS applications. It was already crazy
             | fast and became affordable pretty quickly but it seemed
             | tragic that it couldn't do that.
        
               | kazinator wrote:
               | Also PIC24 family of mcontrollers.
        
               | Narishma wrote:
               | > The 80286 was a turkey because it didn't meet the
               | minimal viable product level of being a target for a
               | 24-bit OS that could virtualize DOS applications. It was
               | already crazy fast and became affordable pretty quickly
               | but it seemed tragic that it couldn't do that.
               | 
               | That's mainly because it was designed before the IBM PC
               | was a success and backwards compatibility with DOS
               | applications important.
        
               | monocasa wrote:
               | Even better than that was an Z280 which did have a proper
               | user/supervisor mode and simple MMU. Really reminiscent
               | of a PDP-11 in terms of supervisor state, just strapped
               | to a z80 instruction set otherwise. Also added a nice SP
               | relative load to the instruction set.
        
               | fentonc wrote:
               | I took a different approach by just making an FPGA-based
               | multi-core Z80 setup. One core is dedicated to running
               | 'supervisor' CP/NET server, and all of the applications
               | run on CP/NET clients and can run normal CP/M software. I
               | built a 16-core version of this, and each CPU gets its
               | own dedicated 'terminal' window, with all of the
               | windowing handled by the display hardware (and ultimately
               | controlled by the supervisor CPU). It's a fun 'what-if'
               | architecture that works way better than one might expect
               | in practice. It would have made an amazing mid-to-late
               | 1980s machine.
        
           | sunday_serif wrote:
           | True enough, but having designed a "fantasy cpu" gives you a
           | better frame of reference for understanding the more complex
           | features of a cpu (privilege levels, memory segmentation,
           | virtual addresses, cache hierarchy, etc.)
           | 
           | I often feel like those who haven't done the exercise of
           | understanding the ISA of a "fantasy cpu" have a really hard
           | time understanding those more advanced features.
           | 
           | I guess all I am saying is that learning the "fantasy cpu"
           | still has value even if everything else in the real world is
           | more complex.
           | 
           | Walking before running and all that.
        
             | PaulHoule wrote:
             | I've been doing some planning for a 24-bit fantasy CPU, my
             | plan is to make it pretty baroque. For instance it has some
             | instructions to do things like unpack UTF-8 strings into
             | chars, do alpha compositing, etc. The CPU part looks like a
             | strange mainframe that didn't quite get built into the
             | 1970s and it is coupled to a video system that would make a
             | Neo-Geo blush.
        
               | notjoemama wrote:
               | "Neo Geo[a] is a brand of video game hardware developed
               | by SNK.
               | 
               | It was launched with the Neo Geo, an arcade system
               | (called MVS) with a home console counterpart (AES). Games
               | on the Neo Geo (MVS and AES) were well received and it
               | spawned several long-running and critically acclaimed
               | series, mostly 2D fighters. Later, SNK released the Neo
               | Geo CD, a more cost-effective console with games released
               | on compact discs, which was met with limited success. A
               | new arcade system, Hyper Neo Geo 64, was released in
               | 1997, but it did not fare well. SNK also released a
               | handheld console under the brand, the Neo Geo Pocket,
               | which was quickly succeeded by the Neo Geo Pocket Color,
               | which have been given praise despite its short lifetime.
               | 
               | SNK encountered various legal and financial issues
               | resulting in a sale of the company in 2001. Despite that,
               | the original Neo Geo arcade and console continued
               | receiving new games under new ownership until 2004. The
               | Neo Geo brand was revived in 2012 with the release of the
               | Neo Geo X[1] handheld. Since then, a number of other Neo
               | Geo products have been released based on the original Neo
               | Geo."
               | 
               | -- Wikipedia
               | 
               | Seems to be from about 30 years ago in the 1990s.
        
             | markus_zhang wrote:
             | agreed! A fantasy CPU is good for the first project.
        
             | jeffrallen wrote:
             | Forget advanced features... Without understanding a CPU
             | it's easy to never really understand pointers, and without
             | pointers, it's hard to understand lots of data structures.
             | 
             | I was easily 12 months ahead of other students in my CS
             | education because I learned 6502 assembly in high school. I
             | wish all CS courses started with "make a VM".
        
               | wkjagt wrote:
               | This is very true. I had tried learning C multiple times
               | but pointers were always kind of hard. I kind of
               | understood them but not really. I later spent a lot of
               | time making an OS in assembly for my homebrew 6502
               | computer, and after that pointers made so much sense. It
               | actually took me a little while to realize that these
               | addresses I was passing around were pointers, and I had
               | started to understand them without realizing it.
        
               | mtzet wrote:
               | I don't get this take. Is it so hard to understand that a
               | computer operates on a giant array of bytes?
               | 
               | I think the hard thing to understand is that C's pointer
               | syntax is backwards (usage follows declaration is weird).
               | 
               | I also think understanding how arrays silently decay to
               | pointers and how pointer arithmetic works in C is hard:
               | ptr+1 is not address+1, but address+sizeof(*ptr)!
               | 
               | Pointers are not hard. C is just confusing, but happens
               | to be the lingua franca for "high level" assembly.
        
           | sitkack wrote:
           | 80286 is the PHP of CPUs.
           | 
           | A wonderfully different early CPU with plenty of existing
           | software is the https://en.wikipedia.org/wiki/RCA_1802 which
           | was a target of the https://en.wikipedia.org/wiki/CHIP-8
           | interpreter.
        
           | flohofwoe wrote:
           | Or you could pick the 68000 which is easier to emulate than a
           | 286 because it was 'properly designed' (and not a stopgap
           | solution like the x86 line until the 386 arrived).
        
           | UniverseHacker wrote:
           | There are plenty of real CPUs that are useful in the real
           | world but are much simpler than a 286.... 8 bit RISC
           | microcontrollers like Atmel AVRs for example. I've done a
           | small amount of programming for those in assembly, and they
           | are not a whole lot more complex than the 'toy' design I used
           | for the class.
        
         | whartung wrote:
         | Our CS 101 class had such a system. Simple computer/assembler
         | written in BASIC on the PDP.
         | 
         | One of the assignments was to do a simple multiply (i.e. add in
         | a loop).
         | 
         | Rather than do that, my friend simply altered the program and
         | created a new MUL command.
         | 
         | The teacher was not amused.
        
           | JoeAltmaier wrote:
           | Teachers don't want extra work.
           | 
           | My first 360 assembler class our first assignment: add two
           | numbers, fault, print the crashdump and circle the answer to
           | the ADD in the printout.
           | 
           | I wrote an RPN calculator, had it do a series of calculations
           | and print the result. Turned that in.
           | 
           | The teacher wrote on it "You have 24 hours to turn in the
           | required assignment"
        
             | MortyWaves wrote:
             | "Don't let education get in the way of your learning" seems
             | appropriate here.
             | 
             | Never underestimate how few fucks teachers give about
             | anything beyond their small bubble.
        
               | fuzztester wrote:
               | I never let schooling get in the way of my education.
               | 
               | - Mark Twain.
        
             | kubb wrote:
             | Sounds like you needed to learn to follow basic
             | instructions.
        
               | JoeAltmaier wrote:
               | Said like a true teacher.
        
             | aleph_minus_one wrote:
             | > My first 360 assembler class our first assignment: add
             | two numbers, fault, print the crashdump and circle the
             | answer to the ADD in the printout.
             | 
             | > I wrote an RPN calculator, had it do a series of
             | calculations and print the result. Turned that in.
             | 
             | The teacher wanted you to learn:
             | 
             | 1. how to write a (very simple) assembly program for 360 to
             | add to numbers
             | 
             | 2. how to create a fault
             | 
             | 3. how to print a crashdump
             | 
             | 4. how to read a crashdump
             | 
             | By using a RPN calculator, you learned none of these
             | objectives.
        
               | JoeAltmaier wrote:
               | Not 'using'. Writing. In assembler. A full RPN
               | calculator.
        
             | coreyp_1 wrote:
             | I was a professor. I would have given you a zero.
             | 
             | From the instructor's perspective, it sounds like you
             | didn't even try to do the intentional assignment (which the
             | fault/crashdump/find info in the crashdump was the most
             | important part), and either did your own thing or sent in
             | someone else's assignment because you didn't look closely
             | enough at what it was. Yes, this is a big accusation, but
             | it happens every semester.
             | 
             | The professor just wanted you to learn what you were
             | supposed to from the assignment, so that you could build
             | off it.
             | 
             | Your cynical take on it now shows that you still don't
             | understand how to teach or the importance of that
             | fundamental skill.
             | 
             | Is there a place for fun, exploratory projects? Of course,
             | and I always incorporated them into my syllabus. But
             | there's also a place for structure. If you want to explore
             | along the way, then that is even better! But you still have
             | to meet the incremental checkpoints along the way,
             | otherwise you have not demonstrated that you have met the
             | core competencies of the subject, as reflected by your
             | grade.
             | 
             | There's a lot of shortcomings in modern university
             | education, but I loved teaching, and would still be doing
             | it today if it paid reasonably. My biggest headaches were
             | the cheaters and the know-it-alls, mostly because neither
             | of them knew enough to respect the subject matter and its
             | importance. The students that made me the most proud,
             | though, were those that trusted the roadmap, and in the end
             | were able to do bigger and better things than they thought
             | that they were capable of.
             | 
             | It's not about extra work. We did plenty of it, btw. We
             | just didn't always tell you about it. The University system
             | referrs to it as "service", and you, the student, are the
             | beneficiary of it, either directly or indirectly.
             | 
             | "Teachers don't want extra work." I'm sorry, but this
             | sounds like a toddler stomping their foot and saying "Mommy
             | and Daddy don't love me because they went to work today
             | instead of staying home to play with me."
             | 
             | I'm glad that you wrote the RPN calculator. That's cool!
             | I'm disappointed that you use this as an opportunity to
             | bash the educator who agreed to take a low salary so that
             | they could help you learn important things that are
             | fundamental to your craft and chosen field of study.
        
               | JoeAltmaier wrote:
               | I didn't bash anybody - that's words in my mouth.
               | 
               | I understand that it's a sensitive subject, students that
               | stray from the boundaries of the lesson plan, so often to
               | no good effect. Making work for an overworked teacher, I
               | have complete sympathy.
               | 
               | But understand that there's also room for the teacher
               | that doesn't stomp their own foot and petulantly demand
               | every student stay in the herd like good little scholars.
               | 
               | Maybe even encourage exceptional students, maybe suggest
               | something more appropriate for them, another class
               | perhaps. But no, that certainly didn't happen, just the
               | deliberate ignoring of the effort spent, because it
               | didn't serve their tiny objective.
               | 
               | Just that one weary, autocratic bland statement,
               | telegraphing as sure as a digital signal "I am the
               | teacher, and you will not do anything I don't sanction,
               | like learning way more than I demand for today's lesson"
        
         | chii wrote:
         | > This stuff is all shockingly simple
         | 
         | it is, but the thing is that these simple building blocks end
         | up quite far away from an actual production level outcome that
         | someone looking at a computer might see or interact with. It's
         | hundreds of levels deep.
         | 
         | Someone with curiosity and eager to learn will be able to
         | easily learn these foundational layers. Someone looking to "get
         | rich quick" and be ready and employable ASAP won't.
        
       | ajross wrote:
       | To be That Guy: this is an emulator, not a VM. While the term can
       | apply in a descriptive sense, and in the pre-hardware-
       | virtualization past there was some ambiguity, the overwhelmingly
       | common usage of "Virtual Machine" in the modern world refers to
       | an environment making use of hardware virtualization features
       | like VT-x et. al.
        
         | kragen wrote:
         | I don't think that terminology is "overwhelmingly common", and
         | I'd argue that it isn't even entirely correct. The JVM is
         | widely deployed, the Ethereum VM is called the "EVM",
         | https://www.linuxfoundation.org/hubfs/LF%20Research/The_Stat...
         | describes BPF and eBPF repeatedly as "virtual machines",
         | https://webassembly.org/ begins by saying, "WebAssembly
         | (abbreviated Wasm) is a binary instruction format for a stack-
         | based virtual machine," etc. "Virtual machine" is still the
         | most common term for fictional machines. (Myself, I think I'd
         | prefer "fictive machine", "fictious machine", "imaginary
         | computer", or "fantastic automaton", but I doubt these terms
         | will gain adoption.)
         | 
         | You can't always use the term "emulator" instead of "virtual
         | machine", because while you could say wasmtime was an
         | "emulator", you can't correctly say that WebAssembly itself is
         | an "emulator". Rather, WebAssembly is the virtual machine which
         | wasmtime emulates. It's also common to call emulators "virtual
         | machines". (The Wasmtime web site mostly calls wasmtime a
         | "runtime" and WebAssembly an "instruction format", FWIW.) And
         | of course a running instance of an emulator is also a "virtual
         | machine" in a different sense.
         | 
         | I think it's also reasonable to use "virtual machine" in the
         | way you are describing, and it has some overlap with this last
         | sense of "virtual machine". Perhaps in your current environment
         | that is the overwhelmingly most common usage, but that is
         | definitely not true elsewhere.
        
           | dataflow wrote:
           | Disagree with both of you. JVM was always a misnomer, it just
           | stuck because that's just the name they stuck with. Putting
           | "virtual machine" in its name in defiance of the meaning
           | doesn't suddenly make it a virtual machine any more than
           | calling your dog "my son" makes your dog your son. And
           | virtual machines have used software virtialization for the
           | longest time, they aren't always hardware based.
           | 
           | Really, a virtual machine is literally what it says on the
           | tin: something that isn't a physical computer (machine), but
           | presents itself as one. You know, with a virtual CPU or a
           | virtual disk or a virtual keyboard etc., whatever a computer
           | would have. If it doesn't present itself as a machine, it's
           | not a virtual machine.
           | 
           | Calling JVM a virtual machine just because it interprets some
           | bytecode the same way a machine does is like calling me a
           | virtual doctor just because I read text the same way a doctor
           | does.
        
             | kragen wrote:
             | The JVM _is_ a VM in that sense. It isn 't a physical
             | computer, but presents itself as one. It's true that that
             | computer doesn't have a disk or a keyboard, but neither do
             | the vast majority of physical computers, including,
             | specifically, the set-top boxes the JVM was originally
             | designed to run on.
        
               | dataflow wrote:
               | > The JVM _is_ a VM in that sense. It isn 't a physical
               | computer, but presents itself as one.
               | 
               | No, it isn't and doesn't. What in the world does it have
               | that makes you think it pretends to be a computer? A
               | bunch of pretend registers and a memory limit?
        
               | kragen wrote:
               | It doesn't have registers, pretend or otherwise, or a
               | memory limit. It sounds like you aren't very familiar
               | with the JVM or computers in general. On the JVM in
               | particular, I suggest reading https://docs.oracle.com/jav
               | ase/specs/jvms/se23/html/index.ht....
        
               | dataflow wrote:
               | What are you talking about? It literally has a program
               | counter register, it's right there in you own link?
               | 
               | https://docs.oracle.com/javase/specs/jvms/se23/html/jvms-
               | 2.h...
               | 
               | I just mixed it up with Dalvik having multiple registers,
               | but that makes it even worse. At least normal machines
               | have more registers.
               | 
               | And literally here:
               | 
               |  _The following section provides general guidelines for
               | tuning VM heap sizes: The heap sizes should be set to
               | values such that the maximum amount of memory used by the
               | VM does not exceed the amount of available physical RAM._
               | 
               | https://docs.oracle.com/cd/E21764_01/web.1111/e13814/jvm_
               | tun...
               | 
               | Maybe instead of telling me I'm clueless, tell me how
               | you're right? What do you see in it that is actually
               | virtializing a computer in your mind? Because I'm pretty
               | sure that if someone sold you a VM service and it turned
               | out to be just them running your programs under the JVM,
               | you'd be kind of mad.
        
             | rramadass wrote:
             | The JVM is what is called a "Process Virtual Machine" - htt
             | ps://en.wikipedia.org/wiki/Virtual_machine#Process_virtua..
             | .
        
         | troad wrote:
         | > To be That Guy: this is an emulator, not a VM. While the term
         | can apply in a descriptive sense, and in the pre-hardware-
         | virtualization past there was some ambiguity, the
         | overwhelmingly common usage of "Virtual Machine" in the modern
         | world refers to an environment making use of hardware
         | virtualization features like VT-x et. al.
         | 
         | I don't think the distinction you're advocating for actually
         | exists. 'Virtual machine' is commonly used for any software
         | that executes machine- or bytecode, irrespective of the reason.
         | This can include virtualisation, but you also commonly see the
         | term used for language runtimes: e.g. Java's JVM, Ruby's YARV
         | (Yet Another Ruby VM).
         | 
         | The one area you don't actually hear the term that often is in
         | emulation, and this is in part because most modern emulators
         | have tended away from emulating entire systems, and towards
         | techniques like dynamic recompilation (dynarec) of emulated
         | software.
        
           | asimpletune wrote:
           | Exactly, emulation is "make sure the input and output of what
           | you're emulating are the same", whereas simulation is "model
           | the actual internal mechanisms of the thing you're
           | simulating, so as to achieve the same inputs and outputs".
        
         | LeFantome wrote:
         | This is a VM like in JVM ( Java Virtual Machine ). I think you
         | will find Java qualifies for "overwhelmingly common usage".
        
           | ajross wrote:
           | Not in this context, no. If you tell your boss you're
           | deploying the solution "in a VM" and it turns out to be a
           | .jar file, you're likely to be fired.
           | 
           | Context and convention are important. This subthread is
           | filled with people who are technically right, but choosing to
           | cheer for an unclear and idiosyncratic terminology. A VM
           | means hardware virtualization. Choosing other meanings is at
           | best confusing and at worst a terrible security bug.
        
         | asimpletune wrote:
         | I have to politely disagree. In the most pure sense a VM is a
         | made up computer, without any implications being expressed to
         | what it's intended to be used for or how it works.
         | 
         | In the article the author even states as one example emulation
         | a classic console, but it's clear from the article that there
         | are many other possible VM's from the definition they provide.
         | 
         | Anyways the point is VM is abstract, and there are many many
         | types of them. Simulators, emulators, hypervisors, etc... are
         | all VMs, but also there are VMs that don't quite already have a
         | name for them yet because they're strange.
         | 
         | I don't mean to be rude at all - quite the contrary, I want to
         | be respectful - but I also want to be clear about these terms
         | for people who are interested in learning.
        
       | rramadass wrote:
       | Some books that i was pointed to;
       | 
       | 1) _Virtual Machines: Versatile Platforms for Systems and
       | Processes_ by Smith and Nair - Seems to be a comprehensive
       | subject overview book.
       | 
       | 2) _Virtual Machines_ by Iain Craig - Seems like a more hands-on
       | book with languages and VMs.
       | 
       | 3) _Virtual Machine Design and Implementation in C /C++_ by Bill
       | Blunden - Seems like an hands-on implementation.
       | 
       | If somebody who has read any of the above can add some comments
       | that would be helpful to everybody.
        
         | kragen wrote:
         | I'm not convinced that the subject is narrow enough to overview
         | in a single book. The considerations involved in writing a
         | Nintendo emulator, a hypervisor using VT-x, a conventional
         | multitasking operating system, an interpreter for a new
         | scripting language, a SQL query optimizer, a regular expression
         | matcher, a security monitor for untrusted player code running
         | on a game server, etc., would seem to have very little overlap.
         | But these are all "virtual machines". There's even a stack-
         | based virtual machine in the terminfo format for specifying
         | character-cell terminal escape sequences!
         | 
         | In a deep sense, virtual machines are what make computers
         | _computers_ , as we use the term today. Turing's 01936 paper
         | about the Entscheidungsproblem hinged on virtual machines being
         | able to emulate one another.
        
           | rramadass wrote:
           | You are thinking about "Virtual Machines" in a more
           | philosophical/abstract sense. Here the definition is in a
           | much more narrower and conventional sense. Surprisingly i
           | can't find too many books on this subject. The Iain Craig
           | book looks pretty interesting since it seems to link language
           | to the VM, so maybe one can learn both language design and a
           | VM to run it on.
           | 
           | Other than the above i can only think of studying "QEMU
           | Internals" for more knowledge - https://airbus-
           | seclab.github.io/qemu_blog/
        
             | kragen wrote:
             | I'm mostly talking about the sense of the term used in the
             | title of the article we're commenting on, not a more
             | philosophical sense, though I did draw the connection in my
             | second paragraph. It's hard to get less philosophical than
             | terminfo!
        
       | mystified5016 wrote:
       | After watching Ben Eater's breadboard CPU series, I want nothing
       | more than to design and emulate my own CPU.
       | 
       | I just wish I could find the time to sit down and design the damn
       | thing.
        
       ___________________________________________________________________
       (page generated 2024-12-27 23:01 UTC)