[HN Gopher] A 23-byte "hello, world" program assembled with DEBU...
___________________________________________________________________
A 23-byte "hello, world" program assembled with DEBUG.EXE in MS-DOS
Author : susam
Score : 68 points
Date : 2022-10-30 20:07 UTC (2 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| marginalia_nu wrote:
| DEBUG.EXE is some necronomicon-tier dark magic.
| Narishma wrote:
| What do you mean? It's a simple straightforward debug tool, or
| a monitor as it used to be called in 8-bit systems.
| userbinator wrote:
| PC magazines used to publish source code listings for little
| utilities in the form of DEBUG scripts.
| blueflow wrote:
| The debugger and the 8086 instruction set is well documented,
| much better than the "modern" software that i have to work with
| at dayjob. Its not magic.
| pizza234 wrote:
| It's interesting how ASM nowadays appears as a dark magic,
| since only a small fraction of the programmers are (rightfully)
| very far from that level.
|
| Curiously, I found learning Rust way more challenging than
| learning 16-bit assembly (it was way simpler back then; no
| complex instructions, less baggage, simpler processors... and
| less expectations :)).
| int_19h wrote:
| I've learned x86 16-bit assembly originally, and I find that
| most of that knowledge is still applicable when looking at
| assembly listings while debugging C/C++ today (which is the
| most likely area where one might have to deal with it in
| production these days; few people get to write asm from
| scratch).
|
| For x86, at least, I wouldn't even say that it was less
| complex. Segmented memory alone is a huge complication, and
| then on top of that there was all the legacy CISC stuff like
| the BCD helpers or ENTER/LEAVE; x64 is comparatively
| streamlined.
| mtrower wrote:
| An 8 year old, using the ring bound manual that came with their
| computer, can figure out the basics - enough to write a program
| like this, and examine it. I'm not being dismissive here, but
| speaking literally from experience. Personally, it's my opinion
| that computers were a lot simpler back then.
|
| Nowadays we're very, very far from the hardware - hardware that
| has grown very complicated in comparison.
| userbinator wrote:
| You can just use ret at the end, saving 3 bytes. Also, the
| initial value of bp is 09xx on every version of MS-DOS since 4.0,
| so you can also start off with an xchg ax,bp to save another
| byte. xchg ax,bp mov dx,107 int
| 21 ret db "hello, world" 0d 0a '$'
|
| 22 bytes, of which 15 is the message.
| EvanAnderson wrote:
| I was going to suggest the RET but I didn't remember the BP
| being set to save the additional byte. Nice, albeit sacrificing
| some compatibility.
|
| For anybody interested some DOS default register values
| documented: http://www.fysnet.net/yourhelp.htm
| userbinator wrote:
| It's an old demoscene trick, so perhaps a bit obscure but
| somewhat common in the sizecoding community.
| jart wrote:
| Wow from feedback to commit in 30 minutes.
| https://github.com/susam/hello/commit/36fa08e7cafb7c5268b651...
| susam wrote:
| Updated the repository to use your RET suggestion. Thanks!
| ithinkso wrote:
| With no attribution nevertheless :) (/s)
|
| I love how, without the 'hello, world' message itself, 25% of
| your entire HELLO.ASM codebase is from a random HN comment
| colejohnson66 wrote:
| Impressive. Is there a reason to jump over the string instead of
| just having the string _after_ the program? Seems like one could
| save two bytes doing that.
| donio wrote:
| debug.exe is single pass and doesn't do labels so by having the
| string first you know its address later.
| vore wrote:
| You can compute the address of the string yourself like a
| two-pass assembler would, though, so that shouldn't be
| limiting.
| pizza234 wrote:
| This is an ASM program with a very standard structure
| (including the standard printing API:
| http://spike.scu.edu.au/~barry/interrupts.html#ah09) using a
| very standard tool (DEBUG.exe, common at the time for quick
| debugging); I'm confused why this is impressive.
| Agingcoder wrote:
| The COM executable gets loaded by DOS at address 100h, so the
| first bytes have to be executable code, if memory serves me
| well?
| vore wrote:
| I think OP is saying what if you wrote it as:
| mov ah, 9 mov dx, offset helloworld
| int 21 mov ah, 0 int 21
| .helloworld: db 'hello, world', d, A, '$'
| dmitrygr wrote:
| You are right and sister comment to this one is wrong. Thusly:
|
| MOV AH, 9
|
| MOV DX, str
|
| INT 21
|
| MOV AH, 0
|
| INT 21
|
| Str:
|
| DB 'hello, world', d, A, '$'
| q-big wrote:
| This program can be simplified further: MOV
| AH, 9 MOV DX, str INT 21 RET str:
| DB 'hello, world', d, A, '$'
|
| Why can MOV AH, 0 INT 21
|
| be replaced by RET? Here is the answer:
| https://stackoverflow.com/a/60805758
|
| UPDATE: Under https://news.ycombinator.com/item?id=33398592
| userbinator posted an additional possible optimization.
| ralferoo wrote:
| Not a size optimisation, but a performance optimisation...
| INT 21 RET
|
| can be replaced with JP 5
| susam wrote:
| I wrote this about 20 years ago during my university days. I
| happened to stumble upon it today in my archives and thought of
| sharing it on GitHub. I was still learning microprocessors back
| then. While browsing the C:\Windows directory, I fortuitously
| happened to discover the DEBUG.EXE program. Turned out it was
| available on any standard installation of MS-DOS as well as
| Windows 98. That chance encounter helped me to dive into the
| world of assembly language programming much before the
| coursework introduced me to more popular assemblers.
|
| Since I was still learning the x86 CPUs, the intention here was
| not to save bytes but instead to have something working. I
| believe I picked up the style of having the string at the top
| and jumping over it from other similar code I had come across
| in those days.
|
| You are right of course. Here is a complete example that moves
| the string to the bottom and saves two bytes:
| C:\>DEBUG -A 1165:0100 MOV AH, 9 1165:0102
| MOV DX, 10B 1165:0105 INT 21 1165:0107 MOV AH, 0
| 1165:0109 INT 21 1165:010B DB 'hello, world', D, A, '$'
| 1165:011A -G hello, world Program
| terminated normally -N HELLO.COM -R CX CX
| 0000 :1A -W Writing 0001A bytes -Q
| C:\>HELLO hello, world C:\>
|
| I have now updated the GitHub repository with this updated
| source code and binary. Thank you for the nice comment!
| Narishma wrote:
| DEBUG.EXE is present in all versions of MS-DOS since the very
| first.
| owl57 wrote:
| Curious. These other programmers probably learned the habit
| from some even older code. Jumping over data isn't a very
| obvious way of organising code, so probably it served some
| purpose many years ago.
|
| Maybe someone here knows what was that purpose?
| [deleted]
| userbinator wrote:
| I remember seeing that in old Asm books too. My best guess
| is that it avoids having too many forward references, which
| would take up precious memory in the systems of the time
| and perhaps reach the limit of the assembler sooner.
| _the_inflator wrote:
| Maybe also better for linking the files. At least this
| was the reason I did it on Amiga when using absolute
| addresses. I only had to remember the start of the
| address area even when recompiling.
| jstanley wrote:
| I wrote a compiler for a small machine that did this so
| that it could output the string content straight away
| without having to buffer it in memory.
| jmole wrote:
| you can hard-code values if you know their address in
| memory. If the data section comes first, then it doesn't
| move around if your code size changes.
| secondcoming wrote:
| This brings back memories!
|
| Many years ago I used debug.exe create a bootable floppy that did
| nothing but display my name on the screen when booted from. I
| peed a little when it finally worked.
|
| Why did MS stop shipping it???
| ivoras wrote:
| Crazy that once upon a time just sequences of machine code were
| written to files, without any headers, or checksums, or basically
| any modern metadata. Just plain machine code, to be directly fed
| into the CPU.
|
| Cue lamentations of today's complexity mixed with feelings of
| life being great because of it.
| pizza234 wrote:
| Yes, although this was for COM files only, which were limited
| to (64k - 0x100) bytes. EXEs didn't have this limitation, but
| they indeed had a header.
| Lt_Riza_Hawkeye wrote:
| And before that, they were toggled directly into the machine
| using physical switches!
| 13of40 wrote:
| MS-DOS v1 didn't have an assembler in debug.exe (or was it
| .com?) so the only way to author machine code on a 5150 with no
| extra dev tools was to code it on paper, translate it by hand,
| and enter it in hex.
| susam wrote:
| You are right. It was DEBUG.COM. Indeed it did not have an
| assembler. It had a disassembler though. An archived copy
| from MS-DOS v1.25 can be found here:
| https://github.com/microsoft/MS-
| DOS/blob/master/v1.25/bin/DE... C:\>DEBUG.COM
| -A ^ Error -N HELLO.COM -L -U 100
| 107 0340:0100 B409 MOV AH,09
| 0340:0102 BA0801 MOV DX,0108 0340:0105 CD21
| INT 21 0340:0107 C3 RET -
|
| There is another copy of the debugger for MS-DOS v2.0 here:
| https://github.com/microsoft/MS-
| DOS/blob/master/v2.0/bin/DEB... . This one does does have an
| assembler. C:\>DEBUG.COM -A
| 0482:0100 MOV AH, 2 0482:0102 MOV DL, 41
| 0482:0104 INT 21 0482:0106 RET 0482:0107 -N
| A.COM -R CX CX 0000 :7 -W
| Writing 0007 bytes -Q C:\>A A
| C:\>
| pjc50 wrote:
| If you're programming a microcontroller, that time is now.
|
| (OK, so it'll often be the output of a C compiler, but there's
| usually some work to be done in asm to get the system in a
| state with a stack, clocks, RAM etc to run a C program!)
___________________________________________________________________
(page generated 2022-10-30 23:00 UTC)