[HN Gopher] Write Your Own Terminal
___________________________________________________________________
Write Your Own Terminal
Author : ingve
Score : 120 points
Date : 2023-11-10 08:15 UTC (14 hours ago)
(HTM) web link (flak.tedunangst.com)
(TXT) w3m dump (flak.tedunangst.com)
| Instantnoodl wrote:
| I wrote a small terminal emulator a while ago to have a portable
| terminal for my terminal based game. It's very specific but I had
| great fun with it.
|
| https://github.com/bigjk/crt
| clemailacct1 wrote:
| That terminal and even the associated game look incredible!
| sotix wrote:
| This is awesome! Excited to see what you do with the project.
| Definitely keep us updated.
| billconan wrote:
| I wanted to write a terminal emulator, the biggest hurdle is
| understanding the escape sequences. all documents seem to be
| unreadable, including those mentioned in the post, and those
| often referenced in projects, like
| https://vt100.net/emu/dec_ansi_parser
|
| the second difficulty is handling reflow. a real terminal can't
| resize its screen, but an emulator can. how to implement that
| correctly with cursor movements?
|
| the third difficulty is handling font fallback and rendering
| emojis and other combinatory glyphs correctly.
| SoftTalker wrote:
| Some real terminals had a few different row/column modes, e.g.
| 80x24 or 132x40 (IIRC). I don't recall if text was reflowed
| when switching.
| jws wrote:
| The early DEC terminals did not reflow on switching.
| wyan wrote:
| Usually the screen was cleared when switching modes, so no
| text reflow.
| dilap wrote:
| haven't tried, but i'd guess chatgpt (especially 4) could help
| a ton w/ this.
| jws wrote:
| I've written ANSI terminal handling before. The original VT100
| paper manual that came with the terminal is a nice start,
| because the complexity hadn't really happened yet and people
| used to write useful documentation. With that as a starting
| point for understanding it isn't hard to extend into handling
| the full spec.
|
| That diagram you linked is actually quite nice, but visually
| intimidating. If you think of it as a handful of regular
| expressions exploded into a state diagram that helps. For
| instance, the entire left half of the diagram is just the CSI
| code acceptor, see "CSI (Control Sequence Introducer)
| sequences" on the "ANSI escape code" wikipedia page.
|
| You can write a regular expression to match a CSI and carry on.
| This is 2023 and you aren't using an 8080 with 3k of RAM.
| (probably) The only tiny trick is that you have to handle the
| "incomplete trailing regex" and wait for more data to arrive
| and try again.
|
| As for handling reflow. I wouldn't call that an
| "implementation" problem. I'd call it a "specification"
| problem. I'd approach it by seeing what Apple's Terminal
| program does, write that down, and call that my specification.
| alpaca128 wrote:
| I always used Wikipedia's article on ANSI escape sequences. A
| few details could be explained a bit better but overall I found
| it useful. The diagram you linked is probably a more complete
| and compact overview of all possible combinations, but I don't
| find it very intuitive either.
| sureglymop wrote:
| I think it lacks some important escape sequences. E.g. how do
| programs like vim and tmux switch to another buffer and then
| restore the buffer? I vaguely know about it but never saw an
| actually complete documentation.
| vidarh wrote:
| The diagram is complete. It shows the collection of the raw
| sequences, which includes a bunch of parameters that you
| then need to process separately to determine what to
| actually do.
|
| To switch to/from the alternate screen mode is \e[?47h and
| \e[?47l. "\e[?" is DEC private mode which are DEC private
| mode codes. The number specifies a range of settings to
| switch on or off. The "h" and "l" determines if you're
| setting or clearing the setting respectively.
|
| The parsing of those are handled by the escape, csi entry,
| and csi param boxes in the diagram.
| vidarh wrote:
| Ignore all of this, and start simple.
|
| You can get _something_ going with just the most rudimentary
| escape handling just by spitting what programs write to your
| terminal to debug output in the terminal you run your new
| terminal from, and add a proper parser a bit later.
|
| You can totally ignore reflow. It'll look ugly. It doesn't
| matter. When running full screen applications you need to
| handle width/height reporting, that's all.
|
| Font fallback and nice font handling is a detail to worry about
| well down the line. There are libraries that can do a lot of
| the lifting for you depending on language/platform. Just pick a
| font with reasonable coverage and worry about the rest later.
| blueflow wrote:
| Ignore all that ANSI stuff, implement escape sequences like you
| think its easy to implement and then write a terminfo file for
| that so applications know how to use it.
| er4hn wrote:
| Mitchell Hashimoto, Hashicorps longest serving IC, has been
| working on his own terminal emulator as a side project:
| https://mitchellh.com/ghostty . It's been interesting to read
| through his logs and see how it develops along with the gnarly
| bugs he gets to work through.
| mtlynch wrote:
| Agreed! I started reading not understanding anything about
| terminal emulators, and it's been interesting following his
| progress.
|
| > _Mitchell Hashimoto, Hashicorps longest serving IC_
|
| Small correction: I don't think this is right. He only became
| an IC two years ago.[0]
|
| [0] https://www.hashicorp.com/blog/mitchell-s-new-role-at-
| hashic...
| keithwinstein wrote:
| FWIW, I wouldn't try to parse escape sequences directly from the
| input bytestream -- it's easy to end up with annoying edge cases.
| :-/ In my experience you'll thank yourself if you can separate
| the logic into something like:
|
| - First step (for a UTF-8-input terminal) is interpreting the
| input bytestream as UTF-8 and "lexing" into a stream of Unicode
| Scalar Values
| (https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf#P.12...
| ; https://github.com/mobile-
| shell/mosh/blob/master/src/termina...).
|
| - Second step is "parsing" the scalar values by running them
| through the DEC parser/state machine. This is independent of the
| escape sequences (https://vt100.net/emu/dec_ansi_parser ;
| https://github.com/mobile-shell/mosh/blob/master/src/termina...).
|
| - And then the third step is for the terminal to execute the
| dispatch/execute/etc. actions coming from the parser, which is
| where the escape sequences and control chars get implemented
| (https://www.vt100.net/docs/vt220-rm/ ; https://invisible-
| island.net/xterm/ctlseqs/ctlseqs.html ;
| https://github.com/mobile-shell/mosh/blob/master/src/termina...).
|
| Without this separation, it's easier to end up with bugs where,
| e.g., a UTF-8 sequence or an ANSI escape sequence is treated
| differently if it's split between read() calls
| (https://bugs.chromium.org/p/chromium/issues/detail?id=212702),
| or invalid input isn't correctly recovered-from, etc.
| azinman2 wrote:
| The comments here don't seem to reflect what I think is the most
| interesting point here: quick loops of satisfaction. So much of
| programming often takes forever to get any real utility or see
| progress. That can really be depressing, especially for a side
| project. That's what I love about cooking or sewing; you quickly
| see the process come together. I wish programming was like that
| more often.
| vidarh wrote:
| Yeah, I'm using my own terminal, and my own editor. The
| terminal also relies on a font-engine I have heavily modified
| (I converted the original from C to Ruby). So I "control" the
| whole pipeline from the editor to the actual pixels, and on one
| hand it has all kinds of quirks I wouldn't wish on someone
| else, on the other hand they all have bits and pieces that are
| custom-written to fit exactly what I want, and which features
| gets implemented are decided almost entirely based on which
| little change feels like it'll immediately improve my life
| right now (and I'm not joking - I spend enough time in front of
| my terminal that fixing small aspects of the terminal or my
| editor does feel like it is making an actual improvement in my
| happiness).
| norir wrote:
| > It's also possible to write a terminal in a terminal, like
| tmux, but I'd save this for my second attempt. It's very helpful
| to have a place to dump logging info that's not also the screen
| we're writing to.
|
| I don't fully understand what the author is saying here or
| precisely what they mean by writing a terminal in a terminal.
| From my perspective though, it is easier to write a hosted
| terminal that runs inside of an existing terminal. Writing the
| full thing from scratch is a much harder problem. A terminal has
| many subproblems that are best attacked separately in my opinion.
|
| At its heart, a terminal reads formatted text from standard input
| and writes formatted text to standard output. It is essentially a
| REPL. So the first step is to write a (R)ead function. Then you
| pass the result of read to the (E)valuate function which will
| process the input and finally pass it in to the (P)rint function.
| If you start with a hosted terminal, the read and print functions
| can be modeled with posix read and write so you can devote most
| of your time to the evaluate function.
|
| Once you have a good evaluate function and the terminal works as
| you like in the hosted environment, then it makes sense to go
| back and write new implementations of read and write that target
| a new host environment. This is when it makes sense to switch to
| QT or opengl: when you have already implemented the core logic of
| the terminal and want better io performance. But it also might
| make sense to target html/js for maximum portability. You can
| either reuse or rewrite the backend that was used in the
| bootstrap terminal depending on how it was written. Even if you
| are changing languages, the rewrite should be much easier than
| the initial implementation since you already know what
| functionality is necessary and how to do it.
|
| If you start with QT or opengl, you might never even get to a
| useful terminal because you get so bogged down in the incidental
| details.
|
| What I am describing is essentially quite similar to
| bootstrapping a new programming language. The initial
| implementation should be done in the most convenient
| language/environment possible for the author. People commonly
| make the mistake of implementing a bootstrap compiler in a low
| level language, which is almost always a premature optimization
| and forces you to take on accidental complexity (such as memory
| management) that is secondary to your primary goals. Remember
| Fred Brook's advice to plan to throw the first implementation
| away. It is so much easier to do something that you have already
| done before than something new.
| vidarh wrote:
| The entire backend rendering to raw X11 calls for my personal
| terminals is ~160 lines of Ruby, and that includes support for
| oddities like double width/double height, and optimizations you
| can drop at first like scroll up/down (as opposed to taking the
| slow approach of redrawing, which is enough for a first
| approximation). You need very little to do the bare minimum
| graphical output.
| norir wrote:
| Exactly. So start there and build up.
| winstonrc wrote:
| I built a fake terminal on my website[0]. I've been planning on
| building an actual one that is compiled to WASM, but it was fun
| building the little features such as a memory of entered commands
| that can be navigated by pressing up and down with the arrow
| keys. This looks like a great resource for me to take it to the
| next level. Are there any concerns I should be aware of if I were
| to deploy a working terminal on a website?
|
| [0] https://www.winstoncooke.com/terminal
| c-smile wrote:
| Just in case, Sciter has built-in element <terminal> that can be
| used for various purposes.
|
| Escape codes are supported, see: https://sciter.com/wp-
| content/uploads/2022/10/terminal.png
|
| Docs/API: https://docs.sciter.com/docs/behaviors/behavior-
| terminal
___________________________________________________________________
(page generated 2023-11-10 23:00 UTC)