[HN Gopher] ht: Headless Terminal
___________________________________________________________________
ht: Headless Terminal
Author : tosh
Score : 169 points
Date : 2024-06-02 07:53 UTC (2 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| m1keil wrote:
| That's pretty interesting. What would be an example use case of
| this?
| npace12 wrote:
| could be useful for llm tools using the command line
| andyk wrote:
| Yep this was my main reason for wanting it, though lots of
| other good ideas in these HN comments.
| m1keil wrote:
| Ah, that makes sense, cheers.
| telotortium wrote:
| I'm guessing it's mostly useful for writing a terminal emulator
| in the browser. In theory I think all input and output on the
| terminal is done via text, which may contain control codes, so
| you'd still have to write code to render the text, support
| mouse input and selection, etc.
| guyman50 wrote:
| It looks like the use case is if you want to script the usage
| of a tui program. In most cases it would be best to script the
| operation yourself with sed rather than ht/nano. I could see
| this being useful for scripting internal tui tools without
| access to the source code.
| craftkiller wrote:
| My first thought was using it as a compatibility layer for
| VT100-style CLI programs. Hypothetically, if we wanted to
| finally replace VT100 emulation and move to some new legacy-
| free protocol for terminals, we would need some sort of shim
| for running legacy VT100 programs and displaying them in our
| shiny new VT-next terminal. Similar to the `vt` program from
| plan 9.
| andyk wrote:
| I shared the motivating use case for why Marcin and I built
| this (LLM agents using terminals) in a diff comment but I'll
| also expand the readme to give examples of use cases.
| pzmarzly wrote:
| I think it is a really good idea to separate vt100 emulation
| "backend" from the UI. Then all terminal emulators could use a
| common implementation and just focus on displaying the text,
| instead of emulating quirks of 50-year old devices.
|
| Using JSON as RPC also seems like a good idea, especially when
| SSH-ing in, as the language server protocol has shown.
|
| That being said, I don't see this project doing anything (yet?)
| about escape sequences (for colors, clickable links, mouse and
| clipboard integrations, setting title and cwd, etc) and handling
| shortcuts (e.g. translating ctrl-c to sigint). There are very few
| terminal emulators that get all of that right (I think Kitty,
| iTerm, WezTerm), so it would be great if this project could lead
| towards more of them being written.
| mananaysiempre wrote:
| Is there an actually convincing simple protocol for a 2D
| character buffer? (As opposed to a jumped-up typescript like
| the DEC terminals.) I've tried looking at the IBM tradition,
| and while it's certainly _different_ , I wouldn't say it's
| better on this particular point.
|
| (There is also the part where a VT100-style display is a bit
| more than just a character buffer--there's rewrapping on
| resize, for example, which IIUC be on for some lines and off
| for others--but let's assume we're willing to postpone updating
| window contents until the resize is concluded so can afford to
| do that on the application/VT100-processor side. Even though
| that feels like 1995.)
| ku1ik wrote:
| Most sequences used by CLI apps that are supported by popular
| (widely used) terminal emulators are supported here, i.e.
| there's good compatibility with VT100/VT220/VT520/etc.
|
| At the moment there's no support for mouse and clipboard - ht
| uses asciinema's avt for terminal emulation, where this was
| never needed (although may be added to avt if ht will need it).
|
| Regarding the colors: internally there's full support for
| standard indexed colors (palette) and RGB. The "getView" call
| currently returns a plain text version of the screen buffer,
| stripped of all color information (plaing text is what andyk's
| headlong project, which uses ht, needs), but we can easily make
| it return color attributes for each cell or segment of text
| (either by modifying "getView", or adding complementary
| "getRichView" or sth like that).
| Y_Y wrote:
| How does this relate to classic tools like `expect`?
| yjftsjthsd-h wrote:
| For one, this looks like a client/server model instead of one
| program directly controlling another.
| blueflow wrote:
| Also there is script(1).
| Zambyte wrote:
| I don't think you can reasonably use expect with TUIs, but you
| can use this with TUIs.
| ComputerGuru wrote:
| fish-shell does for its integration tests, and yes, it's
| brutal :)
| andyk wrote:
| I tried to contrast to `expect` in a couple of my other
| responses, but yeah this is my sense too after looking
| briefly at `expect` - that ht always transparently sets up a
| terminal for you under the hood and you interact with that so
| you can always grab a screenshot of any terminal UI.
|
| I don't think `expect` is targeted at this use case (though I
| am only learning about `expect` right now so could be wrong)
| xuhu wrote:
| "getView command allows obtaining a textual view of a terminal
| window."
| electric_mayhem wrote:
| First impression: nifty.
|
| Having sat with it a few minutes though, why would I choose this
| over the classic:
|
| https://core.tcl-lang.org/expect/index
|
| or any of its analogs such as:
| https://pkg.go.dev/github.com/google/goexpect
| https://pexpect.readthedocs.io/en/stable/
| https://www.rubydoc.info/gems/ruby_expect/1.6.0/RubyExpect/E...
|
| edit: ah. it preserves the UI. This has potential.
| andyk wrote:
| thanks for surfacing `expect` to our attention. I'll add a
| compare/contrast to the ht readme
| MobiusHorizons wrote:
| Reading the readme, I find myself wondering what problems this
| solves for. The example of wrapping nano strikes me as
| particularly odd, since editing files is already fairly easy to
| do programmatically either via direct file operations or with
| tools like sed. Aside from editors, most tools that offer a tui
| also expose the functionality for programmatic access (typically
| with some additional flags to the same binary). So it just
| strikes me as the wrong abstraction to interact with most things.
|
| The other potential goal I could imagine was automation. If that
| is the case, I would recommend the author to make that more clear
| in the examples and also to describe it in relation to `expect`,
| which would probably be my go to tool for such use cases.
|
| Whatever the case it does look like a fun project, even if I
| don't see any cases where it would be the right tool for the job.
| woodrowbarlow wrote:
| to me this seems essentially like 'screen/tmux without the
| multiplexing features' which is useful because most of us do
| 'terminal multiplexing' via our window manager and we're really
| just using screen because we want to detach the process from
| the terminal session (e.g. as a glorified nohup wrapper).
| another similar tool is `nq`.
| andrewshadura wrote:
| Try dtach
| mariocesar wrote:
| I had this issue of needing to control Docker containers in a
| VPS, without sharing access to the server itself. It seems like
| it will be easy to create a simple web service that can
| communicate with the ht API, list my containers, show me the
| stats, and restart containers if I want to. I can manage all
| security in the web service.
|
| This could be a nice case.
| shanemhansen wrote:
| That would work but wouldn't making http requests to the
| docker socket itself be a little easier?
|
| Example: curl --silent --unix-socket
| /run/user/1000/docker.sock http://v1.41/version
|
| From: https://dev.to/smac89/curl-to-docker-through-
| sockets-1mhe
|
| You could even do something like a reverse proxy to very
| limited paths although I tend to think that would ultimately
| be a bad idea and making your own http calls is probably
| better.
| mbreese wrote:
| You probably don't want to expose that service to the
| internet...
|
| I see this as something like the console management for
| VPS. Back in the day, I remember reading about how
| prgmr.com had setup a console that you'd directly SSH into.
| That's now this interface [1] (and a company name change),
| but I could see how programmatically working with this
| would be helpful.
|
| [1] https://tornadovps.com/documentation/vps-console
| johnmaguire wrote:
| The comment you're replying to mounted the Docker daemon
| as a local socket, accessible only on the machine. (It
| exposes an HTTP server still.)
|
| I don't see why one would be any more comfortable
| exposing a shell to the internet than the Docker daemon.
| It grants _more_ capabilities. Either should likely be
| protected by authentication.
| mbreese wrote:
| My understanding was that there was a server running
| Docker containers where the admin wanted to allow others
| to control/start containers without giving them access to
| the machine through a local login. The idea proposed was
| to make the docker port accessible to the outside world
| (authenticated, somehow).
|
| I'm not sure I'd want to expose the Docker port to the
| outside world (or outside of a strictly firewalled
| subnet). Even if it is wrapped, this seems to dangerous
| to me.
|
| The service I talked about is not a shell. It's a command
| line program that operates as the shell when you login
| via SSH. Instead of bash/zsh/etc, this program runs
| instead. The purpose is to give the VM admin access for
| out-of-band management (serial console, reinstalling the
| OS, etc). I'm a big fan of this approach, where you don't
| necessarily get full access to the host, but you do have
| enough access to do the work that's needed (and still SSH
| encrypted). No more, no less. To me, this seems like a
| great approach for something like restricted VPS or
| container admin.
|
| I ended up doing something similar for an SSH jump box a
| few jobs back where you could setup some basic admin
| things (like uploading SSH keys) using a CLI program that
| was used as an SSH shell.
|
| To bring it back to the original post -- like others, I
| had a hard time seeing what the OP could be used for.
| Until I thought about this OOB CLI. It would be great for
| scripting access to something like this.
| shanemhansen wrote:
| I think I was very unclear. I didn't mount anything, the
| docker daemon by default is accessible over http (over a
| unix domain socket rather than the typical TCP).
|
| I was proposing that this persons web app not do any sort
| of subprocess automation via something like ht, and
| instead take in requests and talk to the docker daemon on
| clients behalf. Since that allows any sort of
| authentication or filtering that needs to happen.
|
| I wasn't really seriously proposing the straight reverse
| proxy setup. That's one of those layer violations like
| PostREST that is either genius or lunacy. I haven't
| figured out which one.
| colinsane wrote:
| here's a terrible script which runs as root on all my boxes (as
| `redirect-tty /dev/tty1 unl0kr`):
| https://git.uninsane.org/colin/nix-files/src/commit/9189f18c...
|
| none of the Linux greeters meet all my needs, so i fall back to
| `login`. but i still need a graphical program for actually
| entering in my password -- particularly because some of my
| devices don't have a physical keyboard (i.e. my phone). so i
| take the output of a framebuffer-capable on-screen-keyboard [1]
| and pipe that into `login`. but try actually doing that. try
| `cat mypassword.txt | login MobiusHorizons`. it doesn't work:
| `login` does some things on its stdin which only work on vtty.
| so instead i run login on /dev/tty1, and pipe the password into
| /dev/tty1 for the auth.
|
| yes, this solution is terrible. a lot of things would make it
| less terrible. i could fix one of the greeters to work the way
| i need it (tried that). i could patch `login` (where it
| probably won't ever be upstreamed). i could integrate the OSK
| into the same input system the ttys use... or i could reach for
| `ht`. everything except the last one is a day or more of work.
|
| 1:
| https://gitlab.com/postmarketOS/buffybox/-/tree/master/unl0k...
| hitchstory wrote:
| I wrote an integration testing framework which I wanted to
| integrate with a tool _exactly_ like this so it could be used
| to, e.g. test a command line app like vim.
|
| Expect is what I tried to integrate with first. It falls over
| quite quickly with any kind of app that does anything mildly
| complicated with the terminal.
| andyk wrote:
| Interesting. When we decided to build ht we didn't compare it
| to expect (which I hadn't heard of or used) but I'm comparing
| the two now as they seem related.
|
| How exactly did `expect` fall over?
|
| From what I can tell, expect does not provide the
| functionality of a stateful terminal server/client under the
| hood for you so it isn't as easy to grab "text" screenshots
| of a Terminal User Interface, which is one of the main
| motivations behind ht (will update the readme to make this
| main use-case more clear)
| andyk wrote:
| Hey, project lead here. I had a very specific use case in mind:
| I'm playing with using LLM agent frameworks for software
| engineering - like MemGPT, swe-agent, Langchain and my own
| hobby project called headlong
| (https://github.com/andyk/headlong). Headlong is focused on
| making it easy for a human to edit the thought history of an
| agent via a webapp. The longer term goal of headlong is
| collecting large-ish human curated datasets that intermix
| actions/observations/inner-thoughts and then use those data to
| fine-tune models to see if we can improve their reasoning.
|
| While working on headlong I tried out and implemented a variety
| of 'tools' (i.e., functions) like editFile(), findFile(),
| sendText(), checkTime(), searchWeb(), etc., which the agents
| call using LLM function calling.
|
| A bunch of these ended up being functions that interacted with
| an underlying terminal. This is similar to how swe-agent works
| actually.
|
| But I figured instead of writing a bunch of functions that sit
| between the LLM and the terminal, maybe let the LLM use a
| terminal more like a human does, i.e., by "typing" input into
| it and looking at snapshots of the current state of it. Needed
| a way to get those stateful text snapshots though.
|
| I first tried using tmux and also looked to see if any existing
| libs provide the same functionality. Couldn't find anything so
| teamed up with Marcin to design and make ht.
|
| playing with the agent using the terminal directly has evolved
| into a hypothesis that I've been exploring: the terminal may be
| the "one tool to rule them all" - i.e., if an agent learns to
| use a terminal well it can do most of what humans do on our
| computers. Or maybe terminal + browser are the "two tools to
| rule them all"?
|
| Not sure how useful ht will be for other use cases, but maybe!
| MobiusHorizons wrote:
| This makes a lot of sense. I would call that out, because
| it's really surprising out of context. Hopefully you can see
| how unusual it would be to try to use human interfaces from
| code for which in at least the majority of cases, there are
| programatic interfaces for each task that already exist, and
| would be much less bug prone / finicky. I guess the analogy
| would be choosing to use Webdriver to interact with a service
| for which there is already an API.
| wolrah wrote:
| The immediate thought I had upon reading the description was
| "this would be great for Minecraft servers".
|
| Most of us running Minecraft servers on Linux have it wrapped
| in screen or tmux because the CLI is the only way to issue
| certain commands including stopping it properly.
|
| This could provide an alternative.
| tstack wrote:
| As others have kinda alluded to, it could be useful for testing
| TUI applications. I develop a logfile viewer for the terminal
| (https://lnav.org) and have a similar application[1] for
| testing, but it's a bit flaky. It produces/checks snapshots
| like [2]. I think the problems I run into are more around
| different versions of ncurses producing slightly different
| outputs.
|
| [1] -
| https://github.com/tstack/lnav/blob/master/test/scripty.cc [2]
| - https://github.com/tstack/lnav/blob/master/test/tui-
| captures...
| hexsprite wrote:
| Seems like it could be useful for e2e testing of command line
| applications
| cpendery wrote:
| You could always try https://github.com/microsoft/tui-test too.
| It could still use some more polishing on my part though
| lostmsu wrote:
| > npm i -D @microsoft/tui-test
| pama wrote:
| This could simplify RL on ncurses codes.
| blueflow wrote:
| What is RL referring to?
| theblazehen wrote:
| Potentially readline?
| https://en.wikipedia.org/wiki/GNU_Readline
| throwanem wrote:
| More likely reinforcement learning, I think.
| jauntywundrkind wrote:
| Combined with nohup, this could probably be useful for
| detaching/reattaching to long running processes across user
| sessions?
|
| I'm back to using tmux again, but for a while I was using another
| program dtach, to start vim sessions that I could disconnect and
| reattach to. Inside neovim I'd have a bunch of terminals &
| buffers & what not, so it felt redundant having tmux also hosting
| an event higher level environment.
|
| Dtach is super super lightweight. Tmux is keeping copies of each
| screen in memory, is doing some reprocessing. Dtach is basically
| a pipe wrapping input a program's input and output, data in data
| out.
|
| I even wrote a little shell script to let me very quickly 'dta
| my-proje' which will go autocomplete my-project name and either
| open the existing dtach session in that project or go create
| dtach session with vim in it.
| https://github.com/jauntywunderkind/dtachment/blob/master/dt...
|
| It would be interesting to see something like dtach used for
| automation or scripting, as it seems targeted for. The idea of
| being able to relay around the input and output feels like it
| should have some neat uses. There isn't really a protocol, afaik,
| and there's definitely no retained state for a getView like ht.
| But it should in many ways function similarly?
| nxobject wrote:
| Every legacy IBM mainframe application from the 60s with a web
| interface would like to have a word... s/IBM 3270/VT100.
| broknbottle wrote:
| Hmm this may have some use cases for macOS automation where
| there's no MDM.
| sigmonsays wrote:
| i dont see any way of actually typing key presses, like
| modifiers.
|
| This project looks pretty interesting, maybe i'm missing
| something.
| andyk wrote:
| Include in your input json the ascii control character that the
| keyboard combo would generate (e.g., \x03 for ctrl-c).
|
| To send control-c to the terminal, for example, you'd send the
| following JSON message to ht: { "type":
| "input", "payload": "\x03" }
| ku1ik wrote:
| To expand on what andyk wrote in the sibling comment:
|
| Programs running in a terminal don't get individual components
| of composite key presses such as ctrl+a, shift+b, so they don't
| see "a with ctrl modifier" or "b with shift modifier". The
| modifier keys are handled by terminal emulator before sending
| the key's ascii value to the program, modifying the regular
| ascii letters appropriately. So when "a" (ascii value 0x61) is
| pressed while holding shift, its ascii value is ... shifted
| (down) by a constant 0x20, making it ascii 0x41, which
| represents "A". Similar with ctrl key, which shifts down the
| ascii value by 0x60, turning "a" into 0x01. So to send "ctrl+d"
| you send input with a single byte of value 0x04 ("d" ascii 0x64
| minus 0x60). ht uses PTY under the hood, and this is how you
| send keyboard input into to a program via a PTY. This is kinda
| low level though, and there's definitely a possibility of
| implementing a high level input method in ht, which would parse
| string such as "<ctrl+d>" and automatically turn it into 0x04
| before sending it to the process.
|
| In other words, the way input in ht works right now was the
| easiest, simplest way of implementing this to get it out the
| door.
| rank0 wrote:
| Super cool! Also super difficult to secure if used server-side...
| remram wrote:
| I was wondering about this, is there a tmux API that is not
| command-line? A way to access the tmux socket directly?
|
| I would rather not multiply the number of tools I use, even
| though ht might be more appropriate in a clean-room environment.
| TickleSteve wrote:
| libtmux https://github.com/tmux-python/libtmux
| andyk wrote:
| Oh this is cool. I looked at using tmux before we built ht
| because I've used screen and tmux forever. I didn't find
| libtmux though. Will def check it out.
| remram wrote:
| This is a cool wrapper, but it calls tmux on the command
| line.
|
| I'm really confused as to why the tmux protocol can't be used
| directly, and entirely different systems like ht have to be
| created.
| jeroenjanssens wrote:
| I've created tmuxr, an R package to manage tmux [0].
|
| [0] https://github.com/jeroenjanssens/tmuxr
| no-dr-onboard wrote:
| seems really interesting for fuzzing.
| doubloon wrote:
| Would be awesome for ui testing.
| andyk wrote:
| andyk here. it's clear our readme is lacking use cases! adding
| some now. When we introduced ht on twitter I gave a little more
| context -- https://x.com/andykonwinski/status/1796589953205584234
| -- but that should have been in the project readme.
|
| Also a few people comparing to `expect`. I haven't used `expect`
| before, but it looks very cool. Their docs/readme seem only
| slightly more fleshed out than ours :-D Looks like the main way
| to use expect is via: spawn ... expect ...
| send ... expect ... etc.
|
| so, the expect syntax seems targeted more towards testing where
| you simultaneously get the output from the underlying binary and
| then check if it's what you expect (thus the name I guess). I
| can't see if there is a way to just get the current terminal
| "view" (aka text screenshot) via an expect command?
|
| ht is more geared towards scripting (or otherwise
| programmatically accessing) the terminal as a UI (aka Terminal
| UI). So ht always runs a terminal for you and gives you access to
| the current terminal state. Need to try out expect myself, but
| from what I can tell, it doesn't seem to always transparently run
| a Terminal for you.
|
| There might already be some other existing tool that overlaps
| with the ht functionality, but we couldn't find it when looked
| around a bunch before building ht.
| m0shen wrote:
| `expect` is absolutely geared towards scripting, as it's an
| extension of TCL. Though as far as getting a "current terminal
| view" `expect` has `term_expect`: https://core.tcl-
| lang.org/expect/file?name=example/term_expe...
| metadat wrote:
| Expect is The Original Way, and has been the standard since
| before I learned to program more than 20 years ago. :-D
|
| Expect is also extra cool because of `autoexpect'.
| generate an Expect script from observing a (shell) session
|
| https://manpages.ubuntu.com/manpages/focal/en/man1/autoexpec...
___________________________________________________________________
(page generated 2024-06-04 23:00 UTC)