[HN Gopher] Preventing Flash of Incomplete Markdown when streami...
___________________________________________________________________
Preventing Flash of Incomplete Markdown when streaming AI responses
Author : biot
Score : 26 points
Date : 2025-06-04 17:06 UTC (5 hours ago)
(HTM) web link (engineering.streak.com)
(TXT) w3m dump (engineering.streak.com)
| sim7c00 wrote:
| fun read, its weird interacting with chatgpt around markdown
| sometimes.
|
| it formats its own stuff with markdown, so if i ask it for
| markdown but dont explicitly specify a downloadable file, it will
| produce valid markdown up to where it conflicts with its own
| markdown, and then it gets choppy and chunked.
|
| its an issue of my prompting is what im sure some customer
| service rep would be told to tell me :p because theres money to
| be made in courses for prompting skills perhaps, idk. (cynical
| view).
|
| sure is enjoyable to struggle together with the AI to format its
| responses correctly :'D
| porridgeraisin wrote:
| You can ask for it to put the markdown in a codeblock. It works
| well for me. It also works with latex.
| kherud wrote:
| Is there a general solution to this problem? I assume you can
| only start buffering tokens once you see a construct, for which
| there are continuations, that once completed, would lead to the
| previous text being rendered differently. Of course you don't
| want to keep buffering for too long, since this would defeat the
| purpose of streaming. And you never know if the potential
| construct will actually be generated. Also, the solution probably
| has to be more context sensitive. For example, within code
| blocks, you'll never want to render links for []() constructs.
|
| EDIT: One library I found is
| https://github.com/thetarnav/streaming-markdown which seems to
| combine incremental parsing with optimistic rendering, which
| works good enough in practice, I guess.
| biot wrote:
| There are a few things in our implementation that make a more
| general solution unnecessary. We only need the output to
| support a limited set of markdown which is typically text,
| bullet points, and links. So we don't need code blocks (yet).
|
| However, the second thing (not mentioned in the post) is that
| we are not rendering the markdown to HTML on the server, so
| []() markdown is sent to the client as []() markdown, not
| converted into <a href=...>. So even if a []() type link exists
| in a code block, that text will still be sent to the client as
| []() text, only sent in a single chunk and perhaps with the
| link URL replaced. The client has its own library to render the
| markdown to HTML in React.
|
| Also, the answers are typically short so even if OpenAI outputs
| some malformed markdown links, worst case is that we end up
| buffering more than we need to and the user experiences a pause
| after which the entire response is visible at once (the last
| step is to flush any buffered text to the client).
| kristopolous wrote:
| This exact problem is why I wrote Streamdown
| https://github.com/day50-dev/Streamdown
|
| Almost every model has a slight but meaningfully different
| opinion on what markdown is and how creative they can be with
| it.
|
| Doing it well is a non-trivial problem.
| woah wrote:
| Could this result in edge cases with [ where due to some
| misformatting or intentional syntax that looks like the start of
| a markdown link, the entire response is hidden from the user?
|
| (This comment when subjected to this processing could look like:
| "Could this result in edge cases with ")
| biot wrote:
| If you buffer starting with the ( character, then you'd still
| send the [text] part of the link, and worst case is that with
| no matching ) character to close the link, you end up buffering
| the remainder of the response. Even still, the last step is
| "flush any buffered text to the client", so the remainder of
| the response will be transmitted eventually in a single chunk.
|
| There are some easy wins that could improve this further: line
| endings within links are generally not valid markdown, so if
| the code ever sees \n then just flush buffered text to the
| client and reset the state to TEXT.
| impure wrote:
| I do something like this too because links in emails are insanely
| long. It's worse in marketing emails. So I shorten the links to
| save on tokens and expand them again when I get the response back
| from the LLM.
___________________________________________________________________
(page generated 2025-06-04 23:01 UTC)