[HN Gopher] Preventing Flash of Incomplete Markdown when streami...
       ___________________________________________________________________
        
       Preventing Flash of Incomplete Markdown when streaming AI responses
        
       Author : biot
       Score  : 26 points
       Date   : 2025-06-04 17:06 UTC (5 hours ago)
        
 (HTM) web link (engineering.streak.com)
 (TXT) w3m dump (engineering.streak.com)
        
       | sim7c00 wrote:
       | fun read, its weird interacting with chatgpt around markdown
       | sometimes.
       | 
       | it formats its own stuff with markdown, so if i ask it for
       | markdown but dont explicitly specify a downloadable file, it will
       | produce valid markdown up to where it conflicts with its own
       | markdown, and then it gets choppy and chunked.
       | 
       | its an issue of my prompting is what im sure some customer
       | service rep would be told to tell me :p because theres money to
       | be made in courses for prompting skills perhaps, idk. (cynical
       | view).
       | 
       | sure is enjoyable to struggle together with the AI to format its
       | responses correctly :'D
        
         | porridgeraisin wrote:
         | You can ask for it to put the markdown in a codeblock. It works
         | well for me. It also works with latex.
        
       | kherud wrote:
       | Is there a general solution to this problem? I assume you can
       | only start buffering tokens once you see a construct, for which
       | there are continuations, that once completed, would lead to the
       | previous text being rendered differently. Of course you don't
       | want to keep buffering for too long, since this would defeat the
       | purpose of streaming. And you never know if the potential
       | construct will actually be generated. Also, the solution probably
       | has to be more context sensitive. For example, within code
       | blocks, you'll never want to render links for []() constructs.
       | 
       | EDIT: One library I found is
       | https://github.com/thetarnav/streaming-markdown which seems to
       | combine incremental parsing with optimistic rendering, which
       | works good enough in practice, I guess.
        
         | biot wrote:
         | There are a few things in our implementation that make a more
         | general solution unnecessary. We only need the output to
         | support a limited set of markdown which is typically text,
         | bullet points, and links. So we don't need code blocks (yet).
         | 
         | However, the second thing (not mentioned in the post) is that
         | we are not rendering the markdown to HTML on the server, so
         | []() markdown is sent to the client as []() markdown, not
         | converted into <a href=...>. So even if a []() type link exists
         | in a code block, that text will still be sent to the client as
         | []() text, only sent in a single chunk and perhaps with the
         | link URL replaced. The client has its own library to render the
         | markdown to HTML in React.
         | 
         | Also, the answers are typically short so even if OpenAI outputs
         | some malformed markdown links, worst case is that we end up
         | buffering more than we need to and the user experiences a pause
         | after which the entire response is visible at once (the last
         | step is to flush any buffered text to the client).
        
         | kristopolous wrote:
         | This exact problem is why I wrote Streamdown
         | https://github.com/day50-dev/Streamdown
         | 
         | Almost every model has a slight but meaningfully different
         | opinion on what markdown is and how creative they can be with
         | it.
         | 
         | Doing it well is a non-trivial problem.
        
       | woah wrote:
       | Could this result in edge cases with [ where due to some
       | misformatting or intentional syntax that looks like the start of
       | a markdown link, the entire response is hidden from the user?
       | 
       | (This comment when subjected to this processing could look like:
       | "Could this result in edge cases with ")
        
         | biot wrote:
         | If you buffer starting with the ( character, then you'd still
         | send the [text] part of the link, and worst case is that with
         | no matching ) character to close the link, you end up buffering
         | the remainder of the response. Even still, the last step is
         | "flush any buffered text to the client", so the remainder of
         | the response will be transmitted eventually in a single chunk.
         | 
         | There are some easy wins that could improve this further: line
         | endings within links are generally not valid markdown, so if
         | the code ever sees \n then just flush buffered text to the
         | client and reset the state to TEXT.
        
       | impure wrote:
       | I do something like this too because links in emails are insanely
       | long. It's worse in marketing emails. So I shorten the links to
       | save on tokens and expand them again when I get the response back
       | from the LLM.
        
       ___________________________________________________________________
       (page generated 2025-06-04 23:01 UTC)