[HN Gopher] Progressive JSON
___________________________________________________________________
Progressive JSON
Author : kacesensitive
Score : 445 points
Date : 2025-06-01 00:58 UTC (22 hours ago)
(HTM) web link (overreacted.io)
(TXT) w3m dump (overreacted.io)
| behnamoh wrote:
| I think the pydantic library has something similar that involves
| validating streaming JSON from large language models.
| polyomino wrote:
| We encountered this problem when converting audio only LLM
| applications to visual + audio. The visuals would increase
| latency by a lot since they need to be parsed completely before
| displaying, whereas you can just play audio token by token and
| wait for the LLM to generate the next one while audio is playing.
| nixpulvis wrote:
| I've always liked the idea of putting latency requirements in to
| API specifications. Maybe that could help delimit what is and is
| not automatically inlined as the author proposes.
| xtajv wrote:
| Choose APIs that offer SLAs. <3
|
| It's not about being picky. It's about communicating needs, and
| setting boundaries that are designed to satisfy those needs
| _without_ overwhelming anybody 's system to the point of
| saturation and degraded performance.
| nixpulvis wrote:
| Right, but what if some of that SLA information actually
| directed the code itself.
|
| In the context of this blog post, what if the SLA was <100ms
| for an initial response, with some mandatory fields, but then
| any additional information which happens to be loaded within
| that 100ms automatically is included. With anything outside
| the 100ms is automatically sent in a followup message?
| uncomplete wrote:
| jsonl is json objects separated by endline characters. Used in
| Bedrock batch processing.
| markerz wrote:
| ndjson is extremely similar, Splunk uses it for exporting logs
| as json
| lucb1e wrote:
| From a quick lookup, aren't "newline-delimited json" and
| "json lines" identical? Different name for the same thing?
| Izkata wrote:
| Came up at work a few weeks ago when a co-worker used
| "ndjson" which I'd never heard of before, but I knew
| "jsonl" which he'd never heard of before: As far as I could
| tell with some searching, they are basically the same thing
| and have two different names because they came from two
| different places. "ndjson" was a full-on spec, while
| "jsonl" was more informal - kind of like an enterprise vs
| open source, that converged on the same idea.
|
| From wikipedia, "ndjson" used to include single-line
| comments with "//" and needed custom parsers for it, but
| the spec no longer includes it. So now they are the same.
| yencabulator wrote:
| ndjson has an actual spec (however bitrotted), everything
| else in that space makes rookie mistakes like not
| specifying that a newline is a required message terminator
| -- consider receiving "13\n42", is that truncated or not?
|
| https://github.com/ndjson/ndjson.github.io/issues/1#issueco
| m...
|
| None of the above is actually good enough to build on, so a
| thousand little slightly-different ad hoc protocols bloom.
| For example, is empty line a keepalive or an error? (This
| might be perfectly fine. They're trivial to program, not
| like you need a library.)
| xelxebar wrote:
| Very cool point, and it applies to any tree data in general.
|
| I like to represent tree data with parent, type, and data vectors
| along with a string table, so everything else is just small
| integers.
|
| Sending the string table and type info as upfront headers, we can
| follow with a stream of parent and data vector chunks, batched N
| nodes at a time. Tye depth- or breadth-first streaming becomes a
| choice of ordering on the vectors.
|
| I'm gonna have to play around with this! Might be a general way
| to get snappier load time UX on network bound applications.
| x-complexity wrote:
| ... It might be a pursuit worth making a small library for.
| thethimble wrote:
| You can even alternate between sending table and node chunks!
| This will effectively allow you to reveal the tree in any order
| including revealing children before parents as well as
| representing arbitrary graphs! Could lead to some interesting
| applications.
| xelxebar wrote:
| Good point! The parent vector rep is what allows arbitrary
| node order, but chunking the table data off chunks of node
| IDs is brilliant idea. Cheers!
| dmkolobov wrote:
| If you send the tree in preorder traversal order with known
| depth, you can send the tree without node ids or parent ids!
| You can just send the level for each node and recover the tree
| structure with a stack.
| xelxebar wrote:
| Well the whole point is to use a breadth first order here. I
| don't think there's a depth vector analogue for breadth first
| traversals. Is there?
|
| But, indeed, depth vectors are nice and compact. I find them
| harder to work with most of the time, though, especially
| since insertions and deletions become O(n), compared to
| parent vector O(1).
|
| That said, I do often normalize my parent vectors into dfpo
| order at API boundaries, since a well-defined order makes
| certain operations, like finding leaf siblings, much nicer.
| ummonk wrote:
| I'm not familiar with depth vectors, but wouldn't the
| breadth first traversal analogue of each entry specifying
| its depth (in a depth first format) be each entry
| specifying the number of immediate children it has?
| dmkolobov wrote:
| Yeah, it has its limits for sure. I like it for the
| streaming aspect.
|
| I think you can still have the functionality described in
| the article: you would send "hole" markers tagged with
| their level. Then, you could make additional requests when
| you encounter these markers during the recovery phase,
| possibly with buffering of holes. It becomes a sort of
| hybrid DFS/BFS approach where you send as much tree
| structure at a time as you want.
| aljow wrote:
| If it has to be mangled to such an extent to do this, then it
| seems reasonable to assume JSON is the wrong format for the task.
|
| Better to rethink it from scratch instead of trying to put a
| square peg in a round hog.
| danabramov wrote:
| I'm being a bit coy about it but the article aims to describe
| key ideas in the RSC wire protocol, which is an implementation
| detail of React and isn't actually beholden to JSON itself.
| JSON is just a nice starting point to motivate it. However, I
| think reusing JSON for object notation kind of makes sense (and
| allows native JSON.parse calls for large objects).
| KronisLV wrote:
| I feel like in an ideal world, this would start in the DB: your
| query referencing objects and in what order to return them (so
| not just a bunch of wide rows, nor multiple separate queries) and
| as the data arrives, the back end could then pass it on to the
| client.
| powgpu wrote:
| Many db has done that, 4d.com is one that comes to mind. It is
| kinda like socket.io + PostgreSQL + node/ruby/php (middleware
| layer) all in one. In db there is also concept of cursor, etc.
|
| Seems like it is never about merit of technological design. As
| some CS professor put it, tech is more about fashion than tech
| now days. IMHO that is true, and often also comes down to the
| technological context surrounding the industry at the time, and
| now days if the code is open sourced/FOSS.
| pigbearpig wrote:
| That's not going to make the front page of HN though.
| Velorivox wrote:
| 99.9999%* of apps don't need anything nearly as 'fancy' as this,
| if resolving breadth-first is critical they can just make
| multiple calls (which can have very little overhead depending on
| how you do it).
|
| * I made it up - and by extension, the status quo is 'correct'.
| echelon wrote:
| We technically didn't need more than 640K either.
|
| Having progressive or partial reads would dramatically speed up
| applications, especially as we move into an era of WASM on the
| frontend.
|
| A proper binary encoded format like protobuf with support for
| partial reads and well defined streaming behavior for sub
| message payloads would be incredible.
|
| It puts more work on the engineer, but the improvement to UX
| could be massive.
| pigbearpig wrote:
| Sure, if you're the 0.00001% that need that. It's going to be
| over engineering for most cases. There are so many simpler
| and easier to support things that can be done before trying
| this sort of thing.
|
| Following the example, why is all the data in one giant
| request? Is the DB query efficient? Is the DB sized
| correctly? How about some caching? All boring, but if rather
| support and train someone on boring stuff.
| danabramov wrote:
| To be clear, I wouldn't suggest someone to implement this
| manually in their app. I'm just describing at the high level
| how the RSC wire protocol works, but narratively I wrapped it
| in a "from the first principles" invention because it's more
| fun to read. I don't necessarily try to sell you on using RSC
| either but I think it's handy to understand how some tools are
| designed, and sometimes people take ideas from different tools
| and remix them.
| conartist6 wrote:
| I'm already thinking of whether there's any ideas here I
| might take for CSTML -- designed as a streaming format for
| arbitrary data but particularly for parse trees
| Velorivox wrote:
| I get that. Originally my comment was a response to another
| but I decided to delete and repost it at the top level --
| however I failed to realize that not having that context
| makes the tone rather snarky and/or dismissive of the article
| as a whole, which I didn't intend.
| danabramov wrote:
| Np, fair enough!
| xtajv wrote:
| There's nothing wrong with "accidentally-overengineering" in
| the sense of having off-the-shelf options that are actually
| nice.
|
| There _is_ something wrong with adding a "fancy" feature to an
| off-the-shelf option, if said "fancy" feature is realistically
| "a complicated engineering question, for which we can offer a
| leaky abstraction that will ultimately trip up anybody who
| doesn't have the actual mechanics in mind when using it".
| motorest wrote:
| > There's nothing wrong with "accidentally-overengineering"
| in the sense of having off-the-shelf options that are
| actually nice.
|
| Your comment focuses on desired outcomes (i.e., "nice"
| things), but fails to acknowledge the reality of tradeoffs.
| Over engineering a solution always creates problems. Systems
| become harder to reason with, harder to maintain, harder to
| troubleshoot. For example, in JSON arrays are ordered lists.
| If you onboard an overengineered tool that arbitrarily
| reorders elements in a JSON array, things can break in non-
| trivial ways. And they often do.
| neRok wrote:
| Multiple calls?! That sounds like n*n+1. Gross :P
|
| I think the issue with the example json is that it's sent in
| OOP+ORM style (ie nested objects), whereas you could just send
| it as rows of objects, something like this; {
| header: "Welcome to my blog", post_content: "This is my
| article", post_comments: [21,29,88], # the numbers are
| the comment ID's footer: "Hope you like it",
| comments: {21: "first", 29: "second", 88: "third" } }
|
| But then you may as well just go with protobufs or something,
| so your endpoints and stuff are all typed and defined,
| something like this; syntax = "proto3";
| service DirectiveAffectsService { rpc
| Get(GetPageWithPostParams) returns (PageWithPost); }
| message GetPageWithPostParams { string post_id = 1;
| } message PageWithPost { string page_header = 1;
| string page_footer = 2; string post_content = 3;
| repeated string post_comments = 4; repeated
| CommentInPost comments_for_post = 5; } message
| CommentInPost { string comment_id = 1; string
| comment_text = 2; }
|
| And with this style, you don't necessarily need to embed the
| comments in 1 call like this, and you could cleanly do it in 2
| like parent-comment suggests (1 to get page+post, second to get
| comments), which might be aided with `int32 post_comment_count
| = 4;` instead (so you can pre-render n blocks).
| jimmcslim wrote:
| I understand the GraphQL has fallen out of favour somewhat, but
| wasn't it intended to solve for this?
| delichon wrote:
| For serialization GraphQL uses ... JSON.
|
| GraphQL could use Progressive JSON to serialize subscriptions.
| Spivak wrote:
| I think the point is that GraphQL solves the problem, a
| client only actually needing a subset of the data, by
| allowing the client to request only those fields.
| owebmaster wrote:
| It can't fall out of favor if it was never really in favor to
| begin with. GraphQL was a quite brief hype then a big technical
| debt.
| tunesmith wrote:
| So if not graphql, then what's the latest "in-favor" thinking
| to solve the problem of underfetching and overfetching?
| Especially in an environment with multiple kinds of
| frontends?
| cluckindan wrote:
| What do you mean by technical debt here?
| owebmaster wrote:
| Everywhere I worked with GraphQL it was always a pain for
| the backend team to keep the graphql server updated and
| also a pain to use in the frontend, simple REST apis or
| JSON-RPC are much better.
| RexM wrote:
| Interesting take considering graphql adoption is growing and
| generally in favor at my company.
| danabramov wrote:
| From what I recall, GraphQL has a feature that's similar
| (@defer) but I'm not familiar enough to compare them. RSC was
| definitely inspired by GraphQL among other things.
| roxolotl wrote:
| This feels very similar to JSON API links[0]. This is a great way
| to implement handling resolving those links on the frontend
| though.
|
| 0: https://jsonapi.org/format/#document-links
| mdaniel wrote:
| In the recent Web 2.0 2.0 submission
| <https://news.ycombinator.com/item?id=44073785> there was some
| HATEOAS poo-poo-ing, but maybe this is the delivery mechanism
| which makes that concept easier to swallow, 'cause JSON
| ChrisMarshallNY wrote:
| This would be good.
|
| I got really, _really_ sick of XML, but one thing that XML
| parsers have always been good at, is realtime decoding of XML
| streams.
|
| It is infuriating, waiting for a big-ass JSON file to completely
| download, before proceeding.
|
| Also JSON parsers can be memory hogs (but not all of them).
| hsbauauvhabzb wrote:
| Json is just a packing format that does have that limitation.
| If you control the source and the destination, could you
| possibly use a format that supports streaming better like
| Protobuf?
| ChrisMarshallNY wrote:
| Yes, but that also interferes with portability.
|
| I've written a lot of APIs. I generally start with CSV,
| convert that to XML, then convert _that_ to JSON.
|
| CSV is extremely limited, and there's a lot of stuff that can
| only be expressed in XML or JSON, but starting with CSV
| usually enforces a "stream-friendly" structure.
| sopooneo wrote:
| I've heard two side the the Protobuf/streaming idea. On my
| first introduction, it seemed you could. But later reading
| leads me to believe it is only _almost_ streamable:
| https://belkadan.com/blog/2023/12/Protobuf-Is-Almost-
| Streama....
|
| * I do acknowledge you qualified the question with "better".
| zzo38computer wrote:
| I had invented a variant of DER called DSER (Distinguished
| Streaming Encoding Rules), which is not compatible with DER
| (nor with BER) but is intended for when streaming is needed.
|
| The type and value are encoded the same as DER, but the
| length is different:
|
| - If it is constructed, the length is omitted, and a single
| byte with value 0x00 terminates the construction.
|
| - If it is primitive, the value is split into segments of
| lengths not exceeding 255, and each segment is preceded by a
| single byte 1 to 255 indicating the length of that segment
| (in bytes); it is then terminated by a single byte with value
| 0x00. When it is in canonical form, the length of segments
| other than the last segment must be 255.
|
| Protobuf seems to not do this unless you use the deprecated
| "Groups" feature, and this is only as an alternative of
| submessages, not for strings. In my opinion, Protobuf also
| seems to have many other limits and other problems, that DER
| (and DSER) seems to do better anyways.
| hombre_fatal wrote:
| What stops you from parsing tokens from a stream like a SAX
| parser for JSON?
|
| [ ["aaa", "bbb"], { "name", "foo" } ] Start
| array Start array String aaa String
| bbb End array Start object Key name
| String foo End object End array
| ChrisMarshallNY wrote:
| Nothing, really, but I don't have the bandwidth to write
| JSAX. I wonder why it hasn't already been done by someone
| more qualified than I am. I suspect that I'd find out, if I
| started doing it.
|
| You can do that, in a specialized manner, with PHP, and
| Streaming JSON Parser[0]. I use that, in one of my server
| projects[1]. It claims to be JSON SAX, but I haven't really
| done an objective comparison, and it specializes for file
| types. It works for my purposes.
|
| [0] https://github.com/salsify/jsonstreamingparser
|
| [1] https://github.com/LittleGreenViper/LGV_TZ_Lookup/blob/
| main/...
| hombre_fatal wrote:
| Streaming JSON parsers certainly exist. I'm just pointing
| out there's nothing about JSON that makes it inherently
| harder to stream than an XML tree.
|
| In response to "Json is just a packing format that does
| have that [streaming] limitation".
| okasaki wrote:
| Reinventing pagination
|
| ?page=3&size=100
| 3cats-in-a-coat wrote:
| I'll try to explain why this is a solution looking for a problem.
|
| Yes, breadth-first is always an option, but JSON is a
| heterogenous structured data source, so assuming that breadth-
| first will help the app start rendering faster is often a poor
| assumption. The app will need _a subset_ of the JSON, but it 's
| not simply the depth-first or breadth-first first chunk of the
| data set.
|
| So for this reason what we do is include URLs in JSON or other
| API continuation identifiers, to let the caller _choose_ where in
| the data tree /graph they want to dig in further, and then the
| "progressiveness" comes from simply spreading your fetch
| operation over multiple requests.
|
| Also often times JSON is deserialized to objects so depth-frst or
| breadth-first doesn't matter, as the object needs to be "whole"
| before you can use it. Hence again: multiple requests, smaller
| objects.
|
| In general when you fetch JSON from a server, you don't want it
| to be so big that you need to EVEN CONSIDER progressive loading.
| HTML needs progressive loading because a web page can be,
| historically especially, rather monolithic and large.
|
| But that's because a page is (...was) static. Thus you load it as
| a big lump and you can even cache it as such, and reuse it. It
| can't intelligently adapt to the user and their needs. But JSON,
| and by extension the JavaScript loading it, can adapt. So use
| THAT, and do not over-fetch data. Read only what you need. Also,
| JSON is often not cacheable as the data source state is always in
| flux. One more reason not to load a whole lot in big lumps.
|
| Now, I have a similar encoding with references, which results in
| a breadth-first encoding. Almost by accident. I do it for another
| reason and that is structural sharing, as my data is shaped like
| a DAG not like a tree, so I need references to encode that.
|
| But even though I have breadth-first encoding, I never needed to
| progressively decode the DAG as this problem should be solved in
| the API layer, where you can request exactly what you need (or
| close to it) when you need it.
| danabramov wrote:
| _> The app will need a subset of the JSON, but it's not simply
| the depth-first or breadth-first first chunk of the data set._
|
| Right. Closer to the end of the article I slightly pivot to
| talk about RSC. In RSC, the data _is_ the UI, so the outermost
| data literally corresponds to the outermost UI. That 's what
| makes it work.
|
| It's encoded like progressive JSON but conceptually it's more
| like HTML. Except you can also have your own "tags" on the
| client that can receive object attributes.
| owebmaster wrote:
| > Closer to the end of the article I slightly pivot to talk
| about RSC.
|
| Not again, please.
| danabramov wrote:
| The best part about someone else's writing is you can just
| ignore it.
| its-summertime wrote:
| why send the footer above the comments? Maybe its not a footer
| then but a sidebar? Should be treated as a sidebar then? Besides
| this could all kinda be solved by still using plain streaming
| json and sending .comments last?
| danabramov wrote:
| Part of the point I'm making is that an out-of-order format is
| more efficient because we _can_ send stuff as it 's ready (so
| footer can go as soon as it's ready). It'll still "slot in" the
| right place in the UI. What this lets us do, compared to
| traditional top-down streaming, is to progressively reveal
| _inner_ parts of the UI as more stuff loads.
| goranmoomin wrote:
| Seems like some people here are taking this post literally, as in
| the author (Dan Abramov) is proposing a format called Progressive
| JSON -- it is not.
|
| This is more of a post on explaining the idea of React Server
| Components where they represent component trees as javascript
| objects, and then stream them on the wire with a format similar
| to the blog post (with similar features, though AFAIK it's
| bundler/framework specific).
|
| This allows React to have holes (that represent loading states)
| on the tree to display fallback states on first load, and then
| only display the loaded component tree afterwards when the server
| actually can provide the data (which means you can display the
| fallback spinner and the skeleton much faster, with more fine
| grained loading).
|
| (This comment is probably wrong in various ways if you get
| pedantic, but I think I got the main idea right.)
| danabramov wrote:
| Yup! To be fair, I also don't mind if people take the described
| ideas and do something else with them. I wanted to describe
| RSC's take on data serialization without it seeming too React-
| specific because the ideas are actually more general. I'd love
| if more ideas I saw in RSC made it to other technologies.
| tough wrote:
| hi dan! really interesting post.
|
| do you think a new data serialization format built around
| easier generation/parseability and that also happened to be
| streamable because its line based like jsonld could be useful
| for some?
| danabramov wrote:
| I don't know! I think it depends on whether you're running
| into any of these problems and have levers to fix them. RSC
| was specifically designed _for_ that so I was trying to
| explain its design choices. If you're building a serializer
| then I think it's worth thinking about the format's
| characteristics.
| dgb23 wrote:
| I've used React in the past to build some applications
| and components. Not familiar with RSC.
|
| What immediately comes to mind is using a uniform
| recursive tree instead, where each node has the same
| fields. In a funny way that would mimic the DOM if you
| squint. Each node would encode it's type, id, name,
| value, parent_id and order for example. The engine in
| front can now generically put stuff into the right place.
|
| I don't know whether that is feasible here. Just a
| thought. I've used similar structures in data driven
| react (and other) applications.
|
| It's also efficient to encode in memory, because you can
| put this into a flat, compact array. And it fits nicely
| into SQL dbs as well.
| tough wrote:
| Awesome, thanks! I do keep running on the issues, but the
| levers as you say make it harder to implement.
|
| As of right now, I could only replace the JSON tool
| calling on LLM's on something I fully control like vLLM,
| and the big labs probably are happy to over-charge a
| 20-30% tokens for each tool call, so they wouldn't really
| be interested on replacing json any time soon)
|
| also it feels like battling against a giant which is
| already an standard, maybe there's a place for it on
| really specialized workflows where those savings make the
| difference (not only money, but you also gain a 20-30%
| extra token window, if you don't waste it on quotes and
| braces and what not
|
| Thanks for replying!
| vinnymac wrote:
| I already use streaming partial json responses (progressive
| json) with AI tool calls in production.
|
| It's become a thing, even beyond RSCs, and has many practical
| uses if you stare at the client and server long enough.
| tough wrote:
| how do you do that exactly?
| richin13 wrote:
| Not the original commenter but I've done this too with
| Pydantic AI (actually the library does it for you). See
| "Streaming Structured Output" here
| https://ai.pydantic.dev/output/#streaming-structured-output
| tough wrote:
| Thanks yes! Im aware of structured outputs, llama.cpp has
| also great support with GBNF and several languages beyond
| json.
|
| I've been trying to create go/rust ones but its way
| harder than just json due to all the context/state they
| carry over
| danenania wrote:
| One way is to eagerly call JSON.parse as fragments are
| coming in. If you also split on json semantic boundaries
| like quotes/closing braces/closing brackets, you can detect
| valid objects and start processing them while the stream
| continues.
| tough wrote:
| Interesting approach! thanks for sharing
| motorest wrote:
| Can you offer some detail into why you find this approach
| useful?
|
| From an outsider's perspective, if you're sending around JSON
| documents so big that it takes so long to parse them to the
| point reordering the content has any measurable impact on
| performance, this sounds an awful lot like you are batching
| too much data when you should be progressively fetching child
| resources in separate requests, or even implementing some
| sort of pagination.
| Wazako wrote:
| Slow llm generation. A progressive display of a progressive
| json is mandatory.
| krzat wrote:
| Am I the only person that dislikes progressive loading?
| Especially if it involves content jumping around.
|
| And the most annoying antipattern is showing empty state UI
| during loading phase.
| Szpadel wrote:
| alternative is to stare at blank page without any indication
| that something is happening
| leptons wrote:
| I'm sure that isn't the only alternative.
| ahofmann wrote:
| Or, you could use caches and other optimizations to serve
| content fast.
| withinboredom wrote:
| It's better than moving the link or button as I'm clicking
| it.
| danabramov wrote:
| Right -- that's why the emphasis on intentionally designed
| loading states in this section:
| https://overreacted.io/progressive-json/#streaming-data-
| vs-s...
|
| Quoting the article:
|
| _> You don't actually want the page to jump arbitrarily as
| the data streams in. For example, maybe you never want to
| show the page without the post's content. This is why React
| doesn't display "holes" for pending Promises. Instead, it
| displays the closest declarative loading state, indicated by
| <Suspense>._
|
| _> In the above example, there are no <Suspense> boundaries
| in the tree. This means that, although React will receive the
| data as a stream, it will not actually display a "jumping"
| page to the user. It will wait for the entire page to be
| ready. However, you can opt into a progressively revealed
| loading state by wrapping a part of the UI tree into
| <Suspense>. This doesn't change how the data is sent (it's
| still as "streaming" as possible), but it changes when React
| reveals it to the user._
|
| _[...]_
|
| _> In other words, the stages in which the UI gets revealed
| are decoupled from how the data arrives. The data is streamed
| as it becomes available, but we only want to reveal things to
| the user according to intentionally designed loading states._
| sdeframond wrote:
| You might be interested in the "remote data" pattern (for
| lack of a better name)
|
| https://www.haskellpreneur.com/articles/slaying-a-ui-
| antipat...
| hinkley wrote:
| Ember did something like this but it made writing Ajax
| endpoints a giant pain in the ass.
|
| It's been so long since I used ember that I've forgotten the
| terms, but essentially the rearranged the tree structure so
| that some of the children were at the end of the file. I
| believe it was meant to handle DAGs more efficiently but I may
| have hallucinated that recollection.
|
| But if you're using a SAX style streaming parser you can start
| making progress on painting and perhaps follow-up questions
| while the initial data is still loading.
|
| Of course in a single threaded VM, you can snatch Defeat from
| the jaws of Victory if you bollocks up the order of operations
| through direct mistakes or code evolution over time.
| bob1029 wrote:
| > We can try to improve this by implementing a streaming JSON
| parser.
|
| In .NET land, Utf8JsonReader is essentially this idea. You can
| parse up until you have everything you need and then bail on the
| stream.
|
| https://learn.microsoft.com/en-us/dotnet/standard/serializat...
| yen223 wrote:
| I've never really thought about how all the common ways we
| serialise trees in text (JSON, s-expressions, even things like
| tables of content, etc), serialise them depth-first.
|
| I suppose it's because doing it breadth-first means you need to
| come up with a way to reference items that will arrive many lines
| later, whereas you don't have that need with depth-first
| serialisation.
| NoahZuniga wrote:
| Also it makes memory allocation easier
| turtlebits wrote:
| Progressive JPEG make sense, because it's a media file and by
| nature is large. Text/HTML on the other hand, not so much. Seems
| like a self-inflicted solution where JS bundles are giant and now
| we're creating more complexity by streaming it.
| danabramov wrote:
| Things can be slow not because they're large but because they
| take latency to produce or to receive. The latency can be on
| the server side (some things genuinely take long to query, and
| might be not possible or easy to cache). Some latency may just
| be due to the user having poor network conditions. In both
| cases, there's benefits to progressively revealing content as
| it becomes available (with intentional loading stages) instead
| of always waiting for the entire thing.
| whilenot-dev wrote:
| Agree with everything you're saying here, but to be fair I
| think the analogy with Progressive JPEG doesn't sit quite
| right with your concept. What you're describing sounds more
| like "semantic-aware streaming" - it's as if a Progressive
| JPEG would be semantically aware of its blob and load any
| objects that are in focus first before going after data for
| things that are out of focus.
|
| I think that's a very contemporary problem and worth
| pursuing, but I also somehow won't see that happening in
| real-time (with the priority to reduce latency) without
| necessary metadata.
| danabramov wrote:
| It's not an exact analogy but streaming outside-in (with
| gradually more and more concrete visual loading states)
| rather than top-down feels similar to a progressive image
| to me.
| whilenot-dev wrote:
| It's data (JPEG/JSON) VS software (HTML/CSS/JS)... you
| can choose to look at HTML/CSS/JS as just some chunks of
| data, or you can look at it as a serialized program that
| wants to be executed with optimal performance. Your blog
| post makes it seem like your focus is on the latter (and
| it's just quite typical for react applications to fetch
| their content dynamically via JSON), and that's where
| your analogy to the progressive mode of JPEGs falls a bit
| flat and "streaming outside-in" doesn't seem like all you
| want.
|
| Progressively loaded JPEGs just apply some type of
| "selective refinement" to chunks of data, and for
| _Progressive selective refinement_ to work it 's
| necessary to "specify the location and size of the region
| of one or more components prior to the scan"[0][1]. If
| you don't know what size to allocate, then it's quite
| difficult(?) to optimize the execution. This doesn't seem
| like the kind of discussion you'd like to have.
|
| Performance aware web developers are working with
| semantic awareness of their content in order to make
| tweaks to the sites loading time. YouTube might prefer
| videos (or ads) to be loaded before any comments, news
| sites might prioritize text over any other media, and a
| good dashboard might prioritize data visualizations
| before header and sidebar etc.
|
| The position of the nodes in any structured tree tells
| you very little about the preferred loading priority,
| wouldn't you agree?
|
| [0] https://jpeg.org/jpeg/workplan.html
|
| [1] https://www.itu.int/ITU-T/recommendations/rec.aspx?id
| =3381 (see D.2 in the PDF)
|
| EDIT: Btw thanks for your invaluable contributions to
| react (and redux back then)!
| danabramov wrote:
| I used this analogy more from the user's perspective (
| _as a user_ , a gradually sharpening image feels similar
| to a website with glimmers gradually getting replaced by
| revealing content). I don't actually know how JPEG is
| served under the hood (and the spec is too dense for me)
| so maybe if you explain the point a bit closer I'll be
| able to follow. I do believe you that the analogy doesn't
| go all the way.
|
| RSC streams outside-in because that's the general shape
| of the UI -- yes, you might want to prioritize the video,
| but you _have to_ display the shell _around_ that video
| first. So "outside-in" is just that common sense -- the
| shell goes first. Other than that, the server will
| prioritize whatever's ready to be written to the stream
| -- if we're not blocked on IO, we're writing.
|
| The client does some selective prioritization on its own
| as it receives stuff (e.g. as it loads JS, it will
| prioritize hydrating the part of the page that you're
| trying to interact with).
| efitz wrote:
| What if we like, I don't know, you know, _separate data from
| formatting_?
| jarym wrote:
| This appears conceptually similar to something like line-
| delimited JSON with JSON Patch[1].
|
| Personally I prefer that sort of approach - parsing a line of
| JSON at a time and incrementally updating state feels easier to
| reason and work with (at least in my mind)
|
| [1] https://en.wikipedia.org/wiki/JSON_Patch
| slt2021 wrote:
| if you ever feel the need to send progressive JSON - just zip it
| and don't bother solving fake problem at the wrong abstraction
| layer
| aloha2436 wrote:
| The article doesn't advocate sending it progressively to make
| it smaller on the wire. The motivating example is one where
| some of the data (e.g. posts) is available before the rest of
| the data in the response (e.g. comments). Rather than:
|
| - Sending a request for posts, then a request for comments,
| resulting in multiple round trips (a.k.a. a "waterfall"), or,
|
| - Sending a request for posts and comments, but having to wait
| until the commends have loaded to get the posts,
|
| ...you can instead get posts and comments available as soon as
| they're ready, by progressively loading information. The
| message, though, is that this is something a full-stack web
| framework should handle for you, hence the revelation at the
| end of the article about it being a lesson in the motivation
| behind React's Server Components.
| harrall wrote:
| I don't think progressive loading is innovative.
|
| What is innovative trying to build a framework that does it for
| you.
|
| Progressive loading is easy, but figuring out which items to
| progressively load and in which order without asking the
| developer/user to do much extra config is hard.
| hobs wrote:
| That's because its basically cache invalidation.
| motorest wrote:
| > Progressive loading is easy, but figuring out which items to
| progressively load and in which order without asking the
| developer/user to do much extra config is hard.
|
| Do developers even control the order in which stuff is loaded?
| Tha depends on factors beyond a developer's control, such as
| the user's network speed, the origin server's response speed,
| which resources are already cached, how much data each request
| fetches for user A or user B, etc.
| danabramov wrote:
| Right, which is why I describe a framework that does it for you
| (RSC) at the end of the article. The article itself is meant as
| an explanation of how RSC works under the hood.
| yawaramin wrote:
| > I'd like to challenge more tools to adopt progressive streaming
| of data.
|
| It's a solved problem. Use HTTP/2 and keep the connection open.
| You now have effectively a stream. Get the top-level response:
| { header: "/posts/1/header", post:
| "/posts/1/body", footer: "/posts/1/footer" }
|
| Now reuse the same connection to request the nested data, which
| can all have more nested links in them, and so on.
| aloha2436 wrote:
| > Now reuse the same connection to request the nested data,
| which can all have more nested links in them, and so on.
|
| This still involves multiple round-trips though. The approach
| laid out in the article lets you request exactly the data you
| need up-front and the server streams it in as it becomes
| available, e.g. cached data first, then data from the DB, then
| data from other services, etc.
| yawaramin wrote:
| When you have an HTTP/2 connection already open a 'round-
| trip' is not really a gigantic concern performance-wise. And
| it gives the client application complete control and ver what
| nested parts it wants to get and in what order. Remember that
| the article said it's up to the server what order to stream
| the parts? That might not necessarily be a good idea on the
| client side though. It would probably be better for the
| client to decide what it wants and when. Eg, it can request
| the header and footer, then swap in a skeleton facade in the
| main content area, then load the body and swap it in when
| loaded.
| jlokier wrote:
| Round trips for parallel requests work fine over HTTP/2.
| (As long as there aren't vast numbers of tiny requests, for
| example every cell in a spreadsheet).
|
| However, _sequentially-dependent_ requests are about as
| slow with HTTP /2 as HTTP/1.1. For example, if your client
| side, after loading the page, requests data to fill a form
| component, and then that data indicates a map location, so
| your client side requests a map image with pins, and then
| the pin data has a link to site-of-interest bubble content,
| and you will be automatically expanding the nearest one, so
| your client side requests requests the bubble content, and
| the bubble data has a link to an image, so the client
| requests the image...
|
| Then over HTTP/2 you can either have 1 x round trip time
| (server knows the request hierarchy all the way up to the
| page it sends with SSR) or 5 x round trip time (client side
| only).
|
| When round trip times are on the order of 1 second or more
| (as they often are for me on mobile), >1s versus >5s is a
| very noticable difference in user experience.
|
| With lower latency links of 100ms per RTT, the UX
| difference between 100ms and 500ms is not a problem but it
| does feel different. If you're on <10ms RTT, then 5
| sequential round trips are hardly noticable, thought it
| depends more on client-side processing time affecting back-
| to-back delays.
| aperturecjs wrote:
| I previously wrote a prototype of streaming a JSON tree this way:
|
| https://github.com/rgraphql/rgraphql
|
| But it was too graphql-coupled and didn't really take off, even
| for my own projects.
|
| But it might be worth revisiting this kind of protocol again
| someday, it can tag locations within a JSON response and send
| updates to specific fields (streaming changes).
| Aeolun wrote:
| I think the problem with this is that it makes a very simple
| thing a lot harder. I don't want to try and debug a JSON stream
| that can fail at any point. I just want to send a block of text
| (which I generate in 2ms anyway) and call it a day.
| jlokier wrote:
| 2ms to generate, 1 second for basic text to appear and 20 more
| seconds to receive the whole page on my phone in the centre of
| town, due to poor service.
|
| Compared with waiting on a blank page for ages, sometimes it's
| nice to see text content if it's useful, and to be able to
| click navigation links early. It's much better than pages which
| look like they have finished loading but important buttons and
| drop-downs are broken without any visible indication because
| there's JS still loading in the background. I'm also not fond
| of pages where you can select options and enter data, and then
| a few seconds after you've entered data, all the fields reset
| as background loading completes.
|
| All the above are things I've experienced in the last week.
| hyfgfh wrote:
| The thing I have seem in performance is people trying to shave ms
| loading a page, while they fetch several mbs and do complex
| operations in the FE, when in the reality writing a BFF,
| improving the architecture and leaner APIs would be a more
| productive solution.
|
| We tried to do that with GraphQL, http2,... And arguably failed.
| Until we can properly evolve web standards we won't be able to
| fix the main issue. Novel frameworks won't do it either
| kristianp wrote:
| What's a BFF in this context? Writing an AI best friend isn't
| all that rare these days...
| continuational wrote:
| BFF (pun intended?) in this context means "backend for
| frontend".
|
| The idea is that every frontend has a dedicated backend with
| exactly the api that that frontend needs.
| xiphias2 wrote:
| At least this post explains why when I load a Facebook page the
| only thing that really matters (the content) is what loads last
| globalise83 wrote:
| When I load a Facebook page the content that matters doesn't
| even load.
| onion2k wrote:
| Doesn't that depend on what you mean by "shave ms loading a
| page"?
|
| If you're optimizing for time to first render, or time to
| visually complete, then you need to render the page using as
| little logic as possible - sending an empty skeleton that then
| gets hydrated with user data over APIs is fastest for a user's
| perception of loading speed.
|
| If you want to speed up time to first input or time to
| interactive you need to actually build a working page using
| user data, and that's often fastest on the backend because you
| reduce network calls which are the slowest bit. I'd argue most
| users actually prefer that, but it depends on the app.
| Something like a CRUD SAAS app is probably best rendered server
| side, but something like Figma is best off sending a much more
| static page and then fetching the user's design data from the
| frontend.
|
| The idea that there's one solution that will work for
| everything is wrong, mainly because what you optimise for is a
| subjective choice.
|
| And that's before you even get to Dev experience, team
| topology, Conway's law, etc that all have huge impacts on tech
| choices.
| motorest wrote:
| > If you're optimizing for time to first render, or time to
| visually complete, then you need to render the page using as
| little logic as possible - sending an empty skeleton that
| then gets hydrated with user data over APIs is fastest for a
| user's perception of loading speed.
|
| I think that OP's point is that these optimization strategies
| are completely missing the elephant in the room. Meaning,
| sending multi-MB payloads creates the problem, and shaving a
| few ms here and there with more complexity while not looking
| at the performance impact of having to handle multi-MB
| payloads doesn't seem to be an effective way to tackle the
| problem.
| MrJohz wrote:
| > sending an empty skeleton that then gets hydrated with user
| data over APIs is fastest for a user's perception of loading
| speed
|
| This is often repeated, but my own experience is the
| opposite: when I see a bunch of skeleton loaders on a page, I
| generally expect to be in for a bad experience, because the
| site is probably going to be slow and janky and cause
| problems. And the more the of the site is being skeleton-
| loaded, the more my spirits worsen.
|
| My guess is that FCP has become the victim of Goodhart's Law
| -- more sites are trying to optimise FCP (which means that
| _something_ needs to be on the screens ASAP, even if it's
| useless) without optimising for the UX experience. Which
| means delaying rendering more and adding more round-trips so
| that content can be loaded later on rather than up-front.
| That produces sites that have worse experiences (more
| loading, more complexity), even though the metric says the
| experience should be improving.
| Bjartr wrote:
| > the experience should be improving
|
| I think it's more the bounce rate is improving. People may
| recall a worse experience later, but more will stick around
| for that experience if they see _something_ happen sooner.
| PhilipRoman wrote:
| It also breaks a bunch of optimizations that browsers have
| implemented over the years. Compare how back/forward
| history buttons work on reddit vs server side rendered
| pages.
| MrJohz wrote:
| It is possible to get those features back, in fairness...
| but it often requires more work than if you'd just let
| the browser handle things properly in the first place.
| danabramov wrote:
| RSC, which is described at the end of this post, _is_
| essentially a BFF (with the API logic componentized). Here's my
| long post on this topic: https://overreacted.io/jsx-over-the-
| wire/ (see BFF midway in the first section).
| elcomet wrote:
| Too many acronyms, what's FE, BFF?
| holoduke wrote:
| Front end and a backend for a frontend. In which you
| generally design apis specific for a page by aggregating
| multiple other apis, caching, transforming etc.
| aeinbu wrote:
| I was asking the same questions.
|
| - FE is short for the Front End (UI)
|
| - BFF is short for Backend For Frontend
| jatins wrote:
| I have seen Dan's "2 computers" talk and read some of his recent
| posts trying to explore RSC and their benefits.
|
| Dan is one of the best explainers in React ecosystem but IMO if
| one has to work this hard to sell/explain a tech there's 2
| possibilities 1/ there is no real need of tech 2/ it's a flawed
| abstraction
|
| #2 seems somewhat true because most frontend devs I know still
| don't "get" RSC.
|
| Vercel has been aggressively pushing this on users and most of
| the adoption of RSC is due to Nextjs emerging as the default
| React framework. Even among Nextjs users most devs don't really
| seem to understand the boundaries of server components and are
| cargo culting
|
| That coupled with fact that React wouldn't even merge the PR that
| mentions Vite as a way to create React apps makes me wonder if
| the whole push for RSC is for really meant for users/devs or just
| as a way for vendors to push their hosting platforms. If you
| could just ship an SPA from S3 fronted with a CDN clearly that's
| not great for Vercels and Netflifys of the world.
|
| In hindsight Vercel just hiring a lot of OG React team members
| was a way to control the future of React and not just a talent
| play
| kenanfyi wrote:
| I find your analysis very good and agree on why companies like
| Vercel are pushing hard on RSC.
| foo42 wrote:
| [flagged]
| tomhow wrote:
| Please don't do this here. If a comment seems unfit for HN,
| please flag it and email us at hn@ycombinator.com so we can
| have a look.
| Garlef wrote:
| I think there's a world where you would use the code
| structuring of RSCs to compile a static page that's broken down
| into small chunks of html, css, js.
|
| Basically: If you replace the "$1" placeholders from the
| article with URIs you wouldn't need a server.
|
| (In most cases you don't need fully dynamic SSR)
|
| The big downside is that you'd need a good pipeline to also
| have fast builds/updates in case of content changes: Partial
| streaming of the compiled static site to S3.
|
| (Let's say you have a newspaper with thousands of prerendered
| articles: You'd want to only recompile a single article in case
| one of your authors edits the content in the CMS. But this
| means the pipeline would need to smartly handle some form of
| content diff)
| danabramov wrote:
| RSC is perfectly capable of being run at the build-time,
| which is the default. So that's not too far from what you're
| describing.
| danabramov wrote:
| You're wrong about the historical aspects and motivations but I
| don't have the energy to argue about it now and will save it
| for another post. (Vercel isn't setting React's direction;
| rather, they're the ones who funded person-decades of work
| under the direction set by the React team.)
|
| I'll just correct the allegation about the Vite -- it's being
| worked on but the ball is largely in the Vite team's court
| because it can't work well without bundling in DEV (and the
| Vite team knows it and will be fixing that). The latest work in
| progress is here: https://github.com/facebook/react/pull/33152.
|
| Re: people not "getting" it -- you're kind of making a circular
| argument. To refute it I would have to shut up. But I like
| writing and I want to write about the topics I find
| interesting! I think even if you dislike RSC, there's enough
| interesting stuff there to be picked into other technologies.
| That's really all I want at this point. I don't care to
| convince _you_ about anything but I want people to also think
| about these problems and to steal the parts of the solution
| that they like. Seems like the crowd here doesn't mind that.
| andrewingram wrote:
| I also appreciate that you're doing these explainers so that
| people don't have to go the long way round understand what
| problems exists that call for certain shapes of solutions --
| especially when those solutions can feel contrived or
| complicated.
|
| As someone who's been building web UI for nearly 30 years
| (scary...), I've generally been fortunate enough that when
| some framework I use introduces a new feature or pattern, I
| know what they're trying to do. But the only reason I know
| what they're trying to do is because I've spent some amount
| of time running into the problems they're solving. The first
| time I saw GraphQL back in 2015, I "got" it; 10 years later
| most people using GraphQL don't really get it because they've
| had it forced upon them or chose it because it was the new
| shiny thing. Same was true of Suspense, server functions,
| etc.
| metalrain wrote:
| While RSC as technology is interesting, I don't think it makes
| much sense in practice.
|
| I don't want to have a fleet of Node/Bun backend servers that
| have to render complex components. I'd rather have static pages
| and/or React SPA with Go API server.
|
| You get similar result with much smaller resources.
| pas wrote:
| It's convenient for integrating with backends. You can use
| async/await on the server, no need for hooks (callbacks) for
| data loading.
|
| It allows for dynamism (user only sees the menus that they
| have permissions for), you can already show those parts that
| are already loaded while other parts are still loading.
|
| (And while I prefer the elegance and clean separation of
| concerns that come with a good REST API, it's definitely more
| work to maintain both the frontend and the backend for it.
| Especially in caes where the backend-for-frontend integrates
| with more backends.)
|
| So it's the new PHP (with ob_flush), good for dashboards and
| big complex high-traffic webshop-like sites, where you want
| to spare no effort to be able to present the best options to
| the dear customer as soon as possible. (And also it should be
| crawlable, and it should work on even the lowest powered
| devices.)
| robertoandred wrote:
| RSCs work just fine with static deployments and SPAs. (All
| Next sites are SPAs.)
| throwingrocks wrote:
| > IMO if one has to work this hard to sell/explain a tech
| there's 2 possibilities 1/ there is no real need of tech 2/
| it's a flawed abstraction
|
| There's of course a third option: the solution justifies the
| complexity. Some problems are hard to solve, and the solutions
| require new intuition.
|
| It's easy to say that, but it's also easy to say it should be
| easier to understand.
|
| I'm waiting to see how this plays out.
| liamness wrote:
| You can of course still just export a static site and host it
| on a basic CDN, as you say. And you can self host Next.js in
| the default "dynamic" mode, you just need to be able to run an
| Express server, which hardly locks you into any particular
| vendor.
|
| Where it gets a little more controversial is if you want to run
| Next.js in full fat mode, with serverless functions for render
| paths that can operate on a stale-while-revalidate basis.
| Currently it is very hard for anyone other than Vercel to
| properly implement that (see the opennextjs project for
| examples), due to undocumented "magic". But thankfully Next.js
| / Vercel have proposed to implement (and dogfood) adapters that
| allow this functionality to be implemented on different
| platforms with a consistent API:
|
| https://github.com/vercel/next.js/discussions/77740
|
| I don't think the push for RSC is at all motivated by the shady
| reasons you're suggesting. I think it is more about the
| realisation that there were many good things about the way we
| used to build websites before SPA frameworks began to dominate.
| Mostly rendering things on the server, with a little
| progressive enhancement on the client, is a pattern with a lot
| of benefits. But even with SSR, you still end up pushing a lot
| of logic to the client that doesn't necessarily belong there.
| lioeters wrote:
| > thankfully Next.js / Vercel have proposed to implement (and
| dogfood) adapters that allow this functionality to be
| implemented on different platforms with a consistent API:
|
| Seeing efforts like this (started by the main dev of Next.js
| working at Vercel) convinces me that the Vercel team is
| honestly trying to be a good steward with their influence on
| the React ecosystem, and in general being a beneficial
| community player. Of course as a VC-funded company its
| purpose is self-serving, but I think they're playing it
| pretty respectably.
|
| That said, there's no way I'm going to run Next.js as part of
| a server in production. It's way too fat and complicated.
| I'll stick with using it as a static site generator, until I
| replace it with something simpler like Vite and friends.
| tills13 wrote:
| Holy the pomp in this thread. It would perhaps help for some
| people here to have the context that this isn't some random
| person on the internet but Dan Abromov -- probably one of the
| most influential figures in building React (if not one of the
| creators, iirc)
| kolme wrote:
| He got famous because of "redux" and "hot module reload" and
| then he got hired by Meta and started working on react.
|
| This was before the hook era.
| yard2010 wrote:
| Dan is hands down THE best captain to steer this ship - he
| manages to push react forward even though it changed a lot (and
| faced many growth pains and challenges) in the last few years.
| He is doing it in his own special way - he is kind, thoughtful,
| patient and visionary. He is the best kind of master teacher
| there is - although he has many many years of experience, he
| understands exactly what newbies don't understand. That's
| inspiring.
|
| Read a few of his many comments in any React issue and see what
| I mean. We are truly gifted. Dan you are my idol!
| sriku wrote:
| Would a stream where each entry is a list of kv-pairs work just
| as well? The parser is then expected to apply the kv pairs to the
| single json object as it is receiving them. The key would
| describe a json path in the tree - like 'a.b[3].c'.
| dejj wrote:
| Reminds me of Aftertext, which uses backward references to apply
| markup to earlier parts of the data.
|
| Think about how this could be done recursively, and how scoping
| could work to avoid spaghetti markup.
|
| Aftertext: https://breckyunits.com/aftertext.html
| philippta wrote:
| This sounds suspiciously similar to CSV.
| anonzzzies wrote:
| So is there a library / npm to do this? Even his not good cases
| example; just making partial JSON to parse all the time. I don't
| care if it's just the top and missing things, as long as it
| _always_ parses as legal json.
| tarasglek wrote:
| People put so much effort into streaming Json parsing whereas we
| have a format called Yaml which takes up less characters on the
| wire and happens to work incrementally out of the box meaning
| that you can reparse the stream as it's coming in without having
| to actually do any incremental parsing
| motorest wrote:
| > People put so much effort into streaming Json parsing whereas
| we have a format called (...)
|
| There are many formats out there. If payload size is a concern,
| everyone is far better off enabling HTTP response compression
| instead of onboarding a flavor-of-the-month language.
| croes wrote:
| YAML is more complex and harder to parse
| rk06 wrote:
| Yaml makes json appear user friendly by comparison.
|
| Last thing one want in a wire format is white space sensitivity
| and ambiguous syntax. Besides, if you are really transferring
| that much json data, there are ways to achieve it that solves
| the issues
| filoeleven wrote:
| If we're talking about niche data protocols, edn is hard to
| beat. Real dates and timestamps and comments, namespaced
| symbols, tagged elements, oh my!
|
| https://github.com/edn-format/edn
| Existenceblinks wrote:
| It's useless as data is not just some graphic semantic, they have
| relation, business rules on top, not ready to interact with if
| not all are ready, loaded.
| danabramov wrote:
| It's definitely not useless. You're right that it requires the
| interpreting layer to be able to handle missing info. The use
| case at the end of the article is streaming UI. UI, unlike
| arbitrary data, is actually self-describing -- and we have
| meaningful semantics for incomplete UI (show the closest
| loading state placeholder). That's what makes it work, as the
| article explains in the last section.
| Existenceblinks wrote:
| Thanks Dan. Yes, I agreed on the ui part, it seems to work in
| most cases. Some html tags have relation like `<datalist>` or
| `[popover]` attribute, but if we make all kind of relations
| trivial then it's benefit for sure.
| danabramov wrote:
| Yea, and also to clarify by "UI", I don't necessarily mean
| HTML -- it could be your own React components and their
| props. In idiomatic React, you generally don't have these
| kinds of "global" relations between things anyway. (They
| could appear _inside_ components but then presumably they
| 'd be bound by matching IDs.)
| metalrain wrote:
| It feels like this idea needs Cap'n'Proto style request inlining
| so client can choose what parts to stream instead of getting
| everything asynchronously.
|
| https://capnproto.org/
| inglor wrote:
| I am not sure the wheel can be rediscovered many more times but
| definitely check out Kris's work from around 2010-2012 around
| q-connection and streaming/rpc of chunks of data. Promises
| themselves have roots in this and there are better formats for
| this.
|
| Check our mark miller's E stuff and thesis - this stuff goes all
| the way back to the 80s.
| inglor wrote:
| @dang - I hit "reply" once (I am sure of that) and I see my
| (identical) comment twice in the UI. Not sure what sort of
| logging/tracing/instrumentation you have in place - I am not
| delete'ing this so you have a chance to investigate but if
| that's not useful by all means feel free to do so.
| inglor wrote:
| I am not sure the wheel can be rediscovered many more times but
| definitely check out Kris's work from around 2010-2012 around
| q-connection and streaming/rpc of chunks of data. Promises
| themselves have roots in this and there are better formats for
| this.
|
| Check our mark miller's E stuff and thesis - this stuff goes all
| the way back to the 80s.
| inglor wrote:
| Not to disrespect Dan here, each discovery is impressive on its
| own but I wish we had a better way to preserve this sort of
| knowledge.
| vanderZwan wrote:
| > _I wish we had a better way to preserve this sort of
| knowledge._
|
| It's called "being part of the curriculum" and apparently the
| general insights involved aren't, so far.
| camgunz wrote:
| I don't mean to be dismissive, but haven't we solved this by
| using different endpoints? There's so many virtues: you avoid
| head of line blocking; you can implement better filtering (eg
| "sort comments by most popular"); you can do live updates; you
| can iterate on the performance of individual objects (caching,
| etc).
|
| ---
|
| I broadly see this as the fallout of using a document system as
| an application platform. Everything wants to treat a page like a
| doc, but applications don't usually work that way, so lots of
| code and infra gets built to massage the one into the other.
| danabramov wrote:
| Sort of! I have two (admittedly long) articles on this topic,
| comparing how the code tends to evolve with separate endpoints
| and what the downsides are:
|
| - https://overreacted.io/one-roundtrip-per-navigation/
|
| - https://overreacted.io/jsx-over-the-wire/
|
| The tldr is that endpoints are not very fluid -- they kind of
| become a "public" API contract between two sides. As they
| proliferate and your code gets more modular, it's easy to hurt
| performance because it's easy to introduce server/client
| waterfalls at each endpoint. Coalescing the decisions on the
| server as a single pass solves that problem and also makes the
| boundaries much more fluid.
| techpression wrote:
| Reading this makes me even happier I decided on Phoenix LiveView
| a while back. React has become a behemoth requiring vendor
| specific hosting (if you want the bells and whistles) and even a
| compiler to overcome all the legacy.
|
| Most of the time nobody needs this, make sure your database
| indexes are correct and don't use some under powered serverless
| runtime to execute your code and you'll handle more load than
| most people realize.
|
| If you're Facebook scale you have unique problems, most of us
| doesn't.
| gavinray wrote:
| > React has become a behemoth requiring vendor specific hosting
|
| This is one of the silliest things I've read in a while.
|
| React is sub-3kB minified + gzip'ed [0], and the grand majority
| of React apps I've deployed are served as static assets from a
| fileserver.
|
| My blog runs off of Github Pages, for instance.
|
| People will always find a way to invent problems for
| themselves, but this is a silly example.
|
| [0] https://bundlephobia.com/package/react@19.1.0
| owebmaster wrote:
| > This is one of the silliest things I've read in a while.
|
| You know that the author of this post is the creator of React
| and that he's been pushing for RSC/Vercel relentlessly,
| right?
|
| btw reactdom is ~30kb gzipped so React minimal bundle is
| around 35kb
| whilenot-dev wrote:
| Dan Abramov isn't "the creator of React", he just became an
| evangelist for react ever since he got to the team at
| Facebook through his work on redux. He is pushing for RSC
| (as that's where react's future seems to be), but what
| makes you think he's pushing for Vercel?
| gavinray wrote:
| If you really want to bikeshed over size, you can use
| Preact which is a genuine 3kB full drop-in for React.
| owebmaster wrote:
| Why would I? React is just bad, the change from
| classes/function components to hooks abstraction was
| terrible but the current push to RSC made me quit 2 years
| ago with zero regrets. Life is great when you don't need
| to debug zillions of useless components re-render.
| danabramov wrote:
| I'm not the creator of React (that would be Jordan). I've
| also never taken any money from Vercel and I don't care
| about it. I do think RSC is an interesting technology and I
| like writing about it while I'm on my sabbatical.
| techpression wrote:
| There's more to it than size, the framework itself and its
| execution speed and behaviors. Look at a flame graph of any
| decent React app for example.
|
| Sure, I could've been clearer, but you did forget react-dom.
| And good luck getting RSC going on GH pages.
| danabramov wrote:
| RSC is perfectly capable of producing static sites. My site
| is hosted for free on Cloudflare with their static free
| plan.
| techpression wrote:
| That's not what I said though. You can generate a static
| site using spring in Java too, doesn't mean it actually
| runs Java.
| MrJohz wrote:
| I find the Krausest benchmarks[0] to be useful for these
| sorts of comparisons. There are always flaws in benchmarks,
| and this one particularly is limited to the performance for
| DOM manipulation of a relatively simple web application (the
| minimal VanillaJS implementation is about 50 lines of code).
| That said, Krausest and the others who work on it do a good
| job of ensuring the different apps are well-optimised but
| still idiomatic, and it works well as a test of what the
| smallest meaningful app might look like for a given
| framework.
|
| I typically compare Vanilla, Angular, SolidJS, Svelte, Vue
| Vapor, Vue, and React Hooks, to get a good spread of the
| major JS frameworks right now. Performance-wise, there are
| definitely differences, but tbh they're all much of a
| muchness. React famously does poorly on "swap rows", but also
| there's plenty of debate about how useful "swap rows"
| actually is as a benchmark.
|
| But if you scroll further down, you get to the memory
| allocation and size/FCP sections, and those demonstrate what
| a behemoth React is in practice. 5-10x larger than SolidJS or
| Svelte (compressed), and approximately 5x longer FCP scores,
| alongside a significantly larger runtime memory than any
| other option.
|
| React is consistently more similar to a full Angular
| application in most of the benchmarks there than to one of
| the more lightweight (but equally capable) frameworks in that
| list. And I'm not even doing a comparison with
| microframeworks like Mithril or just writing the whole thing
| in plain JS. And given the point of this article is about
| shaving off moments from your FCP by delaying rendering,
| surely it makes sense to look at one of the most significant
| causes to FCP, namely bundle size?
|
| [0]: https://krausest.github.io/js-framework-
| benchmark/2025/table...
| PetahNZ wrote:
| Reminds me of Oboe.js
|
| https://oboejs.com/
| atombender wrote:
| You could stream incrementally like this without explicitly
| demarcating the "holes". You can simply send the unfinished JSON
| (with empty arrays as the holes), then compute the next iteration
| and send a delta, then compute the next and send a delta, and so
| on.
|
| A good delta format is Mendoza [1] (full disclosure: I work at
| Sanity where we developed this), which has Go and JS/TypeScript
| [2] implementations. It expresses diffs and patches as very
| compact operations.
|
| Another way is to use binary digging. For example, zstd has some
| nifty built-in support for diffing where you can use the previous
| version as a dictionary and then produce a diff that can be
| applied to that version, although we found Mendoza to often be as
| small as zstd. This approach also requires treating the JSON as
| bytes and keeping the previous binary snapshot in memory for the
| next delta, whereas a Mendoza patch can be applied to a
| JavaScript value, so you only need the deserialized data.
|
| This scheme would force you to compare the new version for what's
| changed rather than plug in exactly what's changed, but I believe
| React already needs to do that? Also, I suppose the Mendoza
| applier could be extended to return a list of keys that were
| affected by a patch application.
|
| [1] https://github.com/sanity-io/mendoza
|
| [2] https://github.com/sanity-io/mendoza-js
| __mattya wrote:
| They want to know where the holes are so that they can show a
| loading state.
| atombender wrote:
| You don't need templating ($1 etc.) for that as long as you
| can describe the holes somehow, which can be done out-of-
| band.
|
| If we imagine a streaming protocol of key/value pairs that
| are either snapshots or deltas: event:
| snapshot data: {"topPost":[], "user": {"comments":
| []}} pending: topPosts,user.comments
| event: delta data: [17,{"comments":[{"body":"hello
| world"}]},"user"] pending: topPosts
| andrewingram wrote:
| For the use case of streaming data for UI, I don't think empty
| arrays and nulls are sufficient information. At any moment
| during the stream, you need the ability to tell what data is
| pending.
|
| If pending arrays are just returned as empty arrays, how do I
| know if it's empty because it's actually empty, or empty
| because it's pending?
|
| GraphQL's streaming payloads try to get the best of both
| worlds, at any point in time you have a valid payload according
| the GraphQL schema - so it's possible to render some valid UI,
| but it also communicates what paths contain pending data, and
| then subsequent payloads act as patches (though not as
| sophisticated as Mendoza's).
| atombender wrote:
| As I commented in
| https://news.ycombinator.com/item?id=44150238, all you need
| is a way to express what is pending, which can be done using
| JSON key paths.
|
| Of course, you could do it in-band, too:
| {"comments": {"state": "pending", "values": []}}
|
| ...at the cost of needing your data model to be explicit
| about it. But this has the benefit of being diffable, of
| course, so once the data is available, the diff is just the
| new state and the new values.
| andrewingram wrote:
| Yes, hence the last paragraph in my comment :)
| pjungwir wrote:
| Does this scheme give a way to progressively load slices of an
| array? What I want is something like this:
| ["foo", "bar", "$1"]
|
| And then we can consume this by resolving the Promise for $1 and
| splatting it into the array (sort of). The Promise might resolve
| to this: ["baz", "gar", "$2"]
|
| And so on.
|
| And then a higher level is just iterating the array, and doesn't
| have to think about the promise. Like a Python generator or Ruby
| enumerator. I see that Javascript does have async generators, so
| I guess you'd be using that.
|
| The "sort of" is that you can stream the array contents without
| literally splatting. The caller doesn't have to reify the whole
| array, but they could.
|
| EDIT: To this not-really-a-proposal I propose adding a new spread
| syntax, ["foo", "bar", "...$1"]. Then your progressive JSON layer
| can just deal with it. That would be awesome.
| danabramov wrote:
| From what I understand of the RSC protocol which the post is
| based on (might be wrong since I haven't looked closely at this
| part), this is supported:
| https://github.com/facebook/react/pull/28847.
|
| _> The format is a leading row that indicates which type of
| stream it is. Then a new row with the same ID is emitted for
| every chunk. Followed by either an error or close row._
| izger wrote:
| Interesting idea. Another way to implement the same without
| breaking json protocol framing is just sent {progressive: "true"}
| {a:"value"} {b:"value b"} {c: {d}c:"value b"} .. {progressive:
| "false"}
|
| and have
|
| { progressive: "false", a:"value", b:"value b", .. }
|
| on top of that add some flavor of message_id, message_no (some
| other on your taste) and you will have a protocol to consistently
| update multiple objects at a time.
| jerf wrote:
| There are at least two other alternatives I'd reach for before
| this.
|
| Probably the simplest one is to refactor the JSON to not be one
| large object. A lot of "one large objects" have the form
| {"something": "some small data", "something_else": "some other
| small data", results: [vast quantities of identically-structured
| objects]}. In this case you can refactor this to use JSON lines.
| You send the "small data" header bits as a single object. Ideally
| this incorporates a count of how many other objects are coming,
| if you can know that. Then you send each of the vast quantity of
| identically-structed objects as one-line each. Each of them may
| have to be parsed in one shot but many times each individual one
| is below the size of a single packet, at which point streamed
| parsing is of dubious helpfulness anyhow.
|
| This can also be applied recursively if the objects are then
| themselves large, though that starts to break the simplicity of
| the scheme down.
|
| The other thing you can consider is guaranteeing order of
| attributes going out. JSON attributes are unordered, and it's
| important to understand that when no guarantees are made you
| don't have them, but nothing stops you from specifying an API in
| which you, the server, guarantee that the keys will be in some
| order useful for progressive parsing. (I would always shy away
| from specifying _incoming_ parameter order from clients, though.)
| In the case of the above, you can guarantee that the big array of
| results comes at the end, so a progressive parser can be used and
| you will guarantee that all the "header"-type values come out
| before the "body".
|
| Of course, in the case of a truly large pile of structured data,
| this won't work. I'm not pitching this as The Solution To All
| Problems. It's just a couple of tools you can use to solve what
| is probably the _most common case_ of very large JSON documents.
| And both of these are a lot simpler than any promise-based
| approach.
| defraudbah wrote:
| nah, I got enough with TCP
| 65 wrote:
| Here's a random, crazy idea:
|
| What if instead of streaming JSON, we streamed CSV line by line?
| That'd theoretically make it way easier to figure out what byte
| to stream from and then parse the CSV data into something
| usable... like a Javascript object.
| geokon wrote:
| This is outside my realm of experience, isn't this kind part of
| the utility of a triple-store? Isn't that the canonical way to
| flatten trees data to a streamable sequence?
|
| I think you'd also need to have some priority mechanism for which
| order to send your triple store entries (so you get the same
| "breadth first" effect) .. and correctly handle missing entries..
| but that's the data structure that comes to mind to build off of
| jongjong wrote:
| This is an interesting idea. I solved this problem in a different
| way by loading each resource/JSON individually, using foreign
| keys to link them on the front end. This can add latency/delays
| with deeply nested child resources but it was not a problem for
| any of the use cases I came across (pages/screens rarely display
| parent/child resources connected by more than 3 hops; and if they
| do, they almost never need them to be loaded all at once).
|
| But anyway this is a different custom framework which follows the
| principle of resource atomicity and a totally different direction
| than GraphQL approach which follows the principle of aggregating
| all the data into a big nested JSON. The big JSON approach is
| convenient but it's not optimized for this kind of lazy loading
| flexibility.
|
| IMO, resource atomicity is a superior philosophy. Field-level
| atomicity is a great way to avoid conflicts when supporting real-
| time updates. Unfortunately nobody has shown any interest or is
| even aware of its existence as an alternative.
|
| We are yet to figure out that maybe the real issue with REST is
| that it's not granular enough (should be field granularity, not
| whole resource)... Everyone knows HTTP has heavy header
| overheads, hence you can't load fields individually (there would
| be too many heavy HTTP requests)... This is not a limitation for
| WebSockets however... But still, people are clutching onto HTTP;
| a transfer protocol originally designed for hypertext content, as
| their data transport.
| rictic wrote:
| If you've got some client side code and want to parse and render
| JSON progressively, try out jsonriver:
| https://github.com/rictic/jsonriver
|
| Very simple API, takes a stream of string chunks and returns a
| stream of increasingly complete values. Helpful for parsing large
| JSON, and JSON being emitted by LLMs.
|
| Extensively tested and performance optimized. Guaranteed that the
| final value emitted is identical to passing the entire string
| through JSON.parse.
| jto1218 wrote:
| typically if we need to lazy load parts of the data model we make
| multiple calls to the backend for those pieces. And our redux
| state has indicators for loading/loaded so we can show
| placeholders. Is the idea that that kind of setup is inefficient?
| nesarkvechnep wrote:
| In my opinion, REST, proper, hypertext driven, solves the same
| problems. When you have small, interlinked, cacheable resources,
| the client decides how many relations to follow.
| fdoifdois wrote:
| Related: https://stackoverflow.com/q/79454372/320615
| yencabulator wrote:
| SvelteKit has something like this to facilitate loading data
| where some of the values are Promises. I don't think the format
| is documented for external consumption, but it basically does
| this: placeholders for values where the JSON value at that point
| is still loading, replaced by streaming the results as they
| complete.
|
| https://svelte.dev/docs/kit/load#Streaming-with-promises
| bilater wrote:
| This is something I've been thinking about ever since I saw BAML.
| Progressive streaming for JSON should absolutely be a first class
| thing in Javascript-land.
|
| I wonder if Gemini Diffusion (and that class of models) really
| popularize this concept as the tokens streamed in won't be from
| top to bottom.
|
| Then we can have a skeleton response that checks these chunks,
| updates those value and sends them to the UI.
| aaronvg wrote:
| You might also find Semantic Streaming interesting. It's t he
| same concept but applied to llm token streaming. It's used in
| BAML (the ai framework).
| https://www.boundaryml.com/blog/semantic-streaming
|
| I'm one of the developers of BAML.
| creatonez wrote:
| This HN thread is fascinating. A third of the commenters here
| only read 1/3 of the article, another third read 2/3 of the
| article, and another third actually read the whole article. It's
| almost like the people in this thread linearly loaded the article
| and stopped at random points.
|
| Please, don't be the next clueless fool with a "what about X" or
| "this is completely useless" response that is irrelevant to the
| point of the article and doesn't bother to cover the use case
| being proposed here.
| EugeneOZ wrote:
| Just don't resurrect HATEOAS monster, please.
___________________________________________________________________
(page generated 2025-06-01 23:00 UTC)