[HN Gopher] Gojq: Pure Go Implementation of Jq
___________________________________________________________________
Gojq: Pure Go Implementation of Jq
Author : laqq3
Score : 62 points
Date : 2022-08-21 18:05 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| simonw wrote:
| "gojq does not keep the order of object keys" is a bit
| disappointing.
|
| I care about key order purely for cosmetic reasons: when I'm
| designing JSON APIs I like to put things like the "id" key first
| in an object layout, and when I'm manipulating JSON using jq or
| similar I like to maintain those aesthetic choices.
|
| I know it's bad to write code that depends on key order, but it's
| important to me as a way of keeping JSON as human-readable as
| possible.
|
| After all, human readability is one of the big benefits of JSON
| over various other binary formats.
| haasted wrote:
| I bet it's an artifact of Go having a randomized iteration
| order over maps [0]. Getting a deterministic ordering requires
| extra work.
|
| [0] https://stackoverflow.com/questions/9619479/go-what-
| determin...
| vips7L wrote:
| Does Go not have more than one Map implementation in the
| standard library?
| [deleted]
| esprehn wrote:
| It does not. Maps are not even a real interface you can
| implement, it's compiler magic encoded in the language
| spec: https://dave.cheney.net/2018/05/29/how-the-go-
| runtime-implem...
|
| This is all fallout of not having generics.
| [deleted]
| simonw wrote:
| I used to have the exact same problem with Python, until
| Python 3.7 made maintaining sort order a feature of the
| language: https://softwaremaniacs.org/blog/2020/02/05/dicts-
| ordered/
| c2h5oh wrote:
| Go actually went in the other direction for a bunch of
| reasons (e.g. hash collision dos) and made key order quasi-
| random when iterating. Small maps used to maintain order,
| but a change was made to randomize that so people didn't
| rely on that and get stung when their maps got larger:
| https://github.com/golang/go/issues/6719
| tialaramex wrote:
| Right, the startling thing about Python's previous dict
| was that it was _so_ terrible that the ordered dict was
| actually significantly faster.
|
| It's like if you did such a bad job making a drag racer
| that the street legal model of the same car was
| substantially faster over a quarter mile despite also
| having much better handling and reliability.
|
| In some communities the reaction would have been to write
| a good _unordered_ dict which would obviously be even
| faster, but since nobody is exactly looking for the best
| possible performance from Python, they decided that
| ordered behaviour was worth the price, and it 's not as
| though existing Python programmers could complain since
| it was faster than what they'd been tolerating
| previously.
|
| Randomizing is the other choice if you actually want your
| maps to be fast and want to resist Hyrum's law, but see
| the absl experience - they initially didn't bother to
| randomize tiny maps but then the order of those tiny maps
| changed for technical reasons and... stuff broke. Because
| hey, in testing I made six of this tiny map, they always
| had the same order therefore (ignoring the documentation
| imploring me not to) I shall assume the order is always
| the same...
| alecthomas wrote:
| > Right, the startling thing about Python's previous dict
| was that it was so terrible that the ordered dict was
| actually significantly faster.
|
| I've never heard that before and it would be really
| surprising, given that Python's builtin dict is used for
| everything from local symbol to object field lookup. Do
| you have more information?
| aaronbee wrote:
| Here's a description of the new map implementation and
| why it's more efficient:
| https://www.pypy.org/posts/2015/01/faster-more-memory-
| effici...
| cerved wrote:
| into it's not about code, it's about predicable and consistent
| layout so that you can easily diff
| zxcvbn4038 wrote:
| Yeah, this is a deal breaker. While technically the key order
| doesn't matter, in the real world it really does matter. People
| have to read this stuff. People have to be able to
| differentiate between actual changes and stuff moving around
| just because. Luckily it's a solved problem and you can write
| marshalers that preserve order, but it's extra work and
| generally specific to an encoding format. It would be nice to
| have ordered maps in the base library as an option.
| silverwind wrote:
| Agree, this is deterring me from this tool. Many
| languages/tools nowadays guarantee object key order which is
| convenient in many ways.
| lapser wrote:
| For what it's worth, JSON Objects are not guaranteed to be
| ordered. Maps in many different languages are implemented
| without an order.
| fwip wrote:
| Not implementing key-sorting is a curious decision:
|
| > gojq does not keep the order of object keys. I understand this
| might cause problems for some scripts but basically, we should
| not rely on the order of object keys. Due to this limitation,
| gojq does not have keys_unsorted function and --sort-keys (-S)
| option. I would implement when ordered map is implemented in the
| standard library of Go but I'm less motivated.
|
| I feel like --sort-keys is most useful when it is producing
| output for tools that do not understand JSON - for example,
| generating diffs or hashes of the JSON string. There is value in
| the output formatting being deterministic for a given input.
| renewiltord wrote:
| Could pipe through gron and sort to resort
| Someone wrote:
| That helps when you want to sort by key, but not when you
| want to keep the order of object keys as in the input file.
| eropple wrote:
| I agree with you that there's value to sorted keys from a
| presentational standpoint (we are not beep-boop robots, humans
| have to read this stuff too), but now there also exists a JSON
| canonicalization RFC that tools can/should follow (with all the
| usual caveats about canonicalization being fraught):
| https://www.rfc-editor.org/rfc/rfc8785
| mdaniel wrote:
| I guess "Informational" is better than /dev/null, but unless
| everyone adopts it doesn't that run the risk of it just being
| My Favorite Canonicalization™?
|
| Either way, I'm guessing if the gojq author has _that much_
| heartburn about implementing --sort-keys, --canonical is just
| absolutely off the table :-(
| fwip wrote:
| Thank you for letting me know! I hadn't thought to look.
| spullara wrote:
| i neither know nor care what language the original jq was
| implemented in.
| brundolf wrote:
| I can think of two reasons it matters here:
|
| - Can be used as a library in Go projects
|
| - Memory-safe (could be relevant when processing foreign data,
| esp as a part of some automated process)
| donio wrote:
| Yep, Benthos is an example of a cool project that uses gojq
| for its jq syntax support.
| lapser wrote:
| I have actually fully replaced my jq installation with gojq
| (including an `ln -s gojq jq`) for a few years, and no script has
| broken so far. I'm super impressed by the jq compatibility.
|
| If you are going down this route, do be careful with performance.
| I don't know which is more performant as I've never really had to
| work with large data sets, but I can't help but feel jq will be
| faster than gojq in such case. I have no benchmarks backing this
| up, but who knows, maybe someone will benchmark both.
|
| One of my favourite features is the fact that error messages are
| actually legible, unlike jq.
| brundolf wrote:
| It's very possible it could be faster; jq seems to actually be
| fairly unoptimized. This implementation in OCaml was featured
| on HN a while back and it trashes the original jq in
| performance: https://github.com/davesnx/query-json
|
| After seeing that one I did my own (less-complete) version in
| Rust and managed to squeeze out even more performance in the
| operations it supports: https://github.com/brundonsmith/jqr
| cube2222 wrote:
| This looks quite cool! I'm not sure though why I would use this
| over the original jq. However, I can definitely see the value in
| embedding this into my own applications, to provide jq scripting
| inside of them.
|
| Shameless plug: As I'm not a fan of the jq syntax, I've created
| jql[0] as an alternative to it. It's also written in Go and
| presents a lispy continuation-based query language (it sounds
| much scarier than it really is!). This way it has much less
| special syntax than jq and is - at least to me - much easier to
| compose for common day-to-day JSON manipulation (which is the use
| case it has been created for; there are definitely many features
| of jq that aren't covered by it).
|
| It might seem dead, as it hasn't seen any commit in ages, but to
| me it's just finished, I still use it regularly instead of jq on
| my local dev machine. Check it out if you're not a fan of the jq
| syntax.
|
| [0]: https://github.com/cube2222/jql
| laqq3 wrote:
| One reason to prefer gojq is that gojq's author is one of the
| most knowledgeable person for the original jq (as seen by
| GitHub PRs and issues), and his gojq fixes many long standing
| issues in jq.
|
| Plus, for my use cases, gojq beats jq by a fair margin.
___________________________________________________________________
(page generated 2022-08-21 23:00 UTC)