[HN Gopher] Zq: An easier and faster alternative to jq
___________________________________________________________________
Zq: An easier and faster alternative to jq
Author : mccanne
Score : 354 points
Date : 2022-04-26 13:02 UTC (9 hours ago)
(HTM) web link (www.brimdata.io)
(TXT) w3m dump (www.brimdata.io)
| hbbio wrote:
| jq is awesome, last time I used it is... today :)
|
| Or rather the pure Go rewrite https://github.com/itchyny/gojq
| which is a better faster implementation, with bugs fixed
| kitd wrote:
| The better error messages alone make this an improvement over
| jq IMHO.
| mdaniel wrote:
| And if it's maintained, that's also a plus, since I didn't
| realize jq was unmaintained, I thought it just didn't have
| any bugs to fix
| kryptozinc wrote:
| Is there a universal json normalizer (to csv for example) that
| doesn't require learning a terse language?
| omaranto wrote:
| There is gron [1], which prints json as a series of assignment
| statements that recreate the json value. It's pretty handy.
|
| [1] https://github.com/TomNomNom/gron
| ilyash wrote:
| In Next Generation Shell (author here), it is not as ergonomic
| (yet?) but on the other hand it's a fully fledged no-nonsense
| programming language... and I claim quite a readable.
|
| good_data = fetch("openlibrary.json").docs.filter({"author_name":
| Arr, "publish_year":Arr})
|
| good_data.map({{"title": A.title, "author_name":
| A.author_name[0], "publish_year": A.publish_year[0]}}).group("aut
| hor_name").mapv(len).sortv((>=)).limit(3)
| pm90 wrote:
| It took me a while to grok jq, but now that I do I kinda like it?
| I don't think I want to learn yet another thing.
|
| I do like tools that complement/supplement jq though, like jid:
| https://github.com/simeji/jid
| tus666 wrote:
| The worst thing about JQ is printing out several values from an
| object at once. The syntax is so bad I have to look it up on SO
| every time.
| dimensionc132 wrote:
| Simple json tasks .... read from, write to, read a value and save
| it as a variable in BASH .... where are those examples?
|
| The question for is this; can I do with json files what i can do
| with Python using Zq?
| eru wrote:
| Jq being secretly a sort-of functional programming language is
| part of what makes it great.
|
| Why would you change that?
| cosmiccatnap wrote:
| I would love to see what jq looks like on something like a 1mil
| line Json vs this. In my experience jq syntax is fine and I've
| not ran into a performance issue on any one file but I seem to
| see a jq clone every few months on here so someone seems to need
| that, or maybe it's just the new volume slider problem who knows.
| justinsaccount wrote:
| jq performance is pretty terrible. Here I'm going to do
| something super simple like pull out a single field out of a
| large log file: $ wc -l big.log 979400
| big.log $ du -hs big.log 570M big.log
|
| `count` is a small program that counts lines on stdin. like
| `sort|uniq -c |sort -n`
|
| jq takes 12 seconds: $ time cat big.log |jq -cr
| .method |~/bin/count 848000 GET 94800 POST
| 34000 HEAD 2400 OPTIONS 200 null real
| 0m12.381s user 0m12.427s sys 0m0.333s
|
| my tool takes .5 seconds $ time cat big.log
| |~/bin/jj method |~/bin/count 848000 GET 94800
| POST 34000 HEAD 2400 OPTIONS 200
| real 0m0.466s user 0m0.512s sys 0m0.198s
|
| `jj` is a little tool I wrote that uses
| https://github.com/buger/jsonparser
| gzapp wrote:
| I'm sure I'm not the only person that got fed up with
| occasionally needing to do something more advanced and just
| finding the JQ incantations inscrutable.
|
| Also prob not the first to create a project for personal use that
| just wraps evals in another language haha:
| https://www.npmjs.com/package/jsling
| xg15 wrote:
| A bit OT:
|
| The post links to the tutorial "An Introduction to JQ" at [1].
|
| Somewhere inside the tutorial, array operators are introduced
| like this:
|
| > _jq lets you select the whole array [], a specific element [3],
| or ranges [2:5] and combine these with the object index if
| needed._
|
| This is not supposed to be criticism on this particular tutorial
| (I've seen this kind of description quite often), but I could
| imagine this to be a typical "eyes glaze over" moment, where
| people subtly lose track of what is happening.
|
| It appears to make sense on first glance, but leaves open the
| question what "selecting the whole array" actually means -
| especially, since you can write both ".myarray" and ".myarray[]"
| and both will select the whole array in a sense.
|
| I think this is the point where one would really need to learn
| about sequences and about jq's processing model to not get
| frustrated later.
|
| [1] https://earthly.dev/blog/jq-select/
| adamgordonbell wrote:
| Oh, I wrote that. I think I get what you mean. There are two
| different things, and they aren't being delineated. How would
| you explain it?
|
| I don't know how jq works internally and in my mental model []
| maps into the json array and also can wrap things back into an
| array. So that [.[]] unwraps and then rewraps a JSON array,
| sort of like how [.[].title] is the same as map(.title).
| henrydark wrote:
| I have recently started to use jq massively, and I love it.
|
| Zq looks cool, but the fact that this piece doesn't contain a
| single instance of the word "map" tells me the authors still
| haven't gotten jq. Especially with the running strawman example
| of adding numbers.
| ducaale wrote:
| In the theme of jq alternatives, there is fx[1] which has an
| interactive view and supports querying JSON in Javascript, Python
| and Ruby. It used to be a node CLI but was recently rewritten in
| golang[2]
|
| [1] https://github.com/antonmedv/fx
|
| [2] https://twitter.com/antonmedv/status/1515429017582809090
| stblack wrote:
| Why all the hate HN?
|
| I feel the author makes his case clearly, then presents an
| alternative. Underneath all this is a ton of work, for which I
| applaud OP.
|
| It may not scratch your particular itch, but come on!
|
| Being an ass on HN is a choice. It happens far too often, and I
| wish everyone would just dial it back.
| dimitrios1 wrote:
| Do not confuse critique with hate.
|
| This place has a high standard for new tools and libraries,
| particularly one that claims to be better in any stretch
| ("faster" and "easier"). If this was say, a college student
| learning programming and presenting it as "hey I made a jq
| alternative and I believe it's easier and faster" I imagine it
| would solicit more softened feedback.
|
| Come prepared, and ready to defend your stance. If you can't
| take the heat, don't come in the kitchen.
| eatonphil wrote:
| I don't see hate for the project here.
|
| I see criticism for the way they're trying to position it as
| _easier_ than jq when it 's just different than jq.
|
| It looks like a cool project on its own and doesn't need to
| describe jq as confusing to make that point.
| skybrian wrote:
| But it is easier, for them.
|
| Easier, as a universal claim, is hard to establish - you'd
| need to do user studies. Easier in the author's opinion is
| normal usage, and their opinion is as good as anyone else's.
| They gave a reasonable justification.
|
| I kind of think you'd need to use both tools to have an
| informed opinion about which you think is easier. But most of
| us aren't going to do that, which is fine.
|
| I think having strong opinions about which is easier without
| trying them both is weird, though.
| pessimizer wrote:
| > But it is easier, for them.
|
| As they wrote it, it would be surprising if it weren't.
| diehunde wrote:
| Pardon my ignorance, but would I spend time learning something
| like jq or zq when it only takes me a couple of minutes to
| develop a script using some high-level language? I've had to
| process complex JSON files in the past, and a simple Python
| script gets the job done, and the syntax is much more familiar
| and easier to memorize. Is there a use case I'm missing?
| vlunkr wrote:
| jq is great for shell scripts. Say your script hits an API that
| returns JSON, and you want to retrieve a single field. This can
| be difficult to do correctly with grep or other text matching
| tools, but is trivial with jq. You just pipe it in like "curl
| XYZ | jq '.path.to.your.data'"
|
| I imagine this is how it's used 90% of the time, but can do
| lots more advanced stuff as described in the article.
| johnday wrote:
| Suppose you write a shell script which is intended for use
| among colleagues as part of a pipeline.
|
| In many cases, the most appropriate and useful tool for the job
| would be jq - one line in the shell script corresponding to the
| required data transform, calling out to `jq`, which already has
| a reasonable user base and documentation, and could be
| trivially replaced by anyone if the business needs change.
| orthecreedence wrote:
| You can spend a few days getting to know jq or you can happily
| live with your 100+ purpose-built scripts. I know which one I
| prefer.
|
| I don't even process complex JSON...it's usually pretty basic.
| But being able to quickly select parts out of streams of JSON
| data on the CLI is incredibly useful to me, and learning even
| just the basics of jq has paid for itself a hundred times over
| by now.
|
| Granted, a lot of my job right now is data forensics stuff, so
| I breath this kind of stuff. You might never need jq.
| johnthuss wrote:
| There is certainly a learning curve with jq that can put people
| off. The attraction is that the end result is a very small
| amount of code that does only one thing: parse a JSON file,
| rather than invoking an external script that might send many
| HTTP requests or launch a missile.
|
| As the complexity of the input JSON grows or the complexity of
| your processing, it does makes sense to leave jq behind for a
| higher level language.
| eru wrote:
| I agree with most of what you say.
|
| I disagree with 'leaving for a higher level language'. Jq is
| an extremely high level language.
|
| What it is _not_ is a general purpose language.
| ris wrote:
| 1. The High Level Language of your choice may not be the
| flavour liked by other members of your team. Ruby? ew please
| use Python - unnecessary discussion ensues... 2. Your High
| Level Language of choice would probably require a non-trivial
| container image, which requires extra decisions to be made
| about sourcing, which is something you'd rather not think about
| if this is just e.g. a step in a CD pipeline. jq is tiny and a
| very simple addition to an existing image. It's even present by
| default in GitHub Actions' `ubuntu-latest`. 3. Your High Level
| Language of choice may require dependencies to do the same job.
| How are those dependencies going to be defined, pinned, who's
| going to be responsible for bumping them...?
|
| I used to 100% agree with you, but these days I understand why
| so much stuff ends up being bash and jq.
| [deleted]
| meowface wrote:
| If you're doing a lot of JSON munging every day and have good
| mastery of something like jq or zq, you can probably get things
| done faster.
|
| Like you, I almost always just write Python scripts for such
| tasks because it's a lot easier for me to reason through it and
| debug it, but it's definitely slower-going than what I might do
| if I were very adept in a terse language like jq. I don't do
| this too often, so it makes little difference to me, but if
| someone is doing this multiple times a day, every day, it'll
| add up. As you say, it takes a few minutes; with jq, it could
| be a few seconds.
| aftbit wrote:
| This is how I felt about regular expression when I was first
| learning them. Now I feel that they're one of the most powerful
| text-processing tools that I know. I also felt similarly about
| SQL at the very beginning. IMO if you find yourself doing a
| _lot_ of JSON processing, learning at least basic jq gives you
| superpowers.
| folkrav wrote:
| Honestly, I've only really used `jq` to quickly parse JSON
| structures in interactive sessions e.g curl -s
| http://foo.bar | jq .some.nested.value
|
| Anything more complicated I would indeed go for writing a
| proper script.
| eru wrote:
| Don't tell anyone, but jq is secretly a pretty well thought
| out functional programming language.
| pantulis wrote:
| I am also on the same side of the discussion, but I'm a
| programmer by trade. Most of the cases I've seen non trivial jq
| uses is by people doing command line or shell script magic. In
| this context I guess it's easier to write the jq expression
| language than to whip up a fully fledged Python/Ruby/Perl
| script without having to debug pretty basic stuff once you know
| the syntax. Pretty much like awk.
| eatonphil wrote:
| I'm a programmer by trade. I use jq. :)
| meepmorp wrote:
| The same thing could be said for grep, or really any other
| utility that can have its functionality reproduced in a
| programing language.
| eru wrote:
| Indeed! Jq is basically something like grep for JSON.
|
| It might actually make sense to embed jq functionality into
| your favourite language (as a library or so), as it is quite
| a nice and well-chosen set of functionality.
| preferjq wrote:
| I would love to see jq libraries become as common as regex
| libraries so I could use jq directly in whatever stack or
| environment I'm working on.
| lichtenberger wrote:
| I'm working on a JSONiq based implementation to jointly process
| JSON data and XML. The compiler uses set-oriented processing (and
| thus uses hash joins for instance wherever applicable) and is
| meant to provide a base for JSON based database systems with
| shared common optimizations (but can also be used as a standalone
| in-memory query processor):
|
| http://brackit.io
|
| The language itself borrows a lot of concepts from functional
| languages as higher order functions, closures... you can also
| develop modules with functions for easy reuse...
|
| A simple join for instance looks like this:
| let $stores := [ { "store number" : 1,
| "state" : "MA" }, { "store number" : 2, "state" :
| "MA" }, { "store number" : 3, "state" : "CA" },
| { "store number" : 4, "state" : "CA" } ]
| let $sales := [ { "product" : "broiler", "store
| number" : 1, "quantity" : 20 }, { "product" :
| "toaster", "store number" : 2, "quantity" : 100 },
| { "product" : "toaster", "store number" : 2, "quantity" : 50 },
| { "product" : "toaster", "store number" : 3, "quantity" : 50 },
| { "product" : "blender", "store number" : 3, "quantity" : 100 },
| { "product" : "blender", "store number" : 3, "quantity" : 150 },
| { "product" : "socks", "store number" : 1, "quantity" : 500 },
| { "product" : "socks", "store number" : 2, "quantity" : 10 },
| { "product" : "shirt", "store number" : 3, "quantity" : 10 }
| ] let $join := for $store in $stores,
| $sale in $sales where $store=>"store number" =
| $sale=>"store number" return { "nb"
| : $store=>"store number", "state" :
| $store=>state, "sold" : $sale=>product
| } return [$join]
|
| Of course you can also group by, count, order by, nest FLWOR
| clauses...
| preferjq wrote:
| Here is a straightforward jq translation def
| stores: [ { "store number" : 1, "state" :
| "MA" }, { "store number" : 2, "state" : "MA" },
| { "store number" : 3, "state" : "CA" }, { "store
| number" : 4, "state" : "CA" } ]; def sales:
| [ { "product" : "broiler", "store number" : 1,
| "quantity" : 20 }, { "product" : "toaster", "store
| number" : 2, "quantity" : 100 }, { "product" :
| "toaster", "store number" : 2, "quantity" : 50 }, {
| "product" : "toaster", "store number" : 3, "quantity" : 50 },
| { "product" : "blender", "store number" : 3, "quantity" : 100
| }, { "product" : "blender", "store number" : 3,
| "quantity" : 150 }, { "product" : "socks", "store
| number" : 1, "quantity" : 500 }, { "product" :
| "socks", "store number" : 2, "quantity" : 10 }, {
| "product" : "shirt", "store number" : 3, "quantity" : 10 }
| ]; [ {store: stores[], sale:
| sales[]} | select(.store."store number" ==
| .sale."store number") | { nb: .store."store
| number", state: .store.state, sold:
| .sale.product } ]
|
| Try it online -
| https://tio.run/##rZPPUsMgEMbP5Sl2ctIZmklbe6HTg@PZJ8jkkD84Rh...
| lichtenberger wrote:
| The difference might be that Brackit uses sophisticated join
| algorithms for these kinds of implicit joins as known from
| relational query processing.
| knowsuchagency wrote:
| jq is a great tool, but my favorite alternative, by far, is jello
| and the libraries the author has created around it
| https://blog.kellybrazil.com/2020/03/25/jello-the-jq-alterna...
| bradwood wrote:
| Nothing beats gron in my view.
|
| That plus good old fashioned sed/grep/awk give me everything I
| need to do on the cli.
|
| If I want more, it's python or node.
| tzury wrote:
| yq uses jq like syntax but works with YAML, JSON and XML.
|
| https://github.com/mikefarah/yq
| ctur wrote:
| It takes a while to get to the point so I'll save others some
| time and tldr this very lengthy and agenda-driven blog post:
|
| jq had a tough learning curve so you should switch to zq which is
| a (closed source?) wrapper around an obscure language you've
| never heard of that we promise is easier because reasons. Also
| coincidentally it's the language of an ecosystem we were funded
| to build.
|
| Edit: mea culpa, turns out you can download the source (revealed
| half way through the article).
| loeg wrote:
| Closed source?
| https://github.com/brimdata/zed/blob/main/runtime/query.go
|
| Yes, it's an obscure query language. But if you were interested
| in jq, that clearly wasn't a barrier to entry.
|
| I agree the author is happy to show off their tool, but
| disagree that that is somehow disqualifying. They made a cool
| thing, they're allowed to be proud about it.
| mdaniel wrote:
| And it's BSD 3 Clause, for those interested:
| https://github.com/brimdata/zed/blob/v1.0.0/LICENSE.txt
| [deleted]
| mccanne wrote:
| Hi, all. Author here. Thanks for all the great feedback.
|
| I've learned a lot from your comments and pointers.
|
| The Zed project is broader than "a jq alternative" and my bad for
| trying out this initial positioning. I do know there are a lot of
| people out there who find jq really confusing, but it's clear if
| you become an expert, my arguments don't hold water.
|
| We've had great feedback from many of our users who are really
| productive with the blend of search, analytics, and data
| discovery in the Zed language, and who find manipulating eclectic
| data in the ZNG format to be really easy.
|
| Anyway, we'll write more about these other aspects of the Zed
| project in the coming weeks and months, and in the meantime, if
| you find any of this intriguing and want to kick the tires, feel
| free to hop on our slack with questions/feedback or file GitHub
| issues if you have ideas for improvements or find bugs.
|
| Thanks a million!
|
| https://github.com/brimdata/zed https://www.brimdata.io/join-
| slack/
| [deleted]
| preferjq wrote:
| "cobbled-together" jq as it often appears in the wild will
| often compare badly with crafted solutions because the writer's
| goal is usually GSD and not write pretty code.
|
| People with the time and inclination to slow down and think a
| little more about how the tools work will produce cleaner
| solutions.
|
| In your example to convert
| {"name":"foo","vals":[1,2,3]}
|
| to {"name":"foo","val":1}
| {"name":"foo","val":2} {"name":"foo","val":3}
|
| All you need is this jq filter {name:.name,
| val:.vals[]}
|
| To me this is much better than the proposed zq or jq solution
| you're using as a basis for comparison. You could almost use
| the shorter .vals = .vals[]
|
| if the name in the output didn't change.
|
| These filters takes advantage of how jq's [] operator converts
| a single result into separate results. For people new to jq
| this behavior is often confusing unless they've seen things
| like Cartesian products.
|
| .[] -
| https://stedolan.github.io/jq/manual/#Array/ObjectValueItera...
| 1vuio0pswjnm7 wrote:
| Thank you for your work on tcpdump, (original) bpf and the pcap
| library. I benefit from those projects everyday.
|
| ZSON looks way better than JSON. I pray that the Zed project
| becomes more popular.
| mccanne wrote:
| Wow, thanks.
|
| Coincidentally, after hearing of a friend's woes dealing with
| massive amounts of CSV coming from a BPF-instrumental kernel,
| I played around a bit with integrating Zed and BPF. Just an
| experimental toy (and the repo is already out of date)...
|
| https://github.com/brimdata/zbpf
|
| The nice thing about Zed here is any value can be a group-by
| key so it's easy, for example, to use kernel stacks (an array
| of strings) in a grouping aggregate.
|
| (p.s. for the record, the only thing I have to do with the
| modern linux BPF system is the tiny vestige of origin story
| it shares with the original work I did in the BSD kernel
| around 1990)
| rienko wrote:
| Ever since my team started using Splunk (circa 2012), we
| claimed for a more open version we could tinker with and not
| cost an arm and a leg to ingest multiple terabytes of daily
| data.
|
| Positioning as an opensource Splunk would be an interesting
| play. Going through your docs the union() function looks like
| it returns a set, akin to splunk values(), is there the
| equivalent to list()?
|
| Elastic is great in its lane, but it requires more resources
| and has a monolith weight, that has left a sour taste from our
| internal testing. Doing a minimal ElasticSearch compatible API
| would open up your target audience, are there any plans to do
| you it in a short term horizon (< 1 year)?
| mccanne wrote:
| That's a cool idea. We've had many collaborators using Zed
| lakes for search at smallish scale and we are still building
| the breadth of features needed for a serious search platform,
| but I think we have a nice architecture that holds the
| promise to blend the best of both worlds of warehouses and
| search.
|
| As for list() and values() functions, Zed has native arrays
| and sets so there's no need for a "multi-value" concept as in
| splunk. If you want to turn a set into an array, a cast will
| do the trick, e.g.,
|
| echo '1 2 2 3 3' | zq 'u:=union(this) | cast(u,<[int64]>) ' -
|
| [1,2,3]
|
| (Note that <[int64]> is a type value that represents array of
| int64.)
| gauravphoenix wrote:
| there is Dassana[1] if someone wants to try out json
| native,index-free, schema-less solution built on top of
| ClickHouse.
|
| ShowHN post(FAQ)[2]
|
| disclaimer- I'm founder/CEO of Dassana.
|
| [1] https://lake.dassana.io/
|
| [2] https://news.ycombinator.com/item?id=31111432
| harbor11012 wrote:
| btw, in case you don't know, you can actually run jq using a curl
| command:
|
| https://xbin.io/w/tool/jq
| algesten wrote:
| I don't get it. "Instead of learning jq DSL, learn zq DSL".
|
| To me they look similarly complicated and the examples stresses
| certain aggregation operations that are harder to do in jq (due
| to it being stateless).
| enriquto wrote:
| > "Instead of learning jq DSL, learn zq DSL".
|
| A saner approach is to gron the damn json and just use regular
| unix tools on the data.
| p5a0u9l wrote:
| Yes, but fortunately, your efforts will pay dividends when
| parsing all the 'z*' boutique formats that it supports, zson,
| zst, zng, the list goes on. /s
| mattnibs wrote:
| Not sure if this came across in the article, but all the
| "boutique" z* formats are all representations of the same zed
| model https://zed.brimdata.io/docs/formats/zed/
| loeg wrote:
| > "Instead of learning jq DSL, learn zq DSL"
|
| I think you got it -- that's exactly the idea. They claim
| (reasonably?) that it's a more intuitive DSL; and it supports
| state. They also make some performance claims towards the end
| of the article.
| jerrysievert wrote:
| > They also make some performance claims towards the end of
| the article.
|
| essentially a marginal speed increase they think on json, but
| a much bigger speed increase (5x-100x they claim) if you
| switch to their native format ZNG.
|
| if I'm switching formats completely, I'm not sure why I care
| about jq vs zq in json performance ...
| loeg wrote:
| Marginally faster is better than marginally slower, at
| least. I agree the JSON use is probably more compelling
| than their ZNG thing.
| jerrysievert wrote:
| > I agree the JSON use is probably more compelling than
| their ZNG thing.
|
| considering how much data I can already get via json (or
| converted to json via other json related standards such
| as geojson), there doesn't seem to be much of a
| compelling case to use ZNG.
|
| I'd love to hear different though!
| xg15 wrote:
| So, admitted jq fanboy here, but I found a lot of the criticism
| from the articale really sensible.
|
| I think jq has a pretty elegant data model, but the syntax is
| often very clunky to work with.
|
| So here is a half thought-out idea how you might improve the
| syntax for the "stateful operations" usecase the OP outlined:
|
| I think it's not quite true that different elements of a sequence
| can never interact. The OP mentioned reduce/foreach, but it's
| also what any function that takes argument does:
|
| If you have an expression 'foo | bar', then bar is called once
| for every element foo emits. However, foo could also a function
| that takes arguments. Then you can specify bar as an argument of
| foo like this: 'foo(bar)'. In this situation, execution of bar is
| completely controlled by foo. In particular, foo gets to see
| _all_ elements that foo emits, not just one each. I believe this
| is how e.g. [x] can collect all elements of x into an array.
|
| In the same way, you could write a function 'add_all(x)' which
| calls x and adds up all emitted elements to a sum.
|
| However, this wouldn't help you with collecting all input lines,
| as there is nothing for you function to "wrap around". Or at
| least, there _used_ to be nothing, but I think in one of the
| recent build, an "inputs" function was added, which emits all
| remaining inputs. So now, you can write e.g. '[., inputs]' to
| reimplement slurp. In the same way, you could sum up all input
| lines by writing 'add_all(., inputs)'.
|
| However, this is still ugly and unintuitive to write, so I think
| introducting some syntactic sugar for this would be useful. E.g.,
| you could imagine a "collect operator", e.g. '>>' which treats
| everything left of it as the first argument to the function to
| the right of it.
|
| e.g., writing 'a >> b' would desugar to 'b(a)'.
|
| Writing 'a | b >> c' would desugar to 'c(a | b)'.
|
| Any steps further to the right are not affected:
|
| 'a | b >> c | d' would desugar to 'c(a | b) | d'.
|
| Scope to the left could be controlled with parantheses:
|
| 'a | (b >> c)' would desugar to 'a | c(b)'.
|
| To make this more useful for aggregating on input lines, you
| could add a special rule that, if the operator is used with no
| parantheses, it will implicitly prepend '(., inputs)' as the
| first step.
|
| So if the entire top-level expression is 'a | b >> c', it would
| desugar to 'c((., inputs) | a | b)'.
|
| This would make many usecases that require keeping state much
| more straight-forward. E.g. collecting all the "baz" fields into
| an array could be written as '.baz >> []' which would desugar to
| '[(., inputs) | .baz]'
|
| Summing up all the bazzes could be written as '.baz >> add_all'
| which would desugar to 'add_all((., inputs) | .baz)'
|
| ...and so on.
|
| On the other hand, this could also lead to new confusion, as you
| could also write stuff like '... | (.baz >> map) | ...' which
| would really mean 'map(.baz)' or 'foo >> bar >> baz' which would
| desugar to the extremely cryptic expression 'baz((., inputs) |
| bar((., inputs) | foo))'. So I'm not quite sure.
|
| Any thoughts about the idea?
| jrm4 wrote:
| Okay, so I'm a big scripter and not much of a programmer and I
| definitely have found jq to be mostly worthless to me; but it
| also looks like zq doesn't much help?
|
| Seems to me that if you're in a shell, then you should be "shell-
| like." There should not be much of a learning curve at all, and
| when in doubt, try to behave like other shell tools, in a Unix
| way way. Make pipe behavior generally predictable, especially for
| those who aren't deep into json et al.
|
| And if you're not going to do that, say so on "the box?"
|
| (Disclaimer, it could be that I'm an idiot when it comes to all
| of this and I'm missing something big. Kind of feels that way,
| and I welcome correction)
| dymk wrote:
| Could you help me understand what your usecases are, and where
| jq/zq fall short? I find the tools useful for e.g. curl'ing a
| request with a JSON format into, and then
| mapping/filtering/reducing the content into what I want. It
| seems pretty unix-y to me, but I'm curious what the
| shortcomings are. For instance, could you give an example where
| the pipe behavior is unpredictable?
| brushfoot wrote:
| The name of its corporate progenitor may leave a bad taste in
| some mouths, but I highly recommend PowerShell for this sort of
| thing. It's cross platform, MIT licensed, and comes with
| excellent JSON parsing and querying capabilities. Reading,
| parsing, and querying JSON to return all red cars:
| Get-Content cars.json | ConvertFrom-Json | ? { $_.color -eq 'red'
| }
|
| The beauty of this is that the query syntax applies not just to
| JSON but to every type of collection, so you don't have to learn
| a specific syntax for JSON and another for another data type. You
| can use Get-Process on Linux to get running processes and filter
| them in the same way. The same for files, HTML tags, etc. I think
| nushell is doing something similar, though I haven't tried it
| yet.
|
| I prefer this approach to another domain-specific language, as
| interesting as jq's and zq's are.
| klysm wrote:
| I want to learn powershell, but I have an internal ick bias
| because I've been using bash for so many years. The tab
| behavior is the exact opposite of what I expect and it short
| circuits my brain every single time I press it. Having
| structured data in the pipes seems very useful and powerful
| though so I should probably just bite the bullet.
| vips7L wrote:
| Tab behavior is configurable. I have mine set to menu
| expansion. Set-PSReadLineKeyHandler -Key
| Tab -Function MenuComplete
| icedchai wrote:
| Thanks for that! I've recently been learning PowerShell.
| After 30 years of bash it is interesting.
| [deleted]
| tl wrote:
| Powershell's object pipes are more inspectable than any of
| Bourne's text-based decendants. But the tool itself occupies a
| niche between "write shell, dealing with esoteria of
| grep/sed/awk/jq/etc" and "write python getting constructs than
| handle complexity better than pipes".
|
| Looking at the popularity of VSCode, I don't think Microsoft
| hatred blocks its adoption.
| bblb wrote:
| PowerShell is "Python interactive done right". It's too bad it
| has a bad rap in open source community and it might never get
| the traction it really deserves. Sure it has it's downsides,
| which tech doesn't, but PowerShell has solved so many issues
| and annoyances with the shells that we've been used to, that it
| still comes out as the winner.
|
| I've been using it since day one from 2006, every single day.
| It has come a long way and the current PS7 is the best shell
| experience there is. Hands down no contest.
|
| Snover's passionate early presentation about the PS pipeline is
| a pretty cool tech video.
| https://www.youtube.com/watch?v=325kY2Umgw8
| Arnavion wrote:
| jq can not only process JSON input but also emit JSON output.
| So on that note, has ConvertTo-Json stopped mangling your JSON
| yet? https://news.ycombinator.com/item?id=25500632
| vips7L wrote:
| > The beauty of this is that the query syntax applies not just
| to JSON but to every type of collection,
|
| This is the best part of pwsh. Everything is standardized,
| you're not guessing at the idioms of each command, and you're
| working with objects instead of parsing strings!
|
| My second favorite part is having access to the entire C#
| standard library.
| pxc wrote:
| Agreed-- PowerShell is really nice for this, as are some of the
| other shells it has inspired.
| ilyash wrote:
| .. or you can try Next Generation Shell (author here):
|
| fetch("cars.json").filter({"color": "red"})
|
| # or
|
| echo(fetch("cars.json").filter({"color": "red"}))
| ptx wrote:
| PowerShell "sends basic telemetry data to Microsoft [...] about
| the host running PowerShell, and information about how
| PowerShell is used" [1].
|
| And since it relies on .NET, that also requires its own
| separate opt-out for its telemetry. There might be other
| components, now or in the future, that also send data to
| Microsoft by default and would have to be separately discovered
| and disabled.
|
| [1] https://docs.microsoft.com/en-
| us/powershell/module/microsoft...
| brushfoot wrote:
| To me a telemetry opt-out is a small price to pay for what
| PowerShell brings to the table, but to each their own.
|
| > There might be other components, now or in the future, that
| also send data to Microsoft
|
| Of course. Do your due diligence on whatever you install. No
| tool should be exempt from that.
| mschuster91 wrote:
| > Do your due diligence on whatever you install. No tool
| should be exempt from that.
|
| That's a ridiculous take. 99% of users don't understand
| what all that technobabble in a typical EULA means, they
| will just go for the option they are nudged to (which is
| why first the courts and now enforcement agencies are
| stepping up their game against that practice [1]).
|
| The way that the GDPR expects stuff to be handled is by
| getting _explicit_ user consent, the consent must be a
| reasonably free choice (i.e. deals like "give me your
| personal data and the app is free, otherwise pay" are
| banned), and there must not be any exchange of GDPR-
| protected data without that consent unless technically
| required to perform the service the user demands. Clearly,
| a telemetry _opt-out_ is completely against the spirit of
| the GDPR and I seriously hope for Microsoft to get
| flattened by the courts for the bullshit they have been
| pulling for way too long now.
|
| What I would _actually_ expect of Microsoft is to follow
| the Apple way: have one single central place, ideally at
| setup and later in the System Preferences, where tracking,
| analytics and other optional crap can be disabled system-
| wide.
|
| [1] https://www.hiddemann.de/allgemein/lg-rostock-bejaht-
| unterla...
| jodrellblank wrote:
| The GDPR applies to personal data. PowerShell telemetry
| isn't personal data, so it's not covered by the GDPR.
| What is reported is documented here:
|
| https://docs.microsoft.com/en-
| us/powershell/module/microsoft...
|
| and is " _anonymized information about the host running
| PowerShell, and information about how PowerShell is used_
| ". It sucks that it has telemetry, but anonymised
| information about whether a computer ran 10 .exe or 10
| cmdlets pales into insignificance against Windows and
| Edge and OneDrive slurping up names, addresses, files,
| moving logins to Microsoft accounts, sending browser
| history to Microsoft, checking downloads with Microsoft,
| keeping a history of all programs run in Windows for
| timeline and trying to send that to Microsoft to sync it
| between devices, moving OneNote to the cloud, having the
| start menu search be a Bing web search, defaulting to
| Cortana being a cloud based voice search, sending pen and
| ink data to Microsoft, and etc. etc.
| mschuster91 wrote:
| Even the fact that a particular piece of software is used
| by a specific IP address _is_ enough PII that it 's
| covered under GDPR by most viewpoints. The fact that
| Microsoft is collecting even more data doesn't excuse
| telemetry in PowerShell _at all_.
|
| I would simply wish for _no_ telemetry to happen at all
| without user consent. If Microsoft wants information
| about how people use their software or how stable it is
| and not enough people opt in, _they should fucking pay
| people money_ for market research and QA.
| brushfoot wrote:
| > That's a ridiculous take
|
| Then it befits a ridiculous state of affairs. It would be
| great to have the standards you suggest, and it's a shame
| that we don't. But that doesn't change the fact that we
| don't, and because we don't, we need to do due diligence
| on the tools we install.
| ElectricalUnion wrote:
| > What I would actually expect of Microsoft is to follow
| the Apple way: have one single central place, ideally at
| setup and later in the System Preferences, where
| tracking, analytics and other optional crap can be
| disabled system-wide.
|
| This is still GDPR non-compliant, you should have a
| central place to _opt-in_ _tracking, analytics and other
| optional crap_ if you so desire.
| mschuster91 wrote:
| So what? You can opt-in to tracking in the macOS System
| Preferences, pane "security and data protection", tab
| "Privacy" _at any time_ you wish should you not have done
| so during the macOS onboarding process.
|
| In Debian, you can opt-in at setup time or any later time
| with a simple "dpkg-reconfigure popularity-contest" (even
| though that one isn't fully GDPR-compliant as you can't
| easily read what exactly is being done from the same
| screen).
| ElectricalUnion wrote:
| > So what? You can opt-in to tracking in the macOS System
| Preferences, pane "security and data protection", tab
| "Privacy" at any time you wish should you not have done
| so during the macOS onboarding process.
|
| You cannot _opt-in_. You can go to `System Preferences >
| Security & Privacy > Analytics & Improvements` and _opt-
| out_ , but the default _is not opt-in_.
| sandyarmstrong wrote:
| > And since it relies on .NET, that also requires its own
| separate opt-out for its telemetry.
|
| Building a program with .NET does NOT cause that program to
| send telemetry to Microsoft.
|
| You're thinking of the .NET SDK itself. Using PowerShell does
| not trigger any use of the .NET SDK.
|
| Disclaimer: I work for Microsoft.
| ptx wrote:
| Ah, yes, my mistake. Although PowerShell sends its own
| telemetry, the additional telemetry from the .NET platform
| is only sent when you use the _dotnet_ command [1] and, as
| a special case, not when you very carefully invoke it only
| "in the following format: dotnet [path-to-app].dll" and
| never e.g. "dotnet help".
|
| However, presumably PowerShell requires at least the .NET
| Runtime if not the .NET SDK, doesn't it? The docs [2]
| suggest running "dotnet --list-runtimes" to "see which
| versions of the .NET runtime are currently installed", so
| it sounds like the Runtime also includes the _dotnet_
| command. Does running the recommended "dotnet --list-
| runtimes" command send telemetry, like most of the
| commands? Or are you saying that the Runtime, unlike the
| SDK, doesn't include telemetry at all?
|
| [1] https://docs.microsoft.com/en-
| us/dotnet/core/tools/telemetry
|
| [2] https://docs.microsoft.com/en-
| us/dotnet/core/install/how-to-...
| sandyarmstrong wrote:
| > However, presumably PowerShell requires at least the
| .NET Runtime if not the .NET SDK, doesn't it?
|
| Nope, these days .NET programs (like PowerShell) bundle
| the runtime. But even if they did a lighter distribution
| that depended on the runtime already being installed,
| there would be no .NET telemetry sent.
|
| > Does running the recommended "dotnet --list-runtimes"
| command send telemetry, like most of the commands?
|
| This is still an SDK command. I don't personally know if
| this one sends any telemetry.
|
| > Or are you saying that the Runtime, unlike the SDK,
| doesn't include telemetry at all?
|
| The runtime does not send telemetry.
| ptx wrote:
| So the "dotnet" command is only in the SDK, not in the
| separately downloadable Runtime? Does the Runtime have
| some other command to launch an executable?
|
| Edit: Actually, the ".NET Runtime 6.0.4" [1] (not the
| SDK) definitely has a "dotnet" command included.
| Presumably with the telemetry?
|
| [1] https://dotnet.microsoft.com/en-
| us/download/dotnet/6.0
| sandyarmstrong wrote:
| When I say "the runtime", I'm referring to everything
| that would be bundled into a published .NET program. The
| base class libraries, the bootstrapper, etc. There is no
| telemetry here.
|
| Yes, if you download a .NET Runtime distribution, it will
| include the `dotnet` command from the SDK so that basic
| commands like `dotnet --list-runtimes` and `dotnet
| --list-sdks` are available. These commands may send
| telemetry. But as you probably saw on
| https://docs.microsoft.com/en-
| us/dotnet/core/tools/telemetry , using `dotnet
| path/to/program.dll` to run an unbundled .NET program
| will never send telemetry.
| mdaniel wrote:
| I conceptually like pwsh, but even as your example shows, I
| don't have the RSI budget left to spend on typing that
| extremely verbose expression every day
|
| jq and its unix-y friends allow me to trade off expressiveness
| against having to memorize arcane invocations
| [deleted]
| brushfoot wrote:
| I hear that, I use and like *nix too. PowerShell aliases help
| a lot. It comes with some predefined, like `gc` for `Get-
| Content`. The above example could be rewritten:
| gc cars.json | ConvertFrom-Json | ? color -eq 'red'
|
| `ConvertFrom-Json` doesn't have a default alias, but you can
| define one in your PowerShell profile. I do that for commands
| I find myself using frequently. Say we pick convjson:
| gc cars.json | convjson | ? color -eq 'red'
|
| That's more like what my typical pipelines look like.
|
| The nice thing about aliases is you can always switch back to
| the verbose names when clarity is more important than
| brevity, like in long-term scripts.
|
| Edit: Seems I've been using too many braces and dollar signs
| all these years. Thanks to majkinetor for the tip.
| majkinetor wrote:
| You don't need $_ for immediate properties which looks much
| cleaner: gc cars.json | convjson | ?
| color -eq 'red'
| sandyarmstrong wrote:
| TIL! Thanks!
| taude wrote:
| I'm surprised no one mentioned rq [1] yet. It's come up before in
| older HN threads [2] whenever the discussion on jq comes up...
|
| [1] https://github.com/dflemstr/rq [2]
| https://news.ycombinator.com/item?id=13090604
| marmada wrote:
| I see a lot of JQ experts on this thread, so I'll bite the bullet
| here as a novice.
|
| The purpose of life is not to know JQ. I just want to process the
| JSON so I can move on and do whatever is actually important.
| Ideally, I'd just be able to tell GPT-codex to do what I want to
| do to the JSON in English.
|
| We're not there yet, but in the meantime if there's another tool
| that allows me to know less in exchange for doing more, I'll
| gladly use it.
| phil294 wrote:
| This is one of the purposes I think Deno should have been built
| for: Use JavaScript for oneliners in the command line. We had
| ... | deno xeval '...stdin processing code using special var $'
|
| which was close to xargs in terms of conciseness.
| Unfortunately, it was removed as being considered "too niche"
| [1].
|
| [1] https://github.com/denoland/deno/issues/3230
| [deleted]
| dotopotoro wrote:
| > know less in exchange for doing more
|
| That is very rare event with established tooling.
|
| Most of the time complexity is just shifted around.
| boyter wrote:
| Same boat here. I ended up finding gron
| https://github.com/tomnomnom/gron which resolved that issue for
| me. Now I don't have to look up how to use jq each time I want
| to quickly find something in some JSON.
| 1vuio0pswjnm7 wrote:
| Perhaps it is not so much the "tool" that is the impediment to
| progress. Perhaps it is the format. In many cases, the JSON
| format leads to slower processing. To add insult to injury, it
| is not human readable. I have written programs for myself using
| flex^1 to extract JSON from HTML and make it line-delimited and
| readable for me.^2 For me, these programs are easier to use and
| work faster than jq or any other program I have tried,
| irrespective of the size of the JSON input.^3 IMHO, the real
| issue is not the programs available, it is the format. (Or
| maybe I am just too stupid to see the greatness of JSON.)
|
| 1. I think jq uses flex, too, but its usage is way more
| complicated.
|
| 2. I dislike excessive indentation and nesting.
|
| 3. jq provides a --stream option for situations where the size
| of the input may impact the processing speed.
| preferjq wrote:
| I completely agree when your goal is GSD just use the tools you
| have.
|
| When you have time to sharpen the saw come back and dig into
| the details of how jq and tools like it work and where their
| limits are. Looking at the jq builtins[1] can be very
| enlightening
|
| If you get to the point where your goal is to increase your jq
| skills I'd recommend looking at the jq questions on Stack
| Overflow and posting your own solution. Contributing a solution
| to https://rosettacode.org/wiki/Category:Jq is also good.
|
| 1- https://github.com/stedolan/jq/blob/master/src/builtin.jq
| nixpulvis wrote:
| No, not ideally.
|
| English descriptions will never be completely unambiguous and
| unique keys into a JSON data structure. There is a very good
| reason programming languages (and other forms of languages)
| exist.
| psacawa wrote:
| Since no one seems to know about it, jq is described in _great_
| detail on the github wiki page [0]. That flattens the learning
| curve a lot. It 's not as arcane as it seems.
|
| The touted claim that is fundamentally stateless is not true. jq
| is also stateful in the sense that it has variables. If you want,
| you can write regular procedural code this way. Some examples [1]
|
| The real problem of jq is that it is currently lacking a
| maintainer to assess a number of PRs that have accumulated since
| 2018.
|
| [0] https://github.com/stedolan/jq/wiki/jq-Language-Description
|
| [1]
| https://github.com/fadado/JBOL/blob/master/fadado.github.io/...
| j1elo wrote:
| Sadly very few authors seem to acknowledge or even know that
| _github wiki pages are not indexed by search engines_ so if it
| wasn 't for third-party sites like github-wiki-see.page (which
| could stop working at any time) their contents would be
| undiscoverable by the very same people they are usually
| intended...
| oblio wrote:
| What? That's crazy! Does Github block indexing?
| bckygldstn wrote:
| There's more details on https://github-wiki-see.page/ and h
| ttps://github.com/github/feedback/discussions/4992#discussi
| ...
|
| > we have also introduced an x-robots-tag: none in the http
| response header of Wiki pages
|
| > Abusive behavior in Wikis had a negative impact on our
| search engine ranking
|
| > GitHub is currently permitting a select criteria of
| GitHub Wikis to be indexed
| beembeem wrote:
| https://github.com/robots.txt
|
| I don't see anything here about wiki specifically but maybe
| one of the rules hits wiki pages?
| dilap wrote:
| From that page:
|
| > The jq documentation is written in a style that hides a lot
| of important detail because the hope is that the language feels
| intuitive.
|
| Yeah, not so much boys! Also, that disclaimer should really be
| at the top of the _manual_ , with a link to the wiki, rather
| than vice-versa, as it is now.
|
| The wiki is like secret information -- "oh, hey, here's the
| page that _actually_ tells you how it works! "
| klysm wrote:
| I didn't realize jq was missing a maintainer, it's one of my
| most used CLI tools.
| ethanwillis wrote:
| It really is a fundamental problem where lots of these
| important projects aren't maintained simply because the
| reality is the maintainers can't beat the economics of a lot
| of rich freeloaders having no real short term incentive to
| compensate these maintainers..
| avgcorrection wrote:
| > can't beat the economics
|
| This makes it sound like this is some antagonistic
| relationship where the OSS maintainer loses. But the
| idealistic scenario that you are alluding to[1] is about a
| developer who develops free OSS in their free time. And
| then, yes, very few end up paying or donating anything. But
| how is a predictable chain of events a _loss_? What is the
| "economics" of it?
|
| [1] Some OSS developers do it as their day job.
| ethanwillis wrote:
| This is unrelated to the argument but using references
| that aren't references made that really confusing to
| read.
|
| In any case, what I meant by the "economics" of it is
| that in general a person can only afford to work for free
| for so long before they need to pay bills, eat, have
| and/or acquire a standard of living that isn't poverty.
| If they have a day job where they are writing this
| software in their free time, how long can they do this
| before burning out?
| avgcorrection wrote:
| You say that this is unrelated yet your follow-up
| reinforces my initial impression.
|
| How does one afford to work for free? One has a day job.
| How does someone who volunteers for search-and-rescue
| afford it? That's obviously a ridiculous question--they
| are volunteers so they necessarily must do something from
| nine to five. Or be independently wealthy.
|
| But how does one avoid burnout as a double-worked
| programmer? I think we have ourselves to blame on that
| point since we have put the double-worked programmer on a
| pedestal. So we can either:
|
| 1. Not work on things both professionally and in our free
| time; or
|
| 2. Force ourselves to do just that because we gain
| something extrinsic from it that we might need, like
| simply keeping up with the Joneses (having an answer for
| "where's your private GitHub" in interviews...)
| jahewson wrote:
| It's not though, because in this case the (ex-)maintainer
| works at a Wall St firm.
| ethanwillis wrote:
| This is exactly my point. Will he quit the paying job to
| work for free? How long could he maintain this for free
| with no other job, or with a job and additional free
| hours without running out of money or burning out?
| carlhjerpe wrote:
| But everyone can't realistically live a wealthy life off
| FLOSS tools either. The people who write these things are
| usually very talented and will make killer pay anywhere
| they work. Usually a cool thing is when companies sponsor
| them to work X% for them and Y% on the FLOSS tool.
| lenkite wrote:
| Pity he quit before Github opened up sponsorships.
| skybrian wrote:
| In this case it doesn't seem too critical? It means jq
| remains stable, which is probably what should happen once a
| tool like this gets a lot of users.
| Beltalowda wrote:
| > It's not as arcane as it seems.
|
| The issue with jq is that I use it maybe once a month, or even
| less. The syntax is "arcane enough" that I keep forgetting how
| to use it because I use it so sporadically.
|
| In comparison awk - which I also don't use _that_ often - has a
| much easier syntax that I can mostly remember.
|
| Not entirely convinced by the zq syntax either though; it also
| seems "arcane enough" that I would keep forgetting it.
| hiram112 wrote:
| Bingo.
|
| There are at least a dozen tools and languages and syntaxes
| that I've used sporadically over the years - awk, sed, bash,
| Mongo, perl, etc. I don't use them often enough to remember
| exactly how they work, and so I always have to spend a few
| hours reviewing manuals or old code repos or an O'Reilly
| book.
|
| But if I do end up using it for a few days in a row, it
| starts to make sense, and I improve each time I use it.
|
| But not with jq.
|
| It just does not make sense to my brain, no matter how many
| times I've had to use it. Every single time I need to use it,
| it requires finding some Stack Exchange or blog and just
| copying and pasting. Even after seeing the solution, rarely
| do I then really understand why or how it works. Nor can I
| often take that knowledge and apply it to similar problems.
|
| About the only other syntax or language that gives me such
| problems is Elastic Search DSL.
| silon42 wrote:
| Same for me... everytime I have to lookup the basics... and
| I love awk,perl and xpath/xslt.
| ar_lan wrote:
| This is ironic - I use `awk` so infrequently, I have _no
| idea_ how to use it without reading its man page or using
| Google. But I use `jq` often and find it simple.
| ts0000 wrote:
| Interesting, for me it's the exact opposite.
|
| I've tried a couple of times to get into awk, but still find
| the syntax arcane.
| Beltalowda wrote:
| I don't know; I wouldn't presume to tell you what _you_ do
| or don 't find arcane, but once I understood the somewhat
| unusual flow of awk ("for every line, check if the line
| matches this condition, and if it does run this block of
| code") I found it's quite easy to work with. It's "arcane"
| in the sense that it has an implicit loop and that it's a
| specialized language for a very limited class of problems,
| but I found that for this limited class of problem it's
| surprisingly effective.
| dotancohen wrote:
| > an implicit loop
|
| As an occasional awk user, I'd love if you expand on
| this. Maybe it will help clear things up for me. You're
| not referring to the fact that awk operates on every line
| independently, are you?
| Beltalowda wrote:
| My mental image of awk has always been something along
| these lines: for line in readfile()
| for block in script: if block.match(line)
| run_block(block) end endfor
| endfor
|
| Where the "for line in readfile()" is the "implicit
| loop", and the blocks are the "condition { .. }" blocks.
|
| The actual flow is a little bit more complex and has some
| exceptions e.g. (BEGIN/END), but this is about the gist
| of it.
| taude wrote:
| Same issue. However, I do successfully rely on using ctrl-r a
| lot to search prior invoked commands. And have a few core
| aliases that I've cobbled together....
| rgoodwintx wrote:
| Here because.... I didn't know of ctrl-R. What a life
| changer (although I had an alias for "hg" to "history |
| grep" :) )
| kalev wrote:
| Please check FZF [1] and it's integration with ctrl-r.
| It's a huge productivity boost and I cannot live without
| it.
|
| [1] https://github.com/junegunn/fzf
| carlhjerpe wrote:
| There's also McFly[1] that does interactive history
| search. [1]: https://github.com/cantino/mcfly
| laurent123456 wrote:
| I wonder if someone tried to use plain JS as a filtering
| language? It would be more verbose but it would be easy to
| remember. For example: [1,2,3] | js "out =
| 0; for (const n of this) out += n"
|
| That would print "6". `out` would be a special variable you
| write to to print the result, and `this` would be the input.
| Beltalowda wrote:
| A few of the tools listed here seem to work like that, or
| roughly similar: https://ilya-sher.org/2018/04/10/list-of-
| json-tools-for-comm...
|
| I didn't check any of them out though.
| mechanicalpulse wrote:
| I've used trentm's json (formerly known as jsontool)
| package from npm as my default tool for command-line
| manipulation of JSON for many years now. It provides CLI
| arguments for passing JavaScript code for filtering and
| executing on input. I have resisted investing the time into
| becoming fluent in jq because I've found that many of the
| common use cases I have are readily handled by jsontool.
|
| https://www.npmjs.com/package/json
|
| Edit: added more information
| lgas wrote:
| My hope was to one day add JS eval support to
| https://github.com/SuperpowersCorp/refactorio but as you
| can tell by the timestamps I haven't found any time to work
| on it in the last 4 years.
| adamgordonbell wrote:
| I found it hard to approach at first, but I think it was just
| the lack of material that worked through simple examples step
| by step.
|
| I ended up writing my own guide to it, that in my unbiased
| opinion makes it easier to get the point where in-depth
| examples and language descriptions are easier to understand.
|
| Edit: Oh, wow, it's even mentioned in this article. Maybe I
| should read before commenting.
|
| https://earthly.dev/blog/jq-select/
| sfink wrote:
| I discovered jq after I wrote my own (extremely limited)
| version of it. I need it quite often, and yet I've never
| managed to get up the activation energy to learn enough for
| it to be useful. I need to have some notion of the
| computation model before anything is going to make sense to
| me. I hate learning things in completely disparate pieces
| that I need to memorize in hopes that someday it will just
| click together and I'll derive the underlying principles.
|
| Your guide was great for this. It stepped me through enough
| of the bare basics in a way that the underlying model was
| obvious. It didn't get me nearly far enough for many of the
| tasks that I need jq for, but it got me started and that's
| all I really needed. Everything additional that I need to
| learn becomes obvious in retrospect--"of course there's an
| operator for this, there kind of has to be!".
|
| Thank you!
| jdnier wrote:
| Here's podcast interview with the creator of jq about what he's
| been working on at Jane Street:
| https://signalsandthreads.com/memory-management/
| [deleted]
| sfink wrote:
| The thing that I find myself wanting, which is lacking in both jq
| and zq afaik, is interactive exploration. I want to move around
| in a large JSON file, narrow my context to the portion I'm
| interested in, and do specialized queries and transformations on
| just the data I care about.
|
| I wrote a tool to do this -- https://github.com/hotsphink/sfink-
| tools/blob/master/bin/jso... -- but I do not recommend it to
| anyone other than as perhaps a source of inspiration. It's slow
| and buggy, the syntax is cryptic and just matches whatever I came
| up with when I had a new need, etc. It probably wouldn't exist if
| I had heard of jq sooner.
|
| But for what it does, it's _awesome_. I can do things like:
| % json somefile.json > ls 0/ 1/ 2/
| > cd 0 > ls info/ files/ timings/
| version > cat version 1.2b > cat timings/*/mean
| timings/firstPaint/mean = 51 timings/loadEventEnd/mean =
| 103 timings/timeToContentfulPaint/mean = 68
| timings/timeToDomContentFlushed/mean = 67
| timings/timeToFirstInteractive/mean = 658 timings/ttfb/mean
| = 6
|
| There are commands for searching, modifying data, aggregating,
| etc., but those would be better done in a more principled, full-
| featured syntax like jq's.
|
| I see ijq, and it looks really nice. But it doesn't have the
| context and restriction of focus that I'm looking for.
| dan-robertson wrote:
| One solution I've seen is basically to hijack fzf to
| interactively input a jq query, add closing brackets in a naive
| way, run jq -C ... | head on an input file, and display the
| result as a fzf 'preview'. fzf ends up handling things like the
| preview command and display and line-editing logic but it may
| be slow if you don't get early results.
| lichtenberger wrote:
| That's one of the main steps forward for Brackit, a
| retargetable JSONiq query engine/compiler (http://brackit.io)
| and the append-only data store SirixDB (https://sirix.io) and a
| new web frontend. My vision is not only to explore the most
| recent revision but also any other older revisions, to display
| the diffs, to display the results of time travel queries...
| help is highly welcome as I'm myself a backend engineer and
| working on the query engine and the data store itself :-)
|
| Detect changes of a specific node and the whole
| subtree/subtree: let $node :=
| jn:doc('mycol.jn','mydoc.jn')=>fieldName[[1]] let
| $result := for $node-in-rev in jn:all-times($node)
| return if
| ((not(exists(jn:previous($node-in-rev))))
| or (sdb:hash($node-in-rev) ne sdb:hash(jn:previous($node-in-
| rev)))) then $node-in-rev
| else () return [
| for $jsonItem in $result return { "node": $jsonItem,
| "revision": sdb:revision($jsonItem) } ]
|
| Get all diffs between all revisions and serialize the output in
| an array: let $maxRevision :=
| sdb:revision(jn:doc('mycol.jn','mydoc.jn')) let $result
| := for $i in (1 to $maxRevision) return
| if ($i > 1) then
| jn:diff('mycol.jn','mydoc.jn',$i - 1, $i)
| else () return [
| for $diff at $pos in $result return {"diffRev" ||
| $pos || "toRev" || $pos + 1: jn:parse($diff)=>diffs} ]
|
| Open a specific revision
|
| By datetime: jn:open('mycol.jn','mydoc.jn',xs
| :dateTime('2022-03-01T00:00:00Z'))
|
| By revision number:
| jn:doc('mycol.jn','mydoc.jn',5)
|
| And a view of an outdated frontend:
|
| https://github.com/sirixdb/sirix/raw/master/Screenshot%20fro...
| ratorx wrote:
| I really like fx (https://github.com/antonmedv/fx) for
| interactive stuff. It does exactly what I think you want. You
| can expand individual fields and explore the schema.
|
| However, I really do like jq for queries and scripting, so I
| keep both around.
| endgame wrote:
| It feels a lot like the FP idea of a zipper coupled to an
| interactive shell.
| eloh wrote:
| You could take a look at jless [1], it allows interactive
| selections/browsing in JSON documents.
|
| [1] https://jless.io/
| ggm wrote:
| This is almost exactly how I think about the problem of
| deciding how to deep-key to a specific field of a nested json
| structure.
|
| If you can emit the syntactic form as a Python or perl ref, or
| a jq array ref, then I could use your tool to find the
| structure and the other ones to stream.
|
| Great example! Thanks for posting this.
| [deleted]
| abledon wrote:
| There is also "JP" https://github.com/jmespath/jp
|
| which follows the jmespath standard
| mdaniel wrote:
| My heartburn with jmespath is that it lacks pipelines, only
| projections, so doing _crazy_ stuff to the input structure is
| damn near impossible
| remram wrote:
| From a computer science point of view, what kind of
| transformations are impossible to express in jmespath but are
| possible in jq?
| mdaniel wrote:
| I dunno how to speak to your "computer science" part, but
| pragmatically anything that requires a "backreference",
| because unlike with JSONPath (and, of course, jq) there are
| no "root object" references $ printf
| '{"a": {"b":"c", "d":["d0","d1"]}}' | jq -r '[ .a as $a |
| $a.d[] | {x: ., y: $a.b}]' [ {
| "x": "d0", "y": "c" }, {
| "x": "d1", "y": "c" } ]
|
| and I realize this isn't as pure CS-y as you were asking,
| but this syntax is hell on quoting $
| printf '["a","b"]' | jp -u 'join(`"\n"`, @)' # vs
| $ printf '["a","b"]' | jq -r 'join("\n")'
| remram wrote:
| I see. The need to quote JSON values and the need for @
| seem like a high price to pay for removing the . in field
| accesses.
|
| I also find jq more intuitive but I really dislike that
| we have three standards each used by a number of tools,
| e.g. jsonpath, jmespath, and jq.
| NateEag wrote:
| I suspect the JMESPath people would argue that if you want to
| do major transformations to the input, you should write a
| proper program, and that a CLI query tool should focus on,
| well, querying.
|
| I'm personally trying to move away from jq and towards jp,
| because
|
| - there's a standard defining it, not just an implementation,
| decreasing the odds of being stuck with an unmaintained tool
|
| - there are libraries supporting the syntax for most of the
| major programming languages
|
| - JMESPath's relative simplicity compared to jq is a good
| thing, IMO - Turing-completeness is a two-edged sword
|
| - JMESPath is the AWS CLI query language, which is a
| convenient bonus
| mdaniel wrote:
| > JMESPath is the AWS CLI query language, which is a
| convenient bonus
|
| And in ansible, too, FWIW, but yes it's my hand-to-hand
| combat with the language in both of those circumstances
| that has formed my opinion about it
|
| Regrettably, "kubectl get -o jsonpath" is _almost_ the
| same, but just different enough to trip me up :-(
| arwineap wrote:
| I've never found jq to be particularly hard, or slow
| micimize wrote:
| Their syntax comparison under "So you like chocolate or vanilla?"
| is disingenuous. You can do variable assignment and array
| expansion in jq:
| expand_vals_into_independent_records=' .name as $name |
| .vals[] | { name: $name, val: . } ' echo
| '{"name":"foo","vals":[1,2,3]} {"name":"bar","vals":[4,5]}' |
| jq "$expand_vals_into_independent_records"
|
| Also, generally, not a fan of the tone of this article.
| lilyball wrote:
| Your `.name as $name` was my immediate attempt too, but it
| turns out you can go even simpler with jq
| '{name, val: .vals[]}'
| 29athrowaway wrote:
| "Easier" is subjective. For simple use-cases, zq is harder to
| understand than jq.
|
| I also have never seen jq as a performance bottleneck.
|
| jq is stable, I have never encountered a bug with it and I have
| never seen it getting stuck after years of usage. It is
| dependable and practical.
|
| jq has helped me put out countless fires throughout my career. I
| should donate to it one day.
| politelemon wrote:
| > HomeBrew for Mac or Linux
|
| Please do not recommend HomeBrew for Linux. A binary download is
| safer compared to how HomeBrew clobbers a Linux machine. If you
| do not wish to use a Linux package manager, simply point at the
| binary download. It is much safer and less intrusive.
| xenophonf wrote:
| Homebrew isn't any better on macOS. Why people use it instead
| of MacPorts is beyond me.
| ilyash wrote:
| While we are at it, I have a list of JSON tools for command line
| here - https://ilya-sher.org/2018/04/10/list-of-json-tools-for-
| comm...
| AcerbicZero wrote:
| I'm pretty new to jq (maybe 2 years of exposure) but from my
| perspective - on some level, jq does to json what powershell does
| to everything windows, except powershell gives me the get-member
| cmdlet, so when I don't know what is even in my object, I can
| explore.
|
| Sometimes jq -r '.[]' works, but its all just trial and error. I
| use plenty of jq in my scripts, but I can never seem to visualize
| how jq looks at the data. I just have to toss variations of
| '.[whateveriwant].whatever[.want.]' until something works....I
| suppose the root of my complaint is that jq does not do a good
| job of teaching you to use jq. It either works, or gives you
| nothing, and while I've learned to work around that, I'll try
| anything that claims to be even 1% better than jq.
| quotemstr wrote:
| As an aside --- isn't the traditional flat namespace of unix
| command names getting a _bit_ crowded nowadays?
| eatonphil wrote:
| If jq is getting too slow for you (that's never happened for me),
| it really seems like it's time to put your data in a database
| like sqlite or duckdb at least.
|
| Incidentally there are many tools that help you do this like dsq
| [0] (which I develop), q [1], textql [2], etc.
|
| [0] https://github.com/multiprocessio/dsq
|
| [1] https://github.com/harelba/q
|
| [2] https://github.com/dinedal/textql
| jeffbee wrote:
| I don't agree. There is a great deal of room for improvement in
| jq performance. I profiled one invocation and it spent the
| majority of its time asserting that the stack depth was lower
| than some amount, which is crazy. I rebuilt it with NDEBUG
| defined and it was seriously ten times faster, but it's not
| safe to run it that way because it has asserts with side
| effects, which is also crazy.
|
| Rewriting all or parts of it in C++ would make it dramatically
| faster. I would start by ripping out the asserts and using a
| different strtod which they spend an awful lot of time in.
| eatonphil wrote:
| Fair point! I don't mean to say jq performance can't or
| shouldn't be improved.
|
| Just that jq does two things: 1) ingest and 2) query.
|
| If you're doing a bunch of exploration on a single dataset in
| one period of time or if the dataset is large enough and
| you're selecting subsets of it, you can ingest the data into
| a database (and optionally toggle indexes).
|
| Then you can query as many times as you want and not worry
| about ingest again until your data changes.
|
| All three of the tools I listed have variations of this sort
| of caching of data built in. For dsq and q with caching
| turned on, repeat queries against files with the same hashsum
| only do queries against data already in SQLite, no ingestion.
| jeffbee wrote:
| I have a large GeoJSON dataset I analyze to answer local
| government questions. It is of course loaded into a
| database for common questions but I also find myself doing
| ad hoc queries that aren't suited to the database
| structure, and that's where I find myself waiting for jq.
| Also I use jq as the ETL for that database.
| gcmeplz wrote:
| I like using `jq` to create line-delimited JSON and then using a
| language I know well (Node) to process it after that point. I
| find `jq '.[] | select(.location=="Stockholm")'` less readable
| than something like `nq --filter '({location}) => location ===
| "Stockholm"'` because I'm much more used to Node syntax.
|
| - https://github.com/thisredone/rb is a widely used ruby version
| of this idea
|
| - https://github.com/KelWill/nq#readme is something similar that
| I wrote for my own use
| eru wrote:
| By Node, you mean JavaScript?
|
| If yes, it's fascinating to me, that jq is so powerful, it's
| even useful when handling JavaScript Object Notation in
| JavaScript.
| msluyter wrote:
| Whenever jq comes up I feel obligated to mention 'gron'[1]. If
| all you're doing is trying to grep some deeply nested field, it's
| way easier with gron, IMHO.
|
| [1] https://github.com/tomnomnom/gron
| RulerOf wrote:
| Gron and jq are complementary tools IMO. I frequently use gron
| to trim down large json files such that I can determine what my
| ultimate jq query is going to look like.
| zimpenfish wrote:
| Used it only this morning to find out if/where the JSON for a
| tweet mentioned the verification status of the poster and/or
| retweetee[1]. Quick and easy to dump it through `gron | grep
| verif` to find out the paths.
|
| [1] "the person who was retweeted" in lieu of a better word.
| radicality wrote:
| For a moment I thought that this is `glom`, which is also a
| tool I can recommend if you need to be doing any json
| processing in python (comes with a cli too). It does have a
| relatively steep learning curve for the advanced features, but
| does allow you to do interesting things like concisely write
| recursive parsers in the mini-dsl Glom provides.
|
| https://glom.readthedocs.io/en/latest/
| knome wrote:
| These guys must really hate functional programming.
|
| I can see where jq might confuse someone new to it, but their
| replacement is irregular, stateful, still difficult, and I don't
| even see variable binding or anything.
|
| jq requires you to understand that `hello|world` will run world
| for each hello, passing the world out values to either the next
| piped expression, the wrapping value-collecting list, or printing
| them to stdout.
|
| it's a bit unintuitive if you come in thinking of them as regular
| pipelines, but it's a constant in the language that once learned
| always applies.
|
| this zed thing has what appears to be a series of workarounds for
| its own awkwardness, where they kept tacking on new forms to try
| to bandaid those that came before.
|
| additionally, since they made attribute selectors barewords where
| jq would require a preceding reference to a variable or the
| current value (.), I'm not sure where they'll go for variables
| should they add them.
| thayne wrote:
| I think their main complaint is that you can't iteratively
| operate on a stream as a whole without first converting it to
| an array, which besides sometimes requiring awkward syntax, can
| require a lot of memory for large datasets.
| [deleted]
| johnday wrote:
| No kidding!
|
| This part in particular jumped out at me:
|
| > To work around this statelessness, you can wrap a sequence of
| independent values into an array, iterate over the array, then
| wrap that result back up into another array so you can pass the
| entire sequence as a single value downstream to the "next
| filter".
|
| This is literally just describing a map. A technique so
| generally applicable and useful that it's made its way into
| every modern imperative/procedural programming language I can
| think of. The idea that this person fails to recognise such a
| common multiparadigmatic programming idiom doesn't fill me with
| confidence about the design of zq.
| aarchi wrote:
| In fact, jq already has `map`, which would replace the
| article's pattern of `[.[]|add]` with `map(add)`. It is
| defined as such: def map(f): [.[] | f];
|
| Many built-in functions in jq are implemented in jq, in terms
| of a small set of core primitives. The implementations can be
| inspected in builtin.jq.
|
| https://github.com/stedolan/jq/blob/master/src/builtin.jq#L3
| mattnibs wrote:
| Variables exist in zq, "this" is a reserved word: echo {x:1} |
| zq 'x := x+1' -
| [deleted]
| aarchi wrote:
| I find the stateless streaming paradigm in jq very pleasing.
|
| Results can be emitted iteratively using generators, which are
| implemented as tail-recursive streams [0]. Combined with the
| `input` built-in filter, which yields the next item in the
| input stream, and jq can handle real-time I/O and function as a
| more general-purpose programming language.
|
| I built an interpreter for the Whitespace programming language
| in jq using these concepts and it's easily one of the most
| complex jq programs out there.
|
| [0]:
| https://stedolan.github.io/jq/manual/#Generatorsanditerators
|
| [1]: https://github.com/andrewarchi/wsjq
| weinzierl wrote:
| jq is incredibly powerful and I'm using it more and more. Even
| better, there is a whole ecosystem of tools that are similar or
| work in conjunction with jq:
|
| * jq (a great JSON-wrangling tool)
|
| * jc (convert various tools' output into JSON)
|
| * jo (create JSON objects)
|
| * yq (like jq, but for YAML)
|
| * fq (like jq, but for binary)
|
| * htmlq (like jq, but for HTML)
|
| List shamelessly stolen from Julia Evans[1]. For live links see
| her page.
|
| Just a few days ago I needed to quickly extract all JWT token
| expiration dates from a network capture. This is what I came up
| with: fq 'grep("Authorization: Bearer.*" ) |
| print' server.pcap | grep -o 'ey.*$' | sort | uniq | \ jq
| -R '[split(".") | select(length > 0) | .[0],.[1] | gsub("-";"+")
| | gsub("_";"/") | @base64d | fromjson]' | \ jq '.[1]' | jq
| '.exp' | xargs -n1 -I! date '+%Y-%m-%d %H:%M:%S' -d @!
|
| It's not a beauty but I find the fact that you can do it in one
| line, with proper parsing and no regex trickery, remarkable.
|
| [1] https://jvns.ca/blog/2022/04/12/a-list-of-new-ish--
| command-l...
| kitd wrote:
| Also highly recommended is gron [0], to make json easily
| searchable
|
| [0] https://github.com/TomNomNom/gron
| spudlyo wrote:
| Most of the time I can get what I need with gron and
| traditional UNIX tools, without needing to reach for jq, and
| without having to re-learn its somewhat arcane syntax.
| zikduruqe wrote:
| I came here looking for a gron recommendation. I use this
| very often.
| toxik wrote:
| Is your example not easier to write and read as a 10-something
| line Python script? I never understood the appeal of jq etc
| because of this very reason.
| [deleted]
| hoherd wrote:
| I would definitely add dasel to that list. It's become my de
| facto serialized data converter, and regularly use it to
| convert between csv, toml, yaml, json, and xml using jq-ish
| syntaxes.
|
| https://github.com/tomwright/dasel
| chriswarbo wrote:
| The yq tool also provides 'xq', which works on XML :)
___________________________________________________________________
(page generated 2022-04-26 23:00 UTC)