[HN Gopher] Parsing Protobuf Definitions with Tree-sitter
       ___________________________________________________________________
        
       Parsing Protobuf Definitions with Tree-sitter
        
       Author : PaulHoule
       Score  : 35 points
       Date   : 2024-08-03 19:13 UTC (3 hours ago)
        
 (HTM) web link (relistan.com)
 (TXT) w3m dump (relistan.com)
        
       | cyberax wrote:
       | I don't get it. Why not just use a better Protobuf model? Go's
       | serialization format for protobufs is not the most brilliant one,
       | but it's reasonable.
       | 
       | E.g. just use `string` instead of `StringValue`.
        
       | MathMonkeyMan wrote:
       | I need to get around to playing with tree-sitter. The approach in
       | this article is neat.
       | 
       | Here's another approach. The AST of a .proto file is itself a
       | protobuf. That's how the codegen plugins work. Protobuf also has
       | a canonical mapping to JSON, so...
       | 
       | What you can do is use protoc to parse the .proto file, spit it
       | out as JSON, and then process that data using your favorite
       | pattern matching language. I wrote a [tool][1] that helps with
       | that. For example, here's some [js code][2] that translates
       | protobuf message definitions into "types" for use in an ORM.
       | 
       | [1]: https://github.com/dgoffredo/protojson
       | 
       | [2]:
       | https://github.com/dgoffredo/okra/blob/master/lib/proto2type...
        
         | superb_dev wrote:
         | Oh my god... this might've made some tools I'm developing a lot
         | easier
        
         | fizx wrote:
         | Writing a protoc plugin would have been 5x easier, but its
         | harder to get a blog article out of it.
         | 
         | Also, this reads like they might not have seen the newer proto3
         | optional keyword, or know about the well-known wrapper types.
        
       | grumbles wrote:
       | Huh. tree-sitter seems neat, but I don't really get why the
       | author thinks processing the descriptor set is so hard. Seems
       | equally difficult to learn a bunch of new abstractions in the
       | form of tree-sitter vs just learning protobuf's own ones.
       | 
       | Also, if you're parsing .proto files directly, you have to deal
       | with a bunch of annoying issues like include paths, how you
       | package sets of them to move around, etc. descriptor sets seem
       | like a better solution to me.
        
       | pcj-github wrote:
       | From the docs "The protocol compiler can output a
       | FileDescriptorSet containing the .proto files it parses."
       | (https://github.com/protocolbuffers/protobuf/blob/main/src/go...)
       | 
       | I don't understand the point of using tree-sitter to repeat that
       | work (almost certainly having bugs doing so). Am I missing
       | something?
        
       | Arainach wrote:
       | Like others, I don't understand the author's issues getting the
       | stock proto reflection behavior to extract this information.
       | 
       | I'm not as familiar with the Go reflection tools, but getting the
       | information the author wants is trivial in Java reflection.
        
       ___________________________________________________________________
       (page generated 2024-08-03 23:00 UTC)