[HN Gopher] Writing a simple JSON library from scratch: a tour t...
       ___________________________________________________________________
        
       Writing a simple JSON library from scratch: a tour through modern
       C++
        
       Author : eatonphil
       Score  : 40 points
       Date   : 2021-08-26 21:10 UTC (1 hours ago)
        
 (HTM) web link (notes.eatonphil.com)
 (TXT) w3m dump (notes.eatonphil.com)
        
       | trzeci wrote:
       | Title: > a tour through modern C++
       | 
       | Then:
       | 
       | ```
       | 
       | #ifndef JSON_H
       | 
       | #define JSON_H
       | 
       | // ...
       | 
       | #endif
       | 
       | ```
       | 
       | Still I didn't get how #pragma once didn't end up in a standard,
       | when every major compiler supports that.
        
         | kitkat_new wrote:
         | I thought modules would be modern C++ now? Am I missing
         | something?
        
       | henning wrote:
       | Why do people make websites that are so narrow that the code
       | snippets have a horizontal scrollbar? I have a 4k monitor. Modern
       | web design is almost as stupid as modern C++.
        
       | cpp2013wow wrote:
       | Noobs writing C++ tutorials are a crime against humanity.
       | Compilers too, but I digress. Are you a time traveler from ten
       | years ago or something, what is this even supposed to be.
        
       | nwellnhof wrote:
       | I know it's only a tutorial but recursive function calls are a
       | bad idea for a general purpose JSON parser. At the very least,
       | limit the recursion depth. Otherwise, it's trivial to make the
       | parser crash with a stack overflow.
        
         | cpp2013wow wrote:
         | That would require knowing how to iterate a tree without
         | recursion, tough stuff. Hi dang, love our private chat.
        
       | jcelerier wrote:
       | That                 struct JSONValue {
       | std::optional<std::string> string;         std::optional<double>
       | number;         std::optional<bool> boolean;
       | std::optional<std::vector<JSONValue>> array;
       | std::optional<std::map<std::string, JSONValue>> object;
       | JSONValueType type;       };
       | 
       | _really_ ought to be a variant, that would simplify things a lot.
       | It 'd just be                 struct JSONValue;       using
       | variant_type =          std::variant<          std::string
       | , double        , bool        , std::vector<JSONValue>        ,
       | std::map<std::string, JSONValue>       >;       struct JSONValue
       | : variant_type {          using variant::variant;       };
       | 
       | then instead of switches you'd do:
       | std::string_view print_type(const variant_type& jtt) {
       | using namespace std::literals;         struct {           auto
       | operator()(const std::string&) const noexcept { return
       | "String"sv; }           auto operator()(double) const noexcept {
       | return "Number"sv; }           auto operator()(bool) const
       | noexcept { return "Bool"sv; }           auto operator()(const
       | std::vector<JSONValue>&) const noexcept { return "Array"sv; }
       | auto operator()(const std::map<std::string, JSONValue>&) const
       | noexcept { return "Dict"sv; }           auto
       | operator()(std::monostate) const noexcept { return "Null"sv; }
       | } vis;         return std::visit(vis, jtt);       }
       | 
       | which is much safer than switch/cases
        
         | jcelerier wrote:
         | Since you're interested in readability, you may also like
         | std::format:                   std::string
         | format_parse_error(std::string base, JSONToken token) {
         | std::ostringstream s;            s << "Unexpected token '" <<
         | token.value << "', type '"              <<
         | JSONTokenType_to_string(token.type) << "', index ";
         | s << std::endl << base;           return format_error(s.str(),
         | *token.full_source, token.location);         }
         | 
         | becomes                   std::string
         | format_parse_error(std::string_view base, JSONToken token) {
         | return format_error(            std::format(
         | "Unexpected token '{}', type '{}', index '{}'\n{}"
         | , token.value              ,
         | JSONTokenType_to_string(token.type)              , index
         | , base)            , *token.full_source            ,
         | token.location           );         }
        
           | eatonphil wrote:
           | I was excited to use it! But I don't think it's available in
           | Clang 12 (noted in this post) which is the latest version
           | that comes with latest Fedora.
           | 
           | I didn't want to have to spring a newer compiler version than
           | came with the distro.
           | 
           | Unless it's just a flag I did not turn on.
        
         | eatonphil wrote:
         | I do get that it's the type-safe approach but your resulting
         | code looks more complicated than what I did (to me).
         | 
         | Still, thanks for sharing so that other people can see for
         | themselves.
         | 
         | One of the biggest turnoffs about std::variant to me (which
         | isn't relevant here) is that you can't name the types so the
         | hacks for having multiple of the same type in std::variant look
         | even hairier.
        
           | gravypod wrote:
           | A nice follow up would also be looking at std::string_view as
           | this has the potential to save a lot of memory.
        
             | eatonphil wrote:
             | Since I'm unfamiliar with the underlying concept, could you
             | give an example of how you could see that being used?
        
               | gravypod wrote:
               | `std::string_view` is essentially something like this:
               | struct string_view {           char *data;
               | size_t length;        }
               | 
               | It is a pointer to a subset of an existing string. It has
               | all of the existing features of `std::string` but passing
               | it around is zero copy. As long as the original memory
               | allocation exists your `string_view`s are still valid.
               | Also, it has an implicit constructor from `const
               | std::string` and `const char #` allowing you you define a
               | single function like `Thing ParseThing(std::string_view
               | line)` and accept `char #` and `std::string` as an input.
               | 
               | More examples are here: https://abseil.io/tips/1
               | 
               | # -> *. HN formats bad
        
               | jcelerier wrote:
               | every function where you are reading from a string
               | without writing to it or saving it, you can use
               | std::string_view. Same for vector / array: anything non-
               | owning, read-only should use std::span instead.
        
           | jcelerier wrote:
           | it's not only type-safe, it uses much less memory, takes less
           | time to construct and does not duplicate the "null" state:
           | 
           | - your JSONStruct takes 160 bytes (on 64-bit systems with
           | libstdc++), the variant one takes 56, almost three times
           | less.
           | 
           | - it needs std::optional<T> for each member to avoid
           | construction, while std::variant does that by default. But
           | the compiler still have to initialize all these optionals to
           | their default state.
           | 
           | - you need on every access two branches:
           | 
           | * one for the type switch
           | 
           | * one for the optional
           | 
           | the variant only has one level of indirection.
        
             | eatonphil wrote:
             | That's fair. I assumed it would be clear to anyone who
             | knows C++ better than I that the point of this post wasn't
             | exactly performance.
             | 
             | Actually I was most interested in just how simple and
             | readable I could make it. But I didn't say that in the post
             | anywhere. Will add.
        
               | jcelerier wrote:
               | > I assumed it would be clear to anyone who knows
               | 
               | It's just not possible to assume that at all on the
               | internet. Every year I have students who learn from blog
               | posts like this for instance.
        
               | bialpio wrote:
               | Performance aside, using variant has an added benefit of
               | making illegal states unrepresentable. Optionals + an
               | enum still allows users to either nullopt all the
               | members, or have more than one member that is not
               | nullopt.
        
               | gravypod wrote:
               | As someone who does quite a bit of C++ recently: for most
               | uses readability is king.
               | 
               | If you did want squeeze out the extra performance you
               | could also do something where you store `std::variant<>`
               | inside a struct and provide an API like this:
               | if (auto int_value = value.i()) {           // *int_value
               | == your int         }
               | 
               | and expose a `std::variant<> &variant();` for use cases
               | where you want the more complex/faster stuff.
        
               | tcbawo wrote:
               | Using variant as the storage type while providing a nice
               | public accessor API makes sense given the huge difference
               | in semantics between a struct containing optional fields
               | and a variant.
        
       ___________________________________________________________________
       (page generated 2021-08-26 23:00 UTC)