json2tsv.1 - json2tsv - JSON to TSV converter
 (HTM) git clone git://git.codemadness.org/json2tsv
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) README
 (DIR) LICENSE
       ---
       json2tsv.1 (3531B)
       ---
            1 .Dd April 17, 2023
            2 .Dt JSON2TSV 1
            3 .Os
            4 .Sh NAME
            5 .Nm json2tsv
            6 .Nd convert JSON to TSV or separated output
            7 .Sh SYNOPSIS
            8 .Nm
            9 .Op Fl n
           10 .Op Fl r
           11 .Op Fl u
           12 .Op Fl F Ar fs
           13 .Op Fl R Ar rs
           14 .Sh DESCRIPTION
           15 .Nm
           16 reads JSON data from stdin.
           17 It outputs each JSON type to a TAB-Separated Value format per line.
           18 .Pp
           19 The options are as follows:
           20 .Bl -tag -width Ds
           21 .It Fl n
           22 Show the indices for array types (by default off).
           23 .It Fl r
           24 Show all control-characters (by default off).
           25 .It Fl u
           26 Unbuffered: flush output after printing each value (by default off).
           27 .It Fl F Ar fs
           28 Use
           29 .Ar fs
           30 as the field separator.
           31 The default is a TAB character.
           32 .It Fl R Ar rs
           33 Use
           34 .Ar rs
           35 as the record separator.
           36 The default is a newline character.
           37 .El
           38 .Sh SEPARATOR CHARACTERS
           39 The
           40 .Ar fs
           41 or
           42 .Ar rs
           43 separators can be specified in the following formats:
           44 .Pp
           45 .Bl -item -compact
           46 .It
           47 \e\e for a backslash character.
           48 .It
           49 \en for a newline character.
           50 .It
           51 \er for a carriage return character.
           52 .It
           53 \et for a TAB character.
           54 .It
           55 \exXX for a character specified in the hexadecimal format as XX.
           56 .It
           57 \eNNN for a character specified in the octal format as NNN.
           58 .El
           59 .Pp
           60 Otherwise: if a single character is specified this character will be used.
           61 If more than one character is specified it will be parsed as a number using the
           62 format supported by
           63 .Xr strtol 3
           64 with base set to 0 and this character is the index in the ASCII table.
           65 .Sh OUTPUT FORMAT
           66 The output format per node is:
           67 .Bd -literal
           68 nodename<FIELD SEPARATOR>type<FIELD SEPARATOR>value<RECORD SEPARATOR>
           69 .Ed
           70 .Pp
           71 Control-characters such as a newline, TAB and backslash (\en, \et and \e\e) are
           72 escaped in the nodename and value fields unless a
           73 .Fl F
           74 or
           75 .Fl R
           76 option is specified.
           77 .Pp
           78 When the
           79 .Fl F
           80 or
           81 .Fl R
           82 option is specified then the separator characters are removed from the output.
           83 TABs or newlines are printed unless they are set as a separator.
           84 Other control-characters are removed, unless the option
           85 .Fl r
           86 is set.
           87 .Pp
           88 The type field is a single byte and can be:
           89 .Pp
           90 .Bl -item -compact
           91 .It
           92 a for array
           93 .It
           94 b for bool
           95 .It
           96 n for number
           97 .It
           98 o for object
           99 .It
          100 s for string
          101 .It
          102 ? for null
          103 .El
          104 .Sh EXIT STATUS
          105 .Nm
          106 exits with the exit status 0 on success, 1 on a parse error, 2 when out of
          107 memory or a read/write error or 3 with an usage error.
          108 .Sh EXAMPLES
          109 .Bd -literal
          110 json2tsv < input.json | awk -F '\et' '$1 == ".url" { print $3 }'
          111 .Ed
          112 .Pp
          113 To filter without having to unescape characters the
          114 .Fl F
          115 and
          116 .Fl R
          117 options can be used.
          118 In the example below it uses the ASCII character 0x1f (Unit Separator) as the
          119 field separator and the ASCII character 0x1e (Record Separator) as the record
          120 separator.
          121 Additionally the
          122 .Fl r
          123 option is used so control-characters are printed.
          124 .Bd -literal
          125 json2tsv -r -F '\ex1f' -R '\ex1e' < input.json | \e
          126         awk '
          127         BEGIN {
          128                 FS = "\ex1f"; RS = "\ex1e";
          129         }
          130         $1 == ".url" {
          131                 print $3;
          132         }'
          133 .Ed
          134 .Pp
          135 The example can be simplified using the convenience wrapper shellscript
          136 .Xr jaq 1
          137 .Bd -literal
          138 jaq '$1 == ".url" { print $3 }' < input.json
          139 .Ed
          140 .Sh SEE ALSO
          141 .Xr awk 1 ,
          142 .Xr jaq 1
          143 .Sh AUTHORS
          144 .An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org
          145 .Sh CAVEATS
          146 .Bl -item
          147 .It
          148 Characters in object keys such as a dot or brackets are not escaped in the TSV
          149 output, this can change the meaning of the nodename field.
          150 .It
          151 The JSON parser handles all valid JSON.
          152 It also allows some invalid JSON extensions: it does not do a complete
          153 validation on numbers and is not strict with handling unicode input.
          154 See also RFC 8259 section 9. Parsers.
          155 .It
          156 The maximum depth of objects or arrays is hard-coded to 64 levels deep.
          157 .El