json.c: fix utf-16 surrogate pair range - json2tsv - JSON to TSV converter
(HTM) git clone git://git.codemadness.org/json2tsv
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
(DIR) commit b65bd5139faec35430d342dbce6c3b4bf802f4a8
(DIR) parent 8128e7fa5d865ac7adff28e7ffb732f0b3b61f58
(HTM) Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 22 Jan 2021 00:23:19 +0100
json.c: fix utf-16 surrogate pair range
Test-case of a high codepoint: U+10FFFD.
Previously incorrect:
printf '%s' '["\udbff\udffd"]' | json2tsv | hexdump -C
00000000 09 61 09 0a 5b 5d 09 73 09 ed af bf ed bf bd 0a |.a..[].s........|
00000010
Now correct:
printf '%s' '["\udbff\udffd"]' | ./json2tsv | hexdump -C
00000000 09 61 09 0a 5b 5d 09 73 09 f4 8f bf bd 0a |.a..[].s......|
0000000e
Diffstat:
M json.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
---
(DIR) diff --git a/json.c b/json.c
@@ -152,8 +152,8 @@ escchr:
cp |= (hexdigit(c) << i);
}
/* RFC8259 - 7. Strings - surrogates.
- * 0xd800 - 0xdb7f - high surrogates */
- if (cp >= 0xd800 && cp <= 0xdb7f) {
+ * 0xd800 - 0xdbff - high surrogates */
+ if (cp >= 0xd800 && cp <= 0xdbff) {
if ((c = GETNEXT()) != '\\') {
len += codepointtoutf8(cp, &str[len]);
goto chr;