utf8pad, colw: fix byte-seek issue with negative width codepoints in the range >= 127 - sfeed_curses - sfeed curses UI (now part of sfeed, development is in sfeed)
(HTM) git clone git://git.codemadness.org/sfeed_curses
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
(DIR) commit fe6690c9b66d1eb956d8effc5b0a305b24725db5
(DIR) parent 7f13213a355aba904f12a595b322909ce630fbe1
(HTM) Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sat, 9 Jan 2021 16:00:51 +0100
utf8pad, colw: fix byte-seek issue with negative width codepoints in the range >= 127
For example: "\xef\xbf\xb7" (codepoint 0xfff7), returns wcwidth(wc) == -1.
The next byte was incorrected seeked, but the codepoint itself was valid
(mbtowc).
Diffstat:
M sfeed_curses.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
---
(DIR) diff --git a/sfeed_curses.c b/sfeed_curses.c
@@ -315,19 +315,18 @@ colw(const char *s)
slen = strlen(s);
for (i = 0; i < slen; i += inc) {
- inc = 1;
+ inc = 1; /* next byte */
if ((unsigned char)s[i] < 32) {
continue;
} else if ((unsigned char)s[i] >= 127) {
rl = mbtowc(&wc, &s[i], slen - i < 4 ? slen - i : 4);
+ inc = rl;
if (rl < 0) {
mbtowc(NULL, NULL, 0); /* reset state */
- inc = 1; /* next byte */
+ inc = 1; /* invalid, seek next byte */
w = 1; /* replacement char is one width */
} else if ((w = wcwidth(wc)) == -1) {
continue;
- } else {
- inc = rl;
}
col += w;
} else {
@@ -355,19 +354,18 @@ utf8pad(char *buf, size_t bufsiz, const char *s, size_t len, int pad)
slen = strlen(s);
for (i = 0; i < slen; i += inc) {
- inc = 1;
+ inc = 1; /* next byte */
if ((unsigned char)s[i] < 32)
continue;
rl = mbtowc(&wc, &s[i], slen - i < 4 ? slen - i : 4);
+ inc = rl;
if (rl < 0) {
mbtowc(NULL, NULL, 0); /* reset state */
- inc = 1; /* next byte */
+ inc = 1; /* invalid, seek next byte */
w = 1; /* replacement char is one width */
} else if ((w = wcwidth(wc)) == -1) {
continue;
- } else {
- inc = rl;
}
if (col + w > len || (col + w == len && s[i + inc])) {