[HN Gopher] Every bug/quirk of the Windows resource compiler (rc...
___________________________________________________________________
Every bug/quirk of the Windows resource compiler (rc.exe), probably
Author : nektro
Score : 209 points
Date : 2024-10-11 22:47 UTC (1 days ago)
(HTM) web link (www.ryanliptak.com)
(TXT) w3m dump (www.ryanliptak.com)
| squeek502 wrote:
| I'm the author if anyone has questions
| 1f60c wrote:
| Resinator's error messages look amazing! I also feel like I've
| gained a lot of cursed but useless (to me) knowledge, so thanks
| for that. :-)
|
| I don't have a horse in this race, but regarding FONT
| resources, I would like to humbly suggest not supporting them
| at all. Radical, but from what you wrote, they do seem pretty
| weird and ripe for accidental misuse. Plus, they are obsolete
| and it seems like Resinator already intentionally diverges from
| rc.exe in a few cases anyway.
| squeek502 wrote:
| Thanks!
|
| I'm actually pretty okay with where I've landed with FONT
| resources. The legwork has already been done in figuring
| things out, and with the strategy I've chosen, resinator
| doesn't even need to parse .fnt files at all, so the
| implementation is pretty simple (I wrote a .fnt parser, but
| it's now unused[1]).
|
| [1] https://github.com/squeek502/resinator/blob/master/src/fn
| t.z...
| kcbanner wrote:
| Thanks again for resinator! I recently used it to ship a small
| win32 utility (https://github.com/kcbanner/multi-mouse) and it
| worked perfectly.
| InvisibleUp wrote:
| The quote escaping seems to be identical to that of Visual
| Basic 6 (and likely QBASIC as well, although I haven't tested
| that.)
| bramhaag wrote:
| Fun article, thank you! One nitpick: some of the side-by-side
| code blocks overflow on mobile (Pixel 8 Pro, Firefox)
| Arnavion wrote:
| I was thinking `NOT (1|2)` and `NOT () 2` could make sense if the
| parser just has a `not_in_effect` flag that gets set to true when
| a `NOT` is encountered and then applies to the next integer as
| soon as one is parsed. So `NOT (1|2)` sets the flag, then starts
| parsing `(1|2)`. Once it's parsed the `1`, it notices a NOT is in
| effect so it applies it to the 1 as if it had just parsed `NOT 1`
| (which leaves 0 unchanged), then parses `| 2`, so the result is
| 2.
|
| `NOT () 2` would be the same logic. `)` signifies the end of an
| expression and thus evaluates to the current integral result,
| which is 0 (for the same reason that unary - is zero), and a NOT
| is in effect so it's treated as `NOT 0` which is a no-op ("unset
| no bits"). Then the next `2` makes the result `2`. This assumes
| that `x y` is parsed the same as `x | y` (maybe only if a `NOT`
| has been parsed at any point first) or as `y` (the same stack-
| like "the last number that was parsed becomes the result"
| behavior described in other items).
|
| This doesn't explain the `7 NOT NOT 4 NOT 2 NOT NOT 1 = 2` case
| though. If the parser just *sets* the `not_in_effect` flag when
| it encounters a `NOT` (instead of *toggling* it), then this would
| be `7 | NOT 4 | NOT 2 | NOT 1` which would be 0. If the parser
| does toggle the flag, this would be `7 | 4 | NOT 2 | 1` which
| would be 1 or 5. If the parser treats a `NOT` as ending the
| previous expression (if any), this would be `7 | NOT 0 | NOT 4 |
| NOT 2 | NOT 0 | NOT 1` which would be 0.
| immibis wrote:
| That's a crazy amount of work and a crazy amount of quirks
| indeed. Very much illustrates a mindset where the user is at
| fault if they provide bad input - and development effort for
| everything was multiplied compared to today. In 1985, of course,
| nobody cared about things like security from untrusted inputs,
| and reproducible builds.
|
| My favourite bug from this list is that the compiler expands tabs
| to spaces in string literals and puts them at tab stops based on
| the string literal's horizontal position in the source file.
|
| I think that being able to directly define resource type 6 is not
| a bug. You got exactly what you asked for - an invalid resource.
| Crashing when loading it isn't a bug, either.
|
| I suppose that style flag arguments are parsed as |-separated
| lists of numeric or NOT expressions, rather than single
| expressions where | serves as bitwise-or.
|
| > If the truncated value is >= 0x80, add 0xFF00 and write the
| result as a little-endian u32. If the truncated value is < 0x80
| but not zero, write the value as a little-endian u32.
|
| This is sign-extension: s8 -> s16 -> u16 -> u32. The examples
| below this also seem to have reversed the order of the input byte
| and the FF.
|
| Visual C++ 6, at least, includes a toolbar resource editor. IIRC
| it shows the toolbar metadata and the bitmap together in one
| editor, and you edit each button's image individually even though
| they are concatenated into one bitmap in the resource file.
|
| "GROUPBOX can only be used in DIALOGEX" might refer to some
| limitation other than the resource compiler. For example, perhaps
| Windows versions that don't support DIALOGEX also don't support
| GROUPBOX.
|
| A lot of them could be caused by memory safety errors. For
| example the fact that "1 ICON {" treats "ICON" as the filename is
| probably because the tokenizer doesn't set the Microsoft
| equivalent of yytext for tokens where it's not supposed to be
| relevant. Maybe it would even crash (null pointer) if { could be
| the first token (which it can't).
| squeek502 wrote:
| Appreciate the added context!
|
| > |-separated lists of numeric or NOT
|
| Note that | is not the only operator that can be used in style
| parameters, & + and - are all allowed too.
|
| > perhaps Windows versions that don't support DIALOGEX also
| don't support GROUPBOX
|
| Seems possible for sure. From [1]:
|
| > The 16-bit extended dialog template is purely historical. The
| only operating systems to support it were the Windows 95/98/Me
| series.
|
| [1]
| https://devblogs.microsoft.com/oldnewthing/20040622-00/?p=38...
|
| > The examples below this also seem to have reversed the order
| of the input byte and the FF.
|
| Good catch, fixed
| pilif wrote:
| What an amazing article. And what amazing analysis of this 30
| years old blob. This was super enjoyable to read.
|
| My only tiny gripe is that with the first quirk the author who
| insists that his implementation is bug for bug compatible lists
| the other implementations behavior and explains that they are
| very different from the original.
|
| And then they proceed with additional quirks where the supposedly
| bug-for-bug compatible implementation _also_ is different in the
| same way as the first example in that it produces an error
| message rather than the quirky output.
|
| Don't get me wrong: errors rather than quirks is much better
| behavior, but then don't claim to be bug-for-bug compatible, nor
| roast other implementations for doing the same thing.
| squeek502 wrote:
| Apologies if I got the tone wrong there, I definitely wasn't
| trying to roast the other projects.
|
| In terms of what I prioritized bug-for-bug compatibility on, I
| tailored it to getting
| https://github.com/squeek502/win32-samples-rc-tests passing
| 100%, and then also tried to take into account how likely a
| bug/quirk was to be used in a real .rc file (ultimately this is
| just a judgement call, though). The results of that test suite
| (provided in the readme) is also a better indication of how
| rc.exe-compatible the various resource compilers are in
| practice (i.e. on 'real' .rc files).
| urbandw311er wrote:
| I liked this article. I would suggest having a new category of
| "validation" for some of these. It's not particularly fair to
| call something a bug, for example, when it's just that rc.exe
| doesn't play nicely with things it never expected to receive,
| like non-numeric characters etc.
| rwmj wrote:
| Very brave author. I have contributed a few patches to WINDRES to
| fix some bugs and it's a strange tool / concept.
|
| I'm going to guess that Microsoft won't wish to fix any bugs in
| RC.EXE, since that would break some existing resource scripts,
| and at this point backwards compatibility is much more important
| than dealing with quirks.
|
| Edit: Reminds me a bit of my adventures with the Registry,
| another ill-conceived part of Windows:
| https://rwmj.wordpress.com/2010/02/18/why-the-windows-regist...
| squeek502 wrote:
| I think everything labeled 'miscompilation' could be fixed
| without breaking backwards compatibility, since triggering them
| always leads to an unusable/broken .res file. No clue how
| likely it is they'll be fixed, though.
| o11c wrote:
| You're assuming that anything will _notice_ if the .res file
| is broken, though. It might add a .res because "that's what
| Windows programs are supposed to do", or "only old versions
| of the program actually used that", or "everybody knows it
| crashes if you use that menu option, so don't use it."
|
| But the build still depends on the _compilation_ of resources
| succeeding.
| layer8 wrote:
| The concept is actually great, having declarative GUI
| components compiled into efficient binary representations, and
| allowing named values to be shared in a single-point-of-
| definition style with the code that will be handling those
| components at runtime.
| oefrha wrote:
| > My resource compiler implementation, resinator, has now reached
| relative maturity and has been merged into the Zig compiler (but
| is also maintained as a standalone project),
|
| I was going to say scope creep, but then I remembered I've
| replaced the cross toolchain with `zig cc` in a few small cgo
| projects. Does zig intend to become the busybox of compilers?
| squeek502 wrote:
| See https://www.ryanliptak.com/blog/zig-is-a-windows-resource-
| co... for an example of using the Zig build system to cross-
| compile an existing Windows GUI program written in C from any
| supported host system.
| oefrha wrote:
| Yeah I have used windres for that purpose in the past, open
| to replacing that with a zig invocation.
| mananaysiempre wrote:
| > Somehow, the filename { causes rc.exe to think the filename
| token is actually the preceding token, so it's trying to
| interpret ICON as both the resource type and the file path of the
| resource. Who knows what's going on there.
|
| > [...]
|
| > Strangely, rc.exe will treat FOO [in place of the resource
| type, followed by EOF] as both the type of the resource and as a
| filename (similar to what we saw earlier in "BEGIN or { as
| filename").
|
| Having written a stupid lexer recently, I'm almost certain I know
| what's going on there. The lexer has a lexeme type global that it
| always sets, and a separate lexeme text global that it only sets
| for text-like tokens (numbers, identifiers, strings, and bare
| filenames, which it does not interpret other than to tell where
| they end, that being the easiest way to deal with ANSI C's _pp-
| number_ s) but not for punctuation or EOF. Now have the code that
| looks for the resource filename blindly reach into the text
| global without first checking the type global (not even to check
| if it's EOF), and you get exactly the behaviour above.
|
| (Alternatively, the type could instead be returned by the next-
| lexeme function--that's what my stupid lexer currently does,
| anyway, though I'm considering changing it. The result is the
| same.)
|
| > For whatever reason, rc.exe will just take the last number
| literal in the expression and try to read from a file with that
| name [...].
|
| I think I'm skilled enough to fuck that up too: the code to read
| the filename calls the expression parser (which is either not
| supposed to be called at EOF, or returns EOF that you're supposed
| to check for in case that happens) and then blindly reaches into
| the lexeme text variable.
| phaedrus wrote:
| I recently wrote an ANTLR4 parser for RC files, as part of
| software archeology on a legacy codebase I support. Considering
| how many programs over the last 30 (35?) years exist that use the
| Windows resource compiler, it's surprising how little in-depth
| information and how few open source alternative tools exist for
| it. So I'm really glad to see both the information and the
| project in this post.
| layer8 wrote:
| It's impressive that with all those shenanigans going on, not
| more crashing scenarios where uncovered.
___________________________________________________________________
(page generated 2024-10-13 22:01 UTC)