Post ATF7othgfEgpugUuye by boredzo@mastodon.social
(DIR) More posts by boredzo@mastodon.social
(DIR) Post #ATF7oEmYoDE6zvSjNg by boredzo@mastodon.social
2022-11-24T07:14:53Z
0 likes, 0 repeats
You know what would be cool? If Alsoft would bring back PlusMaker.Now that macOS can't read legacy HFS volumes, it'd be great to have a thing I could drop a disk image on and have it spit out a new, HFS+ disk image with the same contents.
(DIR) Post #ATF7oFY3xbDxNFgfQG by boredzo@mastodon.social
2022-11-27T17:25:46Z
0 likes, 0 repeats
So I've been working on this. It is a hairy yak indeed, but I'm making progress.But the part I've just finished—translating the volume header—is the easiest part.The catalog format is different in HFS+, so catalog entries need to be translated as well. That might mean rebuilding the entire catalog; at any rate, it will take up more space.I *should* be able to translate or rebuild it in-memory, and then be able to tell whether the destination device is big enough for the new catalog.
(DIR) Post #ATF7oG65v4Rj4nbqT2 by boredzo@mastodon.social
2022-11-27T18:28:02Z
0 likes, 0 repeats
The need for the new catalog to be bigger means that attempting to keep it in the same extent(s) as the old catalog is pretty much pointless. Best case, one extent grows; worst case, one or more new extents are needed.Seems like the easiest way might be just putting it in one big new extent and shifting everything else forward by the difference in size between the old first extent (if it was used by the catalog) and the new one. Or even shoving the catalog down at the end.
(DIR) Post #ATF7oGmdMuTRCjVom0 by boredzo@mastodon.social
2022-11-27T18:33:07Z
0 likes, 0 repeats
On the flip side, the extents overflow file might get *smaller*.For one thing, adjacent extents could be consolidated, even if they were previously impossible to consolidate under HFS because of number limits (HFS+ uses bigger numbers).But more significantly, extent records hold more extents in HFS+ (8, up from 3), so a file with up to 5 overflow extents (or more in cases of consolidation) would be able to hold all of them in the catalog record.
(DIR) Post #ATF7oHTsm74JMrkMBU by boredzo@mastodon.social
2022-11-27T18:35:01Z
0 likes, 0 repeats
I'm trying to avoid making too many changes to the data (I want a converter, not necessarily an optimizer) but if it frees up space for the growth of the catalog file, it might be worth it.
(DIR) Post #ATF7oI42bfzZB0fEXo by boredzo@mastodon.social
2022-11-27T18:44:25Z
0 likes, 0 repeats
The other thing I'm debating is whether to just slurp the entire source volume into memory.*Technically*, it is possible to have an HFS volume of multiple GB, for which this would cause problems.But *practically*, I want this for converting CD-ROMs, and modern Macs have enough memory that you can have a short-lived 700 MB buffer (or even two) and it's fine.
(DIR) Post #ATF7oIggI0tt6qk5lw by boredzo@mastodon.social
2022-11-27T21:18:58Z
0 likes, 0 repeats
Current rough plan: https://mastodon.social/@boredzo/109417732067787130- block-copy the entire volume verbatim for throughput's sake;- mmap the source volume to translate from;- selectively overwrite free space in the new volume with the translated data.(I hope nobody was hoping to use this as a pipeline filter.)
(DIR) Post #ATF7oJHY4wOIxBzXEm by boredzo@mastodon.social
2022-11-27T23:22:49Z
0 likes, 0 repeats
A consequence of HFS's extents scheme is that it scales more by fragmentation or proliferation of files than sheer numbers of bytes.For example, the Quake 3 Arena for Mac CD-ROM is 517.6 MB, not a byte of it free. Its catalog… is 11 blocks long, or about 90 K.It helps that this volume has a block size of 8 K, so very few extents (which are measured in blocks) can cover relatively large files. In fact, one extent of 65,535 blocks at 8 K each could hold this entire volume.
(DIR) Post #ATF7oJwJdN06zd45mS by boredzo@mastodon.social
2022-11-28T02:29:59Z
0 likes, 0 repeats
Questions I have not found an answer for in Inside Macintosh: Files, nor in any technote:If an extent record has a non-empty first extent, an empty second extent, and a non-empty first extent, what is the correct interpretation of that record?(“correct”, absent documentation, would have to mean “what does Classic Mac OS do”)
(DIR) Post #ATF7oKV3YCn2jNJpvk by boredzo@mastodon.social
2022-11-28T02:33:41Z
0 likes, 0 repeats
I should probably also detect extents that overlap in whole or in part, and refuse to convert the volume in such cases.It would have to be a result of either software or hardware corruption, and I'd certainly be in no position to figure out which bit got flipped.
(DIR) Post #ATF7oKxlpRl6AQklge by boredzo@mastodon.social
2022-11-28T00:56:10Z
0 likes, 0 repeats
Been setting up Synalyze It! to enable me to view HFS volume headers outside of my own program. This also enables me to check my program's output (in particular, discovering endianness errors).
(DIR) Post #ATF7oL8PBuGWhPjGGO by boredzo@mastodon.social
2022-11-28T05:36:58Z
0 likes, 0 repeats
Making progress on catalog parsing:Slurped catalog file: <ImpBTreeFile 0x101320ae0 with estimated 328 nodes>Leaf node has 768 recordsLeaf node has 768 recordsLeaf node has 768 recordsLeaf node has 768 recordsLeaf node has 1280 recordsLeaf node has 512 recordsLeaf node has 1024 recordsLeaf node has 512 recordsLeaf node has 1024 recordsSaw 328 nodes(this is skipping over all the non-leaf nodes, including space allocated for nodes that don't exist/aren't reachable from the tree)
(DIR) Post #ATF7oLqiX9i8uqSeKe by boredzo@mastodon.social
2022-11-28T17:19:31Z
0 likes, 0 repeats
For whatever reason, the B*-tree node types are tagged as:kBTLeafNode= -1,kBTIndexNode= 0,kBTHeaderNode= 1,kBTMapNode= 2This then means that unused nodes, if properly zeroed out, look like index nodes.Fortunately I've now implemented parsing the header node. So now I know that this catalog file with space for 328 nodes is only using 13 of them.There are 9 leaf nodes, no map nodes, and the one header node, so that means there are 3 real index nodes.
(DIR) Post #ATF7oMSeG83IoUCwSG by boredzo@mastodon.social
2022-11-28T17:27:28Z
0 likes, 0 repeats
My B*-tree understanding so far:Every node contains records.The header node may be the head of a linked list of map nodes, extending the map record.Its header record includes the node number of the root node of the tree—presumably an index node usually, although it can be a leaf node.Index nodes contain records that descend through the tree. (I don't see a guarantee that the tree mirrors the folder hierarchy.)Each leaf node describes one or more folders and files.
(DIR) Post #ATF7oN3W33XiepSNv6 by boredzo@mastodon.social
2022-11-30T04:41:31Z
0 likes, 0 repeats
Deploying C11 generics in the service of making the byte-swapping easier to do, and to remember to do.(I'm misappropriating the terms “load” and “store” here to mean “retrieve a value from HFS's structures” and “prepare a value to be stored in HFS+'s structures”.)
(DIR) Post #ATF7oNxAi83vRRUpfc by boredzo@mastodon.social
2022-11-30T04:45:21Z
0 likes, 0 repeats
… and I just realized that that definition of the Store macro isn't going to work, because it'll switch on the type of the value to be stored and not the place I need to store it to.Gonna need to refine it to take a destination, and have the generic switch on the destination's type rather than the type of the value.
(DIR) Post #ATF7oOWyZ0hbEUFQTg by boredzo@mastodon.social
2022-11-30T05:02:22Z
0 likes, 0 repeats
Yeah, that works.
(DIR) Post #ATF7oP6QRD3h0QpjjU by boredzo@mastodon.social
2022-11-30T05:53:49Z
0 likes, 0 repeats
Huh. Apparently some future version of C is going to let you leave out parameter names?What do you use instead? $1, $2, etc.?
(DIR) Post #ATF7oPd2Tx98da5mZE by boredzo@mastodon.social
2022-12-01T00:39:26Z
0 likes, 0 repeats
Today, I discovered that (at least in Clang's implementation) _Generic treats char, signed char, and unsigned char as THREE different types.It does not do this for short, int, long, or long long. Those are all synonymous with signed short, int, etc.I found this out when my generic byte-swapping macros were falling through to the default case when fed a signed char.
(DIR) Post #ATF7oQFgAI3SZQAdnM by boredzo@mastodon.social
2022-12-01T00:45:00Z
0 likes, 0 repeats
Yup, C11 pretty clearly defines them as three different types.Dunno if that was always true or if it changed in C99 or C11 and it just never affected me before.
(DIR) Post #ATF7oQrbtGOcT3uvuy by boredzo@mastodon.social
2022-12-01T22:58:37Z
0 likes, 0 repeats
I worked out the kinks in the catalog record parsing code and now I'm extracting filenames successfully.Only to run headlong into TextEncodingConverter returning paramErrs for no fucking reason.
(DIR) Post #ATF7oROvtN3E8PVXrE by boredzo@mastodon.social
2022-12-01T22:59:41Z
0 likes, 0 repeats
“Failed to convert filename 'Descent' (length 7) to Unicode: error -50/error in user parameter list”It's fucking ASCII! Seven whole bytes of it! What is your problem???
(DIR) Post #ATF7oS2zUR5s8eFXIO by boredzo@mastodon.social
2022-12-02T00:11:18Z
0 likes, 0 repeats
I finally ran across UnicodeConverter (the fact that Carbon had *two* APIs for this was always hilarious), which has a convenient ConvertFromPStringToUnicode function that works a lot better.Next problem: Somehow my strings are getting cut in half. Probably passing a character count to something that expects bytes.
(DIR) Post #ATF7oSeZEj9S1BpXrk by boredzo@mastodon.social
2022-12-02T00:18:53Z
0 likes, 0 repeats
OK, the actual explanation is much funnier (even as it raises a bunch of questions I can't answer).So, I'm converting the filenames to HFSUniStr255 so I can ultimately write them out to HFS+. HFSUniStr255 is a Pascal-style string, where the first character gives the length, but in UTF-16. And since I ultimately want to write it out to HFS+, I need to encode it in UTF-16 BE (big-endian).Let's say the filename is “Descent”…
(DIR) Post #ATF7oTFR1edrrX4zKa by boredzo@mastodon.social
2022-12-02T00:19:03Z
0 likes, 0 repeats
I was forgetting to byte-swap the length, so my length of (say) 0x0007 was being stored in little-endian as 0x0700.CFStringCreateWithPascalString seems to always take the least significant byte. Without the swap, that's a 7, but CF seems to treat it as a byte count (???) and gives me @"Des". With the swap, it's a zero, and CF returns @"".Seems I can't use the UTF-16 BE to create NSStrings for logging.
(DIR) Post #ATF7oTt8e2OvqfehDU by boredzo@mastodon.social
2022-12-02T00:37:07Z
0 likes, 0 repeats
Ohhhhh it's even worse. Fuck.UnicodeConverter isn't giving me big-endian data. The hex dump is as follows:07 00 44 00 65 00 73 0063 00 65 00 6e 00 74 00Those 00s are supposed to come first! It's giving me little-endian (native-order) data!CFStringCreateWithPascalString is only taking the *first* byte as a length, and interpreting it as a byte count. The off-by-one error makes it “work” and return the right characters, but the misinterpretation makes it return only half of them.
(DIR) Post #ATF7oUSEXYTRbW4iv2 by boredzo@mastodon.social
2022-12-02T00:40:21Z
0 likes, 0 repeats
The other thing I tried was this: https://mastodon.social/@gparker@discuss.systems/109441276832765128That gets around CFStringCreateWithPascalString's questionable behavior, but since the data is actually little-endian, CFString (which correctly tries to read it as big-endian like I told it to) turns the characters into garbage.I need to fix the HFSUniStr255 contents, or the generation of them, so that the HFSUniStr255 is big-endian like it needs to be.
(DIR) Post #ATF7oV0GV1hDJ3ztxo by boredzo@mastodon.social
2022-12-02T01:22:22Z
0 likes, 0 repeats
Further investigation into the TEC code path revealing that creating the TextEncodingConverter object was failing. Evidently TEC does not think it can convert directly from MacRoman to big-endian HFS+ UTF-16.Looks like one way or the other, I need to convert to native-order HFS+ UTF-16, then byte-swap the code units myself.
(DIR) Post #ATF7oVdc8jAhH6PKIS by boredzo@mastodon.social
2022-12-02T01:32:42Z
0 likes, 0 repeats
Victory! I am now able to inspect catalog entries successfully:- 📄 “Descent FAQ part 1 of 2”, ID #36 (0x24), type 'TEXT' creator 'ttxt'- 📄 “Install Descent”, ID #66 (0x42), type 'APPL' creator 'STi0'- 📄 “Read Me — Descent”, ID #43 (0x2b), type 'ttro' creator 'ttxt'- 📁 “Descent” with ID #2, 8 items- 📁 “flightstick pro” with ID #24, 1 items- 📁 “gravis mousestick” with ID #28, 2 items
(DIR) Post #ATF7oWETvef77RellI by boredzo@mastodon.social
2022-12-02T01:47:51Z
0 likes, 0 repeats
“Inside Macintosh: Files”: In a catalog record, the record type is a single byte, followed by an unused byte which is reserved.The definition of catalog records in hfs_format.h: Record types are two-byte values and the low byte is always zero.This fucking CD: *stashes a caltrop in that low byte*
(DIR) Post #ATF7oWqldJHr2BZLRA by boredzo@mastodon.social
2022-12-02T05:11:12Z
0 likes, 0 repeats
OK, so now I'm able to crawl the catalog file. What's next?- Probably should start checking what I've got into version control.- Figure out why this works with the Descent CD but not the Quake 3 Arena CD (I suspect a particular hack I put in to get the former working).- Implement discovering the extents for each fork of each file. That will include looking things up in the extents overflow file, the *other* big B*-tree of an HFS volume.- Start building the code to produce an HFS+ volume.
(DIR) Post #ATF7oXLxlKEyawAG3s by boredzo@mastodon.social
2022-12-02T07:38:01Z
0 likes, 0 repeats
Current mystery: Descent's volume bitmap seems to end one whole block (512 bytes) later than other volumes' bitmaps. Like, there's just a whole empty block past the end of the VBM's last block.That's a problem because I need to start counting allocation blocks from where the VBM ends.I had hacked the math to make it work out right for Descent, but that makes… seemingly everything else not work.What am I missing?
(DIR) Post #ATF7oXv3eqJULmaHlQ by boredzo@mastodon.social
2022-12-02T07:43:09Z
0 likes, 0 repeats
The math that I have is basically that the length of the VBM is (number of allocation blocks) / 8, rounded up to a multiple of 512 bytes.This works for a Mini vMac image, Quake 3 Arena, and Disk Copy 6.3.3, but not Descent.I haven't found anything else in IM:F or TN1150 that suggests some other factor I need to include. I've looked at the allocation block size and the clump size, but haven't found any possible fixes that involve them.
(DIR) Post #ATF7oYZ7FuM8M1KHCa by boredzo@mastodon.social
2022-12-02T08:37:50Z
0 likes, 0 repeats
Aha. Finally found it, after checking IM:F for the nth time: There's a field named drAlBlSt (allocation block start) which indicates where allocation block 0 is. No other math required.I'd missed it several times before because I was looking for “where the bitmap ends” rather than “where the allocation blocks begin”, because I'd assumed they were equivalent. Not necessarily!And with that, the two disk images that weren't working—I discovered Journeyman Project also failed—now are.
(DIR) Post #ATF7oZ9d49YyBGPR7A by boredzo@mastodon.social
2022-12-02T09:57:56Z
0 likes, 0 repeats
When I start working on the HFS+ generator side of this, I should keep an eye toward being able to generate HFS as well.I think a “zip up this folder as an HFS disk image” tool would be extremely handy for emulator users.
(DIR) Post #ATF7oZraQij0NayXdA by boredzo@mastodon.social
2022-12-03T08:29:24Z
0 likes, 0 repeats
It's a shame that File Manager is all deprecated now (and FSRef is a performance drag on 64-bit apps because of certain 20-year-old tradeoffs that don't work so well now) because late-Carbon File Manager really is such a nice API for working with file I/O.I kind of wish they'd un-deprecate a bunch of these APIs and introduce CFURL counterparts for the ones that directly interact with FSRefs.
(DIR) Post #ATF7oaXltsT8UQiENs by boredzo@mastodon.social
2022-12-03T21:14:56Z
0 likes, 0 repeats
I was debating whether to keep going with the File Manager code (for rehydrating files from the HFS volume out in the real file system) or throw it out and rewrite it with NSURL.The deciding factor: I just remembered that you can't have an FSRef to a file that doesn't exist. You have to get an FSRef to its parent directory, then create the file. Which is even more of a pain in the ass than adding all the error-checking for what I was already doing.NSURL it is, then.
(DIR) Post #ATF7obGnCUTuk3mBYe by boredzo@mastodon.social
2022-12-04T00:09:46Z
0 likes, 0 repeats
Hmmm. I need to use File Manager anyway, though.I just got to the point of copying the metadata, and some of it can only be set through the File Manager. The Locked bit translates to NSURLIsUserImmutableKey, but Finder info (such as icon position) is only in File Manager. So's the stationery bit, presumably, although I haven't found it yet.
(DIR) Post #ATF7obrJ0jgkZIrLTE by boredzo@mastodon.social
2022-12-04T06:04:09Z
0 likes, 0 repeats
Rehydration progress:- Successfully creating the file.- The fork lengths of the output are correct, but the MD5 hashes don't line up with what Checksum (app for System 7) gives me. Corrupting the resource fork would also explain why app icons don't show up.- Also the tool crashes on exit about half the time now? Haven't started investigating that one yet.
(DIR) Post #ATF7ocXqSZiShElJmC by boredzo@mastodon.social
2022-12-04T07:39:23Z
0 likes, 0 repeats
That's better.
(DIR) Post #ATF7odE1vjSao4V0Wu by boredzo@mastodon.social
2022-12-04T07:44:50Z
0 likes, 0 repeats
Wait, no, that resource fork is still wrong. I wonder how? The length is right, and DeRez is able to understand the file.
(DIR) Post #ATF7oe1IyWsLGtYMKm by boredzo@mastodon.social
2022-12-04T08:18:02Z
0 likes, 0 repeats
Huh. Apparently my File Manager code is bobbling the creation and modification dates.(Ironically, the Cocoa version sets the dates correctly. I had thought the File Manager version could be a more-precise straight-across assignment…)Turns out late-Carbon File Manager defines these values as UTCDateTime structures. I guess I can assign the HFS values to the lowSeconds member and zero out the rest?
(DIR) Post #ATF7oefiYHCZIESdKC by boredzo@mastodon.social
2022-12-04T08:41:42Z
0 likes, 0 repeats
That worked fine. (And modern C makes it easy, with designated initializers.)Still mystified by the resource fork issue, though. There's a very real possibility that Checksum, the app I'm using in Mini vMac, is broken. But how to prove that?I'm stumped as to what could be wrong on my end. It's hard to get “read one extent's worth” wrong, and especially for only one fork and not the other.
(DIR) Post #ATF7ofP5pZUvYxgs3E by boredzo@mastodon.social
2022-12-04T09:35:19Z
0 likes, 0 repeats
Aha. Checksum's manual explains that “For consistency between files, the Finder Information portion of the resource fork (bytes 16 to 127) is ignored in the checksum calculation.”(The Finder stored info in the resource fork? News to me.)So that's why Checksum and OpenSSL don't agree.I validated my output another way: Created two independently-identical resource files with the same contents, then rehydrated them both. Both files' resource forks came out with the same MD5.
(DIR) Post #ATF7ofzbdohlOCm1xo by boredzo@mastodon.social
2022-12-04T09:59:15Z
0 likes, 0 repeats
ResEdit, it seems, does not appear with an icon on my Monterey machine because macOS at some point dropped support for pre-Mac OS 8.5 icon types. If it doesn't have a 'icns' resource or something more modern, no icon.I have a StuffIt updater that was already showing an icon in the modern world, and indeed has a 'icns' resource. MacBinaried it, imported it into Mini vMac, put it on a disk image, rehydrated the updater from the image—and hey, there's that icon.
(DIR) Post #ATF7oghZ0NrnaXL8To by boredzo@mastodon.social
2022-12-04T18:34:18Z
0 likes, 0 repeats
I've had to do some perf work because I threw “Journeyman Project Turbo!” at it again and it spent unreasonably long creating millions of NSDatas and their backings in subdataWithRange:.I had assumed those were basically free—no need to copy the backing, right? Well, it does anyway.Likely going to write a category that adds a “call this block with the bytes and length that such a subdata _would_ have, without creating the subdata” method.
(DIR) Post #ATF7ohFEzAnzGz61yK by boredzo@mastodon.social
2022-12-05T07:34:03Z
0 likes, 0 repeats
Huh. I can't seem to create resource forks via the Cocoa APIs on the internal APFS volume. Works fine on my HFS+ RAM disk, though.Gonna have to do a bunch more of this with File Manager after all, I see.
(DIR) Post #ATF7ohqojSrZ9Wg2Xg by boredzo@mastodon.social
2022-12-05T08:09:30Z
0 likes, 0 repeats
Yup, 100% pure File Manager works just fine.So be it, then. I get to add error checking (and lots of it) tomorrow.
(DIR) Post #ATF7oiTSPnlt5Mktlo by boredzo@mastodon.social
2022-12-06T03:45:35Z
0 likes, 0 repeats
Been refactoring the extents code. It's much cleaner now. Doesn't work anymore, though.
(DIR) Post #ATF7oj0mPuQUkiLVi4 by boredzo@mastodon.social
2022-12-06T09:13:42Z
0 likes, 0 repeats
I'm now able to rehydrate entire directory hierarchies. Even the label colors restore correctly!Technically this should mean I can rehydrate the entire volume into the real world, treating the volume root as a directory.
(DIR) Post #ATF7ok2EbzBTvW2BcG by boredzo@mastodon.social
2022-12-06T09:33:40Z
0 likes, 0 repeats
Here's the closing theme music from “The Journeyman Project” Turbo!, ripped directly from one of the AIFF files in the CD-ROM's “Support Files” directory. Video frame is stolen from an LP of the game.
(DIR) Post #ATF7okkXxEd68wlZgW by boredzo@mastodon.social
2022-12-06T09:47:47Z
0 likes, 0 repeats
Confirmed that I can rehydrate the entire volume from its root directory. (In this case I have to specify it by path, because there's another file or folder somewhere with the same name as the volume.)Takes about 1 second on the internal SSD.Custom icons work—so that's content fidelity, resource forks, and the custom icon bit.Icon positions don't seem to work, though. That might be Monterey's Finder ignoring the pre-.DS_Store position data.
(DIR) Post #ATF7olW36ccwWGzVj6 by boredzo@mastodon.social
2022-12-06T09:54:32Z
0 likes, 0 repeats
This is going to merit some investigation, though. System 7.5.5's Finder and Monterey's Finder give different figures for the total number of bytes used by all of the files.(…It would be hilarious if this was just Monterey failing to count resource forks. There could indeed be only 50 MB of resource forks here.)
(DIR) Post #ATF7omFQNuvIn0DkS8 by boredzo@mastodon.social
2022-12-07T05:44:21Z
0 likes, 0 repeats
Nope, there are in fact files missing!This is why I made listing the HFS catalog the next feature. I implemented that, then wrote a Python program that generates the same output for the real world.Diff the two programs' output and the missing files jump right out.
(DIR) Post #ATF7omoWHQzoXqdm9g by boredzo@mastodon.social
2022-12-07T06:00:04Z
0 likes, 0 repeats
Lol, oops. The filenames that are missing are ones that include slashes. Gotta change those to colons when building POSIX paths!
(DIR) Post #ATF7onPk32loPI3VAm by boredzo@mastodon.social
2022-12-07T06:10:23Z
0 likes, 0 repeats
Oops part 2: Then I have to change it back before passing it to the File Manager (which doesn't want to see *colons* in names), or I get bdNamErr.
(DIR) Post #ATF7onwi4T8q3XTpYm by boredzo@mastodon.social
2022-12-07T06:28:55Z
0 likes, 0 repeats
The good news is, I fixed the issue of files being missing.The bad news is, modern Finder's count is 30 K higher than mine, and vintage Finder's is nearly 8 MB higher.- Finder, 7.5.3: 672,376,320 bytes.- Finder, Monterey: 664,692,315 bytes.- My own directory listings (both of them): 664,665,679.My listings are definitely the total of all the logical lengths, data + resource.
(DIR) Post #ATF7ooVnxzDLoNtrGK by boredzo@mastodon.social
2022-12-07T06:37:06Z
0 likes, 0 repeats
I imagine if I knew why Finder (7.5.3)'s count was different, it might explain why they're both different from my own sums.
(DIR) Post #ATF7opBdSSftu7TGSm by boredzo@mastodon.social
2022-12-07T07:18:12Z
0 likes, 0 repeats
Here's Mac OS 9 giving me yet a fourth number!
(DIR) Post #ATF7ophtWWTlWAZ1kG by boredzo@mastodon.social
2022-12-07T22:23:52Z
0 likes, 0 repeats
I deleted and regenerated the extract output and this time Finder (Monterey)'s count matches my own: 664,665,679 bytes. So maybe it created some .DS_Stores or something last time.Still not clear why it's below the older Finders' counts. I tried adding the catalog and extents files (just under 1 MiB each) but that only got me to 666,760,783 bytes.
(DIR) Post #ATF7oqD5eXQt4v9wMy by boredzo@mastodon.social
2022-12-08T02:08:58Z
0 likes, 0 repeats
SheepShaver has a portal to the UNIX world, so I tried pointing the Mac OS 9 Finder at the exact same freshly-extracted copy of “The Journeyman Project”.Monterey's Finder counts up 90 more KB* than Mac OS 9's Finder does. Same bits. Also one additional item (maybe one is counting the folder and the other not).(And of course this is still a smaller number than it came up with for the CD-ROM disk image, so I haven't exactly exonerated my rehydration logic with this experiment.)*edit: not MB
(DIR) Post #ATF7oqrVEHl76G4DMO by boredzo@mastodon.social
2022-12-08T05:38:28Z
0 likes, 0 repeats
So it turns out Alessandro Levi Montalcini wrote a program eons ago called List Files that crawls a folder hierarchy and generates a report with sizes, which you can set up to include sizes in bytes.Doesn't break out the fork sizes specifically, but that's OK.I summed up the sizes it gave me and… it matches my own output exactly. 664,665,679 bytes.So I still don't know what the Finders are doing, but I'm at least more confident my own program isn't fucking up.
(DIR) Post #ATF7orO7H1qYjPKGC8 by boredzo@mastodon.social
2022-12-08T05:39:21Z
0 likes, 0 repeats
List Files also has an option to include CRC32 checksums. Seems like something I could add easily enough using zlib.
(DIR) Post #ATF7orwrBrdUT9a0LQ by boredzo@mastodon.social
2022-12-08T07:25:01Z
0 likes, 0 repeats
…On the other hand, List Files doesn't document how it feeds the CRC32 function (data fork first? resource fork first?) and I haven't been able to reproduce its output—even with itself.He wrote another app, Verifile, that does CRC32 on each fork of one file. That one shows the same results for both the CD-ROM copy and my rehydrated copy of a selected file. I'm not getting the same digest in my program, though.
(DIR) Post #ATF7osaYoFOYSI9iEK by boredzo@mastodon.social
2022-12-08T07:36:23Z
0 likes, 0 repeats
OK, Checksum's MD5s (of data forks, at least, given its quirk about hashing resource forks) agree with what I get in the real world. I can give up on CRC32 for now.I also picked out one file (Sound Manager) and hex dumped both my rehydrated version and a copy retrieved through SheepShaver's portal. Identical.
(DIR) Post #ATF7ot5kwGLg12kcr2 by boredzo@mastodon.social
2022-12-08T17:27:45Z
0 likes, 0 repeats
Last night, I got back to thinking about HFS+ conversion.My original plan had been to block-copy the whole volume and then patch select blocks. Problem is, that only works for if the allocation blocks are 512 (0x200) bytes. Larger volumes with other block sizes would require shifting everything down.I think I can still block-copy the allocation blocks region, but I need to compute a new offset to copy it to. Or else change everything to a 512-byte block size and recompute every extent.
(DIR) Post #ATF7othgfEgpugUuye by boredzo@mastodon.social
2022-12-10T08:26:09Z
0 likes, 0 repeats
The more I think about this, the more it seems like something that is possible 99% of the time, but not 100%.There's a bunch of adjustments that each consume a little bit of space—a few K here, a few K there.I can free up some space from the extents overflow file. It'll *usually* be enough. (Indeed, that file will usually end up empty if it wasn't already.)In the worst cases, I'll hit the walls of the volume and fail. In some of those, a defragmenting conversion would succeed.
(DIR) Post #ATF7ouDwjIUhWjagG8 by boredzo@mastodon.social
2022-12-10T08:29:04Z
0 likes, 0 repeats
It's also possible that a defragmenting conversion—not attempting to preserve existing locations on disk, just copying the files into one contiguous extent after another, one file at a time—would be simpler and that alone might justify it.No edge-case whack-a-mole. One algorithm that works on everything, except for conversions that are impossible.
(DIR) Post #ATF7oujqog0z7gW9zM by boredzo@mastodon.social
2022-12-10T16:54:04Z
0 likes, 0 repeats
Regardless of the conversion algorithm, the catalog file will need to grow, and I did that math this morning. Looking at two CD-ROMs:- The size of each node grows by a factor of 8 (0x200 to 0x1000).- The number of leaf nodes grows by roughly 50%.- Whether this affects disk footprint depends on utilization of space already allocated. One CD had enough space to grow the catalog into its existing allocation. The other could allocate new blocks or take unused space from the (empty) extents file.
(DIR) Post #ATF7ovHAomfan26lvc by boredzo@mastodon.social
2022-12-11T08:23:33Z
0 likes, 0 repeats
I realized why the Mac OS 9 CD image was stubbornly mounting as read-only: It was because I had copied it from a device to a file. Just had to chmod a+w it.So PlusMaker worked fine. Freed up 20 MB on that volume, presumably by switching to smaller allocation blocks.I'd forgotten that it overtly does a defragmenting conversion. (Indeed, back then, that would have been a significant feature.)
(DIR) Post #ATF7ovkx24UOHO2YLI by boredzo@mastodon.social
2022-12-11T08:34:35Z
0 likes, 0 repeats
I had to rebuild the Desktop file on that volume while booted from the hard drive image, but after doing that, I was able to reattach it as a CD-ROM and boot from it successfully.So that's going to be the standard for my conversion: Can I convert the Mac OS 9 CD-ROM and then (after rebuilding its Desktop file) boot SheepShaver off it?
(DIR) Post #ATF7owOIflxsFQRyfw by boredzo@mastodon.social
2022-12-18T05:54:42Z
0 likes, 0 repeats
Got stuck and had to take a break (in which I did other things) for nearly a week, but I've made some progress on the block allocator and on writing data into blocks designated by extents.I had disliked the change from block #0 being after the volume bitmap to being the block that contains (at least) the first boot block, but I admit it makes the math simpler in computing my pwrite arguments.
(DIR) Post #ATF7oxGXQ7LkxdpIDQ by boredzo@mastodon.social
2022-12-18T23:36:26Z
0 likes, 0 repeats
Today's work has been dismantling the assumption of my B*-tree classes that they're going to be instantiated from already-populated HFS data, so that I can create the objects in order to populate the HFS+ data.
(DIR) Post #ATF7oxlNZS1IVIFvHs by boredzo@mastodon.social
2022-12-20T04:44:19Z
0 likes, 0 repeats
Today in premature optimization: Wrote a method to count the nodes in a linked list in order to compute the size to pass to arrayWithCapacity:.Fortunately I realized that that would not be a net time savings. Better to waste a little extra memory with a less accurate but faster estimate.
(DIR) Post #ATF7oyFVlQ7g0kLzFo by boredzo@mastodon.social
2022-12-23T22:13:34Z
0 likes, 0 repeats
Been mulling over the catalog file.I think I can't necessarily just convert catalog entries straight across in the same order, because the move from 8-bit maybe-MacRoman to Unicode means names won't necessarily sort the same.So I need to repopulate the leaf row in (re-)sorted order, and then regenerate the index rows based on the new order.I still think I don't need to fully implement a B*-tree; I should be able to rebuild the index one row at a time from the bottom up.
(DIR) Post #ATF7oypfaz2votGrc8 by boredzo@mastodon.social
2022-12-24T22:39:27Z
0 likes, 0 repeats
I've been digging into how HFS compares filenames (partly because I cheated earlier; I've been comparing Unicode instead of HFS names).It's mostly documented, albeit in a few places—two Inside Macintosh books and at least one Developer Q&A.And I might've found a bug in some Apple open-source code. Need to fire up MPW and write a little test app to investigate.
(DIR) Post #ATF7ozSfG0EplpW0OW by boredzo@mastodon.social
2022-12-24T23:41:10Z
0 likes, 0 repeats
Been a long time since I've used these tools.
(DIR) Post #ATF7ozxVPKuNJTwdSy by boredzo@mastodon.social
2022-12-25T00:03:49Z
0 likes, 0 repeats
Fira Code doesn't work very well in MPW.
(DIR) Post #ATF7p0TPUiQeuQs7CC by boredzo@mastodon.social
2022-12-25T00:33:18Z
0 likes, 0 repeats
OK, the Apple open-source code I found does indeed match Mac OS 9's behavior. Bug-for-bug compatibility confirmed.
(DIR) Post #ATF7p0yFe36CS5IkGe by boredzo@mastodon.social
2022-12-25T06:54:47Z
0 likes, 0 repeats
Becoming a big fan of taking research notes while working on a personal project. This is already a great corpus of knowledge and I should refine it down to a blog post or something.
(DIR) Post #ATF7p1lWgqVwuuM64W by boredzo@mastodon.social
2022-12-26T09:31:33Z
0 likes, 0 repeats
… What?
(DIR) Post #ATF7p2M2V5imk9RFz6 by boredzo@mastodon.social
2022-12-26T09:34:38Z
0 likes, 0 repeats
Aha. If I change the format specifiers to %ld (properly matching the long int type underneath), then I get:nameComparison 4294967295 == NSOrderedAscending -1: false.So yeah, my declaration of RelString as returning NSComparisonResult isn't going to work. I need to declare it as returning int32_t or something.
(DIR) Post #ATF7p2rwaTF4L6MjiK by boredzo@mastodon.social
2022-12-26T09:39:22Z
0 likes, 0 repeats
Yeah, searching a tree goes a lot faster when your comparator actually works.Edit: Hm, no, this is still broken. Less broken, but it's still exploring more of the tree than it ought to.I don't think I have enough brain for this tonight.
(DIR) Post #ATF7p3ROSfbA72x2y8 by boredzo@mastodon.social
2022-12-27T03:26:08Z
0 likes, 0 repeats
I did fix the tree search earlier today. Deleted some overly-clever code that was navigating away from where earlier code had led it.It now takes 12 seconds to unzip a CD-ROM I was testing with, and approximately all of that is file I/O.This code is in need of a good refactor, but I'm in the middle of working on the conversion implementation, so this isn't the time.
(DIR) Post #ATF7p3yMU5yBlINNM8 by boredzo@mastodon.social
2023-01-01T02:27:06Z
0 likes, 0 repeats
Well, that was fun. Xcode spontaneously forgot how to sign build products, so I couldn't run my own damn program anymore until I quit and relaunched Xcode.
(DIR) Post #ATF7p4SqekM9HqdisK by boredzo@mastodon.social
2023-01-01T03:57:49Z
0 likes, 0 repeats
Huh. Possibly related.I think that was when I relaunched Xcode? Weird.
(DIR) Post #ATF7p52IWwiF3nE288 by boredzo@mastodon.social
2023-01-02T22:07:12Z
0 likes, 0 repeats
Interesting. Attempting to list the ClarisWorks and AppleWorks CD-ROMs fails—returns success, but no output.analyze says the catalog file doesn't have a header node at node 0. Could be I'm reading the wrong bits.Looking at a hex dump, the volume header is in the right place, though I haven't inspected it in detail yet.I need to improve my error reporting on multiple fronts, verify that these images mount in SheepShaver, and then figure out why I'm having trouble reading them.
(DIR) Post #ATF7p5YuZgnggwU4xs by boredzo@mastodon.social
2023-01-02T22:49:06Z
0 likes, 0 repeats
This one exhibits the same problems as the ClarisWorks CD, with a novel twist: It actually has the volume header *twice*? Once at 0x400, as expected, and then seemingly again at 0x4ac. (Which is within the same 0x200 block, so it's not wrong necessarily… but odd.)There is supposed to be a second volume header at the end, and I checked and that's where it's supposed to be. So this volume has three volume headers.The main/first volume header differs slightly from the other two.
(DIR) Post #ATF7p6AUJyrGZU45XE by boredzo@mastodon.social
2023-01-02T22:55:39Z
0 likes, 0 repeats
I had the thought that maybe that was immediately after the previous volume header, like the software just wrote it out twice in a row for some reason.Nope! There's ten bytes in between. Seven of them are 1s and the rest are zeroes.Sadly HFS lacks HFS+'s field for “four-byte signature of the software that wrote this volume” because now I wanna know.
(DIR) Post #ATF7p6duYaOU2jpaOe by boredzo@mastodon.social
2023-01-02T23:53:37Z
0 likes, 0 repeats
Confirmed that Disk Copy in Mac OS 9 can't mount the volume either. How odd, but at least it proves my program isn't broken.(Could be more “imaging the wrong sub-device” shenanigans. It's possible I need to re-image those CDs. Sigh.)
(DIR) Post #ATF7p7BaXNKfjBaTtA by boredzo@mastodon.social
2023-01-03T00:51:49Z
0 likes, 0 repeats
I actually have two images of each of these and I thought I (a) was using the newer one for all my experimentation and (b) checked both and found them identical, but evidently not on both counts.The newer of the two has an intact Apple Partition Map and, extracting the volume from that, I'm able to list the volume with my own tool and mount the extracted volume in SheepShaver.So evidently I already re-imaged it. Thank you, Past Me!
(DIR) Post #ATF7p7lkMwFvXKVMFU by boredzo@mastodon.social
2023-01-05T08:56:41Z
0 likes, 0 repeats
Was stuck on a particular step in generating the new catalog B*-tree. I've been letting my brain chew on it in the background for a few days while I've done other things and I think I have the algorithm figured out now.I jotted down notes in English and will translate to Objective-C tomorrow.
(DIR) Post #ATF7p8K8J5lHFyaoqW by boredzo@mastodon.social
2023-01-06T07:04:38Z
0 likes, 0 repeats
I *think* I've finished writing the catalog converter. Still need to test it, which I may or may not attempt to do before implementing the copying of fork contents.I have started building the latter. The destination volume will be able to vend a simple file handle for the converter to write data to, which will smooth over the impedance mismatch between the volumes' block sizes and any source volume fragmentation.
(DIR) Post #ATF7p8w4246R9cL6y8 by boredzo@mastodon.social
2023-01-06T08:40:11Z
0 likes, 0 repeats
The converter is now more or less MVP. No extents overflow support on the output side yet, but that's fine since it's a defragmenting conversion.All that's left is to play bug whack-a-mole until it works.I made a simple test volume in Mini vMac for early testing. An 800K floppy with two files and one folder.
(DIR) Post #ATF7p9a7d8959r56PI by boredzo@mastodon.social
2023-01-07T01:20:16Z
0 likes, 0 repeats
Had to move a bunch of constants to a new header just for constants, because putting them at the top of various class headers and importing them from each other led to an import cycle.This became evident when I tried to use a type declared in one header in another class's interface, and the compiler said the type didn't exist despite my importing the needed header. The compiler hadn't imported that header yet because that header imported this one. 🤦🏻
(DIR) Post #ATF7pAH13eSNIt9MGW by boredzo@mastodon.social
2023-01-13T05:19:00Z
0 likes, 0 repeats
So many things wrong with this:- Why are the keys in the converted catalog node in reverse order?- Why is there a record with parent ID #352321536?- Why is there a record with parent ID 0? (Not a valid catalog item ID.)- Why did the comparison of 2/Bravo.txt to 2/Bravo.txt return +1 and not 0?
(DIR) Post #ATF7pApOznxj1XEorY by boredzo@mastodon.social
2023-01-13T07:23:22Z
0 likes, 0 repeats
Answers:- The comparator I use when I sort my item objects before populating the new tree is backwards.- I was byte-swapping IDs that I should be copying over verbatim (they're already big-endian, but I was storing them as if they were in native order).- The code that extracts keys from records needed to be taught that HFS+ uses bigger key lengths. It was only taking the first byte of the length.- It looks like the HFS+ string comparator is misbehaving. More investigation needed.
(DIR) Post #ATF7pBQclPjisyeXse by boredzo@mastodon.social
2023-01-13T07:37:31Z
0 likes, 0 repeats
The string comparator was missing a couple of byte-swaps (of the string lengths), and I was also missing a check for when keys compared equal in a particular situation.More bugs remain. Such as: Why is block allocation failing for a paltry twelve bytes?
(DIR) Post #ATF7pCkrpdWV038E4W by boredzo@mastodon.social
2023-01-13T07:49:48Z
0 likes, 0 repeats
Apparently the volume header, and thus the block allocator, believes the volume is zero blocks long.This is because the volume's length in blocks needs to be recomputed from the overall length in bytes divided by the block size. (The HFS volume's block count can't be reused as-is.) But that recomputation isn't happening—the part that's supposed to just assumes the block count was copied over.So I need to figure out how to fix that… tomorrow, probably.
(DIR) Post #ATF7pDHTsNbwdCOGuG by boredzo@mastodon.social
2023-01-14T06:06:36Z
0 likes, 0 repeats
I ended up rewriting the block allocator.The one I had written was overengineered for the situation it's in. Like, I was trying to search for the exact perfect opening that would fit the new allocation. But there are no openings because a conversion never deallocates; there's only the single block of free space, which shrinks with each allocation.For now, at least, I've replaced it with an allocator that just takes the first opening that's big enough.
(DIR) Post #ATF7pDlc4LiK8eUKsC by boredzo@mastodon.social
2023-01-14T06:11:21Z
0 likes, 0 repeats
Next problem: Writing fork contents to the destination volume fails. Returns -1 with no error object.Turns out I forgot to implement that part (there's even a comment that says “TODO: Implement me”).So that'll be the next thing: Implementing my simple write-only file handles.
(DIR) Post #ATF7pEMppxUK05u3tI by boredzo@mastodon.social
2023-01-14T06:22:55Z
0 likes, 0 repeats
I hadn't expected to need to implement file handles but it turns out they're necessary for resolving impedance mismatches between (e.g.) volumes with different block sizes, or differently fragmented allocations.You need something to keep track of how much data has been written and where it's getting written to, and that's what a file handle does.
(DIR) Post #ATF7pEwdgq7zn8eehM by boredzo@mastodon.social
2023-01-14T17:34:12Z
0 likes, 0 repeats
It occurs to me that I should see what Inside Macintosh: Files has to say about File Manager's file control blocks, which were its file handles back in the day. I might be able to learn some lessons from that rather than the hard way.
(DIR) Post #ATF7pFblDx1NqftUnI by boredzo@mastodon.social
2023-01-15T22:54:56Z
0 likes, 0 repeats
I'm at the point where I'm using fsck_hfs and DiskWarrior to tell me what's wrong with my output.(They're not very helpful, unfortunately. fsck in particular is designed to tell you “your disk is fucked, get a new one”, not “here is the specific surgery you need to do to your file system to correct it”. DiskWarrior is only a little better.)
(DIR) Post #ATF7pGBD69NTccTo36 by boredzo@mastodon.social
2023-01-16T02:10:39Z
0 likes, 0 repeats
I have turned to making my HFS+ volume class a subclass of my HFS volume class, overriding in select (and a few new) points, and updating the “analyze” verb to be able to analyze both HFS and HFS+ volumes.I think I've hammered all the bugs out of that, so now I'm able to explore actual defects in the converted volume.
(DIR) Post #ATF7pGmQrl9TU3tX4C by boredzo@mastodon.social
2023-01-18T08:10:34Z
0 likes, 0 repeats
Discovered that fsck_hfs has a cheat code: If you pass -D <some number in hex>, it lets you unlock additional debug options, one per bit.The bits are defined here: https://opensource.apple.com/source/hfs/hfs-226.1.1/fsck_hfs/fsck_debug.h.auto.htmlIn my case, fsck_hfs -d -D 0xc63 turns on everything I might care about—including hex-dumping corrupt records and nodes.
(DIR) Post #ATF7pHFr6MggxJf1vc by boredzo@mastodon.social
2023-01-18T15:00:13Z
0 likes, 0 repeats
Current status:
(DIR) Post #ATF7pHtYikRkwSEjoW by boredzo@mastodon.social
2023-01-18T15:22:14Z
0 likes, 0 repeats
Huh. Wild.None of the files on my output volume are fragmented, so there are no entries in the extents overflow file. Since there are no entries, my extents overflow file has only one node (besides the header node), which is a leaf node, with no records.fsck apparently considers this invalid. I'm going to need to make a similar volume in the modern world and see what *its* extents overflow file looks like (assuming fsck doesn't fail on that as well).
(DIR) Post #ATF7pIRagDfWe09urI by boredzo@mastodon.social
2023-01-18T15:30:48Z
0 likes, 0 repeats
Huh. Brand new HFS+ volume created using hdiutil create has an extents overflow file with only a header node. I thought that was invalid (I had diagnosed it as the cause of some earlier error) and had added code to create an empty leaf node to “fix” it.Guess I can take 90% of that code back out. Just gotta figure out which 10% to keep.
(DIR) Post #ATF7pIwQpYL4BeaXvk by boredzo@mastodon.social
2023-01-19T06:21:25Z
0 likes, 0 repeats
#GuessTheBug
(DIR) Post #ATF7pJaqPIfICzUovA by boredzo@mastodon.social
2023-01-19T07:00:56Z
0 likes, 0 repeats
Achievement unlocked: Clean bill of health from fsck_hfs.Next step is DiskWarrior. It still has a few complaints about values in my volume header and my file records.
(DIR) Post #ATF7pKIRnBXkODtdsu by boredzo@mastodon.social
2023-01-19T07:18:05Z
0 likes, 0 repeats
Achievement unlocked: It mounts on macOS!I wasn't going to try that yet but DiskWarrior was reporting a problem I couldn't find evidence for (it claims a couple of files have the alias flag set). I still can't; I might try updating DiskWarrior to 5.2.
(DIR) Post #ATF7pKoLsZ41zAp7c8 by boredzo@mastodon.social
2023-01-19T07:57:30Z
0 likes, 0 repeats
Updated to 5.2. No change. DiskWarrior still claims my files' catalog records have the alias flag set.If I mount the volume and run GetFileInfo on the files in question, that confirms they don't. If I let DiskWarrior commit its changes, the files' flags don't appear to have changed (although DiskWarrior's own report is now happy).How odd.Absent some evidence corroborating this part of DW's report, I'm going to ignore it. And I probably should contact Alsoft.
(DIR) Post #ATF7pLLfsfideWPjYO by boredzo@mastodon.social
2023-01-19T22:04:39Z
0 likes, 0 repeats
A bug that my 800K test volume did not surface, but testing with a real CD-ROM does, is that the index nodes in my catalog B*-tree are all fucked up.Leaf nodes: All keys correctly ordered.Root node (which in this case is an index node pointing to two further index nodes): Correctly ordered, though there was a 50-50 chance of that.Each of the two index nodes in the middle: Tripped, spilled its records on the floor, and then hastily scooped them up.
(DIR) Post #ATF7pLz1WNC7cYp9t2 by boredzo@mastodon.social
2023-01-20T00:30:47Z
0 likes, 0 repeats
Well, that failsafe kicked in exactly as planned.Now to figure out why the volume header didn't get written (or got written somewhere else)…
(DIR) Post #ATF7pMiOnfUTtI3Oc4 by boredzo@mastodon.social
2023-01-20T01:47:30Z
0 likes, 0 repeats
Hm. Well, fsck_hfs has no problem with my converted copy of the Mac OS 9 CD-ROM, and DiskWarrior has only its dubious flags complaints.But SheepShaver refuses to even attempt to boot off it, and when I tried running the installer from it, it got a decent way through and then hit this:
(DIR) Post #ATF7pNDwuMjBT8oan2 by boredzo@mastodon.social
2023-01-20T01:49:01Z
0 likes, 0 repeats
Apparently that was from the Internet Access installer. There is a Parts folder in what looks like the right place, so it's not clear to me why the installer fails to find it.
(DIR) Post #ATF7pNlGuTNn8UPCjI by boredzo@mastodon.social
2023-01-20T01:54:00Z
0 likes, 0 repeats
The good news is, that means the Mac OS 9 installer itself finished.Temporarily remove both the CD-ROM image and my regular hard drive image, and…Yay! It boots!Doesn't boot directly from the converted CD-ROM, but the Mac OS 9 I installed from it does boot.(And then Setup Assistant crashed with a type 4 error while I was ignoring it. Dunno whether that's related.)
(DIR) Post #ATF7pOPKVXQR8j9CAS by boredzo@mastodon.social
2023-01-20T04:06:10Z
0 likes, 0 repeats
So while there may be some lingering issues in need of investigation, this seems an excellent time to start breaking up all my uncommitted work into commits.There's a lot of it by this point. Some of it shouldn't be committed (dead code, dead ends), but at this point I'm going to commit first and delete later.
(DIR) Post #ATF7pPA7hYr7Tr2Z6W by boredzo@mastodon.social
2023-01-20T05:15:31Z
0 likes, 0 repeats
Memory usage was a little dire, so I inserted an autorelease pool in a strategic place, and now it peaks at roughly 70 MB by Xcode's count. Generally hovers around 40 MB for the entire conversion of the Mac OS 9.0 CD-ROM.It converts that volume in a little under 4 seconds, pretty much all of it I/O (mainly copying fork contents).
(DIR) Post #ATF7pPfJpZoF2bdTjE by boredzo@mastodon.social
2023-01-29T22:28:35Z
0 likes, 0 repeats
Quick catch-up post for folks who don't want to read two months' worth of toots to figure out what this long thread is about:- I'm writing a tool for working with HFS volumes on modern macOS.- Extraction of files, folders, or the whole volume—treating it like a zip archive—works.- Conversion to HFS+ (like PlusMaker but uglier) also works… mostly. DiskWarrior gives a couple complaints I still need to investigate.- Where last I left off: Chopping the work done on conversion up into commits.
(DIR) Post #ATF7pQGXbBaEu33CkK by boredzo@mastodon.social
2023-02-06T03:51:50Z
0 likes, 0 repeats
Whoopsie. Found a bug in the catalog converter when adding a pointer record of just the right size to an almost-full index node: It adds the record after the last one, then stomps over its last two bytes with the offset to the top of the (overflowing) offsets stack.Need to adjust that check to reject the addition if that would happen.
(DIR) Post #ATF7pQlNkWFmRhTpom by boredzo@mastodon.social
2023-02-06T04:08:47Z
0 likes, 0 repeats
Ooh, interesting. The node thinks it has 1,947 bytes available.That… does not agree with the math based on the total size of all records already in it.Either the number-of-bytes-available method is broken, or the node's offsets stack has already been smashed (possibly by the previous append).
(DIR) Post #ATF7pRK7fM2iBRjZy4 by boredzo@mastodon.social
2023-02-09T02:46:14Z
0 likes, 0 repeats
Hmmm. Can't tell if the Quake 1 CD-ROM is weird or if I'm misparsing it somehow. Currently leaning toward the former.(I checked and I'm still able to read CDs I've read before, so I don't appear to have broken anything recently.)
(DIR) Post #ATF7pRxpHjnmAaJHqy by boredzo@mastodon.social
2023-02-09T02:50:41Z
0 likes, 0 repeats
Valid B*-tree node types: Leaf (-1), index (0), header (1), map (2).Type of this catalog's first node (where the header node should be): 4.🤨
(DIR) Post #ATF7pSaSy4i66QO956 by boredzo@mastodon.social
2023-02-09T03:20:38Z
0 likes, 0 repeats
Aha. My file-reading code is actually broken for files that occupy multiple extents. It doesn't advance the buffer offset, and so overwrites earlier reads with later ones.That's a pretty big bug! Glad I caught it.
(DIR) Post #ATF7pT4bA2oTbsUD32 by boredzo@mastodon.social
2023-02-16T18:21:36Z
0 likes, 0 repeats
Back to work on the HFS tool. (See pinned toot for a catch-up.)Fleshing out documentation and realized that extracting aliases might not work. I can extract an alias, but it'll point to a folder/file ID that may not exist or may be any random item.I have a few options when extracting an alias specifically, including rehydrating the real file instead.When extracting the whole volume, I could detect and patch up aliases to reconnect them to their destinations (if on the same volume).
(DIR) Post #ATF7pTbvA9T5HE4ozI by boredzo@mastodon.social
2023-02-16T18:22:31Z
0 likes, 0 repeats
This is one case where conversion is advantageous. Since all the folder/file IDs are preserved in the converted volume, any aliases should Just Work.(…though I haven't tested that.)
(DIR) Post #ATF7pU77IAQCpyfjc0 by boredzo@mastodon.social
2023-02-19T09:07:38Z
0 likes, 0 repeats
Just tested with a new test volume created on Mac OS 9.0.4 (I made the previous one on System 6, so no aliases).Initially, the intra-volume alias on this test volume works fine in the converted volume. However, if I mount the converted volume as read/write and then rename the original file, the alias breaks.So it looks like reference-by-ID gets broken somehow and it ends up effectively a symlink. Dunno if there's anything I can do about the former; I'll need to investigate further.
(DIR) Post #ATF7pUgDBgUiap5lJY by boredzo@mastodon.social
2023-02-21T15:25:00Z
0 likes, 0 repeats
Reading more Inside Macintosh, I have a lead on one possible reason aliases are fragile on the converted volume.When pathname matching fails, the Alias Manager matches volumes by “name, creation date, and type”. Initially tries all three, then creation date+type, then name+type.It's vague on what “type” means but at least one component seems to be the format signature.That alone might be why: the Alias Manager is looking for an HFS volume, which it won't find because it's HFS+ now.
(DIR) Post #ATF7pVCTFkIaCsBWb2 by boredzo@mastodon.social
2023-02-21T15:33:35Z
0 likes, 0 repeats
Interestingly, alias files created on modern macOS contain CFURL bookmark data in the data fork, not an 'alis' resource in the resource fork.Kind of wonder whether CFURLCreateBookmarkDataFromAliasRecord will work when the alias points to a volume that isn't mounted. Moreover, if an alias file has both bookmark data and an (old, verbatim, fragile) alias resource, will the bookmark data win?
(DIR) Post #ATF7pVrEoAuOFJG58i by boredzo@mastodon.social
2023-02-21T16:22:34Z
0 likes, 0 repeats
Oh, THAT's interesting. My test for whether each file is an alias file or not is returning true for the original file. The alias bit is set, even though it shouldn't be.That's the problem DiskWarrior was complaining about. Huh…
(DIR) Post #ATF7pWTAX9FY8x0NGK by boredzo@mastodon.social
2023-02-21T17:03:28Z
0 likes, 0 repeats
At least in this case, it was because I was reading the finderInfo member (which is actually the *extended* Finder info), not the “userInfo” member, which is where the alias bit is.The code that copies both Finder infos verbatim doesn't seem to have this problem. It copies the userInfo to the userInfo and the finderInfo to the finderInfo.
(DIR) Post #ATF7pXDxjAgEU4tkCO by boredzo@mastodon.social
2023-02-21T17:17:33Z
0 likes, 0 repeats
Sweet. Converting the alias record to bookmark data and writing that to the data fork of the alias file on the converted volume (alongside the original, unmodified alias resource) works. The resulting hybrid alias file is able to find the original even after renaming it.That's probably going to be off by default, since it's a change to fork data and isn't necessary when the converted volume is treated as read-only. But it'll be an optional way to keep aliases from becoming brittle.
(DIR) Post #ATF7pXs1KEisUJdjdY by boredzo@mastodon.social
2023-02-21T17:32:01Z
0 likes, 0 repeats
… Never mind. I did a few things differently that time and Finder must have transparently fixed the alias.When I retested with a fresh conversion, it did not work. Either Finder is favoring the alias resource, or the bookmark data is still too strict. (It might even have the exact same problem.)I'm not going to worry about this too much; I think I'll just have to note in the documentation that aliases in converted volumes will be brittle.
(DIR) Post #ATF7pYXUq1tqYx2rHk by boredzo@mastodon.social
2023-02-22T08:15:30Z
0 likes, 0 repeats
I've been chasing down the issues identified by DiskWarrior.One is partly a DiskWarrior bug—it claims the problem is the “alias flag”, but it was actually the thread bit not getting set on files that had needed thread records. Fixed.Second problem was with several folders' custom icon bits. Yup, they needed it and didn't have it—and that was true in the original HFS volume as well.Third: Root directory's creation date didn't match the volume's. Also true in the original.
(DIR) Post #ATF7pZ9QZ0F0San9PM by boredzo@mastodon.social
2023-02-22T08:22:29Z
0 likes, 0 repeats
So I had one bug (wasn't setting the thread bit after creating thread records), but the other two were not bugs in the converter.The converter accurately reproduced the values from the input volume. That is, in this context, the opposite of a bug.My goal is conversion and data retrieval, with an emphasis on fidelity. Repair is the domain of fsck and DiskWarrior.
(DIR) Post #ATF7pZjwNFRqHpsJJw by boredzo@mastodon.social
2023-02-22T17:02:01Z
0 likes, 0 repeats
Finally implementing support for looking up files in the source volume's extents overflow file.And ran smack into a mystery: The source volume's extents overflow file seems to be wrong????IM:F says extents overflow keys are the fork type, file ID number, and first block number. The catalog's extent record for this file says its resource fork starts at block 32. The record in the EO file is for a resource fork that starts at block… 22.(Which is in the middle of the catalog file.)
(DIR) Post #ATF7paJ2GlWM2gIL1U by boredzo@mastodon.social
2023-02-22T17:15:06Z
0 likes, 0 repeats
Aha. I thought I had to be misunderstanding something.IM:F says that the extents key contains the “starting file allocation block number”, the “index of the first allocation block of the first extent descriptor of the extent record”.I had interpreted that to mean the first allocation block of the file. Seems reasonable enough.But it's actually the first block *in that record*—i.e., the total number of blocks before it.Sum of the block counts of the catalog's extent record: 22.
(DIR) Post #ATF7pay9nsPk6DXB7Q by boredzo@mastodon.social
2023-02-22T18:06:40Z
0 likes, 0 repeats
So with all these fixes, I'm down to no known bugs.I still can't boot SheepShaver from the converted Mac OS 9 CD-ROM, for whatever reason, but I am able to install from it. And this time I didn't get that weird error I had before about some missing or corrupted file.
(DIR) Post #ATF7pbU3tFw1hASeqe by boredzo@mastodon.social
2023-02-23T00:10:40Z
0 likes, 2 repeats
Here it is: https://github.com/boredzo/impluse-hfsimpluse (pronounced “impulse”, in the finest traditions of open-source software naming) is the HFS tool I've been working on for a few months now.Convert HFS volumes to HFS+. List their contents. Extract select items, or the entire shebang.Use with caution, keep your originals, and please file bugs.
(DIR) Post #ATF7pd9vf51EtJ3bkG by boredzo@mastodon.social
2023-03-01T01:02:27Z
0 likes, 0 repeats
Just pushed an implementation of Apple Partition Map support to a branch. With this, it's possible to target impluse directly at, say, a Mac OS install CD (which is partitioned), and extract files from it.Still need to work on what it does when converting. Ideally, it probably should copy the partition map + other partitions and write the converted volume where the HFS volume used to be.
(DIR) Post #ATLx8sIU1l9n5Dx3iq by boredzo@mastodon.social
2023-03-06T05:40:21Z
0 likes, 1 repeats
Y'ever go to fix a bug and find that it's actually three completely separate bugs in a trenchcoat?
(DIR) Post #ATLx8uWNl3SlyuTBfE by boredzo@mastodon.social
2023-03-06T07:24:55Z
0 likes, 0 repeats
So yeah, that was a journey. https://github.com/boredzo/impluse-hfs/issues/18“Hexen: Deathkings of the Dark Citadel” required nearly half a dozen fixes to convert successfully. And one of them is a band-aid that will ultimately need to be replaced with a proper fix.