Subj : Re: Windows vs Linux To : boraxman From : tenser Date : Mon Apr 25 2022 14:34:56 On 25 Apr 2022 at 01:01p, boraxman pondered and said... bo> te> Ahh, typical ESR foolishness: that book does not have a bo> te> great reputation for a reason. CSV obviously _is_ a bo> te> delimited text format. Since you mention /etc/passwd, bo> te> suppose a side wanted to put a colon in the GECOS field; bo> te> how would one do it? Or they wanted to put arbitrary bo> te> commas in fields (say, 'LastName, FirstName' was the bo> te> local convention), but you still wanted compatibility bo> te> with tools like `chfn` and `finger`? bo> te> bo> te> There's a reason structured data formats have become bo> te> popular. bo> bo> He explains it. You use an escape character. With /etc/passwd? Nope. That doesn't work. Because the parser is are built into a library. And even on systems where I can hack the library, I might use something like LDAP or NIS, or even shell scripts and rsync or rdist to copy those to machines where I can't hack the library for some reason. So no, that doesn't work for /etc/passwd. Or did you mean for delimited text formats generally? In which case, don't CSV files support quoted strings? bo> I've written a CSV parser, and one which is based on the delimited bo> format. The latter is far easier. If one is going to appeal to authority or personal experience, it's best if one checks one's priors. I learned compilers from Al Aho, and I've written parsers for full programming languages with context sensitive grammars. Some of them are Internet facing and used daily by millions of users. So I think I speak with some authority when I say that CSV is not significantly harder than simple delimited lines of text, which are themselves trivial to parse. However, neither is very extensible. Consider what happens when one needs to add a new field. To go back to the /etc/passwd example, when this last happened with both Linux and BSD, they had to invent a new file format that lived in a separate file next to the legacy V7 format file, and they had to develop specialized tools to keep these in sync. Delimited lines of text are great because they're simple to use and easy to get going. They work well in Unix pipelines because most filters were evolved to work best with that kind of textual data. They're not so great because they generally don't evolve gracefully: too much is implicit in the format itself ("field 3 is always an integer and it's always the user ID number"). There are no universally agreed upon formats to represent the full range of representable data expressible on modern machines. This is schematized structured formats are useful, though they are harder to get started with. However, once you start using those, informally specified things like Unix filters start to break down because they don't understand the structured format. This naturally led to the rise of things like PowerShell, which attempt to fit a much richer data model into the filter paradigm. Things like nushell, or even things like Michael Greenberg's work on formally shell specifications and smoosh are more recent advances. bo> CSV is OK to use from a users POV (and if you have a parser already, bo> better than a format only accessible to its parent application), but if bo> you were making your own format, you wouldn't use it. Practically every programming language in common use today has a high-quality CSV library available. bo> Steve Balmer. Sheesh! I remember that too. So what? You missed the point. Microsoft invested heavily in the developer experience for Windows, and developers wanted to use Windows. bo> te> I'd put this rather differently. Unix wasn't so much designed bo> te> as it emerged as a reaction to overly complex systems squeezed bo> te> onto a tiny (but affordable!) machine. That first machine bo> te> seemed promising and gave way to another small but affordable bo> te> machine; pipes came a few years later. bo> bo> Perhaps, but I find it more pragmatic. Solutions born from people bo> trying to solve problems have stood the test of time. They may not be bo> optimal, often arent, and you could do better if we tried again, but bo> they are established and understood. More pragmatic than what, exactly? The interesting thing about a research system is that it is designed to solve problems that are interesting in some place at some point in time. Unix is one of those very rare systems indeed where the research interests coincided with commercial interests in such a way that it could _successfully_ make the jump from research to commercial development. However, that doesn't mean that the system doesn't owe its origins -- not to mention its major design principles -- to the research context it was created in. The point is that Unix wasn't designed as a pragmatic solution to production data processing problems as much as it evolved to answer interesting research questions. What's even more interesting is that every system since has similarly had the benefit of that research. To bring this back to the original point -- again -- you may prefer Linux, but truly, there's very little in there that cannot be implemented on just about any other base system. --- Mystic BBS v1.12 A47 2021/12/24 (Linux/64) * Origin: Agency BBS | Dunedin, New Zealand | agency.bbs.nz (21:1/101) .