[HN Gopher] Using files with browsers, in reality
       ___________________________________________________________________
        
       Using files with browsers, in reality
        
       Author : jamghee
       Score  : 65 points
       Date   : 2022-03-20 01:33 UTC (1 days ago)
        
 (HTM) web link (macwright.com)
 (TXT) w3m dump (macwright.com)
        
       | nyanpasu64 wrote:
       | As a native programmer who meddles with the horrors of thread
       | safety and & and &mut, and occasionally dabbles in high-level
       | (JS/Dart) asynchronity, async functions fill me with much of the
       | same fear and caution. await looks like merely a nonblocking
       | function call, but means arbitrary code executes and can and will
       | mutate arbitrary state under your feet. Shared state across
       | coroutines is nearly as dangerous as shared state across threads,
       | and (I conjecture) far more pervasive.
       | 
       | In the code in question
       | (https://web.archive.org/web/20220113153505/https://web.dev/f...,
       | now changed in https://web.dev/file-system-access/#drag-and-drop-
       | integratio...):                 elem.addEventListener('drop',
       | async (e) => {         // Prevent navigation.
       | e.preventDefault();         // Process all of the items.
       | for (const item of e.dataTransfer.items) {           // Careful:
       | `kind` will be 'file' for both file           // _and_ directory
       | entries.           if (item.kind === 'file') {             const
       | entry = await item.getAsFileSystemHandle();             if
       | (entry.kind === 'directory') {
       | handleDirectoryEntry(entry);             } else {
       | handleFileEntry(entry);             }           }         }
       | });
       | 
       | I probably wouldn't have guessed that `e.dataTransfer.items` gets
       | cleared at the first await (since I'm not a proficient web
       | developer), but I would've been _extremely_ wary of this code in
       | general. Additionally (not tied to async-await but race
       | conditions in general), is `item.getAsFileSystemHandle()` a
       | TOCTTOU vulnerability where the type of an item can change
       | between folders and files and symlinks etc., while this code is
       | running?
       | 
       | Rust's & vs. &mut system largely eliminates shared state hazards
       | in both threading and asynchronity (&mut is exclusive/unaliased
       | and can't be mutated by other threads or event loop jobs, and &
       | is difficult and unidiomatic to mutate), though it doesn't solve
       | async cancellation errors (https://carllerche.com/2021/06/17/six-
       | ways-to-make-async-rus..., discussed at
       | https://news.ycombinator.com/item?id=27542504), or filesystem
       | TOCTTOU (https://blog.rust-
       | lang.org/2022/01/20/cve-2022-21658.html as well as user code).
       | 
       | Qt event loop reentrancy is fun(tm) as well. It looks like a
       | blocking call, but spawns a nested event loop which can do
       | anything (but rarely enough to lull you into a false sense of
       | complacency), resulting in segfaults like
       | https://github.com/Nheko-Reborn/nheko/issues/656 (workaround at
       | https://github.com/Nheko-Reborn/nheko/commit/570d00b000bd558...,
       | I didn't look into it). And Qt lacks "easy" await syntax and a
       | framework based on calling red functions (though I didn't look
       | into C++20 coroutines yet, perhaps
       | https://www.qt.io/blog/asynchronous-apis-in-qt-6 or
       | https://github.com/mhogomchungu/tasks or
       | https://blog.blackquill.cc/asynchronous-qtquick-uis-and-
       | thei...?).
        
       | PaulHoule wrote:
       | Other APIs for async file I/O are outright painful.
        
       | mattdesl wrote:
       | Nice article! It's a great API, hope to see the kinks ironed out.
       | 
       | I am using it to stream PNG frames in real-time into a user's
       | file system, for web-based media tools. This allows the user to
       | generate their own HD video files offline at maximum quality,
       | bringing static web a little closer to pro level media tooling.
       | 
       | One downside aside from browser support is that two permissions
       | must be requested for this: one to select the directory to save
       | to, and then again to give write access.
        
       | ohCh6zos wrote:
       | Should web pages have this kind of API? We seem to be repeating
       | the mistakes of activeX but this time with JavaScript.
        
         | hulitu wrote:
        
         | dathinab wrote:
         | The gotcha is this (quote):
         | 
         | > Note: What is referred to as the "local file system" in this
         | spec, does not have to strictly refer to the file system on the
         | local device.
         | 
         | So basically a browser can implement this using a KV-Database
         | or similar, nothing requires the browser to actually allow you
         | to (file picker) "pick" files in your home directory or similar
         | especially given that:
         | 
         | > Note: While user agents will typically implement this by
         | persisting the contents of this origin private file system to
         | disk, it is not intended that the contents are easily user
         | accessible. Similarly there is no expectation that files or
         | directories with names matching the names of children of the
         | origin private file system exist.
         | 
         | Also
         | 
         | > The origin private file system is a storage endpoint whose
         | identifier is "fileSystem", types are << "local" >>, and quota
         | is null.
         | 
         | So this "file system" might exist in a complete "parallel
         | universe" to your normal file system.
         | 
         | Also given that this is not a new storage type it means that if
         | you browser is setup to e.g. clear local storage from a
         | specific origin if that origin wasn't used for a month this
         | might still apply.
         | 
         | So have fun with you "files" having disappeared after some
         | longer holidays (e.g. on Safari, at least the way Apple planed
         | to implement it a while ago).
         | 
         | So while it looks like a file system access API, it might end
         | up not being on depending on browser implementation details.
         | 
         | Also any access goes through a file picker and you can't
         | ".."-navigate up this avoid many problems with security, adding
         | in no file links no fancy operations etc. means this should not
         | be a problem even even if it allows access to you files.
         | 
         | Through in the end it depends a lot on the choice the browser
         | does when implementing it.
        
           | dathinab wrote:
           | Similar the limitations, especially the absence of flush
           | makes it kinda useless to run a database in the browser :(
           | 
           | Through there is `persist` on the storage manager, but I'm
           | not sure how much this helps. Theoretically it could work
           | like fsync or F_FULLSYNC, but practically I'm not sure at
           | all.
        
             | slaymaker1907 wrote:
             | You can likely assume fsync is going on once you close a
             | file writer. It's already a super expensive operation, so
             | there would be little reason not to sync. The expensive
             | part is they copy the entirety of the old file to a temp
             | file, let you do changes to the temp file, then copy the
             | temp file to the old file on close.
             | 
             | The implementation creates other problems for DBs though
             | since you can't really do small writes efficiently. One
             | idea I've had would be to implement a virtual paging
             | system, but then you introduce a new layer of abstraction
             | and it's still going to be really slow on NTFS (since it
             | assumes a few large files, not many small ones).
        
           | slaymaker1907 wrote:
           | The origin file system is pretty pointless, but it is only
           | one part of the spec. It is definitely intended that the
           | local file system (whether a cloud k/v store or not) at least
           | allows for moving data between completely different websites.
           | 
           | I think you make a good point about Safari's nonsense with
           | browser data. The spec should require implementors to never
           | clear out what they use for "local file system" unless the
           | user explicitly says to, and only for the files selected for
           | deletion. The old APIs like local storage/IndexedDB
           | unfortunately assumed no browser vendor would be as dumb as
           | Safari with their ridiculously short retention policy.
        
         | spansoa wrote:
         | Came here to say this too. Principle of least privilege doesn't
         | exist here, even with the browser prompting for access to
         | files, people are going to make mistakes and upload their
         | entire C: directory by accident. Reminds me of Kazaa where you
         | could essentially browse the contents of a person's hard-drive
         | because they configured the wrong folder.
        
           | AgentME wrote:
           | This API can allow applications that work with files to be
           | made as strongly-sandboxed web apps instead of unsandboxed
           | applications. If the API didn't exist and the user had to
           | download an unsandboxed application to work with a file, then
           | the unsandboxed application will get access to their whole C:
           | drive without the user needing to make any mistakes.
        
           | vbezhenar wrote:
           | It's not possible to grant access to C:
        
             | [deleted]
        
         | kaslai wrote:
         | I think a "web app" should definitely be able to have some sort
         | of more properly integrated filesystem API, but a "web page"
         | has no business having access to such a thing. There just
         | hasn't been a line drawn in the sand between the two, so every
         | "web page" has all the capabilities of a "web app" by default.
         | 
         | Personally I wish there would be a meaningful line drawn
         | between the two so that users could have a nice shorthand for
         | allowing web pages to "upgrade" into apps which have access to
         | things like WebGL and filesystem access. Such a thing would
         | only have any meaning to power users and privacy oriented
         | people though, and the general trend in browser design has been
         | to spurn such users in favor of reducing friction for everyone
         | else at all costs.
        
           | RussianCow wrote:
           | That line was blurred long ago, and the distinction between
           | an "app" and a "page" is irrelevant today. Even news articles
           | these days sometimes contain interactive elements like WebGL
           | visualizations.
           | 
           | Personally, even though every new API inevitably gets abused,
           | I don't think we should throw the baby out with the bathwater
           | because there are tons of legitimate uses here. (In fact, I'm
           | currently building something as a side project that would
           | benefit greatly from these file system APIs.) My own worry is
           | about complexity creep--specifically, the number of things
           | I'll be expected to know and keep track of as a frontend
           | engineer in another 10 years. But that's probably just me
           | getting older. :)
        
           | AgentME wrote:
           | If web apps are fully sandboxed by default as today, then
           | presenting the user a UI for a web page wanting to upgrade to
           | a (still sandboxed permissionless) web app seems like a waste
           | of the user's attention. Why should the user see a prompt
           | just because a webpage wants to do some WebGL visualization
           | (that doesn't put any of the user's data at risk)? It seems
           | like the perfect recipe to lead to user apathy to permission
           | dialogs and users clicking to allow permissions
           | automatically, because most of the dialogs are for nothing,
           | but then the user may be taught to click through actual
           | important dialogs just as automatically. I'm reminded of when
           | IE used to warn the user about secure connections.
        
             | bityard wrote:
             | > If web apps are fully sandboxed by default
             | 
             | Are they, though?
             | 
             | If they were, then tracking users via third-party cookies
             | and other resources wouldn't be possible. Nor would it be
             | possible for a web site in my browser to suddenly start
             | taking up all of my CPU/RAM due to a programming error or
             | malicious site such as a crypto-miner. For the relatively
             | little isolation that does happen, sandbox-escape
             | vulnerabilities seem to be getting discovered all the time.
             | 
             | Also, as a technical user, I want more control over what
             | web sites can do with my computer than a non-technical user
             | might.
             | 
             | The more holes you poke in a sandbox, the worse a sandbox
             | it is.
        
               | AgentME wrote:
               | Third-party cookies seem to be on the way out thankfully.
               | I agree that there should be a permission necessary (or
               | at least some much better heuristic) for allowing a
               | webpage to use too much CPU/memory.
        
             | autoexec wrote:
             | > Why should the user see a prompt just because a webpage
             | wants to do some WebGL visualization (that doesn't put any
             | of the user's data at risk)?
             | 
             | Probably because there's no way to say that it "doesn't put
             | any of the user's data at risk". WebGL has been abused for
             | browser fingerprinting which itself puts user's privacy at
             | risk, but it also has a long history of very nasty
             | vulnerabilities and exploits. It's been fully disabled in
             | my browser for years because of the security issues.
        
           | bityard wrote:
           | Nearly all of the security and privacy problems we have with
           | the World Wide Web today was because it went from a content-
           | delivery platform (with deliberately limited interactivity)
           | to a fairly complete app-delivery platform.
           | 
           | Javascript isn't the new Java. Web browsers are the new Java.
           | 
           | I would be very much in favor of a way to draw a line between
           | "content" on the web and "apps" delivered by the web. I don't
           | know what form that would take. But it will probably never
           | happen because the FAANGs that run the web these days are
           | actively opposed to any way to deliver content over the web
           | that doesn't also let them include apps to track your
           | activities online.
        
           | zokier wrote:
           | > so every "web page" has all the capabilities of a "web app"
           | by default
           | 
           | Hasn't many (if not most?) major expansions of capabilities
           | been behind a permissions prompt, making them not available
           | to every web page by default?
        
           | armchairhacker wrote:
           | We already have this system.
           | 
           | > foobar.com wants to access your location. [Block] [Allow]
           | 
           | > foobar.com wants to access your camera and microphone.
           | [Block] [Allow]
           | 
           | > foobar.com wants to send push notifications. [Block]
           | [Allow]
           | 
           | Ideally these prompts are presented above the line of death,
           | and clicking "Block" prevents future prompts, so you can't
           | get spammed.
           | 
           | Of course users click on these prompts without caring. Of
           | course websites may try to unnecessarily block access if you
           | dont agree. Of course websites make their own obviously fake
           | prompts so you click "block" and then they present the
           | obviously fake prompt again just to waste your time.
           | 
           | But users already download and open random files and grant
           | them admin privileges, and websites already spam you. The
           | current notification system works and extending it to WebGPU
           | and file systems is natural.
        
       | dugmartin wrote:
       | Another interesting use of the File System Access API
       | (https://caniuse.com/?search=File%20System%20Access%20API) is:
       | 
       | https://vscode.dev/
       | 
       | You can grant it access to a directory on your local machine and
       | then edit files like you would in the Electron version.
        
       | ShamelessC wrote:
       | > Google's documentation and the specification itself contained
       | this bug in example code.
       | 
       | Off-topic, but Google's documentation for _everything_ contains
       | subtle bugs such as this.
        
       | brundolf wrote:
       | I wonder how this API will affect Electron apps once it's
       | available there. For some apps the only reason you need non-UI
       | code at all is for working with the file system; seems like it
       | would be great to be able to consolidate more logic into the UI
       | process
        
       | shuntress wrote:
       | I find the way we are slowly converging on Chromium-Powered-
       | Browser as the standard operating system to be both funny and
       | frustrating.
       | 
       | I'm really dissatisfied with Electron apps in general and using
       | an an operating system designed to run on top of other operating
       | systems _feels_ wrong.
       | 
       | But, I do really like it when things portably _just work_.
        
       | slaymaker1907 wrote:
       | It's not terrible, but it still has a lot of warts. I wrote a
       | saver plugin for Tiddlywiki that uses this API for a self
       | modifying HTML file.
       | https://slaymaker1907.github.io/tiddlywiki/plugin-library.ht...
       | 
       | One major annoyance is that you can't just show the file picker
       | for "security reasons", it has to be user prompted. This is a
       | useless precaution because you can just add an onclick handler to
       | the document body. It inconveniences non-malware developers while
       | hardly troubling malicious sites at all.
       | 
       | Another goodie is that resolving a path is extremely expensive
       | with no opportunity for caching from the OS. You have to parse
       | paths yourself and individually walk each directory.
       | 
       | I've also been working on an equivalent API to the fs module in
       | Node lately, but really such an API should have been the #1
       | priority example for using this API. It would have immediately
       | highlighted how difficult the path problem is. OS/browser engine
       | nerds deal with file descriptors, but most applications work with
       | paths.
       | 
       | It also would be great if Firefox would stop being so hostile to
       | this API. The API has problems, but they aren't inherent to the
       | goals of this API. I also find polyfills for this API to be
       | ridiculous. You really can't polyfill this in a meaningful way
       | for the browser because it is so groundbreaking. Having a non-
       | Google implementation of the API would be way more helpful than
       | polyfills.
        
       ___________________________________________________________________
       (page generated 2022-03-21 23:01 UTC)