[HN Gopher] Using files with browsers, in reality
___________________________________________________________________
Using files with browsers, in reality
Author : jamghee
Score : 65 points
Date : 2022-03-20 01:33 UTC (1 days ago)
(HTM) web link (macwright.com)
(TXT) w3m dump (macwright.com)
| nyanpasu64 wrote:
| As a native programmer who meddles with the horrors of thread
| safety and & and &mut, and occasionally dabbles in high-level
| (JS/Dart) asynchronity, async functions fill me with much of the
| same fear and caution. await looks like merely a nonblocking
| function call, but means arbitrary code executes and can and will
| mutate arbitrary state under your feet. Shared state across
| coroutines is nearly as dangerous as shared state across threads,
| and (I conjecture) far more pervasive.
|
| In the code in question
| (https://web.archive.org/web/20220113153505/https://web.dev/f...,
| now changed in https://web.dev/file-system-access/#drag-and-drop-
| integratio...): elem.addEventListener('drop',
| async (e) => { // Prevent navigation.
| e.preventDefault(); // Process all of the items.
| for (const item of e.dataTransfer.items) { // Careful:
| `kind` will be 'file' for both file // _and_ directory
| entries. if (item.kind === 'file') { const
| entry = await item.getAsFileSystemHandle(); if
| (entry.kind === 'directory') {
| handleDirectoryEntry(entry); } else {
| handleFileEntry(entry); } } }
| });
|
| I probably wouldn't have guessed that `e.dataTransfer.items` gets
| cleared at the first await (since I'm not a proficient web
| developer), but I would've been _extremely_ wary of this code in
| general. Additionally (not tied to async-await but race
| conditions in general), is `item.getAsFileSystemHandle()` a
| TOCTTOU vulnerability where the type of an item can change
| between folders and files and symlinks etc., while this code is
| running?
|
| Rust's & vs. &mut system largely eliminates shared state hazards
| in both threading and asynchronity (&mut is exclusive/unaliased
| and can't be mutated by other threads or event loop jobs, and &
| is difficult and unidiomatic to mutate), though it doesn't solve
| async cancellation errors (https://carllerche.com/2021/06/17/six-
| ways-to-make-async-rus..., discussed at
| https://news.ycombinator.com/item?id=27542504), or filesystem
| TOCTTOU (https://blog.rust-
| lang.org/2022/01/20/cve-2022-21658.html as well as user code).
|
| Qt event loop reentrancy is fun(tm) as well. It looks like a
| blocking call, but spawns a nested event loop which can do
| anything (but rarely enough to lull you into a false sense of
| complacency), resulting in segfaults like
| https://github.com/Nheko-Reborn/nheko/issues/656 (workaround at
| https://github.com/Nheko-Reborn/nheko/commit/570d00b000bd558...,
| I didn't look into it). And Qt lacks "easy" await syntax and a
| framework based on calling red functions (though I didn't look
| into C++20 coroutines yet, perhaps
| https://www.qt.io/blog/asynchronous-apis-in-qt-6 or
| https://github.com/mhogomchungu/tasks or
| https://blog.blackquill.cc/asynchronous-qtquick-uis-and-
| thei...?).
| PaulHoule wrote:
| Other APIs for async file I/O are outright painful.
| mattdesl wrote:
| Nice article! It's a great API, hope to see the kinks ironed out.
|
| I am using it to stream PNG frames in real-time into a user's
| file system, for web-based media tools. This allows the user to
| generate their own HD video files offline at maximum quality,
| bringing static web a little closer to pro level media tooling.
|
| One downside aside from browser support is that two permissions
| must be requested for this: one to select the directory to save
| to, and then again to give write access.
| ohCh6zos wrote:
| Should web pages have this kind of API? We seem to be repeating
| the mistakes of activeX but this time with JavaScript.
| hulitu wrote:
| dathinab wrote:
| The gotcha is this (quote):
|
| > Note: What is referred to as the "local file system" in this
| spec, does not have to strictly refer to the file system on the
| local device.
|
| So basically a browser can implement this using a KV-Database
| or similar, nothing requires the browser to actually allow you
| to (file picker) "pick" files in your home directory or similar
| especially given that:
|
| > Note: While user agents will typically implement this by
| persisting the contents of this origin private file system to
| disk, it is not intended that the contents are easily user
| accessible. Similarly there is no expectation that files or
| directories with names matching the names of children of the
| origin private file system exist.
|
| Also
|
| > The origin private file system is a storage endpoint whose
| identifier is "fileSystem", types are << "local" >>, and quota
| is null.
|
| So this "file system" might exist in a complete "parallel
| universe" to your normal file system.
|
| Also given that this is not a new storage type it means that if
| you browser is setup to e.g. clear local storage from a
| specific origin if that origin wasn't used for a month this
| might still apply.
|
| So have fun with you "files" having disappeared after some
| longer holidays (e.g. on Safari, at least the way Apple planed
| to implement it a while ago).
|
| So while it looks like a file system access API, it might end
| up not being on depending on browser implementation details.
|
| Also any access goes through a file picker and you can't
| ".."-navigate up this avoid many problems with security, adding
| in no file links no fancy operations etc. means this should not
| be a problem even even if it allows access to you files.
|
| Through in the end it depends a lot on the choice the browser
| does when implementing it.
| dathinab wrote:
| Similar the limitations, especially the absence of flush
| makes it kinda useless to run a database in the browser :(
|
| Through there is `persist` on the storage manager, but I'm
| not sure how much this helps. Theoretically it could work
| like fsync or F_FULLSYNC, but practically I'm not sure at
| all.
| slaymaker1907 wrote:
| You can likely assume fsync is going on once you close a
| file writer. It's already a super expensive operation, so
| there would be little reason not to sync. The expensive
| part is they copy the entirety of the old file to a temp
| file, let you do changes to the temp file, then copy the
| temp file to the old file on close.
|
| The implementation creates other problems for DBs though
| since you can't really do small writes efficiently. One
| idea I've had would be to implement a virtual paging
| system, but then you introduce a new layer of abstraction
| and it's still going to be really slow on NTFS (since it
| assumes a few large files, not many small ones).
| slaymaker1907 wrote:
| The origin file system is pretty pointless, but it is only
| one part of the spec. It is definitely intended that the
| local file system (whether a cloud k/v store or not) at least
| allows for moving data between completely different websites.
|
| I think you make a good point about Safari's nonsense with
| browser data. The spec should require implementors to never
| clear out what they use for "local file system" unless the
| user explicitly says to, and only for the files selected for
| deletion. The old APIs like local storage/IndexedDB
| unfortunately assumed no browser vendor would be as dumb as
| Safari with their ridiculously short retention policy.
| spansoa wrote:
| Came here to say this too. Principle of least privilege doesn't
| exist here, even with the browser prompting for access to
| files, people are going to make mistakes and upload their
| entire C: directory by accident. Reminds me of Kazaa where you
| could essentially browse the contents of a person's hard-drive
| because they configured the wrong folder.
| AgentME wrote:
| This API can allow applications that work with files to be
| made as strongly-sandboxed web apps instead of unsandboxed
| applications. If the API didn't exist and the user had to
| download an unsandboxed application to work with a file, then
| the unsandboxed application will get access to their whole C:
| drive without the user needing to make any mistakes.
| vbezhenar wrote:
| It's not possible to grant access to C:
| [deleted]
| kaslai wrote:
| I think a "web app" should definitely be able to have some sort
| of more properly integrated filesystem API, but a "web page"
| has no business having access to such a thing. There just
| hasn't been a line drawn in the sand between the two, so every
| "web page" has all the capabilities of a "web app" by default.
|
| Personally I wish there would be a meaningful line drawn
| between the two so that users could have a nice shorthand for
| allowing web pages to "upgrade" into apps which have access to
| things like WebGL and filesystem access. Such a thing would
| only have any meaning to power users and privacy oriented
| people though, and the general trend in browser design has been
| to spurn such users in favor of reducing friction for everyone
| else at all costs.
| RussianCow wrote:
| That line was blurred long ago, and the distinction between
| an "app" and a "page" is irrelevant today. Even news articles
| these days sometimes contain interactive elements like WebGL
| visualizations.
|
| Personally, even though every new API inevitably gets abused,
| I don't think we should throw the baby out with the bathwater
| because there are tons of legitimate uses here. (In fact, I'm
| currently building something as a side project that would
| benefit greatly from these file system APIs.) My own worry is
| about complexity creep--specifically, the number of things
| I'll be expected to know and keep track of as a frontend
| engineer in another 10 years. But that's probably just me
| getting older. :)
| AgentME wrote:
| If web apps are fully sandboxed by default as today, then
| presenting the user a UI for a web page wanting to upgrade to
| a (still sandboxed permissionless) web app seems like a waste
| of the user's attention. Why should the user see a prompt
| just because a webpage wants to do some WebGL visualization
| (that doesn't put any of the user's data at risk)? It seems
| like the perfect recipe to lead to user apathy to permission
| dialogs and users clicking to allow permissions
| automatically, because most of the dialogs are for nothing,
| but then the user may be taught to click through actual
| important dialogs just as automatically. I'm reminded of when
| IE used to warn the user about secure connections.
| bityard wrote:
| > If web apps are fully sandboxed by default
|
| Are they, though?
|
| If they were, then tracking users via third-party cookies
| and other resources wouldn't be possible. Nor would it be
| possible for a web site in my browser to suddenly start
| taking up all of my CPU/RAM due to a programming error or
| malicious site such as a crypto-miner. For the relatively
| little isolation that does happen, sandbox-escape
| vulnerabilities seem to be getting discovered all the time.
|
| Also, as a technical user, I want more control over what
| web sites can do with my computer than a non-technical user
| might.
|
| The more holes you poke in a sandbox, the worse a sandbox
| it is.
| AgentME wrote:
| Third-party cookies seem to be on the way out thankfully.
| I agree that there should be a permission necessary (or
| at least some much better heuristic) for allowing a
| webpage to use too much CPU/memory.
| autoexec wrote:
| > Why should the user see a prompt just because a webpage
| wants to do some WebGL visualization (that doesn't put any
| of the user's data at risk)?
|
| Probably because there's no way to say that it "doesn't put
| any of the user's data at risk". WebGL has been abused for
| browser fingerprinting which itself puts user's privacy at
| risk, but it also has a long history of very nasty
| vulnerabilities and exploits. It's been fully disabled in
| my browser for years because of the security issues.
| bityard wrote:
| Nearly all of the security and privacy problems we have with
| the World Wide Web today was because it went from a content-
| delivery platform (with deliberately limited interactivity)
| to a fairly complete app-delivery platform.
|
| Javascript isn't the new Java. Web browsers are the new Java.
|
| I would be very much in favor of a way to draw a line between
| "content" on the web and "apps" delivered by the web. I don't
| know what form that would take. But it will probably never
| happen because the FAANGs that run the web these days are
| actively opposed to any way to deliver content over the web
| that doesn't also let them include apps to track your
| activities online.
| zokier wrote:
| > so every "web page" has all the capabilities of a "web app"
| by default
|
| Hasn't many (if not most?) major expansions of capabilities
| been behind a permissions prompt, making them not available
| to every web page by default?
| armchairhacker wrote:
| We already have this system.
|
| > foobar.com wants to access your location. [Block] [Allow]
|
| > foobar.com wants to access your camera and microphone.
| [Block] [Allow]
|
| > foobar.com wants to send push notifications. [Block]
| [Allow]
|
| Ideally these prompts are presented above the line of death,
| and clicking "Block" prevents future prompts, so you can't
| get spammed.
|
| Of course users click on these prompts without caring. Of
| course websites may try to unnecessarily block access if you
| dont agree. Of course websites make their own obviously fake
| prompts so you click "block" and then they present the
| obviously fake prompt again just to waste your time.
|
| But users already download and open random files and grant
| them admin privileges, and websites already spam you. The
| current notification system works and extending it to WebGPU
| and file systems is natural.
| dugmartin wrote:
| Another interesting use of the File System Access API
| (https://caniuse.com/?search=File%20System%20Access%20API) is:
|
| https://vscode.dev/
|
| You can grant it access to a directory on your local machine and
| then edit files like you would in the Electron version.
| ShamelessC wrote:
| > Google's documentation and the specification itself contained
| this bug in example code.
|
| Off-topic, but Google's documentation for _everything_ contains
| subtle bugs such as this.
| brundolf wrote:
| I wonder how this API will affect Electron apps once it's
| available there. For some apps the only reason you need non-UI
| code at all is for working with the file system; seems like it
| would be great to be able to consolidate more logic into the UI
| process
| shuntress wrote:
| I find the way we are slowly converging on Chromium-Powered-
| Browser as the standard operating system to be both funny and
| frustrating.
|
| I'm really dissatisfied with Electron apps in general and using
| an an operating system designed to run on top of other operating
| systems _feels_ wrong.
|
| But, I do really like it when things portably _just work_.
| slaymaker1907 wrote:
| It's not terrible, but it still has a lot of warts. I wrote a
| saver plugin for Tiddlywiki that uses this API for a self
| modifying HTML file.
| https://slaymaker1907.github.io/tiddlywiki/plugin-library.ht...
|
| One major annoyance is that you can't just show the file picker
| for "security reasons", it has to be user prompted. This is a
| useless precaution because you can just add an onclick handler to
| the document body. It inconveniences non-malware developers while
| hardly troubling malicious sites at all.
|
| Another goodie is that resolving a path is extremely expensive
| with no opportunity for caching from the OS. You have to parse
| paths yourself and individually walk each directory.
|
| I've also been working on an equivalent API to the fs module in
| Node lately, but really such an API should have been the #1
| priority example for using this API. It would have immediately
| highlighted how difficult the path problem is. OS/browser engine
| nerds deal with file descriptors, but most applications work with
| paths.
|
| It also would be great if Firefox would stop being so hostile to
| this API. The API has problems, but they aren't inherent to the
| goals of this API. I also find polyfills for this API to be
| ridiculous. You really can't polyfill this in a meaningful way
| for the browser because it is so groundbreaking. Having a non-
| Google implementation of the API would be way more helpful than
| polyfills.
___________________________________________________________________
(page generated 2022-03-21 23:01 UTC)