[HN Gopher] Web Browser Engineering
___________________________________________________________________
Web Browser Engineering
Author : djoldman
Score : 331 points
Date : 2021-10-17 17:33 UTC (2 days ago)
(HTM) web link (browser.engineering)
(TXT) w3m dump (browser.engineering)
| butz wrote:
| Neat, but what about integrating widevine support?
| simonw wrote:
| I was interested to see that this uses the DukPy wrapper around
| Duktape for the JavaScript interpreter:
| https://browser.engineering/scripts.html
|
| This made me start digging into whether this was considered a
| "safe" way of executing untrusted JavaScript in a sandbox.
|
| It's not completely clear to me if DukPy currently attempts safe
| evaluation - it's missing options for setting time or memory
| limits on executed code for example:
| https://github.com/amol-/dukpy
|
| There's a QuickJS Python wrapper here which offers those limits:
| https://github.com/PetterS/quickjs
|
| I'm pretty paranoid though any time it comes to security and
| dependencies written in C, so I'd love to see a Python wrapper
| around a JavaScript engine that has safe sandbox execution as a
| key goal plus an extensive track record to back it up!
| ameliaquining wrote:
| If you want battle-hardened, I figure you can't do better than
| V8. Here's a Python wrapper that I've poked at a bit (it's not
| quite 100% feature-complete but it seems to essentially work):
| https://github.com/sqreen/PyMiniRacer
| simonw wrote:
| That looks really good - especially how they've managed to
| bundle a pre-compiled v8 into a 4MB Python wheel:
| https://blog.sqreen.com/embedding-javascript-into-python/
|
| The time limit and memory limit support looks good too: https
| ://github.com/sqreen/PyMiniRacer/blob/f7b9da0d4987ca7d1...
| devwastaken wrote:
| I don't see any specific claims on isolation/memory safety or
| safety in general on duktapes page. Both V8 and spider monkey
| actively fix new JS vulnerabilities, and V8 isolates are used
| in the wild to success. Cloudflare workers is an example.
| ngai_aku wrote:
| Awesome to see this here! The course that accompanied this
| textbook was among my favorites
| eatonphil wrote:
| I can't wait for Browser Engineering to show up as a university
| course a la compilers, operating systems, networks, etc.
| amelius wrote:
| A browser is basically an operating system inside an operating
| system.
|
| The funny thing is we can have the MINIX microkernel
| discussions all over again :)
| gnull wrote:
| Looking at the book's table of contents, I disagree. Browsers
| may resemble OSes by the size of the code base or by the
| amount of optimization involved, or by their importance for
| the modern world, but not by the types of technologies
| involved. I doubt the Browsers course will intersect a lot
| with OSes course.
|
| EDIT: Reading you comment again I suspect you might have been
| joking :)
| groby_b wrote:
| Which part of OSs do you expect not to be covered?
|
| There's IPC. There's memory management. There's process
| management. There's network management. There's security.
| There's device management.
|
| They all happen at a slightly higher layer, but they all
| exist similar to an OS. (I'm not sure if the higher layer
| makes it easier or harder to understand - but in terms of
| what you need to know, an OS class or three is definitely
| helpful)
| gnull wrote:
| > IPC, memory management, process management, network
| management.
|
| I imagine the non-trivial parts of these are done for the
| JS VM (correct me if I'm wrong), and therefore a VM
| design course would have more intersection with Browsers
| with respect to these disciplines than an OS course.
|
| > security
|
| This one is everywhere, it has no special connection to
| OSes.
|
| > They all happen at a slightly higher layer
|
| Slightly?! That's an understatement of the week! The
| difference in abstraction levels is huge here and the
| specifics of the two levels are very, very different.
|
| > in terms of what you need to know, an OS class or three
| is definitely helpful
|
| Sure. But I think it's as useful as any systems
| programming course. I can agree that systems programming
| is a good preliminary for both Browsers and OSes, and
| learning either of the two will teach you a good deal
| about systems programming, but I doubt they will repeat
| each other.
| groby_b wrote:
| These are all part of the browser outside the VM too.
|
| Multi-process architecture requires you to think deeply
| about IPC.
|
| Memory management is all over the place - there isn't a
| browser without custom allocators, investment into GC,
| etc.
|
| Process management -> see multi-process architecture.
|
| Network management: Browsers need to handle a tremendous
| amount of network issues. I mean... that's what they do.
| Outside of the VM, too.
|
| As for "security is everywhere" - the whole point of an
| OS (and a browser) is to make it possible for security to
| be everywhere. To provide the primitives that you can
| securely build on.
|
| > Slightly?! That's an understatement of the week!
|
| Not really, no. I've worked on embedded networking
| stacks, on full-fledged OSs, and on browsers. I stand by
| "slightly". Yes, granted, a browser doesn't get quite as
| bit-fiddly as a on-the-metal OS, but it's a matter of
| degrees, not quality.
|
| Can you work on many areas of a browser without ever
| touching OS-like code? Absolutely. This particular book
| has a good chance of avoiding most, because it focuses on
| the rendering part.
|
| But a browser, as a whole, provides an abstracted
| platform just like an OS. And it echoes many concepts, if
| in slightly different forms.
| chrishtr wrote:
| Author here.
|
| That's exactly what we are hoping for.
|
| http://browser.engineering/preface.html
|
| So far, my co-author Pavel has taught from this book multiple
| times (including this semester). In the spring at least one
| other university will offer a course. We'll list all known
| courses offerings on the website.
|
| Also, if anyone would like to teach from this book, please get
| in touch!
| varispeed wrote:
| Many years ago, probably 20, I went on a task of implementing a
| web browser. I remember I gave up at rendering tables. I couldn't
| wrap my head around on how to properly size them. It has become
| extremely complex quickly to address edge cases and I eventually
| gave up when I couldn't understand what's going after having a
| two weeks break. Probably if I had money and was able to commit
| full time I could eventually get it, but I had to focus on
| commercial work and putting food on my table.
|
| edit: it's a great article! But nothing on rendering tables :-)
| dmitriid wrote:
| > Many years ago, probably 20, I went on a task of implementing
| a web browser. I remember I gave up at rendering tables.
|
| HTML 5 effort has cleaned up a lot of behaviors and specified
| how browser tags should behave. So it is, possibly, an
| approachable task now. Still daunting though.
| pavpanchekha wrote:
| Tables are still not that clean! But luckily tables are way
| way less important than they were in the past, so much so
| that browser differences in table rendering leaves most pages
| readable.
| baybal2 wrote:
| It's now nearly impossible to build a web browser from scratch
| because of runaway explosion of web browser features, and
| proprietary API extensions.
|
| W3C here is unfortunately a part to the problem.
|
| Standardisation is good, but letting google pour streams
| halfassedly written RFCs onto other browsermakers is not good.
|
| Non-enforcement of standards is also bad, and it's bad to extend
| W3C privileges to companies who themselves selectively implement
| their own proposals, so others' browsers can't match their
| behaviour.
| jahewson wrote:
| Actually a from-scratch web browser is being built:
|
| https://www.fastcompany.com/90611677/flow-ekioh-web-browser-...
|
| Also you should take a look at the WHATWG because it's far more
| relevant than the W3C nowadays.
| dmitriid wrote:
| "The total word count of the W3C specification catalogue is
| 114 million words at the time of writing. If you added the
| combined word counts of the C11, C++17, UEFI, USB 3.2, and
| POSIX specifications, all 8,754 published RFCs, and the
| combined word counts of everything on Wikipedia's list of
| longest novels, you would be 12 million words short of the
| W3C specifications"
|
| https://drewdevault.com/2020/03/18/Reckless-limitless-
| scope....
|
| No idea how Flow does it, but building a browser is nearly
| impossible.
| fabrice_d wrote:
| It's well known that Drew Devault count is meaningless
| since it includes dupes, drafts, and unrelated specs.
| Still, the space to cover for a from scratch browser is
| huge.
|
| Flow didn't start "from scratch" recently, it's an
| evolution of a primarily SVG+CSS renderer for set top
| boxes. They also re-use Spidermonkey as their Javascript
| engine.
| dmitriid wrote:
| > It's well known that Drew Devault count is meaningless
| since it includes dupes, drafts, and unrelated specs.
|
| It's not meaningless. Because in order to implement a
| browser, you have to figure out which of them are dupes,
| deprecated, drafts etc.
|
| And even that won't help you. Because a huge amount of
| "deprecated" standards are in the browsers. A huge amount
| of stuff in the browsers is still at the "community
| draft" stage, and yes, you have to implement that, too.
|
| Microsoft simply gave up, forked Chromium... And they
| still can't keep up: https://web-
| confluence.appspot.com/#!/confluence
| fabrice_d wrote:
| The Edge graph stops before they switched to chromium
| though...
| dmitriid wrote:
| Yes, that was my mistake. Doesn't make my words any less
| true.
| carapace wrote:
| > it includes dupes, drafts, and unrelated specs.
|
| Even if it's overblown by, say, three times, that's still
| over thirty million words.
| fabrice_d wrote:
| Note that a large effort has been made to make specs more
| precise, to be easier to implement in an interoperable
| way.
|
| That contributes to "word bloat", but it's not
| necessarily a bad thing. Picking the right metrics is not
| always that easy!
| dmitriid wrote:
| > Note that a large effort has been made to make specs
| more precise, to be easier to implement in an
| interoperable way
|
| This is true for HTML5 which defined full browser
| behaviour, including things like improperly closed and
| improperly nested tags.
|
| Many, many other specs? Not so much. Especially the crap
| that Chrome has been pumping out the past several years.
| carapace wrote:
| Flow browser isn't FOSS.
|
| > WHATWG [is] far more relevant than the W3C nowadays.
|
| Which is arguably part of the problem.
| kizer wrote:
| Love first principles stuff. Great job.
| ofou wrote:
| this builds a web browser like the build-your-own-x movement?
| pavpanchekha wrote:
| Author here--the browser the book works through is, uhh, pretty
| limited, so I don't imagine you'd want to actually use it for
| web browsing. It's more like writing your own toy compiler or
| operating system, to learn how they work.
| ofou wrote:
| It's perfect then! I'll read and work the exercises.
|
| Thanks for answer!
| jturpin wrote:
| This is awesome. I've always wanted to know how the actual layout
| portion works (or at least, can work in a simple way). I think
| these kinds of resources are really valuable and people should be
| empowered to make bespoke-ish web renderers as the need arises.
| pavpanchekha wrote:
| Author here. Layout is my favorite part of the browser--my
| dissertation is largely about formalizing browser layout--and
| so far the book only covers the basics, like the layout tree
| and how layout is computed with multiple tree traversals, but
| even understanding those basics I think gives you a huge
| advantage when thinking about web page layout tasks.
| game_the0ry wrote:
| As a front end developer, I am really happy to see resources like
| this.
|
| Developing for the browser is a real challenge. I think working
| with html / css /js has been a neglected skill for a long time -
| most software engineers look down on that type of work and its
| rarely covered in comp sci course work.
|
| Still, its good to see a lot of progress has been made, this book
| included.
|
| My only critique - why use python instead of node.js?
| pavpanchekha wrote:
| Author here. I wrote up my answer here:
| http://browser.engineering/blog/why-python.html
|
| Basically: server-side JavaScript is just not as widely known
| as Python, and it'd be additionally confusing when our browser
| starts running JavaScript. And in-browser JavaScript is a bit
| too restricted (by things like the same-origin policy) to do
| the whole thing inside a browser.
| gnull wrote:
| Great blog design, btw! I love the way it displays footnotes
| (if you can call them that now) on page margins.
|
| Does it have RSS?
| pavpanchekha wrote:
| Yep! You can point your RSS reader at
| https://browser.engineering/rss.xml
| game_the0ry wrote:
| I see. I tried searching the table of contents as to "why
| python" and could not find it, but that link does more than
| enough to explain the "why."
|
| I am resisting the urge to disagree, but since you did
| (literally) write a book about building a browser, I will
| defer to your expertise and try to learn from you :)
| bogomipz wrote:
| This looks fantastic! Really excited to see this. I'm looking
| forward to reading the updates as well. Cheers.
| tablespoon wrote:
| > Developing for the browser is a real challenge. I think
| working with html / css /js has been a neglected skill for a
| long time - most software engineers look down on that type of
| work and its rarely covered in comp sci course work.
|
| IMHO, those are job skills and not comp sci topics. They
| shouldn't be part of a degree program (except the most
| superficial treatment required to get some ugly UI up that may
| be required for something else). You have your whole career to
| pick them up.
| dgb23 wrote:
| I agree, JS _the language_ would be a more "obvious" choice for
| this, since it is both generally much faster and more popular
| in web development. I assume they are using Python _the
| ecosystem_ here. It probably comes with packages better suited
| for rendering specifically?
|
| I don't think either is a super compelling choice anyways for
| this type of work. I think you want to use a systems language
| here. However Python is completely fine as a teaching language.
| Pretty much anyone who knows a similarly structured language
| can read it. And there is very little noise. So it can serve as
| a good reference if you want to follow along with a different
| language.
| traverseda wrote:
| I think writing this in javascript would add some confusion,
| as people learning about this might have a hard time
| understanding exactly where the javascript is running.
| richm44 wrote:
| This video is good background for khtml/webkit on which chrome
| was based https://www.youtube.com/watch?v=Tldf1rT0Rn0
| harry-wood wrote:
| Building a browser as a desktop application would be quite hard,
| but I reckon I do it as a web application.
| posedge wrote:
| Maybe we should have a SaaS browser.
| amelius wrote:
| Is this written by actual web browser engineers? If so, what
| fields did they specialize in?
| bla3 wrote:
| https://mobile.twitter.com/chrishtr says "Rendering lead for
| Chrome", so at least one of the authors seems to do this
| professionally.
| imajoredinecon wrote:
| More about his team's work:
| https://www.chromium.org/teams/rendering
| chrishtr wrote:
| Author here.
|
| I'm the rendering lead for Chrome, and know quite a lot about
| how it works. I also recently wrote a series of articles about
| the new rendering architecture of Chromium, see here:
|
| https://developer.chrome.com/blog/renderingng/
|
| Pavel is a professor at the University of Utah and has
| extensively studied CSS from an academic point of view. He also
| has a lot of experience teaching the material and making it
| accessible to students.
| tester756 wrote:
| I recommend this:
|
| https://www.html5rocks.com/en/tutorials/internals/howbrowser...
| chrishtr wrote:
| Author here.
|
| That article, along with a number of other resources, are
| listed here:
|
| https://browser.engineering/bibliography.html
|
| In my view, a critical part of really learning how something as
| complicated as a browser works is by trying to build it
| yourself. That's why our book is oriented around building a
| browser as you go.
| bogomipz wrote:
| This also looks great. Might you or anyone else know where to
| find the accompanying video? I get "Sorry This video does not
| exist." for the embedded video link.
| a_c wrote:
| Love the bibliography section. I have always wanted to
| reinterpret HTML into other representations. These resources give
| me good reference
___________________________________________________________________
(page generated 2021-10-19 23:01 UTC)