[HN Gopher] Running PHP fast at the edge with WebAssembly
       ___________________________________________________________________
        
       Running PHP fast at the edge with WebAssembly
        
       Author : ecmm
       Score  : 84 points
       Date   : 2024-05-23 17:09 UTC (5 hours ago)
        
 (HTM) web link (wasmer.io)
 (TXT) w3m dump (wasmer.io)
        
       | Twirrim wrote:
       | Whereby fast is "About half the speed of native PHP"
        
         | syrusakbary wrote:
         | Author here. Faster than it has ever been in the Edge via
         | WebAssembly :)
         | 
         | But you are completely right pointing out that there's still
         | some room to improve, specially when compared to running PHP
         | natively.
         | 
         | Right now there's some price to pay for the isolation and
         | sandboxing, but we are working hard to reduce the gap to zero.
         | Stay tuned for more updates on this front!
        
           | stracer wrote:
           | Is this tech meant for developers' needs only, or can
           | regular, already existing PHP websites (e-shops...) somehow
           | take advantage of it as well?
        
             | the_duke wrote:
             | It's meant for direct usage, not just for developers.
             | 
             | You can deploy apps on the Wasmer Edge cloud, and also host
             | things yourself if you want to, though in the later case
             | the setup will be non-trivial.
        
           | maxloh wrote:
           | Given that we run PHP on the edge, what is the point of
           | running the PHP interpreter on top of a WebAssembly
           | interpreter (Wasmer) instead of just running the PHP
           | interpreter directly?
           | 
           | The latter will always be faster.
        
             | Cyberdog wrote:
             | From what I can tell it's because some of these "edge"
             | service providers will expect you to give them a WASM
             | binary instead of a PHP script.
             | 
             | The other caveats about "edge" throughout this discussion
             | aside, if I needed to do this, I'd try to write something
             | in Zig or (gag) JS or something else that compiles to WASM
             | directly rather than writing a script for an interpreter
             | that runs under WASM.
        
           | mike_d wrote:
           | Hey you should really check out fly.io if you want to run
           | stuff on the edge. They have it pretty much figured out.
           | 
           | FrankenPHP is also really good and probably a much smarter
           | play than trying to get your code running under WebAssembly.
        
         | vundercind wrote:
         | Nb PHP is already _really_ fast. Some major things built on it
         | are slow, mostly due to things like poor data access patterns
         | or architecture, but its culture since the beginning has
         | basically been "get out of PHP and into C as fast as possible"
         | and it shows (this is basically the trick to _any_ scripting
         | language "being fast", PHP just embraced it hard from the very
         | beginning).
         | 
         | If you're on PHP and need more speed _in the language itself_ ,
         | basically every other scripting language (yes, including Node)
         | is off the table immediately. Lateral move at best.
         | 
         | (All that to say, yeah, I entirely expected that the headline
         | would lead to an article about making PHP slower)
        
           | mschuster91 wrote:
           | > PHP just embraced it hard from the very beginning
           | 
           | Which is also the reason why _a lot_ of the PHP standard
           | library functions are so inconsistent. They 're straight
           | wrappers around the C libraries.
           | 
           | Upside is, unless you have to do shit with pointer magic or
           | play around the edges of signed/unsigned numbers, it's fairly
           | easy to port C example code for any of the myriad things PHP
           | has bindings to to PHP.
        
         | pjmlp wrote:
         | I don't know which hype is worse, AI or WebAssembly on the
         | server.
        
           | secondcoming wrote:
           | Is JS on the server still a thing?
        
             | pjmlp wrote:
             | Yes, and to use alternatives there is no need for
             | WebAssembly.
        
       | csomar wrote:
       | This is trying to solve a solved problem with lots of difficult
       | technology that doesn't apply here. Most of the PHP websites are
       | WordPress. The solution to have a speedy WordPress site is to
       | compile it to static HTML. Calls to the server should happen with
       | JavaScript. The server will always remain relevant as WordPress
       | uses a Database and thus the "Edge" makes no sense here.
        
         | jw1224 wrote:
         | Getting WordPress running in WASM was a huge milestone. It was
         | one of the first big PHP/WASM achievements, but was never the
         | end goal, just a proof-of-concept. The target market for this
         | tech is _not_ WordPress bloggers.
        
       | ahofmann wrote:
       | I'm still trying to understand what this does and what's the use
       | case. Is the "edge" a server? The browser? Why should I compile
       | WordPress or Laravel to wasm?
        
         | tredre3 wrote:
         | In this context the edge is a fancy word to say "serverless".
         | It just means that your PHP interpreter will be started on-
         | demand on a node closer to your customer's request.
         | 
         | So if your website receives no requests, it costs you nothing.
         | And requests have less latency for the user.
         | 
         | That's the theory anyway, in my experience reality is a lot
         | more nuanced because the serverless node still has to reach a
         | database and so on.
        
           | pjmlp wrote:
           | PHP nowadays already has a JIT compiler and its own bytecode,
           | no need for WebAssembly.
        
             | jw1224 wrote:
             | This seems like a misunderstanding, WebAssembly has nothing
             | to do with PHP's internal performance mechanisms. WASM is a
             | compilation target, this means you can take code which
             | needs to be compiled (like PHP's core binary), and compile
             | it to be run in a browser.
             | 
             | PHP in WASM means developers can run actual, real, native
             | PHP code _in the user's browser_ , without the user needing
             | to have PHP installed locally, or nginx, etc...
        
               | pjmlp wrote:
               | Because I always wanted to run PHP in the browser.
        
               | pavlov wrote:
               | You could do that, but the post here is in fact about
               | running PHP on a server on top of a Wasm runtime.
               | 
               | "At the edge" basically means "close to the user", with
               | the details left as an exercise to whoever is selling you
               | their "edge." In this case, it's a Wasm runtime company.
        
           | aeyes wrote:
           | This was solved 10 years ago with services like AppEngine in
           | GCP and similar products in major cloud platforms, they all
           | scale to zero.
        
             | makeitdouble wrote:
             | In GCP land I'd assume Cloud Run is a closer service.
             | 
             | But yes, for full blown applications udually talking to a
             | cache and a DB there are way more efficient and performant
             | solutions already.
        
         | jw1224 wrote:
         | The "edge" in this case is the browser (from a pure WASM
         | standpoint, though I see these guys also offer a hosted
         | serverless version too).
         | 
         | For most general-purpose applications, there's no point to
         | WASM. But some apps may run specific functions which take a
         | long time (e.g. bulk/batch processing), and being able to
         | execute those tasks securely _on the client side_ provides
         | immediate feedback and better UX.
         | 
         | That's just one use case. Another is that WASM makes PHP
         | portable, in the sense it can run in any web browser without
         | the need for a back-end server. Lots of potential opportunities
         | for distributing software which runs completely locally.
        
           | inezk wrote:
           | I don't think edge in this case mean browser. I think edge in
           | this case is more similar to a CDN node.
        
             | egeozcan wrote:
             | Yes, the browser is where you land when you fall off the
             | edge :)
        
           | ahofmann wrote:
           | Thank you for explaining. As a Web dev, who's using PHP since
           | version 4, I'm still very confused why someone would consider
           | running a CMS like WordPress on the client side, or at the
           | "edge". I guess the good thing here is that someone spends (a
           | lot of) energy in giving PHP new ways to get used by
           | developers.
        
             | makeitdouble wrote:
             | This is what is referred to by "edge":
             | https://en.wikipedia.org/wiki/Edge_computing
        
           | csomar wrote:
           | The browser is not the "edge". The browser is the browser.
           | Running WordPress in the browser makes exactly zero sense.
           | Only exception if you are running a test instance.
        
             | pacifika wrote:
             | https://wordpress.org/playground/
        
       | TekMol wrote:
       | Let me try to understand this "Edge" thing:                   -
       | The user sends an HTTP request to somesite.com         - Their
       | DNS query for somesite.com gets resolved to some datacenter near
       | them         - The HTTP request arrives at the datacenter where
       | the PHP in WebAssembly is executed at half the speed of native
       | PHP         - The PHP in Webassembly sends database queries to a
       | central DB server over the internet         - The PHP in
       | Webassembly templates the data and sends it back to the user
       | 
       | How is that faster than resolving somesite.com to the central
       | server, sending the HTTP query to the central server where PHP
       | runs at full speed and talks to the DB on the same machine (or
       | over the LAN)? Even if PHP ran at full speed at the "Edge", won't
       | the two requests over the internet                   user --1-->
       | edge PHP server --2--> central Db server
       | 
       | take longer than the one http request when the user connects to
       | the central server directly?                  user --1--> central
       | PHP+DB server
       | 
       | In reality, the PHP script on the "Edge" server probably makes
       | not one but multiple queries to the DB server. Won't that make it
       | super slow compared to having it all happening on one machine or
       | one LAN?
        
         | afavour wrote:
         | The post mentions a cache. I think the key here would be _not_
         | going to the central DB and instead going to a distributed
         | cache. Not an unreasonable concept when I assume the vast
         | majority will be read operations.
        
           | Y-bar wrote:
           | That's OPcache they mentioned, a specific language-level
           | cache native to PHP.
           | 
           | https://www.php.net/manual/en/intro.opcache.php
        
             | afavour wrote:
             | Ah. Point still stands though, distributed storage is what
             | really unlocks this concept.
             | 
             | Wordpress is a pretty bad example though, surely you'd just
             | CDN cache the pages!
        
               | iruoy wrote:
               | opcache doessn't cache user data. it's for caching
               | bytecode in memory as long as the php script/server is
               | running
               | 
               | i'd like to see a comparison against a firecracker vm and
               | a docker container
        
         | kevincox wrote:
         | Due to connection setup roundtrips, TCP slow start mechanics
         | and quality of network connections it is usually better to
         | terminate the client's TCP connection close to them. But I
         | agree that moving the frontend rendering near the client
         | doesn't really make sense for almost all cases. So the best
         | general purpose setup would probably look like user - CDN -
         | central PHP+DB.
         | 
         | Of course there are always exceptions. Often stale data is ok
         | and you can ship data snapshots to the edge so that they can be
         | served without going back to the central DB. But in many cases
         | some basic cache policies with a CDN can be nearly as
         | effective.
        
           | TekMol wrote:
           | What I usually do is that I use two hostnames. host1 for http
           | requests which can be cached. That host is behind a CDN.
           | host2 for HTTP requests which can't be cached. That host
           | points directly to my server.
           | 
           | Are you saying it would result in a better user experience
           | when I only use host1 which is behind the CDN and add no-
           | cache headers to the request that can't be cached?
        
             | kevincox wrote:
             | It's complicated but typically yes. The simplest reason is
             | that TCP+TLS handshakes require multiple round trips for a
             | fresh connection. The CDN can maintain a persistent
             | connection to the backend that is shared across users. It
             | is also likely that the CDN to backend connection goes over
             | a better connection than the user to backend connection
             | would.
        
               | TekMol wrote:
               | Very interesting, thanks. That would make my setup even
               | simpler.
        
               | e1g wrote:
               | > The CDN can maintain a persistent connection to the
               | backend that is shared across users
               | 
               | We considered using Cloudflare Workers as a reverse
               | proxy, and I did extensive testing of this (very
               | reasonable) assumption. Turns out that when calling back
               | to the origin from the edge, CF Workers established a new
               | connection almost every time, and so had to pay the
               | penalty of the TCP and TLS handshake on every request.
               | That killed any performance gains, and was a deal breaker
               | for us. It's rather difficult to predict or monitor
               | network/routing behavior when running on the edge.
        
               | bluk wrote:
               | Were you using a Cloudflare tunnel for your origin?
        
               | kentonv wrote:
               | This didn't sound right to me so I did some investigation
               | and I think I found a bug.
               | 
               | Keep in mind that Cloudflare is a complex stack of
               | proxies. When a worker performs a fetch(), that request
               | has to pass through a few machines on Cloudflare's
               | network before it can actually go to origin. E.g. to
               | implement caching we need to go to the appropriate cache
               | machine, and then to try to reuse connections we need to
               | go to the appropriate egress machine. Point is, the
               | connection to origin isn't literally coming from the
               | machine that called fetch().
               | 
               | So if you call fetch() twice in a row, to the same
               | hostname, does it reuse a connection? If everything were
               | on a single machine, you'd expect so, yes! But in this
               | complex proxy stack, stuff has to happen correctly for
               | those two requests to end up back on the same machine at
               | the other end in order to use the same connection.
               | 
               | Well, it looks like heuristics involved here aren't
               | currently handling Workers requests the way they should.
               | They are designed more around regular CDN requests
               | (Workers shares the same egress path that regular non-
               | Workers CDN requests use). In the standard CDN use case
               | where you get a request from a user, possibly rewrite it
               | in a Worker, then forward it to origin, you should be
               | seeing connection reuse.
               | 
               | But, it looks like if you have a Worker that performs
               | multiple fetch() requests to origin (e.g. not forwarding
               | the user's requests, but making some API requests or
               | something)... we're not hashing things correctly so that
               | those fetches land on the same egress machine. So... you
               | won't get connection reuse, unless of course you have
               | enough traffic to light up all the egress machines.
               | 
               | I'm face-palming a bit here, and wondering why there
               | hasn't been more noise about this. We'll fix it. Talk
               | about low-hanging fruit...
               | 
               | (I'm the tech lead for Cloudflare Workers.)
               | 
               | (On a side note, enabling Argo Smart Routing will greatly
               | increase the rate of connection reuse in general, even
               | for traffic distributed around the world, as it causes
               | requests to be routed within Cloudflare's network to the
               | location closest to your origin. Also, even if the origin
               | connections aren't reused, the RTT from Cloudflare to
               | origin becomes much shorter, so connection setup becomes
               | much less expensive. However, this is a paid feature.)
        
             | mschuster91 wrote:
             | > Are you saying it would result in a better user
             | experience when I only use host1 which is behind the CDN
             | and add no-cache headers to the request that can't be
             | cached?
             | 
             | Yes, because that way you can leverage the CDN to defend
             | against DDoS issues, and you can firewall the origin server
             | itself so only the CDN is allowed to communicate with it,
             | but no one else.
        
         | the_duke wrote:
         | That's a valid concern, and with a classical centralized DB the
         | Edge solution will not be faster in many cases, especially
         | considering how many DB requests some ORMs like Drupal or
         | Wordpress need for each page load...
         | 
         | Some other providers have tried to solve the problem with
         | replicated databases that are used for local reads.
         | 
         | I can't go into details yet, but Wasmer will offer a solution
         | for this problem quite soon as well, which hopefully will be
         | much more seamless than existing strategies.
         | 
         | (note: I work at Wasmer)
        
           | lovasoa wrote:
           | https://electric-sql.com/ ?
        
         | meekaaku wrote:
         | Isnt this true of everything that runs on the edge?
         | 
         | Unless you host the database in the same network as your edge
         | provider, appliation <-> database is where the bottleneck is.
        
         | slaymaker1907 wrote:
         | You're assuming that network latencies follow the triangle
         | inequality, i.e., that A->C is smaller than A->B + B->C.
         | However, that breaks down because of money. It's possible that
         | the user's ISP is a cheapskate without good peering agreements
         | such that while A->B is fast, they haven't setup the necessary
         | agreements to make A->C fast.
         | 
         | I've personally seen this with a game recently and my own ISP.
         | I can ping their West Coast servers in 20-40ms while pinging
         | their Chicago servers takes 200ms+.
         | 
         | This is actually one huge flaw with the current way IP is done.
         | Because your ISP is closely tied with the last-mile of routing,
         | you have very little control over optimizing those backend
         | layers of routing.
         | 
         | This doesn't explain why you wouldn't just run a thin proxy at
         | the edge and instead want to run a full PHP app. For that, my
         | guess is this makes sense when you don't want to run a full
         | server all the time or if you want it to be rapidly scalable.
        
           | paulddraper wrote:
           | Okay....but that's you providing a backbone for the user.
           | Don't you just use a CDN at that point? (One that allows non-
           | cached requests as well.)
        
         | eknkc wrote:
         | I guess if you have a use case with higly cacheable data on
         | edge, it might work great.
         | 
         | For example, a form builder like typeform could cache form
         | definitions on edge and render them close to users. Submissions
         | would require db communication but the entire experience could
         | be better.
         | 
         | Otherwise it is just not worth it.
         | 
         | BTW, I don't even think PHP running slower in wasm would be
         | that important. These things generally depend on IO performance
         | rather than runtime if you are not doing a lot of calculations
         | on the edge. WASM is also pretty fast these days so..
        
       | moomoo11 wrote:
       | Can someone ELI5 what is does "edge" computing means?
       | 
       | The way I understand it is that is moving some operations closer
       | to the client to avoid bandwidth costs and improve performance.
       | 
       | I thought of the Tesla car computer as edge computing, as it does
       | a lot of processing within the car that would otherwise add
       | latency and reliance on a internet connection.
       | 
       | But for web browsers? Going to some websites?
       | 
       | What sort of apps need this functionality?
       | 
       | Seems like over-engineering, so I'm looking for someone to
       | explain me.
        
         | Joel_Mckay wrote:
         | Hi, traditionally for our purposes it solved a few problems.
         | 
         | 1. latency
         | 
         | 2. intermittent access
         | 
         | 3. distributed "meaningful" data preparation/filters
         | 
         | One may consider routers with squid proxies, VoIP trunks, and
         | p2p cache are essentially similar "edge" technologies.
         | 
         | There are additional use-cases, but we don't want to educate
         | the lamers stealing resumes off jobs sites... having no clue
         | what they are talking about.
         | 
         | Have a nice day, =)
        
       | flemhans wrote:
       | Can i do it on my own hardware?
        
       | maxloh wrote:
       | Given that we run PHP on the edge, what is the point of running
       | the PHP interpreter on top of a WebAssembly interpreter (Wasmer)
       | instead of just running the PHP interpreter directly?
       | 
       | The latter will always be faster.
        
         | lxgr wrote:
         | One thing WASM runtimes usually do really well is sandboxing.
         | 
         | Various interpreters might or might not have a good
         | capability/permissioning model (Java's is capable but complex
         | and not supported by many applications, for example); even if
         | they do, there might be exploitable bugs in the interpreter
         | itself.
        
           | tambourine_man wrote:
           | But we already have containers and VMs are cheap.
           | 
           | I find WASM interesting from a technical perspective, but not
           | from a practical one.
        
             | lxgr wrote:
             | "Containers are not a sandboxing mechanism", I hear
             | reasonably often (although that seems surmountable at least
             | in theory?).
             | 
             | VMs are cheap, but not "let's run thousands of them on 'the
             | edge' in case we get a request for any of them!" cheap.
        
             | kennethallen wrote:
             | When Cloudflare Workers launched, they said V8 isolates had
             | some great properties for serverless-style compute:
             | 
             | 5 ms cold starts vs 500+ ms for containers
             | 
             | 3 MB memory vs 35 MB for a similar container
             | 
             | No context switch between different tenants' code
             | 
             | No virtualization overhead
             | 
             | I'm sure these numbers would be different today, for
             | instance with Firecracker, but there's probably still a
             | memory and/or cold start advantage to V8 isolates.
             | 
             | https://blog.cloudflare.com/cloud-computing-without-
             | containe...
        
       | choutianxius wrote:
       | I think this will face the same problems as Next.js edge runtime:
       | your database cannot be moved to the edge
        
       | mike_d wrote:
       | Please stop calling it "at the edge." You have seven locations
       | all in highly developed Equinix datacenters.
       | 
       | Edge means getting embedded into ISP networks, cell towers,
       | smaller metros, etc.
        
       | irq-1 wrote:
       | https://news.ycombinator.com/item?id=38829557#38834787
       | 
       | >> Ask HN: What are your predictions for 2024?
       | 
       | > Server-side WASM takes off with the re-implementation of PHP,
       | Ruby/Rails, Python, and others, and a WASM based virtual server
       | (shell, filesystem, web server, etc..) Cost more but has better
       | security for both the host and user.
       | 
       | Guess I was wrong about it costing more?
       | 
       | > ... we can run PHP safely without the overhead of OS or
       | hardware virtualization.
       | 
       | But it only runs at half the speed of PHP, so you need more
       | resources.
        
       | vander_elst wrote:
       | Trying to understand better the solution. Why isn't it possible
       | to restrict the application via process isolation (nsjail,
       | cgroups, docker...) and wasm is needed instead?
        
       | tambourine_man wrote:
       | This looks interesting, but as a feedback, I found the copy a bit
       | repetitive and lacking substance.
        
       ___________________________________________________________________
       (page generated 2024-05-23 23:01 UTC)