[HN Gopher] Running PHP fast at the edge with WebAssembly
___________________________________________________________________
Running PHP fast at the edge with WebAssembly
Author : ecmm
Score : 84 points
Date : 2024-05-23 17:09 UTC (5 hours ago)
(HTM) web link (wasmer.io)
(TXT) w3m dump (wasmer.io)
| Twirrim wrote:
| Whereby fast is "About half the speed of native PHP"
| syrusakbary wrote:
| Author here. Faster than it has ever been in the Edge via
| WebAssembly :)
|
| But you are completely right pointing out that there's still
| some room to improve, specially when compared to running PHP
| natively.
|
| Right now there's some price to pay for the isolation and
| sandboxing, but we are working hard to reduce the gap to zero.
| Stay tuned for more updates on this front!
| stracer wrote:
| Is this tech meant for developers' needs only, or can
| regular, already existing PHP websites (e-shops...) somehow
| take advantage of it as well?
| the_duke wrote:
| It's meant for direct usage, not just for developers.
|
| You can deploy apps on the Wasmer Edge cloud, and also host
| things yourself if you want to, though in the later case
| the setup will be non-trivial.
| maxloh wrote:
| Given that we run PHP on the edge, what is the point of
| running the PHP interpreter on top of a WebAssembly
| interpreter (Wasmer) instead of just running the PHP
| interpreter directly?
|
| The latter will always be faster.
| Cyberdog wrote:
| From what I can tell it's because some of these "edge"
| service providers will expect you to give them a WASM
| binary instead of a PHP script.
|
| The other caveats about "edge" throughout this discussion
| aside, if I needed to do this, I'd try to write something
| in Zig or (gag) JS or something else that compiles to WASM
| directly rather than writing a script for an interpreter
| that runs under WASM.
| mike_d wrote:
| Hey you should really check out fly.io if you want to run
| stuff on the edge. They have it pretty much figured out.
|
| FrankenPHP is also really good and probably a much smarter
| play than trying to get your code running under WebAssembly.
| vundercind wrote:
| Nb PHP is already _really_ fast. Some major things built on it
| are slow, mostly due to things like poor data access patterns
| or architecture, but its culture since the beginning has
| basically been "get out of PHP and into C as fast as possible"
| and it shows (this is basically the trick to _any_ scripting
| language "being fast", PHP just embraced it hard from the very
| beginning).
|
| If you're on PHP and need more speed _in the language itself_ ,
| basically every other scripting language (yes, including Node)
| is off the table immediately. Lateral move at best.
|
| (All that to say, yeah, I entirely expected that the headline
| would lead to an article about making PHP slower)
| mschuster91 wrote:
| > PHP just embraced it hard from the very beginning
|
| Which is also the reason why _a lot_ of the PHP standard
| library functions are so inconsistent. They 're straight
| wrappers around the C libraries.
|
| Upside is, unless you have to do shit with pointer magic or
| play around the edges of signed/unsigned numbers, it's fairly
| easy to port C example code for any of the myriad things PHP
| has bindings to to PHP.
| pjmlp wrote:
| I don't know which hype is worse, AI or WebAssembly on the
| server.
| secondcoming wrote:
| Is JS on the server still a thing?
| pjmlp wrote:
| Yes, and to use alternatives there is no need for
| WebAssembly.
| csomar wrote:
| This is trying to solve a solved problem with lots of difficult
| technology that doesn't apply here. Most of the PHP websites are
| WordPress. The solution to have a speedy WordPress site is to
| compile it to static HTML. Calls to the server should happen with
| JavaScript. The server will always remain relevant as WordPress
| uses a Database and thus the "Edge" makes no sense here.
| jw1224 wrote:
| Getting WordPress running in WASM was a huge milestone. It was
| one of the first big PHP/WASM achievements, but was never the
| end goal, just a proof-of-concept. The target market for this
| tech is _not_ WordPress bloggers.
| ahofmann wrote:
| I'm still trying to understand what this does and what's the use
| case. Is the "edge" a server? The browser? Why should I compile
| WordPress or Laravel to wasm?
| tredre3 wrote:
| In this context the edge is a fancy word to say "serverless".
| It just means that your PHP interpreter will be started on-
| demand on a node closer to your customer's request.
|
| So if your website receives no requests, it costs you nothing.
| And requests have less latency for the user.
|
| That's the theory anyway, in my experience reality is a lot
| more nuanced because the serverless node still has to reach a
| database and so on.
| pjmlp wrote:
| PHP nowadays already has a JIT compiler and its own bytecode,
| no need for WebAssembly.
| jw1224 wrote:
| This seems like a misunderstanding, WebAssembly has nothing
| to do with PHP's internal performance mechanisms. WASM is a
| compilation target, this means you can take code which
| needs to be compiled (like PHP's core binary), and compile
| it to be run in a browser.
|
| PHP in WASM means developers can run actual, real, native
| PHP code _in the user's browser_ , without the user needing
| to have PHP installed locally, or nginx, etc...
| pjmlp wrote:
| Because I always wanted to run PHP in the browser.
| pavlov wrote:
| You could do that, but the post here is in fact about
| running PHP on a server on top of a Wasm runtime.
|
| "At the edge" basically means "close to the user", with
| the details left as an exercise to whoever is selling you
| their "edge." In this case, it's a Wasm runtime company.
| aeyes wrote:
| This was solved 10 years ago with services like AppEngine in
| GCP and similar products in major cloud platforms, they all
| scale to zero.
| makeitdouble wrote:
| In GCP land I'd assume Cloud Run is a closer service.
|
| But yes, for full blown applications udually talking to a
| cache and a DB there are way more efficient and performant
| solutions already.
| jw1224 wrote:
| The "edge" in this case is the browser (from a pure WASM
| standpoint, though I see these guys also offer a hosted
| serverless version too).
|
| For most general-purpose applications, there's no point to
| WASM. But some apps may run specific functions which take a
| long time (e.g. bulk/batch processing), and being able to
| execute those tasks securely _on the client side_ provides
| immediate feedback and better UX.
|
| That's just one use case. Another is that WASM makes PHP
| portable, in the sense it can run in any web browser without
| the need for a back-end server. Lots of potential opportunities
| for distributing software which runs completely locally.
| inezk wrote:
| I don't think edge in this case mean browser. I think edge in
| this case is more similar to a CDN node.
| egeozcan wrote:
| Yes, the browser is where you land when you fall off the
| edge :)
| ahofmann wrote:
| Thank you for explaining. As a Web dev, who's using PHP since
| version 4, I'm still very confused why someone would consider
| running a CMS like WordPress on the client side, or at the
| "edge". I guess the good thing here is that someone spends (a
| lot of) energy in giving PHP new ways to get used by
| developers.
| makeitdouble wrote:
| This is what is referred to by "edge":
| https://en.wikipedia.org/wiki/Edge_computing
| csomar wrote:
| The browser is not the "edge". The browser is the browser.
| Running WordPress in the browser makes exactly zero sense.
| Only exception if you are running a test instance.
| pacifika wrote:
| https://wordpress.org/playground/
| TekMol wrote:
| Let me try to understand this "Edge" thing: -
| The user sends an HTTP request to somesite.com - Their
| DNS query for somesite.com gets resolved to some datacenter near
| them - The HTTP request arrives at the datacenter where
| the PHP in WebAssembly is executed at half the speed of native
| PHP - The PHP in Webassembly sends database queries to a
| central DB server over the internet - The PHP in
| Webassembly templates the data and sends it back to the user
|
| How is that faster than resolving somesite.com to the central
| server, sending the HTTP query to the central server where PHP
| runs at full speed and talks to the DB on the same machine (or
| over the LAN)? Even if PHP ran at full speed at the "Edge", won't
| the two requests over the internet user --1-->
| edge PHP server --2--> central Db server
|
| take longer than the one http request when the user connects to
| the central server directly? user --1--> central
| PHP+DB server
|
| In reality, the PHP script on the "Edge" server probably makes
| not one but multiple queries to the DB server. Won't that make it
| super slow compared to having it all happening on one machine or
| one LAN?
| afavour wrote:
| The post mentions a cache. I think the key here would be _not_
| going to the central DB and instead going to a distributed
| cache. Not an unreasonable concept when I assume the vast
| majority will be read operations.
| Y-bar wrote:
| That's OPcache they mentioned, a specific language-level
| cache native to PHP.
|
| https://www.php.net/manual/en/intro.opcache.php
| afavour wrote:
| Ah. Point still stands though, distributed storage is what
| really unlocks this concept.
|
| Wordpress is a pretty bad example though, surely you'd just
| CDN cache the pages!
| iruoy wrote:
| opcache doessn't cache user data. it's for caching
| bytecode in memory as long as the php script/server is
| running
|
| i'd like to see a comparison against a firecracker vm and
| a docker container
| kevincox wrote:
| Due to connection setup roundtrips, TCP slow start mechanics
| and quality of network connections it is usually better to
| terminate the client's TCP connection close to them. But I
| agree that moving the frontend rendering near the client
| doesn't really make sense for almost all cases. So the best
| general purpose setup would probably look like user - CDN -
| central PHP+DB.
|
| Of course there are always exceptions. Often stale data is ok
| and you can ship data snapshots to the edge so that they can be
| served without going back to the central DB. But in many cases
| some basic cache policies with a CDN can be nearly as
| effective.
| TekMol wrote:
| What I usually do is that I use two hostnames. host1 for http
| requests which can be cached. That host is behind a CDN.
| host2 for HTTP requests which can't be cached. That host
| points directly to my server.
|
| Are you saying it would result in a better user experience
| when I only use host1 which is behind the CDN and add no-
| cache headers to the request that can't be cached?
| kevincox wrote:
| It's complicated but typically yes. The simplest reason is
| that TCP+TLS handshakes require multiple round trips for a
| fresh connection. The CDN can maintain a persistent
| connection to the backend that is shared across users. It
| is also likely that the CDN to backend connection goes over
| a better connection than the user to backend connection
| would.
| TekMol wrote:
| Very interesting, thanks. That would make my setup even
| simpler.
| e1g wrote:
| > The CDN can maintain a persistent connection to the
| backend that is shared across users
|
| We considered using Cloudflare Workers as a reverse
| proxy, and I did extensive testing of this (very
| reasonable) assumption. Turns out that when calling back
| to the origin from the edge, CF Workers established a new
| connection almost every time, and so had to pay the
| penalty of the TCP and TLS handshake on every request.
| That killed any performance gains, and was a deal breaker
| for us. It's rather difficult to predict or monitor
| network/routing behavior when running on the edge.
| bluk wrote:
| Were you using a Cloudflare tunnel for your origin?
| kentonv wrote:
| This didn't sound right to me so I did some investigation
| and I think I found a bug.
|
| Keep in mind that Cloudflare is a complex stack of
| proxies. When a worker performs a fetch(), that request
| has to pass through a few machines on Cloudflare's
| network before it can actually go to origin. E.g. to
| implement caching we need to go to the appropriate cache
| machine, and then to try to reuse connections we need to
| go to the appropriate egress machine. Point is, the
| connection to origin isn't literally coming from the
| machine that called fetch().
|
| So if you call fetch() twice in a row, to the same
| hostname, does it reuse a connection? If everything were
| on a single machine, you'd expect so, yes! But in this
| complex proxy stack, stuff has to happen correctly for
| those two requests to end up back on the same machine at
| the other end in order to use the same connection.
|
| Well, it looks like heuristics involved here aren't
| currently handling Workers requests the way they should.
| They are designed more around regular CDN requests
| (Workers shares the same egress path that regular non-
| Workers CDN requests use). In the standard CDN use case
| where you get a request from a user, possibly rewrite it
| in a Worker, then forward it to origin, you should be
| seeing connection reuse.
|
| But, it looks like if you have a Worker that performs
| multiple fetch() requests to origin (e.g. not forwarding
| the user's requests, but making some API requests or
| something)... we're not hashing things correctly so that
| those fetches land on the same egress machine. So... you
| won't get connection reuse, unless of course you have
| enough traffic to light up all the egress machines.
|
| I'm face-palming a bit here, and wondering why there
| hasn't been more noise about this. We'll fix it. Talk
| about low-hanging fruit...
|
| (I'm the tech lead for Cloudflare Workers.)
|
| (On a side note, enabling Argo Smart Routing will greatly
| increase the rate of connection reuse in general, even
| for traffic distributed around the world, as it causes
| requests to be routed within Cloudflare's network to the
| location closest to your origin. Also, even if the origin
| connections aren't reused, the RTT from Cloudflare to
| origin becomes much shorter, so connection setup becomes
| much less expensive. However, this is a paid feature.)
| mschuster91 wrote:
| > Are you saying it would result in a better user
| experience when I only use host1 which is behind the CDN
| and add no-cache headers to the request that can't be
| cached?
|
| Yes, because that way you can leverage the CDN to defend
| against DDoS issues, and you can firewall the origin server
| itself so only the CDN is allowed to communicate with it,
| but no one else.
| the_duke wrote:
| That's a valid concern, and with a classical centralized DB the
| Edge solution will not be faster in many cases, especially
| considering how many DB requests some ORMs like Drupal or
| Wordpress need for each page load...
|
| Some other providers have tried to solve the problem with
| replicated databases that are used for local reads.
|
| I can't go into details yet, but Wasmer will offer a solution
| for this problem quite soon as well, which hopefully will be
| much more seamless than existing strategies.
|
| (note: I work at Wasmer)
| lovasoa wrote:
| https://electric-sql.com/ ?
| meekaaku wrote:
| Isnt this true of everything that runs on the edge?
|
| Unless you host the database in the same network as your edge
| provider, appliation <-> database is where the bottleneck is.
| slaymaker1907 wrote:
| You're assuming that network latencies follow the triangle
| inequality, i.e., that A->C is smaller than A->B + B->C.
| However, that breaks down because of money. It's possible that
| the user's ISP is a cheapskate without good peering agreements
| such that while A->B is fast, they haven't setup the necessary
| agreements to make A->C fast.
|
| I've personally seen this with a game recently and my own ISP.
| I can ping their West Coast servers in 20-40ms while pinging
| their Chicago servers takes 200ms+.
|
| This is actually one huge flaw with the current way IP is done.
| Because your ISP is closely tied with the last-mile of routing,
| you have very little control over optimizing those backend
| layers of routing.
|
| This doesn't explain why you wouldn't just run a thin proxy at
| the edge and instead want to run a full PHP app. For that, my
| guess is this makes sense when you don't want to run a full
| server all the time or if you want it to be rapidly scalable.
| paulddraper wrote:
| Okay....but that's you providing a backbone for the user.
| Don't you just use a CDN at that point? (One that allows non-
| cached requests as well.)
| eknkc wrote:
| I guess if you have a use case with higly cacheable data on
| edge, it might work great.
|
| For example, a form builder like typeform could cache form
| definitions on edge and render them close to users. Submissions
| would require db communication but the entire experience could
| be better.
|
| Otherwise it is just not worth it.
|
| BTW, I don't even think PHP running slower in wasm would be
| that important. These things generally depend on IO performance
| rather than runtime if you are not doing a lot of calculations
| on the edge. WASM is also pretty fast these days so..
| moomoo11 wrote:
| Can someone ELI5 what is does "edge" computing means?
|
| The way I understand it is that is moving some operations closer
| to the client to avoid bandwidth costs and improve performance.
|
| I thought of the Tesla car computer as edge computing, as it does
| a lot of processing within the car that would otherwise add
| latency and reliance on a internet connection.
|
| But for web browsers? Going to some websites?
|
| What sort of apps need this functionality?
|
| Seems like over-engineering, so I'm looking for someone to
| explain me.
| Joel_Mckay wrote:
| Hi, traditionally for our purposes it solved a few problems.
|
| 1. latency
|
| 2. intermittent access
|
| 3. distributed "meaningful" data preparation/filters
|
| One may consider routers with squid proxies, VoIP trunks, and
| p2p cache are essentially similar "edge" technologies.
|
| There are additional use-cases, but we don't want to educate
| the lamers stealing resumes off jobs sites... having no clue
| what they are talking about.
|
| Have a nice day, =)
| flemhans wrote:
| Can i do it on my own hardware?
| maxloh wrote:
| Given that we run PHP on the edge, what is the point of running
| the PHP interpreter on top of a WebAssembly interpreter (Wasmer)
| instead of just running the PHP interpreter directly?
|
| The latter will always be faster.
| lxgr wrote:
| One thing WASM runtimes usually do really well is sandboxing.
|
| Various interpreters might or might not have a good
| capability/permissioning model (Java's is capable but complex
| and not supported by many applications, for example); even if
| they do, there might be exploitable bugs in the interpreter
| itself.
| tambourine_man wrote:
| But we already have containers and VMs are cheap.
|
| I find WASM interesting from a technical perspective, but not
| from a practical one.
| lxgr wrote:
| "Containers are not a sandboxing mechanism", I hear
| reasonably often (although that seems surmountable at least
| in theory?).
|
| VMs are cheap, but not "let's run thousands of them on 'the
| edge' in case we get a request for any of them!" cheap.
| kennethallen wrote:
| When Cloudflare Workers launched, they said V8 isolates had
| some great properties for serverless-style compute:
|
| 5 ms cold starts vs 500+ ms for containers
|
| 3 MB memory vs 35 MB for a similar container
|
| No context switch between different tenants' code
|
| No virtualization overhead
|
| I'm sure these numbers would be different today, for
| instance with Firecracker, but there's probably still a
| memory and/or cold start advantage to V8 isolates.
|
| https://blog.cloudflare.com/cloud-computing-without-
| containe...
| choutianxius wrote:
| I think this will face the same problems as Next.js edge runtime:
| your database cannot be moved to the edge
| mike_d wrote:
| Please stop calling it "at the edge." You have seven locations
| all in highly developed Equinix datacenters.
|
| Edge means getting embedded into ISP networks, cell towers,
| smaller metros, etc.
| irq-1 wrote:
| https://news.ycombinator.com/item?id=38829557#38834787
|
| >> Ask HN: What are your predictions for 2024?
|
| > Server-side WASM takes off with the re-implementation of PHP,
| Ruby/Rails, Python, and others, and a WASM based virtual server
| (shell, filesystem, web server, etc..) Cost more but has better
| security for both the host and user.
|
| Guess I was wrong about it costing more?
|
| > ... we can run PHP safely without the overhead of OS or
| hardware virtualization.
|
| But it only runs at half the speed of PHP, so you need more
| resources.
| vander_elst wrote:
| Trying to understand better the solution. Why isn't it possible
| to restrict the application via process isolation (nsjail,
| cgroups, docker...) and wasm is needed instead?
| tambourine_man wrote:
| This looks interesting, but as a feedback, I found the copy a bit
| repetitive and lacking substance.
___________________________________________________________________
(page generated 2024-05-23 23:01 UTC)