[HN Gopher] Workers AI Update: Stable Diffusion, Code Llama and ...
___________________________________________________________________
Workers AI Update: Stable Diffusion, Code Llama and Workers AI in
100 Cities
Author : todsacerdoti
Score : 68 points
Date : 2023-11-23 14:01 UTC (9 hours ago)
(HTM) web link (blog.cloudflare.com)
(TXT) w3m dump (blog.cloudflare.com)
| swazzy wrote:
| having a hard time calculating what the pricing is for this
| tmikaeld wrote:
| Like with most serverless functions
| MuffinFlavored wrote:
| having a hard time calculating why anybody needs this/wants
| this
| ushakov wrote:
| I see it more as a convenience feature for people already
| using CF Workers
| krosaen wrote:
| productionizing ai models is a pain, this makes it easy. say
| you were building a d&d app and wanted to generate character
| art, this would make it very easy to get started. aws has
| similar offerings (e.g sage maker) but it's not on the edge.
| rafram wrote:
| It seems cheaper than the OpenAI API and is very easy to use
| from a worker.
| civilitty wrote:
| How many dollar bills does it take to make a pile worth
| sleeping in?
| wirelesspotat wrote:
| Oddly, I don't see anything about pricing for Workers AI on the
| Workers pricing page[0] but their Workers AI blog post from
| Sept 2023[1] says the pricing is per 1k "neurons":
|
| > Users will be able to choose from two ways to run Workers AI:
|
| > Regular Twitch Neurons (RTN) - running wherever there's
| capacity at $0.01 / 1k neurons
|
| > Fast Twitch Neurons (FTN) - running at nearest user location
| at $0.125 / 1k neurons
|
| > Neurons are a way to measure AI output that always scales
| down to zero (if you get no usage, you will be charged for 0
| neurons).
|
| Here's the key detail:
|
| > To give you a sense of what you can accomplish with a
| thousand neurons, you can: generate 130 LLM responses, 830
| image classifications, or 1,250 embeddings.
|
| [0] -
| https://developers.cloudflare.com/workers/platform/pricing
|
| [1] - https://blog.cloudflare.com/workers-ai/
| syntaxing wrote:
| Is this cheaper or more expensive than using OpenAI?
| TekMol wrote:
| Getting started with Workers AI + SDXL (via API) couldn't
| be easier. Check out the example below: curl -X POST
| \
| "https://api.cloudflare.com/client/v4/accounts/{account-
| id}/ai/run/@cf/stabilityai/stable-diffusion-xl-base-1.0" \
| -H "Authorization: Bearer {api-token}" \ -H "Content-
| Type:application/json" \ -d '{ "prompt": "A happy llama
| running through an orange cloud" }' -o 'happy-llama.png'
|
| First of all, there is a \ missing before the last line.
|
| Second, what is my "{account-id}"? I can't find it anywhere in
| the Cloudflare dashboard.
|
| I have the feeling it might be my email?
|
| But when I use that, I get this error: {"result
| ":null,"success":false,"errors":[{"code":7003,"message":"Could
| not route to
| /client/v4/accounts/<my_email>/ai/run/@cf/stabilityai/stable-
| diffusion-xl-base-1.0, perhaps your object identifier is
| invalid?"}],"messages":[]}
| Gooblebrai wrote:
| https://developers.cloudflare.com/fundamentals/setup/find-ac...
| TekMol wrote:
| From that link: Log in to the Cloudflare
| dashboard and select your account and domain.
|
| What does that mean? Do I have to register a domain with
| Cloudflare first?
| chris_st wrote:
| No, you don't have to register a domain with them.
|
| Sign up (or log in), and you'll be taken to your Dashboard.
|
| On the left is a bunch of options. Click on the one labeled
| "Workers", and on that page in the top right you'll see
| "Account ID". That value should be the one you want.
| dumbo-octopus wrote:
| On any URL from Cloudflare regarding your account, it's the big
| ID in the URL. When you're logged in and navigate to
| https://dash.cloudflare.com/, you'll be redirected to
| https://dash.cloudflare.com/{account-id}
| martinald wrote:
| Not really understanding the benefit of running this at the edge
| to be honest? The additional latency for the request is
| absolutely negligible compared to the latency of LLMs/SD.
| tmikaeld wrote:
| Maybe now, but in the future?
| mschuster91 wrote:
| > Not really understanding the benefit of running this at the
| edge to be honest?
|
| It's primarily a benefit for Cloudflare. Instead of huge and
| expensive mega-datacenters that AWS/GCP/Azure operates, they
| can rent cheaper colo space and better distribute workloads.
| The latter is, I think, the key... AWS basically incentivises
| you to stay in a single region as long as possible (mostly
| because the UX of both the web UI and the CLI just _sucks_ when
| dealing with multiple regions), which means that a lot of users
| tend to stick in the AWS region most close to themselves, the
| services aren 't really interconnected between regions because
| that's a headache to set up, while Cloudflare runs "at the
| edge" from the beginning and people don't even think about
| introducing silent dependencies on any specific region. And if
| a Cloudflare DC/region has a massive outage, chances are high
| no one will notice it because the workloads will just silently
| shift to somewhere else.
| weird-eye-issue wrote:
| It's for companies like us that already run almost everything
| directly on Cloudflare Workers
|
| We integrate with Replicate for SDXL but if this was production
| ready it would have been likely we went with this instead
| fragmede wrote:
| It's a bit of a "if all you have is a hammer, everything looks
| like a nail" situation. It's not about the latency from you to
| the edge node, it's about already being in the Cloudflare
| worker ecosystem as a developer.
|
| For voice recognition, latency absolutely matters.
| intrepidsoldier wrote:
| Trying to understand why a developer would like to call an API to
| generate code rather than use a coding AI assistant within their
| editor? Genuinely curious.
| liamdgray wrote:
| Why? Well, I'm considering using a LLM API to generate per-user
| custom code at runtime -- like a query builder that accepts
| plain English. The application involves filtering a data stream
| by the user's custom criteria.
|
| I'm not yet committed to this because I know that many (most?)
| people cannot express their intentions in plain English
| concisely and precisely enough to be implemented as an
| algorithm. As my first formal instructor of programming taught
| me, a lot of programming just that: thinking through what one
| wants, with sufficient rigor. Support for such a feature could
| be a nightmare, making it more trouble than it is worth.
| However, I may offer it as an experiment. It might work well
| enough to, say, draft Google Sheets formulas that power users
| could tweak.
| gpderetta wrote:
| How could you possibly make such a thing safe from code
| injection?
| fragmede wrote:
| I get the feeling Cloudflare doesn't know either. But the model
| is freely available via Hugging Faces, so why _not_ support it
| as one of the models. Just because you or I can 't think of
| something doesn't mean that some one else won't. Maybe someone
| will come up with a genius idea of what to do with it. The
| other models * seem more useful, but adding models is likely
| not that much overhead.
|
| * https://developers.cloudflare.com/workers-ai/models/
| asadm wrote:
| I will give this a shot but does anybody know what inference
| times are like?
| spikey_sanju wrote:
| 16-17 seconds for generating one image.
| Havoc wrote:
| Anybody know this works out pricing wise?
|
| I gather $0.01 / 1k neurons. Which apparently is "130 LLM
| responses, 830 image classifications, or 1,250 embeddings."
|
| What's that in sane measurements like dollars per 1k tokens?
|
| As much as I enjoy measuring my cars speed in beard
| seconds...could we fkin not?
___________________________________________________________________
(page generated 2023-11-23 23:02 UTC)