[HN Gopher] Workers AI Update: Stable Diffusion, Code Llama and ...
       ___________________________________________________________________
        
       Workers AI Update: Stable Diffusion, Code Llama and Workers AI in
       100 Cities
        
       Author : todsacerdoti
       Score  : 68 points
       Date   : 2023-11-23 14:01 UTC (9 hours ago)
        
 (HTM) web link (blog.cloudflare.com)
 (TXT) w3m dump (blog.cloudflare.com)
        
       | swazzy wrote:
       | having a hard time calculating what the pricing is for this
        
         | tmikaeld wrote:
         | Like with most serverless functions
        
         | MuffinFlavored wrote:
         | having a hard time calculating why anybody needs this/wants
         | this
        
           | ushakov wrote:
           | I see it more as a convenience feature for people already
           | using CF Workers
        
           | krosaen wrote:
           | productionizing ai models is a pain, this makes it easy. say
           | you were building a d&d app and wanted to generate character
           | art, this would make it very easy to get started. aws has
           | similar offerings (e.g sage maker) but it's not on the edge.
        
           | rafram wrote:
           | It seems cheaper than the OpenAI API and is very easy to use
           | from a worker.
        
         | civilitty wrote:
         | How many dollar bills does it take to make a pile worth
         | sleeping in?
        
         | wirelesspotat wrote:
         | Oddly, I don't see anything about pricing for Workers AI on the
         | Workers pricing page[0] but their Workers AI blog post from
         | Sept 2023[1] says the pricing is per 1k "neurons":
         | 
         | > Users will be able to choose from two ways to run Workers AI:
         | 
         | > Regular Twitch Neurons (RTN) - running wherever there's
         | capacity at $0.01 / 1k neurons
         | 
         | > Fast Twitch Neurons (FTN) - running at nearest user location
         | at $0.125 / 1k neurons
         | 
         | > Neurons are a way to measure AI output that always scales
         | down to zero (if you get no usage, you will be charged for 0
         | neurons).
         | 
         | Here's the key detail:
         | 
         | > To give you a sense of what you can accomplish with a
         | thousand neurons, you can: generate 130 LLM responses, 830
         | image classifications, or 1,250 embeddings.
         | 
         | [0] -
         | https://developers.cloudflare.com/workers/platform/pricing
         | 
         | [1] - https://blog.cloudflare.com/workers-ai/
        
       | syntaxing wrote:
       | Is this cheaper or more expensive than using OpenAI?
        
       | TekMol wrote:
       | Getting started with Workers AI + SDXL (via API) couldn't
       | be easier. Check out the example below:              curl -X POST
       | \
       | "https://api.cloudflare.com/client/v4/accounts/{account-
       | id}/ai/run/@cf/stabilityai/stable-diffusion-xl-base-1.0" \
       | -H "Authorization: Bearer {api-token}" \         -H "Content-
       | Type:application/json" \         -d '{ "prompt": "A happy llama
       | running through an orange cloud" }'         -o 'happy-llama.png'
       | 
       | First of all, there is a \ missing before the last line.
       | 
       | Second, what is my "{account-id}"? I can't find it anywhere in
       | the Cloudflare dashboard.
       | 
       | I have the feeling it might be my email?
       | 
       | But when I use that, I get this error:                   {"result
       | ":null,"success":false,"errors":[{"code":7003,"message":"Could
       | not route to
       | /client/v4/accounts/<my_email>/ai/run/@cf/stabilityai/stable-
       | diffusion-xl-base-1.0, perhaps your object identifier is
       | invalid?"}],"messages":[]}
        
         | Gooblebrai wrote:
         | https://developers.cloudflare.com/fundamentals/setup/find-ac...
        
           | TekMol wrote:
           | From that link:                   Log in to the Cloudflare
           | dashboard         and select your account and domain.
           | 
           | What does that mean? Do I have to register a domain with
           | Cloudflare first?
        
             | chris_st wrote:
             | No, you don't have to register a domain with them.
             | 
             | Sign up (or log in), and you'll be taken to your Dashboard.
             | 
             | On the left is a bunch of options. Click on the one labeled
             | "Workers", and on that page in the top right you'll see
             | "Account ID". That value should be the one you want.
        
         | dumbo-octopus wrote:
         | On any URL from Cloudflare regarding your account, it's the big
         | ID in the URL. When you're logged in and navigate to
         | https://dash.cloudflare.com/, you'll be redirected to
         | https://dash.cloudflare.com/{account-id}
        
       | martinald wrote:
       | Not really understanding the benefit of running this at the edge
       | to be honest? The additional latency for the request is
       | absolutely negligible compared to the latency of LLMs/SD.
        
         | tmikaeld wrote:
         | Maybe now, but in the future?
        
         | mschuster91 wrote:
         | > Not really understanding the benefit of running this at the
         | edge to be honest?
         | 
         | It's primarily a benefit for Cloudflare. Instead of huge and
         | expensive mega-datacenters that AWS/GCP/Azure operates, they
         | can rent cheaper colo space and better distribute workloads.
         | The latter is, I think, the key... AWS basically incentivises
         | you to stay in a single region as long as possible (mostly
         | because the UX of both the web UI and the CLI just _sucks_ when
         | dealing with multiple regions), which means that a lot of users
         | tend to stick in the AWS region most close to themselves, the
         | services aren 't really interconnected between regions because
         | that's a headache to set up, while Cloudflare runs "at the
         | edge" from the beginning and people don't even think about
         | introducing silent dependencies on any specific region. And if
         | a Cloudflare DC/region has a massive outage, chances are high
         | no one will notice it because the workloads will just silently
         | shift to somewhere else.
        
         | weird-eye-issue wrote:
         | It's for companies like us that already run almost everything
         | directly on Cloudflare Workers
         | 
         | We integrate with Replicate for SDXL but if this was production
         | ready it would have been likely we went with this instead
        
         | fragmede wrote:
         | It's a bit of a "if all you have is a hammer, everything looks
         | like a nail" situation. It's not about the latency from you to
         | the edge node, it's about already being in the Cloudflare
         | worker ecosystem as a developer.
         | 
         | For voice recognition, latency absolutely matters.
        
       | intrepidsoldier wrote:
       | Trying to understand why a developer would like to call an API to
       | generate code rather than use a coding AI assistant within their
       | editor? Genuinely curious.
        
         | liamdgray wrote:
         | Why? Well, I'm considering using a LLM API to generate per-user
         | custom code at runtime -- like a query builder that accepts
         | plain English. The application involves filtering a data stream
         | by the user's custom criteria.
         | 
         | I'm not yet committed to this because I know that many (most?)
         | people cannot express their intentions in plain English
         | concisely and precisely enough to be implemented as an
         | algorithm. As my first formal instructor of programming taught
         | me, a lot of programming just that: thinking through what one
         | wants, with sufficient rigor. Support for such a feature could
         | be a nightmare, making it more trouble than it is worth.
         | However, I may offer it as an experiment. It might work well
         | enough to, say, draft Google Sheets formulas that power users
         | could tweak.
        
           | gpderetta wrote:
           | How could you possibly make such a thing safe from code
           | injection?
        
         | fragmede wrote:
         | I get the feeling Cloudflare doesn't know either. But the model
         | is freely available via Hugging Faces, so why _not_ support it
         | as one of the models. Just because you or I can 't think of
         | something doesn't mean that some one else won't. Maybe someone
         | will come up with a genius idea of what to do with it. The
         | other models * seem more useful, but adding models is likely
         | not that much overhead.
         | 
         | * https://developers.cloudflare.com/workers-ai/models/
        
       | asadm wrote:
       | I will give this a shot but does anybody know what inference
       | times are like?
        
         | spikey_sanju wrote:
         | 16-17 seconds for generating one image.
        
       | Havoc wrote:
       | Anybody know this works out pricing wise?
       | 
       | I gather $0.01 / 1k neurons. Which apparently is "130 LLM
       | responses, 830 image classifications, or 1,250 embeddings."
       | 
       | What's that in sane measurements like dollars per 1k tokens?
       | 
       | As much as I enjoy measuring my cars speed in beard
       | seconds...could we fkin not?
        
       ___________________________________________________________________
       (page generated 2023-11-23 23:02 UTC)