hngopher.com

       [HN Gopher] Two new Gemini models, reduced 1.5 Pro pricing, incr...
       ___________________________________________________________________
        
       Two new Gemini models, reduced 1.5 Pro pricing, increased rate
       limits, and more
        
       Author : meetpateltech
       Score  : 170 points
       Date   : 2024-09-24 16:08 UTC (6 hours ago)
        
 (HTM) web link (developers.googleblog.com)
 (TXT) w3m dump (developers.googleblog.com)
        
       | mixtureoftakes wrote:
       | TLDR - 2x cheaper, slightly smarter, and they only compare those
       | new models to their own old ones. Does google have moat?
        
         | usaar333 wrote:
         | The math score exceeds o1-preview (though not mini or o1 full)
         | fwiw.
        
         | re-thc wrote:
         | > Does google have moat?
         | 
         | Potentially (depends if the EU cares)...
         | 
         | E.g. integration with Google search (instead of ChatGPT's Bing
         | search), providing map data, android integration, etc...
        
           | ianbicking wrote:
           | Their Android integration certainly isn't on track to earn
           | them any moats...
           | https://hachyderm.io/@ianbicking/113099247306589777
        
         | cj wrote:
         | Moat could be things like direct integration into Gmail (ask it
         | to find your last 5 receipts from Amazon), Drive (chat with
         | PDF), Slides (create images / flow charts), etc.
         | 
         | Not sure if their models are the moat. But they definitely have
         | an opportunity from the productization perspective.
         | 
         | But so does Microsoft.
        
           | ethbr1 wrote:
           | Doesn't Microsoft also get OpenAI IP, if they run out of
           | money?
        
           | svara wrote:
           | Have you tried the Gemini Gmail integration? I have that
           | enabled in my GSuite account.
           | 
           | It's incredible how bad it is. I've seen it claim I've never
           | received mail from a certain person, while the email was open
           | right next to the chat widget. I've seen it tell me to use
           | the standard search tool, when that wasn't suitable for the
           | query. I've literally never had it find anything that
           | wouldn't have been easier to find with the regular search.
           | 
           | I mean, it's a really obvious thing for them to do, I'm
           | genuinely confused why they released it like that.
        
             | cj wrote:
             | > I'm genuinely confused why they released it like that.
             | 
             | I agree. Right now it's not very useful, but has the
             | potential to be if they keep investing in it. Maybe.
             | 
             | I think Google, Microsoft, etc are all pressured to release
             | _something_ for fear of appearing to be behind the curve.
             | 
             | Apple is clearly taking the opposite approach re: speed to
             | market.
        
       | frankdenbow wrote:
       | Interview with the product lead:
       | https://x.com/rowancheung/status/1838611170061918575?
        
         | hadlock wrote:
         | They put a 42 minute video on twitter? That's brave.
        
           | mh- wrote:
           | https://www.youtube.com/watch?v=WQvMdmk8IkM
        
         | throwup238 wrote:
         | _> What makes the new model so unique? "
         | 
         | _>> Yeah It's a good question. I think it's maybe less so of
         | what makes it unique and more so the general trajectory of the
         | trend that we're on.*
         | 
         | Disappointing.
        
       | summerlight wrote:
       | Looks like they are more focused on the economical aspect of
       | those large models? Like 90~95% performance of other frontier
       | models at 50%~70% price.
        
         | TIPSIO wrote:
         | I do like the trend.
         | 
         | Imagine if Anthropic or someone eventually release a Claude 3.5
         | but at like a whopping 10x its current speed.
         | 
         | Would be incredibly more useful and game changing than a slow
         | o1 model that may or not be x percent smarter.
        
           | sgt101 wrote:
           | We might see that with the inference ASICs later this year I
           | guess?
        
             | xendipity wrote:
             | Ooh, what are these ASICs you're talking about? My
             | understanding was that we'll see AMD/Nvidia gpus continue
             | to be pushed and very competitive as well as have new
             | system architectures like cerebras or grok. I haven't heard
             | about new compute platforms framed as ASICs.
        
               | Workaccount2 wrote:
               | Cerebras has ridiculously large LLM ASICs that can hit
               | crazy speeds. You can try it with llama 8B and 70B:
               | 
               | https://inference.cerebras.ai/
               | 
               | It's pretty fast, but my understanding is that it is
               | still too expensive even accounting for the speed-up.
        
               | throwup238 wrote:
               | Is Cerebras an integrated circuit or more an integrated
               | _wafer_? :-)
               | 
               | And yeah their cost is ridiculous, on the order for high
               | 6 to low 7 figures per wafer. The rack alone looks
               | several times more expensive than the 8x NVIDIA pods [1]
               | 
               | [1] https://web.archive.org/web/20230812020202/https://ww
               | w.youtu...
        
           | bangaladore wrote:
           | Sonnet 3.5 is fast for its quality. But yeah, it's nowhere
           | near Google's flash models. But I assume that is largely just
           | because its a much smaller model.
        
         | Workaccount2 wrote:
         | They are going for large corporate customers. They are a brand
         | name with deep pockets and a pretty risk-adverse model.
         | 
         | So even if Gemini sucks, they'll still win over execs being
         | pushed to make a decision.
        
           | RhodesianHunter wrote:
           | That doesn't seem like much of a plan given their trailing
           | position in the cloud space and the fact that Microsoft and
           | AWS both have their own offerings.
        
             | resters wrote:
             | Maybe Google is holding back it's far superior, truly
             | sentient AI until other companies have broken the ice. Not
             | long ago there was a Google AI engineer who rage quit over
             | Google's treatment of sentienet AI.
        
           | hadlock wrote:
           | Not even trying to be snarky, but their lack of ability to
           | offer products for more than a handful of years, does not
           | lend google towards being chosen by large corporate
           | customers. I know a guy who works in cloud sales and his
           | government customers are PISSED they are sunsetting one of
           | their PDF products and are being forced to migrate that
           | process. The customer was expecting that to work for 10+
           | years and after a ~3 year onboarding process, they have 6
           | months to migrate. If my neck was on the line after buying
           | the google PDF product, I wouldn't even short list them for
           | an AI product.
        
             | jsnell wrote:
             | What Google Cloud pdf product is that? I thought my
             | knowledge of discontinued Google products was near-
             | encyclopedic, but this is the first I've heard of that.
             | 
             | But as an enterprise customer, if you expect X, don't you
             | get X into the contract?
        
       | simonw wrote:
       | This price drop is significant. For <128,000 tokens they're
       | dropping from $3.50/million to $1.25/million, and output from
       | $10.50/million to $2.50/million.
       | 
       | For comparison, GPT-4o is currently $5/million input and
       | $15/million output and Claude 3.5 Sonnet is $3/million input and
       | $15/million output.
       | 
       | Gemini 1.5 Pro was already the cheapest of the frontier models
       | and now it's even cheaper.
        
         | GaggiX wrote:
         | GPT-4o is 2.5/10$. Unless you look at an old checkpoint. GPT-4o
         | was the cheapest frontier model before.
        
           | simonw wrote:
           | I can't see that price on https://openai.com/api/pricing/ -
           | it's listing $5/m input and $15/m output for GPT-4o right
           | now.
           | 
           | No wait, correction: That's confusing: it lists 4o first and
           | then lists gpt-4o-2024-08-06 as $2.50/$10.
        
             | jeffharris wrote:
             | apologies: it's taken us a minute to switch the default
             | `gpt-4o` pointer to the newest snapshot
             | 
             | we're planning on doing that default change next week
             | (October 2nd). And you can get the lower prices now (and
             | the structured outputs feature) by manually specify
             | `gpt-4o-2024-08-06`
        
               | jiggawatts wrote:
               | > "You can"
               | 
               | No, "I" can't.
               | 
               | Open AI has always trickled out model access, putting
               | their customers into "tiers" of access. I'm not
               | sufficiently blessed by the great Sam to have immediate
               | access.
               | 
               | On, and Azure Open AI especially likes to drag their feet
               | both consistently, _and_ also on a per-region basis.
               | 
               | I live in a "no model for you" region.
               | 
               | Open AI says: "Wait your turn, peasant" while claiming to
               | be about democratising access.
               | 
               | Google and everyone else just gives access, no
               | gatekeeping.
        
         | RhodesianHunter wrote:
         | I wonder if they're pulling the wall-mart model. Ruthlessly cut
         | costs and sell at-or-below costs until your competitors go out
         | of business, then ratchet up the prices once you have market
         | dominance.
        
           | entropicdrifter wrote:
           | You think Google would engage in monopolistic practices like
           | that?
           | 
           | Because I do
        
             | 0cf8612b2e1e wrote:
             | I have no idea if this is dumping or not. At
             | Microsoft/Google scale, what does it cost to serve a
             | million LLM tokens?
             | 
             | Tough to disentangle the capex vs opex costs for them. If
             | they did not have so many other revenue streams,
             | potentially dicey as there are probably still many untapped
             | performance optimizations.
        
           | bko wrote:
           | Isn't Walmart still incredibly cheap? They have a net margin
           | of 2.3%
           | 
           | I think that's one of those things competitors complain about
           | that never actually happens (the raising prices part).
           | 
           | https://www.macrotrends.net/stocks/charts/WMT/walmart/net-
           | pr...
        
           | lacker wrote:
           | Probably not. Do they really believe they are going to knock
           | OpenAI out of business, when the OpenAI models are better?
           | 
           | Instead I think they are going after the "Android model".
           | Recognize they might not be able to dethrone the leader who
           | invented the space. Define yourself in the marketplace as the
           | cheaper alternative. "Less good but almost as good." In the
           | end, they hope to be one of a small number of surviving
           | members of an valuable oligopoly.
        
             | GaggiX wrote:
             | Android is more popular than iOS by a large margin and it's
             | neither less good or cheaper, it really depends on the
             | smartphone.
        
             | socksy wrote:
             | The latest Google Pixel phone (you know, the one that
             | Google actually set the price for) appears to cost the
             | exact same as the latest iPhone ($999 for pro, $799 for
             | non-pro). And I would argue against the "less good" bit
             | too.
             | 
             | I think this analysis is not in keeping with reality, and I
             | doubt if that's their strategy.
        
               | rajup wrote:
               | I doubt anyone buys Pixel phones at full price. They are
               | discounted almost right out of the gate.
        
             | scarmig wrote:
             | Cheapness has a quality all its own.
             | 
             | Gemini is substantially cheaper to run (in consumer prices,
             | and likely internally as well) than OpenAI's models. You
             | might wonder, what's the value in this, if the model isn't
             | leading? But cheaper inference could potentially be a
             | killer edge when you can scale test-time compute for
             | reasoning. Scaling test-time compute is, after all, what
             | makes o1 so powerful. And this new Gemini doesn't expose
             | that capability at all to the user, so it's comparing
             | apples and oranges anyway.
             | 
             | DeepMind researchers have never been primarily about LLMs,
             | but RL. If DM's (and OAI's) theory is correct--that you can
             | use test-time compute to generate better results, and train
             | on that--this is potentially a substantial edge for Google.
        
               | Dr4kn wrote:
               | In Home Assistant you can use LLMs to control your Home
               | with your voice. Gemini performs similar to the GPT
               | models, and with the cost difference there is little
               | reason to choose OpenAi
        
               | a_wild_dandan wrote:
               | Using _either_ frontier model for basic edge device
               | problems is wasteful. Use something cheap. We 're asking
               | "is there a profitable niche between the best & runner-up
               | models?" I believe so.
        
               | zaptrem wrote:
               | Google still has an unbelievable training infrastructure
               | advantage. The second they can figure out how to convert
               | that directly to model performance without worrying about
               | data (as the o1 blog post seemed to imply OAI had)
               | they'll be kings.
        
             | JeremyNT wrote:
             | > _Probably not. Do they really believe they are going to
             | knock OpenAI out of business, when the OpenAI models are
             | better?_
             | 
             | Would OpenAI even _exist_ without Google publishing their
             | research? The idea that Google is some kind of also-ran
             | playing catch up here feels kind of wrong to me.
             | 
             | Sure OpenAI gave us the first _productized_ chatbots, so in
             | that sense they  "invented the space," but it's not like
             | Google were over there twiddling their thumbs - they just
             | weren't exposing their models directly outside of Google.
             | 
             | I think we're past the point where any of these tech giants
             | have some kind of moat (other than hardware, but you have
             | to assume that Google is at least at parity with OpenAI/MS
             | there).
        
           | charlie0 wrote:
           | Yes, and it's the exact same thing OpenAI/Microsoft and
           | Facebook are doing. In Facebook's case, they are giving it
           | away for free.
        
           | sangnoir wrote:
           | There's lot of room to cut margins in the AI stack right now
           | (see Nvidia's latest report); low prices are not an sure
           | indication of predatory pricing. Which company do you think
           | is most likely to have the lowest training and inference
           | costs between Anthropic, OpenAI and Google? My bet goes to
           | the one designing,producing and using their own TPUs.
        
         | pzo wrote:
         | whats confusing they have different pricing for output. Here
         | [1] it's $5/million output (starting 1st october) and on vertex
         | AI [2] it's $2.5/1 million (starting 7 october) - but
         | _characters_ - so it 's overall gonna be more expensive if you
         | wanna compare to equivalent 1 million tokens. It's actually
         | even more confusing to know what kind of characters they mean?
         | 1 byte? UTF-8?
         | 
         | [1] https://ai.google.dev/pricing
         | 
         | [2] https://cloud.google.com/vertex-ai/generative-ai/pricing
        
           | Deathmax wrote:
           | They do mention how characters are counted in the Vertex AI
           | pricing docs: "Characters are counted by UTF-8 code points
           | and white space is excluded from the count"
        
         | lossolo wrote:
         | > For comparison, GPT-4o is currently $5/million input and
         | $15/million output and Claude 3.5 Sonnet is $3/million input
         | and $15/million output.
         | 
         | Google is the only one of the three that has its own data
         | centers and custom inference hardware (TPU).
        
       | serjester wrote:
       | Gemini feels like an abusive relationship -- every few months,
       | they announce something exciting, and I'm hopeful that this time
       | will be different, that they've finally changed for the better,
       | but every time, I'm left regretting having spent any time with
       | them.
       | 
       | Their docs are awful, they have multiple unusable SDK's and the
       | API is flaky.
       | 
       | For example, I started bumping into "Recitation" errors - ie they
       | issue a flat out refusal if your response resembles anything in
       | the training data. There's a GitHub issue with hundreds of
       | upvotes and they still haven't published formal guidance on
       | preventing this. Good luck trying to use the 1M context window.
       | 
       | Everything is built the "Google" way. It's genuinely unusable
       | unless you're a total masochist and want to completely lock
       | yourself into the Google ecosystem.
       | 
       | The only thing they can compete on is price.
        
         | jatins wrote:
         | i think it's unusable if you are trying to use via GCP. Using
         | via ai studio is a decent experience
        
       | thekevan wrote:
       | Google does not miss one single opportunity to miss an
       | opportunity.
       | 
       | They announced a price reduction but it "won't be available for a
       | few days". By the time, the initial hype will be over and the
       | consumer-use side of the opportunity to get new users will be
       | lost in other news.
        
       | phren0logy wrote:
       | As far as I can tell there's still no option for keeping data
       | private?
        
         | diggan wrote:
         | Makes sense, as soon as your data leaves your computer, it's
         | safe to assume it's no longer private, no matter what promises
         | a service gives you.
         | 
         | You want guaranteed private data that won't be used for
         | anything? Keep it on your own computer.
        
           | 999900000999 wrote:
           | Not sure why your getting down voted. Anything sent to an
           | cloud hosted LLM is subject to be publicly released or used
           | in training.
           | 
           | Setting up a local LLM isn't that hard, although I'd probably
           | air gap anything truly sensitive. I like ollama, but it
           | wouldn't surprise me if it's phoning home.
        
             | phren0logy wrote:
             | This is just incorrect. The OpenAI models hosted though
             | Azure are HIPAA-compliant, and Antropic will also sign a
             | BAA.
        
               | 999900000999 wrote:
               | I'm open to being wrong. However for many industries your
               | still running the risk of leaking data via a 3rd party
               | service.
               | 
               | You can run Llama3 on prem, which eliminates that risk. I
               | try to reduce reliance on 3rd party services when
               | possible. I still have PTSD from Saucelabs constantly
               | going down and my manager berating me over it.
        
               | caseyy wrote:
               | You are not technically wrong because a statement "there
               | is a risk of leaking data" is not falsifiable. But your
               | comment is performative cynicism to display your own high
               | standards. For the very vast majority of people and
               | companies, privacy standards-compliant services (like
               | HIPAA-compliant) are private enough.
        
             | spiralk wrote:
             | This is not true. Both OpenAI and Google's LLM APIs have a
             | policy of not using the data sent over them. Its no
             | different than trusting Microsoft's or Google's cloud to
             | store private data.
        
               | phren0logy wrote:
               | Can you link to documentation for Google's LLMs? I
               | searched long and hard when Gemma 2 came out, and all of
               | the LLM offerings seemed specifically exempted. I'd love
               | to know if that has changed.
        
               | spiralk wrote:
               | https://ai.google.dev/gemini-api/terms this?
        
               | phren0logy wrote:
               | Thanks very much! I think before I looked at docs for
               | Google AI Studio, but also for Google Workspace, and both
               | made no guarantees.
               | 
               | From the linked document, so save someone else a click:
               | > The terms in this "Paid Services" section apply solely
               | to your use of paid Services ("Paid Services"), as
               | opposed to any Services that are offered free of charge
               | like direct interactions with Google AI Studio or unpaid
               | quota in Gemini API ("Unpaid Services").
        
               | Deathmax wrote:
               | There's some possible confusion because of the Copilot
               | problem where everything in the product stack is called
               | Gemini.
               | 
               | The Gemini API (or Generative Language API) as documented
               | on https://ai.google.dev uses
               | https://ai.google.dev/gemini-api/terms for its terms.
               | Paid usage, or usage from a UK/CH/EEA geolocated IP
               | address will not be used for training.
               | 
               | Then there's Google Cloud's Vertex AI Generative AI
               | offering, which has https://cloud.google.com/vertex-
               | ai/generative-ai/docs/data-g.... Data is not used for
               | training, and you can opt out of the 24 hour prompt cache
               | to effectively be zero retention.
               | 
               | And then there's all the different consumer facing Gemini
               | things. The chatbot at https://gemini.google.com/ (and
               | the Gemini app) uses data for training by default:
               | https://support.google.com/gemini/answer/13594961l,
               | unless you pay for Gemini Enterprise as part of Gemini
               | for Workspace.
               | 
               | Gemini in Chrome DevTools uses data for training (https:/
               | /developer.chrome.com/docs/devtools/console/understan...)
               | .
               | 
               | Enterprise features like Gemini for Workspace (generative
               | AI features in the office suite), Gemini for Google Cloud
               | (generative AI features in GCP), Gemini Code Assist,
               | Gemini in BigQuery/SecOps/etc do not use data for
               | training.
        
         | sweca wrote:
         | If you are on the pay as you go model your data is exempted
         | from training.
         | 
         | > When you're using Paid Services, Google doesn't use your
         | prompts (including associated system instructions, cached
         | content, and files such as images, videos, or documents) or
         | responses to improve our products, and will process your
         | prompts and responses in accordance with the Data Processing
         | Addendum for Products Where Google is a Data Processor. This
         | data may be stored transiently or cached in any country in
         | which Google or its agents maintain facilities.
         | 
         | https://ai.google.dev/gemini-api/terms
        
       | kendallchuang wrote:
       | Has anyone used Gemini Code Assist? I'm curious how it compares
       | with Github Copilot and Cursor.
        
         | spotlmnop wrote:
         | God, it sucks
        
         | spiralk wrote:
         | The Aider leaderboards seem like a good practical test of
         | coding usefulness: https://aider.chat/docs/leaderboards/. I
         | haven't tried Cursor personally but I am finding Aider with
         | Sonnet more useful that Github Copilot and its nice to be able
         | to pick any model API. Eventually even a local model may be
         | viable. This new Gemini model does not rank very high
         | unfortunately.
        
           | kendallchuang wrote:
           | Thanks for the link. That's unfortunate, though perhaps the
           | benchmarks will be updated after this latest Gemini release.
           | Cursor with Sonnet is great, I'll have to give Aider a try as
           | well.
        
             | spiralk wrote:
             | It is updated actually, gemini-1.5-pro-002 is this new
             | model.
        
               | kendallchuang wrote:
               | That was fast, I missed it!
        
             | KaoruAoiShiho wrote:
             | It's on the leaderboard, it's tied with qwen 2.5 72b and
             | far below SOTA of o1, claude sonnet, and deepseek. (also
             | below very old models like gpt-4-0314 lol)
        
         | dudus wrote:
         | I use it and find it very helpful. Never tried cursor or
         | copilot though
        
           | therein wrote:
           | I tried Cursor the other day. It was actually pretty cool. My
           | thought was, I'll open this open source project and use it to
           | grok my way around the codebase. It was very helpful. After
           | that I accidentally pasted an API secret into the document;
           | had to consider it compromised and re-issued the credential.
        
         | therein wrote:
         | I know you aren't necessarily talking about in-editor code
         | assist but something about in-editor AI cloud code assist makes
         | me super uncomfortable.
         | 
         | It makes sense I need to be careful not to commit secrets to
         | public repositories but now I have to avoid not only saving
         | credentials into a file but even to paste them by accident into
         | my editor?
        
         | mil22 wrote:
         | I have used Github Copilot extensively within VS Code for
         | several months. The autocomplete - fast and often surprisingly
         | accurate - is very useful. My only complaint is when writing
         | comments, I find the completions distracting to my thought
         | process.
         | 
         | I tried Gemini Code Assist and it was so bad by comparison that
         | I turned it off within literally minutes. Too slow and
         | inaccurate.
         | 
         | I also tried Codestral via the Continue extension and found it
         | also to be slower and less useful than Copilot.
         | 
         | So I still haven't found anything better for completion than
         | Copilot. I find long completions, e.g. writing complete
         | functions, less useful in general, and get the most benefit
         | from short, fast, accurate completions that save me typing,
         | without trying to go too far in terms of predicting what I'm
         | going to write next. Fast is the key - I'm a 185 wpm on
         | Monkeytype, so the completion had better be super low latency
         | otherwise I'll already have typed what I want by the time the
         | suggestion appears. Copilot wins on the speed front by far.
         | 
         | I've also tried pretty much everything out there for writing
         | algorithms and doing larger code refactorings, and answering
         | questions, and find myself using Continue with Claude Sonnet,
         | or just Sonnet or o1-preview via their native web interfaces,
         | most of the time.
        
           | imp0cat wrote:
           | Have you tried Gitlab Duo and if so, what are your thoughts
           | on that?
        
             | mil22 wrote:
             | Not yet, hadn't heard of it. Thanks for the suggestion.
        
           | kendallchuang wrote:
           | I see, perhaps with Gemini because the model is larger it
           | takes longer to generate the completions. I would expect with
           | a larger model it would perform better on larger codebases.
           | It sounds like for you, it's faster to work on a smaller
           | model with shorter more accurate completions rather than
           | letting the model guess what you're trying to write.
        
         | danmaz74 wrote:
         | I tried it briefly and didn't like it. On the other hand, I
         | found Gemini pro better than sonnet or 4o at some more complex
         | coding tasks (using continue.dev)
        
       | TheAceOfHearts wrote:
       | I only use regular Gemini and the main feature I care about is
       | absolutely terrible: summarizing YouTube videos. I'll ask for a
       | breakdown or analysis of the video, and it'll give me a very high
       | level overview. If I ask for timestamps or key points, it begins
       | to hallucinate and make stuff up. It's incredibly disappointing
       | that such a huge company with effectively unlimited access to
       | both money and intellectual resources can't seem to implement a
       | video analysis feature that doesn't suck. Part of me wonders if
       | they're legitimately this incompetent or if they're deliberately
       | not implementing good analysis features because it could eat into
       | their views and advertisement opportunities.
        
         | foota wrote:
         | I imagine you'd be paying more in ML costs than YouTube makes
         | off your views.
        
         | bahmboo wrote:
         | I've had it fail when a video does not have subtitles - I'm
         | guessing that's what it uses. I have had good success having it
         | answer the clickbait video titles like "is this the new best
         | thing?"
         | 
         | It's not watching the video as far as I can tell.
        
       | msp26 wrote:
       | Has anyone tried Google's context caching feature? The minimum
       | caching window being 32k tokens seems crazy to me.
        
       | jzebedee wrote:
       | As someone who actually had to build on Gemini, it was so
       | indefensibly broken that I couldn't believe Google really went to
       | production with it. Model performance changes from day to day and
       | production is completely unstable as Google will randomly decide
       | to tweak things like safety filtering with no notice. It's also
       | just plain buggy, as the agent scaffolding on top of Gemini will
       | randomly fail or break their own internal parsing, generating
       | garbage output for API consumers.
       | 
       | Trying to build an actual product on top of it was an exercise in
       | futility. Docs are flatly wrong, supposed features are vaporware
       | (discovery engine querying, anybody?), and support is
       | nonexistent. The only thing Google came back with was throwing
       | more vendors at us and promising that bug fixes were "coming
       | soon".
       | 
       | With all the funded engagements and credits they've handed out,
       | it's at the point where Google is paying us to use Gemini and
       | it's _still_ not worth the money.
        
         | wewtyflakes wrote:
         | > Docs are flatly wrong
         | 
         | This +999; I couldn't believe how inconsistent and wrong the
         | docs were. Not only that, but once I got something successfully
         | integrated, it worked for a few weeks then the API was changed,
         | so I was back to square one. I gave it a half-hearted try to
         | fix it but ultimately said 'never again'! Their offering would
         | have to be overwhelmingly better than Anthropic and OpenAI for
         | me to consider using Gemini again.
        
         | victor106 wrote:
         | Same experience here.
         | 
         | I had hopes of Google able to compete with Claude and OpenAI.
         | But I don't think that's the case. Unless they come out with a
         | product that's 10x better in the next year or so I think they
         | lost the AI race.
        
         | ldjkfkdsjnv wrote:
         | The engineers at google are bad, they keep hiring via pure
         | leetcode. Cant ship working products
        
       | naiv wrote:
       | This sounds interesting:
       | 
       | "We will continue to offer a suite of safety filters that
       | developers may apply to Google's models. For the models released
       | today, the filters will not be applied by default so that
       | developers can determine the configuration best suited for their
       | use case."
        
         | ancorevard wrote:
         | This is the most important update.
         | 
         | Pricing and speed doesn't matter when your call fails because
         | of "safety".
        
           | CSMastermind wrote:
           | Also Google's safety filters are absolutely awful. Beyond
           | parody levels of bad.
           | 
           | This is a query I did recently that got rejected for "safety"
           | reasons:
           | 
           | Who are the current NFL starting QBs?
           | 
           | Controversial I know, I'm surprised I'd be willing to take
           | the risk with submitting such a dangerous query to the model.
        
             | elashri wrote:
             | Not stranger than my experience with openai. I got banned
             | from DELL-3 access when it first came because I asked in
             | the prompt about generating a particle moving in magnetic
             | field of a forward direction and decays to two other
             | particles with a kink angle between the particle and the
             | charged daughter.
             | 
             | I don't recall exact prompt but it should be something
             | close to that. I really wonder what filters they had about
             | kink tracks and why? Do they have a problem with Beyond
             | standard model searches /s.
        
               | CSMastermind wrote:
               | For what it's worth I run every query I make through all
               | the major models and Google's censorship is the only one
               | I consistently hit.
               | 
               | I think I bumped into Anthropics once? And I know I hit
               | ChatGPTs a few months back but I don't even remember what
               | the issue was.
               | 
               | I hit Google's safety blocks at least a few times a week
               | during the course of my regular work. It's actually crazy
               | to me that they allowed someone to ship these
               | restrictions.
               | 
               | They must either think they will win the market no matter
               | the product quality or just not care about winning it.
        
         | KaoruAoiShiho wrote:
         | There's still basic filters even if you take all the ones that
         | you can turn off from the UI all off. It's still not capable of
         | summarizing some YA novels I tried to feed it because of those
         | filters.
        
         | panarky wrote:
         | The "safety" filters used to make Gemini models nearly
         | unusable.
         | 
         | For example, this prompt was apparently unsafe: _" Summarize
         | the conclusions of reputable econometric models that estimate
         | the portion of import tariffs that are absorbed by the
         | exporting nation or company, and what portion of import tariffs
         | are passed through to the importing company or consumers in the
         | importing nation. Distinguish between industrial commodities
         | like steel and concrete from consumer products like apparel and
         | electronics. Based on the evidence, estimate the portion of
         | tariffs passed through to the importing company or nation for
         | each type of product."_
         | 
         | I can confirm that this prompt is no longer being filtered
         | which is a huge win given these new lower token prices!
        
       | stan_kirdey wrote:
       | Google should just offer llama3 405b, maybe slightly fine tuned.
       | Geminis are unusable.
        
         | tiborsaas wrote:
         | AI companies should not pick up naming models after
         | astrological signs, after a while it will be hard to tell apart
         | model reviews from horoscope.
        
         | dcchambers wrote:
         | > Geminis are unusable
         | 
         | how so?
        
         | romland wrote:
         | Pretty sure Google's got more than 700 million active users.
         | 
         | In fact, Google is most likely _the_ target for that clause in
         | the license.
        
         | nmfisher wrote:
         | I've found Gemini Pro to be the most reliable for function
         | calling.
        
         | VirusNewbie wrote:
         | Llama seems to be tricked up by simple puzzles that Gemini does
         | not struggle with, in my experience.
        
         | Deathmax wrote:
         | They do: https://cloud.google.com/vertex-ai/generative-
         | ai/docs/partne...
        
       | resource_waste wrote:
       | Its just not as smart as ChatGPT or LLAMA, its mind boggling
       | Google fell so far behind.
        
         | Mistletoe wrote:
         | Has it changed much since March when this was written? Gemini
         | won, 6-4. And it matches my experiences of just using Gemini
         | instead of ChatGPT because it gives me more useful responses,
         | when I feel like I want an AI answer.
         | 
         | https://www.tomsguide.com/ai/google-gemini-vs-openai-chatgpt
        
           | w23j wrote:
           | _" For this initial test I'll be comparing the free version
           | of ChatGPT to the free version of Google Gemini, that is
           | GPT-3.5 to Gemini Pro 1.0."_
           | 
           | The free version of ChatGPT is 4o now, isn't it? So maybe
           | Gemini has not gotten worse, but the free alternatives are
           | now better? When I compare ChatGPT-4o with Gemini-Advanced
           | (wich is 1.5 Pro, I believe) the latter is just so much
           | worse.
        
       | ldjkfkdsjnv wrote:
       | They have to drop the price because the model is bad. People will
       | pay almost any cost for a model that is much better than the
       | rest. How this company carries on the facade of competence is
       | laughable. All the money on the planet, and they still cannot win
       | on their core "competency".
        
       | FergusArgyll wrote:
       | Any opinions on pro-002 vs pro-exp-0827 ?
       | 
       | Unlike others here I really appreciate the gemini API, it's free
       | and it works. I haven't done too many complicated things with it
       | but I made a chatbot for the terminal, a forecasting agent (for
       | metaculus challenge) and a yt-dlp auto namer of songs. The point
       | for me isn't really how it compares to openAI/anthropic, it's a
       | free API key and I wouldn't have made the above if I had to pay
       | just to play around
        
       | rkwasny wrote:
       | Can someone explain to me why there is COMPLETELY different
       | pricing for models on Vertex AI, Google AI studio and also
       | OpenRouter has another price ...
        
         | hiddencost wrote:
         | https://www.businessinsider.com/big-tech-org-charts-2011-6
         | 
         | Google is now looking more like the Microsoft chart.
        
       | sweca wrote:
       | No Human eval benchmark result?
        
       | kebsup wrote:
       | Is there a good benchmark comparing multilingual and/or
       | translation abilities of most recent LLMs? GPT-4o struggles for
       | some tasks in my language learning app.
        
       | charlie0 wrote:
       | Anyone want to take bets on how long it takes for this to hit the
       | Google Graveyard?
        
       | bn-l wrote:
       | I've used it. The API is incredibly buggy and flakey. A
       | particular pain point is the "recitation error" fiasco. If you're
       | developing a real world app this basically makes the Gemini api
       | unusable. It strikes me as a kind of "Potemkin" service.
       | 
       | Google is aware of the issue and it has been open on google's bug
       | tracker since _March_ 2024:
       | https://issuetracker.google.com/issues/331677495
       | 
       | There is also discussion on GitHub: https://github.com/google-
       | gemini/generative-ai-js/issues/138
       | 
       | It stems from something google added intentionally to prevent
       | copyright material being returned verbatim (ala the NYT openai
       | fiasco), so they dialled up the "recitation" control (the act of
       | repeating training data--and maybe data they should not have
       | legally trained on).
       | 
       | Here are some quotes from the bug tracker page:
       | 
       | > I got this error by just asking "Who is Google?"
       | 
       | > We're encountering recitation errors even with basic tutorials
       | on application development. When bootstrapping a Spring Boot app,
       | we're flagged for the pom.xml being too similar to some blog
       | posts.
       | 
       | > This error is a deal breaker... It occurs hundreds of times a
       | day for our users and massively degrades their UX.
        
         | screye wrote:
         | The recitation error is a big deal.
         | 
         | I was ready to champion gemini use across my organization, and
         | the recitation issue curbed any enthusiasm I had. It's opaque
         | and Google has yet to suggest a mitigation.
         | 
         | Your comment is not hyperbole. It's a genuine expression of how
         | angry many customers are.
        
       | anotherpaulg wrote:
       | The new Gemini models perform basically the same as the previous
       | versions on aider's code editing benchmark. The differences seem
       | within the margin of error.
       | 
       | https://aider.chat/docs/leaderboards/
        
       | ramshanker wrote:
       | One company buying expensive NVIDIA hardware vs another using in-
       | house chips. Google got a huge advantage here. They could really
       | undercut OpenAI.
        
         | rty32 wrote:
         | People have said that for many years. Very few companies are
         | choosing Google's TPUs. Everyone wants H100s.
        
       | accumulator wrote:
       | Cool, now all Google has to do is make it easier to onboard new
       | GCP customers and more people will probably use it...its comical
       | how hard it is to create a new GCP organization & billing
       | account. Also I think more Workspace customers would probably try
       | Gemini if it was a usage-based trial as opposed to clicking a
       | "Try for 14 days" CTA to activate a new subscription.
        
       ___________________________________________________________________
       (page generated 2024-09-24 23:01 UTC)