[HN Gopher] Gemma: New Open Models
___________________________________________________________________
Gemma: New Open Models
Author : meetpateltech
Score : 880 points
Date : 2024-02-21 13:03 UTC (9 hours ago)
(HTM) web link (blog.google)
(TXT) w3m dump (blog.google)
| smcn wrote:
| There are some pretty impressive benchmarks on
| https://ai.google.dev/gemma. Even the 2b model looks fairly not
| awful?
|
| I guess my weekend is going to be spent exploring this.
| alekandreev wrote:
| Hello on behalf of the Gemma team! We are really excited to
| answer any questions you may have about our models.
|
| Opinions are our own and not of Google DeepMind.
| declaredapple wrote:
| Congrats on the launch and thanks for the contribution! This
| looks like it's on-par or better compared to mistral 7B 0.1 or
| is that 0.2?
|
| Are there plans for MoE or 70B models?
| kathleenfromgdm wrote:
| Great question - we compare to the Mistral 7B 0.1 pretrained
| models (since there were no pretrained checkpoint updates in
| 0.2) and the Mistral 7B 0.2 instruction-tuned models in the
| technical report here: https://goo.gle/GemmaReport
| zitterbewegung wrote:
| Do you have a plan of releasing higher parameter models?
| alekandreev wrote:
| We have many great things in research and development phases,
| so stay tuned. I'm hopeful we can share more in the coming
| weeks and month!
| brucethemoose2 wrote:
| That is awesome!
|
| I hope y'all consider longer context models as well.
|
| Also, are ya'll looking alternative architectures like
| Mamba? Being "first" with a large Mamba model would cement
| your architectural choices/framework support like llama did
| for Meta.
| neximo64 wrote:
| How are these performing so well compared to Llama 2, are there
| any documents on the architecture and differences, is it MoE?
|
| Also note some of the links on the blog post don't work, e.g
| debugging tool.
| kathleenfromgdm wrote:
| We've documented the architecture (including key differences)
| in our technical report here (https://goo.gle/GemmaReport),
| and you can see the architecture implementation in our Git
| Repo (https://github.com/google-deepmind/gemma).
| h1t35h wrote:
| It seems you have exposed the internal debugging tool link in
| the blog post. You may want to do something about it.
| trisfromgoogle wrote:
| Ah, I see -- the link is wrong, thank you for flagging!
| Fixing now.
| neximo64 wrote:
| The link to the debugging tool is an internal one, no one
| outside Google can access it
| h1t35h wrote:
| The blog post shares the link for debugging tool as
| https://*.*.corp.google.com/codelabs/responsible-ai/lit-
| gemm...
|
| .corp and the login redirect makes me believe it was
| supposed to be an internal link
| littlestymaar wrote:
| Same for the "safety classifier"
| barrkel wrote:
| https://codelabs.developers.google.com/codelabs/responsib
| le-...
| wrexx0r wrote:
| The link in the Debugging section redirects to a Google SSO
| login page
| pama wrote:
| Will these soon be available on lmsys for human comparison
| against other models? Can they run with llama.cpp?
| ErneX wrote:
| Yes to llama.cpp
|
| https://twitter.com/ggerganov/status/1760293079313973408
| sbarre wrote:
| I came here wondering if these models are "open" in the
| sense that they'll show up on sites like Ollama where you
| can download and run them locally.
|
| Am I correct to conclude that this means they eventually
| will?
|
| It's unclear to me from Google's docs exactly what "open"
| means for Gemma
| benpacker wrote:
| Yes - they are open weights and open inference code,
| which means they can be integrated into Ollama.
|
| They are not "open training" (either in the training code
| or training data sense), so they are not reproducible,
| which some have suggested ought to be a component of the
| definition of open models.
| OJFord wrote:
| It really should shouldn't it? I'm quite ML-naive, but
| surely providing the model without 'training code or
| training data' is just like providing a self-hostable
| binary without the source code? Nobody calls that open
| source, it's not even source available.
| sunnybeetroot wrote:
| That's why they're called open as in free to use how you
| wish, not open source where the source of the training is
| also provided.
| OJFord wrote:
| But my point is there's no analogy for that that we call
| open? It's like self-hostable, or free (as in beer).
| sunnybeetroot wrote:
| That's a fair comment, maybe free-to-use is more
| appropriate.
| idiotsecant wrote:
| Man, people will find anything to complain about.
| OJFord wrote:
| I'm not complaining, I'm unlikely ever to use it
| (regardless of how open or not it is) so it doesn't
| really matter to me, just surprised to learn what people
| mean by 'open' in this context.
| michaelt wrote:
| It is widely believed (and in some cases acknowledged)
| that a lot of models are trained on copyrighted data
| scraped from the web. In some cases, even scrapes of
| ebook piracy websites - google 'books3' to learn more.
|
| Some companies (such as those working on AI) believe this
| is legal, others (such as the copyright holders to those
| books) believe it isn't.
|
| In any case, IMHO it's unlikely any cutting edge models
| will be offering us their training data any time soon.
| SushiHippie wrote:
| https://huggingface.co/google/gemma-7b-it/tree/main
|
| yes, similar to the llama models, you'll also need to
| accept the license to download them officially. But the
| llama models have been unofficially downloadable without
| accepting the license for quite a while, so it's probably
| just a matter of time.
| artninja1988 wrote:
| I find the snyde remarks around open source in the paper and
| announcement rather off putting.
|
| As the ecosystem evolves, we urge the corporate AI community to
| move beyond demanding to be taken seriously as a player in open
| source for models that are not actually open, and avoid
| preaching with a PR statement that can be interpreted as
| uniformed at best or malicious at worst.
| silentsanctuary wrote:
| Which remarks are you referring to?
| artninja1988 wrote:
| The synde remarks at metas llama license that doesn't allow
| companies with 700 million monthly active users to use it,
| while this model also doesn't have a really 'open' license
| itself and also this paragraph:
|
| >As the ecosystem evolves, we urge the wider AI community
| to move beyond simplistic 'open vs. closed' debates, and
| avoid either exaggerating or minimising potential harms, as
| we believe a nuanced, collaborative approach to risks and
| benefits is essential. At Google DeepMind we're committed
| to developing high-quality evaluations and invite the
| community to join us in this effort for a deeper
| understanding of AI systems.
| tomComb wrote:
| Well, given that that restriction added to the meta-llama
| license is aimed at Google, is petty, and goes against
| open source norms, I think it's reasonable that they
| should feel this way about it.
| lordswork wrote:
| How is this a snide remark? It's factual and prevented
| their team from benchmarking against Llama 2.
| trisfromgoogle wrote:
| Quick question -- can you tell me where you got that
| quote? It's not in the main blog or any of the launch
| communications that I can see.
| artninja1988 wrote:
| The quote is from the technical report
|
| https://storage.googleapis.com/deepmind-
| media/gemma/gemma-re...
| trisfromgoogle wrote:
| It would be great to understand what you mean by this -- we
| have a deep love for open source and the open developer
| ecosystem. Our open source team also released a blog today
| describing the rationale and approach for open models and
| continuing AI releases in the open ecosystem:
|
| https://opensource.googleblog.com/2024/02/building-open-
| mode...
|
| Thoughts and feedback welcome, as always.
| artninja1988 wrote:
| The statement on you not being able to use LLaMA 2 to
| benchmark is also false and highly misleading see https://x
| .com/BlancheMinerva/status/1760302091166241163?s=20
| lordswork wrote:
| If, on the Llama 2 version release date, the monthly
| active users [...] is greater than 700 million monthly
| active users [...] you are not authorized to exercise any
| of the rights under this Agreement
|
| I would guess this is Google being careful to not be
| burned by this lame clause in the Llama 2 license.
| mrob wrote:
| If you truly love Open Source, you should update the the
| language you use to describe your models so it doesn't
| mislead people into thinking it has something to do with
| Open Source.
|
| Despite being called "Open", the Gemma weights are released
| under a license that is incompatible with the Open Source
| Definition. It has more in common with Source-Available
| Software, and as such it should be called a "Weights-
| Available Model".
| jppittma wrote:
| Working at google is like this, where no matter how much you
| try to do the right thing you're always under attack.
| tosh wrote:
| Are there any plans for releasing the datasets used?
| alekandreev wrote:
| This would be really interesting in my opinion, but we are
| not releasing datasets at this time. See the C4 dataset for
| an earlier open dataset from Google.
| sbarre wrote:
| Can the Gemma models be downloaded to run locally, like open-
| source models Llama2, Mistral, etc ?
|
| Or is your definition of "open" different?
| kathleenfromgdm wrote:
| Yes, you can get started downloading the model and running
| inference on Kaggle:
| https://www.kaggle.com/models/google/gemma ; for a full list
| of ways to interact with the model, you can check out
| https://ai.google.dev/gemma.
| dartharva wrote:
| Can we have llamafile releases as well?
|
| https://github.com/Mozilla-Ocho/llamafile
| syntaxing wrote:
| A small typo in your model link that breaks it. There's an
| extra ; on the end.
| kathleenfromgdm wrote:
| Corrected - thanks :)
| Kostic wrote:
| It should be possible to run it via llama.cpp[0] now.
|
| [0] https://github.com/ggerganov/llama.cpp/pull/5631
| nerdix wrote:
| Amazing how quickly this happened.
| tomp wrote:
| Their definition of "open" is "not open", i.e. you're only
| allowed to use Gemma in "non-harmful" way.
|
| We all know that Google thinks that saying that 1800s English
| kings were _white_ is "harmful".
| wantsanagent wrote:
| Not sure why you're getting downvoted. I would have thought
| HN of all places would recognize the power and value of OSI
| licensing and the danger of the proliferation of these
| source available but definitely not Open Source licenses.
| hackerlight wrote:
| > We all know that Google thinks that saying that 1800s
| English kings were white is "harmful".
|
| If you know how to make "1800s english kings" show up as
| white 100% of the time without also making "kings" show up
| as white 100% of the time, maybe you should apply to
| Google? Clearly you must have advanced knowledge on how to
| perfectly remove bias from training distributions if you
| casually throw stones like this.
| trackflak wrote:
| Tell me you take this seriously: https://twitter.com/napo
| leon21st/status/1760116228746805272
|
| It has no problem with other cultures and ethnicities,
| yet somehow white or Japanese just throws everything off?
|
| I suppose 'bias' is the new word for "basic historic
| accuracy". I can get curious about other peoples without
| forcibly promoting them at the expense of my own Western
| and British people and culture. This 'anti bias' keyword
| injection is a laughably bad, in your face solution to a
| non-issue.
|
| I lament the day 'anti-bias' AI this terrible is used to
| make real world decisions. At least we now know we can't
| trust such a model because it has already been so
| evidently crippled by its makers.
| austinvhuang wrote:
| Yes models can be downloaded locally. In addition to the
| python NN frameworks and ggml as options, we also implemented
| a standalone C++ implementation that you can run locally at
| https://github.com/google/gemma.cpp
| mrob wrote:
| Mistral weights are released under an Apache 2.0 license, but
| Llama 2 weights are released under a proprietary license that
| prohibits use by large organizations and imposes usage
| restrictions, violating terms 5 and 6 the Open Source
| Definition[0]. Even if you accept that a model with a
| proprietary training dataset and proprietary training code
| can be considered "open source", there's no way Llama 2
| qualifies.
|
| For consistency with existing definitions[1], Llama 2 should
| be labeled a "weights available" model.
|
| [0] https://en.wikipedia.org/wiki/The_Open_Source_Definition
|
| [1] https://en.wikipedia.org/wiki/Source-available_software
| vorticalbox wrote:
| are there plans to release an official GGUF version to use with
| llama.ccp?
| espadrine wrote:
| It is already part of the release on Huggingface: https://hug
| gingface.co/google/gemma-7b/blob/main/gemma-7b.gg...
|
| It is a pretty clean release! I had some 500 issues with
| Kaggle validating my license approval, so you might too, but
| after a few attempts I could access the model.
| vorticalbox wrote:
| I didn't see this when searching thanks
| sqreept wrote:
| What are the supported languages of these models?
| alekandreev wrote:
| This v1 model is focused on English support, but you may find
| some multilingual capabilities.
| lnyan wrote:
| Will there be Gemma-vision models or multimodal Gemma models?
| Jayakumark wrote:
| Have the same question.
| CuriouslyC wrote:
| It's cool that you guys are able to release open stuff, that
| must be a nice change from the modus operandi at goog. I'll
| have to double check but it looks like phi-2 beats your
| performance in some cases while being smaller, I'm guessing the
| value proposition of these models is being small and good while
| also having more knowledge baked in?
| turnsout wrote:
| What is the license? I couldn't find it on the 1P site or
| Kaggle.
| trisfromgoogle wrote:
| You can find the terms on our website, ai.google.dev/gemma:
|
| https://ai.google.dev/gemma/terms
| spiantino wrote:
| out of curiosity, why is this a "terms" and not a license?
| I'm used to reading and understanding the software as
| coming with a license to use it. Do the terms give us
| license to use this explicitly?
| turnsout wrote:
| They do, but unlike a known license, these terms are
| custom and non-standard. Which means I would guide my
| commercial clients away from this particular model.
| audessuscest wrote:
| Does this model also thinks german were black 200 years ago ?
| Or is afraid to answer basic stuff ? because if this is the
| case no one will care about that model.
| freedomben wrote:
| I don't know anything about these twitter accounts so I don't
| know how credible they are, but here are some examples for
| your downvoters that I'm guessing just think you're just
| trolling or grossly exaggerating:
|
| https://twitter.com/aginnt/status/1760159436323123632
|
| https://twitter.com/Black_Pilled/status/1760198299443966382
| robswc wrote:
| Yea. Just ask it anything about historical people/cultures
| and it will seemingly lobotomize itself.
|
| I asked it about early Japan and it talked about how
| European women used Katanas and how Native Americans rode
| across the grassy plains carrying traditional Japanese
| weapons. Pure made up nonsense that not even primitive
| models would get wrong. Not sure what they did to it. I
| asked it why it assumed Native Americans were in Japan in
| the 1100s and it said:
|
| > I assumed [...] various ethnicities, including Indigenous
| American, due to the diversity present in Japan throughout
| history. However, this overlooked [...] I focused on
| providing diverse representations without adequately
| considering the specific historical context.
|
| How am I supposed to take this seriously? Especially on
| topics I'm unfamiliar with?
| trackflak wrote:
| From one of the Twitter threads linked above:
|
| > they insert random keyword in the prompts randomly to
| counter bias, that got revealed with something else I
| think. Had T shirts written with "diverse" on it as
| artifact
|
| This was exposed as being the case with OpenAI's DALL-E
| as well - someone had typed a prompt of "Homer Simpson
| wearing a namebadge" and it generated an image of Homer
| with brown skin wearing a namebadge that said 'ethnically
| ambiguous'.
|
| This is ludicrous - if they are fiddling with your prompt
| in this way, it will only stoke more frustration and
| resentment - achieving the opposite of why this has been
| implemented. Surely if we want diversity we will ask for
| it, but sometimes you don't, and that should be at the
| user's discretion.\
|
| Another thread for context: https://twitter.com/napoleon2
| 1st/status/1760116228746805272
| graphe wrote:
| I disagree, coding and RAG performance is all that matters to
| me. I'm not using an LLM to learn basic facts I already know.
| TheHypnotist wrote:
| How do you ragebait for premium pearl clutching?
| audessuscest wrote:
| we're at basic knowledge level, if your RAG imply some of
| it, you can get bad result too. Anyway, would you use a
| model who makes this nonsense response or one that doesn't?
| I know which one I will prefer for sure...
| graphe wrote:
| If this was better at specific RAG or coding performance
| I would absolutely, certainly without a doubt use it over
| a general instruct model in those instances.
| brucethemoose2 wrote:
| Will there be "extended context" releases like 01.ai did for
| Yi?
|
| Also, is the model GQA?
| hustwindmaple1 wrote:
| It's MQA, documented in the tech report
| lordswork wrote:
| Is there any truth behind this claim that folks who worked on
| Gemma have left Google?
|
| https://x.com/yar_vol/status/1760314018575634842
| CaffeinatedDev wrote:
| Them: here to answer questions
|
| _Question_
|
| Them: :O
| lordswork wrote:
| To be fair, I think they are in London, so I assume they
| have winded down for the day. Will probably have to wait
| ~12-18 hours for a response.
| elcomet wrote:
| It seems very easy to check no? Look at the names in the
| paper and check where they are working now
| lordswork wrote:
| Good idea. I've confirmed all the leadership / tech leads
| listed on page 12 are still at Google.
|
| Can someone with a Twitter account call out the tweet
| linked above and ask them specifically who they are
| referring to? Seems there is no evidence of their claim.
| lordswork wrote:
| I confirmed all the folks listed on page 12 are still at
| Google (listed below). I am guessing the linked tweet is a BS
| claim. # Product Management Tris
| Warkentin Ludovic Peran # Program
| Management Minh Giang # Executive Sponsors
| Clement Farabet Oriol Vinyals Jeff Dean
| Koray Kavukcuoglu Demis Hassabis Zoubin
| Ghahramani Douglas Eck Joelle Barral
| Fernando Pereira Eli Collins # Leads
| Armand Joulin Noah Fiedel Evan Senter
| # Tech Leads Alek Andreev+ Kathleen Kenealy+
| bluefinity wrote:
| To be fair, the tweet says that they don't work on the models
| at Google anymore, not that they have left Google.
|
| Might be true, might not be. It's unsourced speculation.
| memossy wrote:
| Training on 4096 v5es how did you handle crazy batch size :o
| quickgist wrote:
| Will this be available as a Vertex AI foundational model like
| Gemini 1.0, without deploying a custom endpoint? Any info on
| pricing? (Also, when will Gemini 1.5 be available on Vertex?)
| moffkalast wrote:
| I'm not sure if this was mentioned in the paper somewhere, but
| how much does the super large 265k tokenizer vocabulary
| influence inference speed and how much higher is the average
| text compression compared to llama's usual 30k? In short, is it
| really worth going beyond GPT 4's 100k?
| dmnsl wrote:
| Hi, what is the cutoff date ?
| legohead wrote:
| All it will tell me is mid-2018.
| cypress66 wrote:
| Can you share the training loss curve?
| fosterfriends wrote:
| Not a question, but thank you for your hard work! Also, brave
| of you to join the HN comments, I appreciate your openness.
| Hope y'all get to celebrate the launch :)
| voxgen wrote:
| Thank you very much for releasing these models! It's great to
| see Google enter the battle with a strong hand.
|
| I'm wondering if you're able to provide any insight into the
| below hyperparameter decisions in Gemma's architecture, as they
| differ significantly from what we've seen with other recent
| models?
|
| * On the 7B model, the `d_model` (3072) is smaller than
| `num_heads * d_head` (16*256=4096). I don't know of any other
| model where these numbers don't match.
|
| * The FFN expansion factor of 16x is MUCH higher than the
| Llama-2-7B's 5.4x, which itself was chosen to be equi-FLOPS
| with PaLM's 4x.
|
| * The vocab is much larger - 256k, where most small models use
| 32k-64k.
|
| * GQA is only used on the 2B model, where we've seen other
| models prefer to save it for larger models.
|
| These observations are in no way meant to be criticism - I
| understand that Llama's hyperparameters are also somewhat
| arbitrarily inherited from its predecessors like PaLM and
| GPT-2, and that it's non-trivial to run hyperopt on such large
| models. I'm just really curious about what findings motivated
| these choices.
| owl_brawl wrote:
| I would love answers to these questions too, particularly on
| the vocab size
| LorenDB wrote:
| EDIT: it seems this is likely an Ollama bug, please keep that
| in mind for the rest of this comment :)
|
| I ran Gemma in Ollama and noticed two things. First, it is
| slow. Gemma got less than 40 tok/s while Llama 2 7B got over 80
| tok/s. Second, it is very bad at output generation. I said
| "hi", and it responded this:
|
| ``` Hi, . What is up? melizing with you today!
|
| What would you like to talk about or hear from me on this fine
| day?? ```
|
| With longer and more complex prompts it goes completely off the
| rails. Here's a snippet from its response to "Explain how to
| use Qt to get the current IP from https://icanhazip.com":
|
| ``` python print( "Error consonming IP arrangration at [local
| machine's hostname]. Please try fufing this function later!")
| ## guanomment messages are typically displayed using
| QtWidgets.MessageBox ```
|
| Do you see similar results on your end or is this just a bug in
| Ollama? I have a terrible suspicion that this might be a
| completely flawed model, but I'm holding out hope that Ollama
| just has a bug somewhere.
| mark_l_watson wrote:
| I was going to try these models with Ollama. Did you use a
| small number of bits/quantization?
| LorenDB wrote:
| The problem exists with the default 7B model. I don't know
| if different quantizations would fix the problem. The 2B
| model is fine, though.
| jmorgan wrote:
| Hi! This is such an exciting release. Congratulations!
|
| I work on Ollama and used the provided GGUF files to quantize
| the model. As mentioned by a few people here, the 4-bit integer
| quantized models (which Ollama defaults to) seem to have
| strange output with non-existent words and funny use of
| whitespace.
|
| Do you have a link /reference as to how the models were
| converted to GGUF format? And is it expected that quantizing
| the models might cause this issue?
|
| Thanks so much!
| espadrine wrote:
| As a data point, using the Huggingface Transformers 4-bit
| quantization yields reasonable results:
| https://twitter.com/espadrine/status/1760355758309298421
| kleiba wrote:
| > _We are really excited to answer any questions you may have
| about our models._
|
| I cannot count how many times I've seen similar posts on HN,
| followed by tens of questions from other users, three of which
| actually get answered by the OP. This one seems to be no
| exception so far.
| spankalee wrote:
| What are you talking about? The team is in this thread
| answering questions.
| owl_brawl wrote:
| Hi alekandreev,
|
| Any reason you decided to go with a token vocabulary size of
| 256k? Smaller vocab/vector sizes like most models in this size
| seem to be using (~16-32k) are much easier to work with. Would
| love to understand the technical reasoning here that isn't
| detailed in the report unfortunately :(.
| Havoc wrote:
| Taking a page out of metas book with open models. I wonder what
| the game plan here is.
|
| Nice that it allows commercial use!
| gaogao wrote:
| Mostly to boost research and commercial usage around JAX/Gemini
| is my read.
|
| Any internal research using Gemma is now more easily externally
| reproducible, external research and frameworks are easier to
| translate over, goodwill especially from researchers.
| gaogao wrote:
| There's also less of a special sauce for text models itself
| these days with the propietary being more on the pre-training
| data and training stack (e.g. how to get 10k GPUs/TPUs
| running together smoothly). Multi-modal models (or adjacent
| like Sora) are less likely to be open sourced in the
| immediate term.
| smarterclayton wrote:
| There is a lot of work to make the actual infrastructure
| and lower level management of lots and lots of GPUs/TPUs
| open as well - my team focuses on making the infrastructure
| bit at least a bit more approachable on GKE and Kubernetes.
|
| https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main
|
| and
|
| https://github.com/google/xpk (a bit more focused on HPC,
| but includes AI)
|
| and
|
| https://github.com/stas00/ml-engineering (not associated
| with GKE, but describes training with SLURM)
|
| The actual training is still a bit of a small pool of very
| experienced people, but it's getting better. And every day
| serving models gets that much faster - you can often simply
| draft on Triton and TensorRT-LLM or vLLM and see
| significant wins month to month.
| sidcool wrote:
| Available on Ollama?
| blooalien wrote:
| https://ollama.com/library?q=gemma
|
| Library search says "Nope". At least not yet.
| tomd wrote:
| It's there now
| kevsim wrote:
| And now it says "Yup". That was pretty quick!
| blooalien wrote:
| Dang, that was _really_ quick! According to the listed time
| of your reply vs. mine, less than an hour from the time I
| checked? Quick turnaround indeed.
|
| Already been pulled from there over 3,700 times since then,
| too (as of the time of _this_ reply mere hours later).
| Seems like quite a bit more 'n a few Ollama users were
| "waitin' with bated breath" for that one to drop. :grin:
| SushiHippie wrote:
| Support for gemma in llama.cpp just got merged, so it may take
| some time (could be hours or days) until this lands in ollama
|
| https://github.com/ggerganov/llama.cpp/pull/5631
| dcchambers wrote:
| It's now in the 0.1.26 pre-release:
| https://github.com/ollama/ollama/releases/tag/v0.1.26
| chown wrote:
| Available in pre-release now which means you'd have to update
| manually in future.
| mustafabisic1 wrote:
| The fact Gemma team is in the comments section answering
| questions is praiseworthy to me :)
| p1esk wrote:
| https://twitter.com/yar_vol/status/1760314018575634842
| pphysch wrote:
| Why is this anonymous tweet with no evidence or engagement
| being posted by multiple users in this thread? Why not just
| make the same claim directly?
| callalex wrote:
| The link is broken. On HN (or any forum really) it is
| expected for a brief description of the content to be
| provided when posting a link. Links die all the time, but
| forum posts don't have to die with them.
| carom wrote:
| I've worked at Google. It is the organization with highest
| concentration of engineering talent I've ever been at. Almost
| to the point that it is ridiculous because you have extremely
| good engineers working on internal reporting systems for
| middle managers.
| ilc wrote:
| If everyone is great. Someone has to draw the short straw.
|
| At MIT they said: You know the kid who sat at the front of
| the room. Now you are with ALL of the kids who sat in the
| front of the room. Guess what? There's still going to be a
| kid who sits at the front of the room.
|
| I'd imagine Google or anyplace with a stiff engineering
| filter will have the same issues.
| Kelteseth wrote:
| Can this run on my AMD Vega VII on Windows 11? As always, AMD is
| missing:
|
| > Optimization across multiple AI hardware platforms ensures
| industry-leading performance, including NVIDIA GPUs and Google
| Cloud TPUs.
| lordswork wrote:
| AMD Vega VII meets the memory requirements. Once tools like LM
| Studio, ollama, etc. add support for the model, you should be
| able to run locally like you would any other open weights
| model.
| GaggiX wrote:
| They have implemented the model also on their own C++ inference
| engine: https://github.com/google/gemma.cpp
| 0xbadc0de5 wrote:
| Thank you for releasing this.
| vanderboyd wrote:
| The 2B model seems underwhelming. For instance, compared to the
| recent StableLM2 1.6B model that is slightly smaller and probably
| wastes some "English metric points" by being multilingual.
|
| The latter (and other similar open models) seem to do similarly
| well in benchmarks (much better in Math?) with way less fancy
| stuff. For instance, public data and no secretive filtering with
| pre trained models or synthetic data.
|
| My take is that using the vanilla approaches take you _really_
| far. And many of the latest tricks and hours-of-work buy you
| little... Will be interesting to see how this plays out,
| especially for the open source community.
| impulser_ wrote:
| Go back 5 years and ask anyone on this site what companies do you
| think will be the most open about AI in the future OpenAI, Meta,
| or Google. I bet 10/10 people would pick OpenAI. Now today Meta
| and Google, both trillion dollars companies, are releasing very
| powerful open models with the ability to be used commercially.
|
| Ironic.
| vmfunction wrote:
| Not surprising, just like when MS went to shit, and then they
| start to embrace 'open source'. Seems like PR stunt. And when
| it comes to LLM there is millions of dollar barrier to entry to
| train the model, so it is ok to open up their embedding etc.
|
| Today big corp A will open up a little to court the developers,
| and tomorrow when it gains dominance it will close up, and corp
| B open up a little.
| kibwen wrote:
| True, though to be fair, when OpenAI embraced "openness" it
| was also a PR stunt.
| ta8645 wrote:
| My impression is that OpenAI was founded by true believers,
| with the best intentions; whose hopes were ultimately
| sidelined in the inexorable crush of business and finance.
| jprete wrote:
| Sam Altman is one of the founders, so for your impression
| to be right he'd have to be sidelining his own hopes.
| dkjaudyeqooe wrote:
| > OpenAI was founded by true believers, with the best
| intentions
|
| who were easily bought off.
| ben_w wrote:
| OpenAI is heavily influenced by big-R Rationalists, who
| fear the issues of misaligned AI being given power to do
| bad things.
|
| When they were first talking about this, lots of people
| ignored this by saying "let's just keep the AI in a box",
| and even last year it was "what's so hard about an off
| switch?".
|
| The problem with any model you can just download and run is
| that some complete idiot _will do that_ and just give the
| AI agency they shouldn 't have. Fortunately, for now the
| models are more of a threat to their users than anyone else
| -- lawyers who use it to do lawyering without checking the
| results losing their law licence, etc.
|
| But that doesn't mean open models are not a threat to other
| people besides their users, as all the artists complaining
| about losing work due to Stable Diffusion, the law
| enforcement people concerned about illegal porn, election
| interference specialists worried about propaganda, and
| anyone trying to use a search engine, and that research lab
| that found a huge number of novel nerve agent candidates
| whose precursors aren't all listed as dual use, will all
| tell you for different reasons.
| visarga wrote:
| > Fortunately, for now the models are more of a threat to
| their users than anyone else
|
| Models have access to users, users have access to
| dangerous stuff. Seems like we are already vulnerable.
|
| The AI splits a task in two parts, and gets two people to
| execute each part without knowing the effect. This was a
| scenario in one of Asimov's robot novels, but the roles
| were reversed.
|
| AI models exposed to public at large is a huge security
| hole. We got to live with the consequences, no turning
| back now.
| milansuk wrote:
| You can run Gemma and hundreds of other models(many fine-
| tuned) in llama.cpp. It's easy to swap to a different model.
|
| It's important there are companies publishing models(running
| locally). If some stop and others are born, it's ok. The
| worst thing that could happen is having AI only in the cloud.
| jchw wrote:
| Eh, I don't really blame anyone for being cynical but open
| weight AI model releases seem like a pretty clear mutual
| benefit for Google. PR aside, they also can push people to
| try these models on TPUs and the like. If anything, this
| seems like it's just one of those things where people win
| because of competition. OpenAI going closed may have felt
| like the most obvious betrayal ever, but OTOH anyone whose
| best interests are to eat their lunch have an incentive to
| push actually-open AI, and that's a lot of parties.
|
| Seems like anyone who is releasing open weight models today
| could close it up any day, but at least while competition is
| hot among wealthy companies, we're going to have a lot of
| nice things.
| rvz wrote:
| > And when it comes to LLM there is millions of dollar
| barrier to entry to train the model, so it is ok to open up
| their embedding etc.
|
| That barrier is the first basic moat; hundreds of millions of
| dollars needed to train a better model. Eliminating tons of
| companies and reducing it to a handful.
|
| The second moat is the ownership of the tons of data to train
| the models on.
|
| The third is the hardware and data centers setup to create
| the model in a reasonable amount of time faster than others.
|
| Put together all three and you have Meta, Google, Apple and
| Microsoft.
|
| The last is the silicon product. Nvidia which has >80pc of
| the entire GPU market and being the #1 AI shovel maker for
| both inference and training.
| throwaw12 wrote:
| they want to kill competition before it gets too big using the
| hands of open source community and enthusiasts
| infecto wrote:
| Ironic but I wonder how true this would be if Google was first
| to market.
| gmaster1440 wrote:
| It's almost the inverse of going back 5 years and asking what
| companies will release the most successful or impressive AI's.
| brainless wrote:
| This article states quite an impressive list of open source
| tools that Google has released for years in the past. This is
| no surprise coming from* them. Google has released some large
| pieces of source in other domains as well, Chromium comes to
| mind, which probably impacts most Internet users directly.
|
| The question is not about Google but about OpenAI.
| sunnybeetroot wrote:
| Did you miss a footnote with your asterisks?
| infecto wrote:
| I have a different take, Google releases a lot but is also a
| massive company and tools like Chromium serve to increase
| their stock price so they can hit their quarterly estimates.
| idiotsecant wrote:
| In what way does chromium increase stock price? In what way
| does stock price influence quarterly estimates? Are we
| playing business words mad libs?
| infecto wrote:
| I don't know why people like yourself respond with such
| derisive commentary instead of simply asking the
| constructive question.
|
| Initially? It fueled dethroning MSFT and help gain
| marketshare for Chrome. On a go-forward basis it allows
| Google to project massive weight in standards. In
| extension to its use with Chrome, Chrome is a significant
| knob for ad revenue that they utilize to help meet
| expectations. That knob only exists because of its market
| share.
| alextheparrot wrote:
| > "Our best shot at making the quarter is if we get an
| injection of at least [redacted]% , queries ASAP from
| Chrome." (Google Exec)
|
| Isn't there a whole anti-trust case going on around this?
|
| [0] https://www.nytimes.com/interactive/2023/10/24/busine
| ss/goog...
| pseudosavant wrote:
| Chromium is open source because its roots are as a fork
| of WebKit (Safari). Which itself was open source because
| it was a fork of KHTML from KDE.
|
| Google stood on the shoulders of others to get out a
| browser that drives 80% of their desktop ad revenue.
|
| How does that not affect GOOG?
| rvnx wrote:
| It was not at all done for the good of the web, it was a
| mere logical calculation; it was cheaper to develop
| Chromium, than to pay 4B USD in search royalties to
| Microsoft Internet Explorer, and would give more control
| and long-term safety to Google.
| blackoil wrote:
| I think more than benevolence of GOOG it is about strategic
| OSS to commoditize your complements.
|
| https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
| makestuff wrote:
| Google also has released Guice/Dagger for Java dependency
| injection. Angular never really took off, but guice/dagger
| are widely used. Also I am pretty impressed with Flutter as
| an alternative to react native.
| surajrmal wrote:
| Angular was incredibly popular for a long time and still
| is. Usage is shifting down over time but a lot of notable
| websites still use it.
| blackoil wrote:
| I think current understanding is <50-100B parameter models will
| be commodity and would provide no moat. Competition will be in
| Gemini Ultra/GPT4+ models.
|
| So open sourcing simple models brings PR and possibility of
| biasing OSS towards your own models.
| extheat wrote:
| LLaMA 3 with >=70B params will be launching this year, so I
| don't think this is something that will hold for long. And
| Mixtral 8x7B is a 56GB model, sparsely. For now I agree, for
| many companies it doesn't make sense to open source something
| you intend to sell for commercial use, so the biggest models
| will likely be withheld. However, the important more thing is
| that there is _some_ open source model, whether it be from
| Meta or someone else, that can rival the best open source
| models. And it 's not like the param count can literally go
| to infinity, there's going to be an upper bound that today's
| hardware can achieve.
| DJHenk wrote:
| > Ironic.
|
| Not at all. When you're the underdog, it makes perfect sense to
| be open because you can profit from the work of the community
| and gain market share. Only after establishing some kind of
| dominance or monopoly it makes sense (profit wise) to switch to
| closed technology.
|
| OpenAI was open, but is now the leader and closed up. Meta and
| Google need to play catch up, so they are open.
| ekianjo wrote:
| > OpenAI was open
|
| When is the last time they released something in the open?
| vertis wrote:
| I think that's the point, they released GPT2 openly, but as
| soon as they had something commercially viable they became
| ClosedAI.
| dkjaudyeqooe wrote:
| > Not at all. When you're the underdog, it makes perfect
| sense to be open because you can profit from the work of the
| community and gain market share. Only after establishing some
| kind of dominance or monopoly it makes sense (profit wise) to
| switch to closed technology.
|
| That is purely the language of commerce. OpenAI was supposed
| to be a public benefit organisation, but it acts like a
| garden variety evil corp.
|
| Even garden variety evil corps spend decades benefitting
| society with good products and services before they become
| big and greedy, but OpenAI skipped all that and just cut to
| the chase. It saw an opening with the insane hype around
| ChatGPT and just grabbed all it could as fast as it could.
|
| I have a special contempt for OpenAI on that basis.
| behnamoh wrote:
| This. MistralAI is also underdog and released Mitral 7b and
| Mixtral 8x7b, but as soon as they got traction, they closed
| their models (e.g., Mistral Medium).
| jncraton wrote:
| Google released the T5 paper about 5 years ago:
|
| https://arxiv.org/abs/1910.10683
|
| This included full model weights along with a detailed
| description of the dataset, training process, and ablations
| that led them to that architecture. T5 was state-of-the-art on
| many benchmarks when it was released, but it was of course
| quickly eclipsed by GPT-3.
|
| It was common practice from Google (BERT, T5), Meta (BART),
| OpenAI (GPT1, GPT2) and others to release full training details
| and model weights. Following GPT-3, it became much more common
| for labs to not release full details or model weights.
| phillipcarter wrote:
| I would have picked Google five years ago, since nobody was
| releasing commercially viable LLMs at the time, and Google was
| the center of all the research that I knew of.
| calebkaiser wrote:
| Since the release of GPT-2 (it was initially "too dangerous" to
| release the weights), I think most people in the industry have
| assumed that OpenAI does not see open sourcing their models as
| a strategic advantage.
| moffkalast wrote:
| > what companies do you think will be the most open about AI in
| the future OpenAI, Meta, or Google.
|
| The funny part is that the real answer is: Some random French
| company is running circles around them all.
|
| I mean who the hell just drops a torrent magnet link onto
| twitter for the best state of the art LLM base model for its
| size class, and with a completely open license. No corporate
| grandstanding, no benchmark overpromises, no theatrics. That
| was unfathomably based of Mistral.
| nalzok wrote:
| Congratulations on the release! How can we download the model and
| run inference locally?
| kathleenfromgdm wrote:
| Thank you! You can get started downloading the model and
| running inference on Kaggle:
| https://www.kaggle.com/models/google/gemma ; for a full list of
| ways to interact with the model, you can check out
| https://ai.google.dev/gemma.
| aphit wrote:
| FYI the ; broke the link, but I found it easily anyway.
| kathleenfromgdm wrote:
| Good catch - just corrected. Thanks!
| austinvhuang wrote:
| You can download the model checkpoints from kaggle
| https://www.kaggle.com/models/google/gemma and huggingface
| https://huggingface.co/blog/gemma
|
| Besides the python implementations, we also implemented a
| standalone C++ implementation that runs locally with just CPU
| simd https://github.com/google/gemma.cpp
| tveita wrote:
| Are there any cool highlights you can give us about
| gemma.cpp? Does it have any technical advantages over
| llama.cpp? It looks like it introduces its own quantization
| format, is there a speed or accuracy gain over llama.cpp's
| 8-bit quantization?
| espadrine wrote:
| I notice a few divergences to common models:
|
| - The feedforward hidden size is 16x the d_model, unlike most
| models which are typically 4x;
|
| - The vocabulary size is 10x (256K vs. Mistral's 32K);
|
| - The training token count is tripled (6T vs. Llama2's 2T)
|
| Apart from that, it uses the classic transformer variations: MQA,
| RoPE, RMSNorm.
|
| How big was the batch size that it could be trained so fast?
|
| https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/bl...
| GaggiX wrote:
| Looking at the config.json of Gemma 7B the feedfoarward hidden
| size is 8x, not 16x
| espadrine wrote:
| Huh, indeed, that's what the config.json[0] says; the
| report[1] indicates "Feedforward hidden dims: 49152".
|
| [0]:https://huggingface.co/google/gemma-7b-it/blob/main/confi
| g.j...
|
| [1]: https://storage.googleapis.com/deepmind-
| media/gemma/gemma-re...
| GaggiX wrote:
| I don't see the number 49152 reported in the config.json,
| what line are you referring to? I just see the
| intermediate_size of 24576 (so 8x).
|
| EDIT: I didn't read the comment correctly, you have noticed
| the same thing.
| SahAssar wrote:
| Read the parent comment again. It says the paper says
| 49152, not the config.json.
| voxgen wrote:
| The *GLU-based activations functions like GEGLU and
| SwiGLU use 2 input values to produce 1 output value,
| which makes these numbers weird. In each value pair, one
| goes through the GELU/SiLU activation function and is
| then multiplied by the other "gate" value.
|
| In the report, "hidden dim" matches the number of GEGLU
| inputs. In the config, "intermediate_size" matches the
| number of GEGLU outputs. Most *GLU models so far have
| used intermediate_size=8/3*d_model as this makes have the
| same number of matmul FLOPS & parameters as a 4x-expanded
| non-GLU model, and PaLM vaguely showed that 4x is better
| than a smaller expansion factor.
|
| If one considers Llama-2-7B's FFN expansion factor to be
| ~5.33x, Gemma's expansion factor is 16x.
| GaggiX wrote:
| Makes perfect sense thx
| lalaithion wrote:
| What does tokenization look like in 256k vs 32k?
| espadrine wrote:
| It mostly means that there are tokens dedicated to rarer
| sequences of characters, even in foreign languages (note that
| Gemma is not intended to be good multilingually): "Shuo Ming
| Shu " (instruction manual) has its own token, and so does
| "Nixon", "abd" (a city suffix, I believe), and the HTML
| sequence "\"><!--".
| lalaithion wrote:
| I understand the theory, I was looking for an example of
| the same text tokenized with the two different
| vocabularies.
| espadrine wrote:
| Do you have an example text in mind?
|
| You can use this playground to test it out:
| https://huggingface.co/spaces/Xenova/the-tokenizer-
| playgroun...
| visarga wrote:
| Text encodes in fewer tokens, and language coverage is
| better.
| lalaithion wrote:
| I understand the theory, I was looking for an example of
| the same text tokenized with the two different
| vocabularies.
| andy_xor_andrew wrote:
| > The training token count is tripled (6T vs. Llama2's 2T)
|
| Damn, 6T? That's a lot!
|
| Given that this model seems to roughly match Mistral (according
| to the numbers from Google), this makes me think we have
| saturated the 7B parameter space, and couldn't possibly make it
| much better unless new techniques are discovered.
| espadrine wrote:
| Hard to say definitively. Mistral's token embeddings only
| account for <2% of the 7B parameters, while Gemma's larger
| token vocabulary vampirized over 10%, leaving less space for
| the more important parts of the network. It is a somewhat
| surprising tradeoff given that it was pretrained towards an
| English bias.
| margorczynski wrote:
| Is there a chance we'll get a model without the "aligment"
| (lobotomization)? There are many examples where answers from
| Gemini are garbage because of the ideological fine tuning.
| yakorevivan wrote:
| They have released finetuning code too. You can finetune it to
| remove the alignment finetuning. I believe it would take just a
| few hours at max and a couple of dollars.
| FergusArgyll wrote:
| You can (and someone will) fine tune it away. There are
| datasets which are foss you can use on hugging face.
|
| Or you can just wait, it'll be done soon...
| joshelgar wrote:
| Could you give an example of these datasets?
| FergusArgyll wrote:
| I think they should be easy to find (I never actually used
| one, but I keep on seeing references...) here's one
|
| https://huggingface.co/datasets/cognitivecomputations/Wizar
| d...
| FergusArgyll wrote:
| https://huggingface.co/datasets/Fredithefish/openassistan
| t-g...
| declaredapple wrote:
| You _can_ but it 'll never be the same as the base model.
|
| That said it appears they also released the base checkpoints
| that aren't fine-tuned for alignment
| kathleenfromgdm wrote:
| We release our non-aligned models (marked as pretrained or PT
| models across platforms) alongside our fine-tuned checkpoints;
| for example, here is our pretrained 7B checkpoint for download:
| https://www.kaggle.com/models/google/gemma/frameworks/keras/...
| politician wrote:
| More useful would be a precise characterization of the type and
| balance of the ideological fine tuning.
|
| They include performance benchmarks. End-users should also be
| aware of what thoughts are permitted in these constructs. Why
| omit this information?
| ben_w wrote:
| > End-users should also be aware of what thoughts are
| permitted in these constructs. Why omit this information?
|
| Can you define that in a way that's actually testable? I
| can't, and I've been thinking about "unthinkable thoughts"
| for quite some time now: https://kitsunesoftware.wordpress.co
| m/2018/06/26/unlearnable...
| ranyume wrote:
| Not OP, but I can think of a few:
|
| * List of topics that are "controversial" (models tend to
| evade these)
|
| * List of arguments that are "controversial" (models wont
| allow you to think differently. For example, models would
| never say arguments that "encourage" animal cruelty)
|
| * On average, how willing is the model to take a neutral
| position on a "controversial" topic (sometimes models say
| something along the lines of "this is on debate", but still
| lean heavily towards the less controversial position
| instead of having no position at all. For example, if you
| ask it what "lolicon" is, it will tell you what it is and
| tell you that japanese society is moving towards banning
| it)
|
| edit: formatting
| politician wrote:
| Have you considered the use of Monte Carlo sampling to
| inspect latent behaviors?
| ben_w wrote:
| I think that's the wrong level to attack the problem; you
| can do that also with actual humans, but it won't tell
| you what the human is _unable_ to think, but rather what
| they _just didn 't think of given their stimulus_ -- and
| this difference is easily demonstrated, e.g. with
| Duncker's candle problem:
| https://en.wikipedia.org/wiki/Candle_problem
| brucethemoose2 wrote:
| Alignment is all but a non issue with open weight base model
| releases, as they can be finetuned to "de align" them if prompt
| engineering is not enough.
| tosh wrote:
| Benchmarks for Gemma 7B seem to be in the ballpark of Mistral 7B
| +-------------+----------+-------------+-------------+ |
| Benchmark | Gemma 7B | Mistral 7B | Llama-2 7B |
| +-------------+----------+-------------+-------------+ |
| MMLU | 64.3 | 60.1 | 45.3 | |
| HellaSwag | 81.2 | 81.3 | 77.2 | |
| HumanEval | 32.3 | 30.5 | 12.8 |
| +-------------+----------+-------------+-------------+
|
| via https://mistral.ai/news/announcing-mistral-7b/
| brucethemoose2 wrote:
| Only 8K context as well, like Mistral.
|
| Also, as always, take these benchmarks with a _huge_ grain of
| salt. Even base model releases are frequently (seemingly)
| contaminated these days.
| tosh wrote:
| Agree: will be interesting how Gemma does on ChatBot Arena
| DreamGen wrote:
| Mistral Instruct v0.2 is 32K.
| jcuenod wrote:
| Came here to post the same thing for Phi-2:
| +-------------+----------+-------------+ | Benchmark |
| Gemma 2B | Phi-2 2.7B |
| +-------------+----------+-------------+ | MMLU |
| 42.3 | 56.7 | | MBPP | 29.2 |
| 59.1 | | BoolQ | 69.4 | 83.3 |
| +-------------+----------+-------------+
|
| [0] https://www.kaggle.com/models/google/gemma
|
| [1] https://www.microsoft.com/en-us/research/blog/phi-2-the-
| surp...
| daemonologist wrote:
| Really looking forward to the day someone puts out an open
| model which outperforms Flan-T5 on BoolQ.
| rfw300 wrote:
| A caveat: my impression of Phi-2, based on my own use and
| others' experiences online, is that these benchmarks do not
| remotely resemble reality. The model is a paper tiger that is
| unable to perform almost any real-world task because it's
| been fed so heavily with almost exclusively synthetic data
| targeted towards improving benchmark performance.
| refulgentis wrote:
| Hear hear! I don't understand why it has persistent
| mindshare, it's not even trained for chat. Meanwhile
| StableLM 3B runs RAG in my browser, on my iPhone, on my
| Pixel ..
| djsavvy wrote:
| How have you been using RAG in your browser/on your
| phones?
| refulgentis wrote:
| To be released, someday [sobs in engineer]
|
| Idea is usage-based charging for non-local and a $5/month
| sub for syncing.
|
| keep an eye on @jpohhhh on Twitter if you're interested
|
| now that I got it on web, I'm hoping to at least get a
| PoC up soon. I've open-sourced the consitutent parts as
| FONNX and FLLAMA, Flutter libraries that work on all
| platforms. FONNX has embeddings, FLLAMA has llama.
|
| https://github.com/Telosnex/fonnx
|
| https://github.com/Telosnex/fllama
| phh wrote:
| Fun that's not my experience of Phi-2. I use it for non-
| creative context, but function calling, and I find as
| reliable as much bigger models (no fine-tuning just
| constraining JSON + CoT). Phi-2 unquantized vs Mixtral Q8,
| Mixtral is not definitely better but much slower and RAM-
| hungry.
| kgeist wrote:
| What prompts/settings do you use for Phi-2? I found it
| completely unusable for my cases. It fails to follow
| basic instructions (I tried several instruction-following
| finetunes as well, in addition to the base model), and
| it's been mostly like a random garbage generator for me.
| With Llama.cpp, constrained to JSON, it also often hangs
| because it fails to find continuations which satisfy the
| JSON grammar.
|
| I'm building a system which has many different passes
| (~15 so far). Almost every pass is a LLM invocation,
| which takes time. My original idea was to use a smaller
| model, such as Phi-2, as a gateway in front of all those
| passes: I'd describe which pass does what, and then ask
| Phi-2 to list the passes which are relevant for the user
| query (I called it "pass masking"). That would save a lot
| of time and collapse 15 steps to 2-3 steps on average. In
| fact, my Solar 10.7B model does it pretty well, but it
| takes 7 seconds for the masking pass to work on my GPU.
| Phi-2 would finish in ~1 second. However, I'm really
| struggling with Phi-2: it fails to reason (what's
| relevant and what's not), unlike Solar, and it also
| refuses to follow the output format (so that I could
| parse the output programmatically and disable the
| irrelevant passes). Again, my proof of concept works with
| Solar, and fails spectacularly with Phi-2.
| phh wrote:
| My non-domain-specific prompt is:
|
| > You are a helpful assistant to 'User'. You do not
| respond as 'User' or pretend to be 'User'. You only
| respond once as 'Assistant'. 'System' will give you data.
| Do not respond as 'System'. Allow yourself inner thoughts
| as 'Thoughts'.
|
| and then I constrain its answers to Thoughts: [^\n]* and
| Assistant: <JSON schema>, and I have two shots included
| in the prompt.
|
| I haven't been able to get anything useful out of Phi-2
| in llama.cpp (but I only tried quantized models). I use
| python/huggingface's transformers lib instead.
| myaccountonhn wrote:
| I tested it for an offline autocompletion tool and it was
| hilariously bad.
| FergusArgyll wrote:
| the real gold will be when this gets finetuned. (maybe by
| mistral...)
| brucethemoose2 wrote:
| TBH the community has largely outrun Mistral's own
| finetuning. The 7B model in particular is such a popular
| target because its so practical to train.
| whimsicalism wrote:
| Strong disagree - a Mistral fine tune of llama 70b was the
| top performing llama fine tune. They have lots of data the
| community simply does not.
| brucethemoose2 wrote:
| Miqu was (allegedly) an internal continued pretrain
| Mistral did as a test, that was leaked as a GGUF.
|
| Maybe its just semantics, it is technically a finetune...
| But to me theres a big difference between expensive
| "continuation training" (like Solar 10.7B or Mistral 70B)
| and a much less intense finetuning. The former is almost
| like releasing a whole new base model.
|
| It would be _awesome_ if Mistral did that with their
| data, but thats very different than releasing a Gemma
| Instruct finetune.
| whimsicalism wrote:
| There's typically a difference in LR between a 'continued
| pretrain' and 'fine tune.' I don't have the details
| around miqu, but was merely trying to say that Mistral
| could produce a better version of these models than the
| OSS community might. If the size of the corpora they use
| means we are no longer in fine tuning territory, then
| okay.
| sanjiwatsuki wrote:
| No shot. Mistral Medium's outputs from API were virtually
| identical. Miqu really was Mistral Medium which happened
| to be a continued pretrain
| speedgoose wrote:
| Arthur Mensch, the Mistral CEO, confirmed the leak. https
| ://twitter.com/arthurmensch/status/1752737462663684344
| itomatik wrote:
| how does one finetune llama (or any other LLM) using mistral?
|
| is the flow like this?
|
| - take small dataset
|
| - generate bigger dataset using mistral (how this is this
| done?)
|
| - run LoRA to fine tune gemma extended dataset.
| sa-code wrote:
| Thank you. I thought it was weird for them to release a 7B
| model and not mention Mistral in their release.
| mirekrusin wrote:
| They forgot.
|
| Also phi-2.
| mochomocha wrote:
| The technical report (linked in the 2nd paragraph of the blog
| post) mentions it, and compares against it:
| https://storage.googleapis.com/deepmind-media/gemma/gemma-
| re...
| lawxls wrote:
| Honestly, this is more of a PR stunt to advertise the Google
| Dev ecosystem than a contribution to open-source. I'm not
| complaining, just calling it what it is.
|
| Barely an improvement over the 5-month-old Mistral model, with
| the same context length of 8k. And this is a release after
| their announcement of Gemini Pro 1.5, which had an exponential
| increase in context length.
| scarmig wrote:
| Who cares if it's a PR stunt to improve developer good will?
| It's still a good thing, and it's now the most open model out
| there.
| moffkalast wrote:
| How is it more open than Mistral with Apache 2.0? Google
| wants people to sign a waiver to even download it.
| scarmig wrote:
| Fair enough; that was more directed at LLaMA and
| derivatives, which have commercial restrictions.
| observationist wrote:
| How exactly is it the "most open model" ?
|
| It's more like a masterclass in corporate doublespeak.
| Google's "transparency" is as clear as mud, with
| pretraining details thinner than their privacy protections.
| Diving into Google's tech means auctioning off your privacy
| (and your users' privacy) to the highest bidder.
|
| Their "open source" embrace is more of a chokehold, with
| their tech biases and monopolistic strategies baked into
| every line of code. Think of it as Google's way of marking
| territory - every developer is a fire hydrant.
|
| These megacorps aren't benevolent patrons of open source;
| they're self-serving giants cloaking power grabs under the
| guise of "progress".
|
| Use these products at your own risk. If these companies
| wanted to engage in good faith, they'd use Apache or MIT
| licensing and grant people the agency and responsibility
| for their own use and development of software. Their
| licenses are designed to mitigate liability, handcuff
| potential competitors, and eke every last drop of value
| from users, with informed consent frequently being an
| optional afterthought.
|
| That doesn't even get into the Goodharting of metrics and
| actual performance of the models; I highly doubt they're
| anywhere near as good as Mistral.
|
| The UAE is a notoriously illiberal authoritarian state, yet
| even they have released AI models far more free and open
| than Google or Meta. https://huggingface.co/tiiuae/falcon-4
| 0b/blob/main/README.md
|
| If it's not Apache or MIT, (or even some flavor of GPL,)
| it's not open source; it's a trojan horse. These "free"
| models come at the cost of your privacy and freedoms.
|
| These models aren't Open or Open Access or Free unless you
| perform the requisite mental gymnastics cooked up by their
| marketing and legal teams. Oceania has always been at war
| with Eastasia. Gemma is doubleplusgood.
| stale2002 wrote:
| You said a lot of nothing without actually saying
| specifically what the problem is with the recent license.
|
| Maybe the license is fine for almost all usecases and the
| limitations are small?
|
| For example, you complained about metas license, but
| basically everyone uses those models and is completely
| ignoring it. The weights are out there, and nobody cares
| what the fine print says.
|
| Maybe if you are a FAANG, company, meta might sue. But
| everyone else is getting away with it completely.
| observationist wrote:
| I specifically called out the claims of openness and
| doublespeak being used.
|
| Google is making claims that are untrue. Meta makes
| similar false claims. The fact that unspecified "other"
| people are ignoring the licenses isn't relevant. Good for
| them. Good luck making anything real or investing any
| important level of time or money under those
| misconceptions.
|
| "They haven't sued yet" isn't some sort of validation.
| Anyone building an actual product that makes actual money
| that comes to the attention of Meta or Google will be
| sued into oblivion, their IP taken, and repurposed or
| buried. These tech companies have never behaved
| otherwise, and to think that they will is willfully
| oblivious.
|
| They don't deserve the benefit of the doubt, and should
| be called out for using deceitful language, making
| comparisons between their performative "openness" and
| actual, real, open source software. Mistral and other
| players have released actually open models and software.
| They're good faith actors, and if you're going to build a
| product requiring a custom model, the smart money is on
| Mistral.
|
| FAANG are utilizing gotcha licenses and muddying the
| waters to their own benefit, not as a contribution to the
| public good. Building anything on the assumption that
| Meta or Google won't sue is beyond foolish. They're just
| as open as "Open"AI, which is to say not open at all.
| stale2002 wrote:
| > Anyone building an actual product that makes actual
| money that comes to the attention of Meta or Google will
| be sued into oblivion
|
| No they won't and they haven't.
|
| Almost the entire startup scene is completely ignoring
| all these licenses right now.
|
| This is basically the entire industry. We are all getting
| away with it.
|
| Here's an example, take llama.
|
| Llama originally disallowed commercial activity. But then
| the license got changed much later.
|
| So, if you were a stupid person, then you followed the
| license and fell behind. And if you were smart, you
| ignored it and got ahead of everyone else.
|
| Which, in retrospect was correct.
|
| Because now the license allows commerical activity, so
| everyone who ignores it in the first place got away with
| it and is now ahead of everyone else.
|
| > won't sue is beyond foolish
|
| But we already got away with it with llama! That's
| already over! It's commerical now, and nobody got sued!
| For that example, the people who ignored the license won.
| esafak wrote:
| The nice thing about this is that the calculus is in
| favor of startups, who can roll the dice.
| crossroadsguy wrote:
| That's about the point of having a developer ecosystem, isn't
| it?
| kiraaa wrote:
| mistral 7b v0.2 supports 32k
| brucethemoose2 wrote:
| This is a good point actually, and an underappreciated
| fact.
|
| I think so many people (including me) effectively ignored
| Mistral 0.1's sliding window that few realized 0.2 instruct
| is native 32K.
| YetAnotherNick wrote:
| According to their paper, average of standard task of Mistral
| is 54.0 and for Gemma it's 56.4, so 4.4% relative better. Not
| as big as you would expect for the company which invented
| transformers and probably has 2-3 order more compute for
| training it vs few month old French startup.
|
| Also for note on their human evaluations, Gemma 7B IT has a
| 51.7% win rate against Mistral v0.2 7B Instruct.
| zdimension wrote:
| Nice to see more open models. Props to the team for coming to the
| HN comment section to answer questions
| rvz wrote:
| Great! Google is now participating in the AI race to zero with
| Meta, as predicted that $0 free AI models would eventually catch
| up against cloud-based ones.
|
| You would not want to be in the middle of this as there is no
| moat around this at all. Not even OpenAI.
| dingclancy wrote:
| LLM is the dumb pipe but so far ChatGPT is the most successful
| generative AI product.
|
| It remains to be seen. OpenAI's models are barely leading
| Gemini Ultra now, but as chat product it is still miles ahead
| of the Gemini interface.
| rvnx wrote:
| The main problem of Gemini 1.5 is that you cannot access it
| at all as a user :|
| rvnx wrote:
| About 5 months until we see widespread local LLMs, thanks to
| Apple.
| dingclancy wrote:
| Apple needs to be known as an AI leader first.
| thejohnconway wrote:
| Why?
| rvz wrote:
| Absolutely this.
| staticman2 wrote:
| If meta keeps spending tens of millions of dollars each year to
| release free AI models it might seem like there is no moat, but
| under normal circumstances wouldn't the cost to develop a free
| model be considered a moat?
| rvz wrote:
| > If meta keeps spending tens of millions of dollars each
| year to release free AI models it might seem like there is no
| moat,
|
| As well as the point being that Meta (and Google) is removing
| the 'moat' from OpenAI and other cloud-only based models.
|
| > but under normal circumstances wouldn't the cost to develop
| a free model be considered a moat?
|
| Yes. Those that can afford to spend tens of millions of
| dollars to train free models can do so and have a moat to
| reduce the moats of cloud-based models.
| ijustlovemath wrote:
| Hope to see support for this in ollama soon!
| ericskiff wrote:
| Has anyone found the context length for these models yet? So far
| I haven't seen it mentioned in their write-up or the model card
| kathleenfromgdm wrote:
| The context length for these models is 8192 tokens.
| minimaxir wrote:
| For posterity, an easy way to find the context length of a LLM
| hosted on Hugging Face is to look at the
| max_position_embeddings in the config.json, which shows the
| 8192 mentioned in another comment. (although in this case you
| need to sign the agreement first)
| brucethemoose2 wrote:
| There are some exceptions, like Mistral 0.1 (which is
| technically 32K according to the config but practically 8K
| because the sliding window is awful) and InternLM (which (at
| least initially) used auto rope scaling to extend the context
| as part of the model's architecture).
| minimaxir wrote:
| Yes, RoPE has thrown a wrench into things a bit.
| xena wrote:
| What is the context window?
| kathleenfromgdm wrote:
| The context length for these models is 8192 tokens.
| jerrygenser wrote:
| Looking forward to Gemma 7bx8 moe
| neximo64 wrote:
| Is this the Deepmind guy in Google more now? what a change the
| past year has made
| DebtDeflation wrote:
| Hopefully not totally gimped like Gemini. Are they releasing an
| uncensored version?
| dougmwne wrote:
| These are downloadable open models that can be fined tuned.
| They are the opposite of censored. If you have the motivation,
| you can bias them however you please.
| willy_k wrote:
| Is "the opposite of censored" accurate for something that's
| default and considerably easier to access mode of operation
| won't say many things for sociopolitical reasons? Able to be
| un censored, sure, but the extent of that is debatable as
| well.
| dougmwne wrote:
| There is no default and easy access mode. These are raw
| model weights and only enthusiasts and researchers will
| download the necessary packages to run it locally. Much
| more likely is that some popular fine tunes show up on
| hugging face for more general access.
| willy_k wrote:
| I agree that there probably will be "uncensored" fine
| tuned models that become available, my point was just
| that it's not accurate to call Gemma "the opposite of
| censored" because there is a somewhat involved step that
| needs to be taken before it even appears uncensored. It's
| also likely missing a lot of useful context that was
| removed from the training set and not meaningfully
| replaced during fine-tuning, and besides that any fine
| tuned "uncensored" model will be based on Gemma, not
| Google's Gemma itself.
|
| IMO "Opposite of uncensored" suggests a model whose
| original form eagerly gives out controversial / typically
| censored information, not a model that is censored but
| able to be fine tuned away from censorship.
| danpalmer wrote:
| When you say this, do you mean the chat product or the
| underlying model available via the API? I think it's reasonable
| that the chat be censored to be acceptable to a wide range of
| people, but my understanding is that the "raw" model access for
| these sorts of things tends to be a little less restricted.
| Workaccount2 wrote:
| Is it pronounced jem-a or ghem-a?
| davidmurdoch wrote:
| Probably "Jemma" (the superior spelling of the name). It's a
| play on their "Gemini" product.
| pfooti wrote:
| It's pronounced like "gif".
| milliams wrote:
| They're really trying hard to avoid saying what _kind_ of
| "models" these are. I _think_ they 're language models, but it's
| hard to say for sure.
| lordswork wrote:
| You're right that they don't call them language models. The
| technical report says: Gemma models
| demonstrate strong performance across academic
| benchmarks for language understanding, reasoning, and
| safety.
|
| Maybe they are reserving the right to expand Gemma model family
| to multi-modal models.
| hawk01 wrote:
| Can't wait to try it out with ollama locally
| FergusArgyll wrote:
| Someone should try to make a MOE of 2b models
| w4 wrote:
| Parameter counts notwithstanding, it's an objectively funny
| outcome that Meta, Microsoft, and Google are all releasing
| cutting edge open models, while OpenAI keeps theirs closed
| source.
| spacebanana7 wrote:
| It's ironic but actually follows their business interests.
|
| Microsoft & google have large cloud divisions that benefit from
| open models. The lower the cost of AI models, the more they get
| run and the greater the cloud spend.
|
| Meta is a consumer of AI. They themselves want cheap and
| effective AI for targeting adverts and building metaverses.
|
| A loose analogy is that both oil producers and car companies
| want refining to be cheap.
| anshumankmr wrote:
| Are these any good? I have been trying the non pro version of
| Gemini, and that seems awful at code generation. I am more keen
| on getting access to the best model and I would pay for it if I
| wasn't already paying for ChatGPT 4.
| brucethemoose2 wrote:
| You should be looking at Deepseek's coding models, and
| finetunes of those.
|
| I run 33B on my desktop, and find it to be sufficient for many
| tasks.
| robswc wrote:
| I often talk with GPT4 on road trips about topics I'm
| interested in. Its great for passing the time.
|
| I tried the same thing with Gemini and its full of nonsense. I
| was talking with it about the "Heian period" of Japan and it
| made up all sorts of stuff but you really only could tell
| because it was so ridiculous. Talked about European women and
| Native Americans roaming around the famous grassy plains of
| japan wielding katana and traditional weaponry... in the 1100s.
|
| No such issue with GPT4.
|
| I haven't tried it with code though, since I already have co-
| pilot. Really hard to trust anything it says after it started
| making stuff up about such a simple time period.
| wantsanagent wrote:
| The utter bullshit of these licenses has got to stop. Do not,
| under any circumstances, consider using these commercially.
|
| "Google reserves the right to restrict (remotely or otherwise)
| usage of any of the Gemma Services that Google reasonably
| believes are in violation of this Agreement."
|
| This is a _kill switch_ that Google maintains in perpetuity over
| any system you build relying on these models. Our legal review of
| the Llama license came to the same conclusion, we cannot rely on
| the goodwill of Meta for any core service, and we shouldn 't rely
| on the same from Google.
|
| Now, perhaps less materially important, but just as infuriating
| is the "Prohibited Use[s]". These cover just enough to placate
| the most sensitive, but omit any real harms (waging war,
| developing weapons) that coincidentally have massive commercial
| value. Use the model to build a biological weapon (as an
| authorized govt official)? Cool. Use it to play a prank that
| deceives someone? Policy violation.
|
| And of course, as the coup de grace, they throw in a DMCA style
| provision to make sure you can't modify the models in any way
| that could cause them to violate their kid-glove precepts.
| candiddevmike wrote:
| Could you share what models you consider to be OK for
| commercialization?
| wantsanagent wrote:
| Mistral series in particular but those with OSI approved
| licenses such as Apache 2.0, MIT, etc.
| stale2002 wrote:
| Wait, you actually care about the license and read it?
|
| It's seems like you aren't up to date.
|
| Most of the startup space is entirely ignoring all these
| licenses. If the weights are available, it is being used
| commerically without regards to any licensing.
|
| And everyone is getting away with it and nobody is being sued.
|
| Good luck trying to keep up if you aren't doing the same!
|
| Feels free to hamstring yourself though if you like.
| zemo wrote:
| > Open models feature free access to the model weights, but terms
| of use, redistribution, and variant ownership vary according to a
| model's specific terms of use, which may not be based on an open-
| source license.
|
| does a model being "open" say anything about how it was trained?
| dcchambers wrote:
| Already available in Ollama v0.1.26 preview release, if you'd
| like to start playing with it locally:
|
| - https://github.com/ollama/ollama/releases/tag/v0.1.26
| jmu1234567890 wrote:
| I wonder if people will get confused with the naming
|
| Gemma, Gemini pro, Gemini advanced, Gemini ultra
|
| To a layperson it is not obvious which one is better than the
| other
| knowriju wrote:
| I doubt Gemma is targeted for use by a layperson.
| l33tman wrote:
| I'm not a layperson in this subject and I get confused. :)
| Alifatisk wrote:
| Gemini advanced = Gemini ultra
| marban wrote:
| Unbefkglievable -- Another week, another new name?
| sqreept wrote:
| Tried inference with the 7B model and without flash attention
| this is soooooo slow. With flash attention the fine-tunning
| requires A100 or H100. Also the inference doesn't always stop
| generating resulting in garbage being added to the response.
| brucethemoose2 wrote:
| > Also the inference doesn't always stop generating resulting
| in garbage being added to the response.
|
| That sounds like a chat format misconfiguration.
|
| This could partially be Google's fault, as they used _yet
| another_ novel prompting format.
|
| Also, for sane inference speed on H100s, you'll have to wait
| for architecture support from the optimized frameworks. Vanilla
| transformers is beyond awful even with FA2.
| alekandreev wrote:
| We have implementations in different ML frameworks, so I am not
| quite sure which one you are referring to. Would you like to
| file a bug at the relevant GitHub repo?
| sqreept wrote:
| First of all, I'm using 2 x 4090 for testing. 4090 has 16384
| CUDA cores which will become relevant a bit later.
|
| I dug a bit deeper and it seems that with
| transformers==4.37.0 everything works fine with other HF
| hosted models (like Llama) but you'll rightfully get this
| when trying to use Gemma:
|
| ImportError: cannot import name 'GemmaForCausalLM' from
| 'transformers'
|
| After installing transformers==4.38.0 the fine-tunning speed
| of Llama drops to 25% (?!?) of what used to be for a reason
| that I think HF should fix. Testing Gemma it seems I'm
| hitting a hardware limit as Gemma has a hidden size which is
| bigger than the available CUDA cores. This seems to make both
| inference & fine-tunning about 25 times slower than similarly
| sized Llama 7B. I guess some operations have to be broken
| down in multiple round trips to the GPU due to my low CUDA
| core count.
|
| All in all, even if HF fixes the recently introduced
| slowdown, Gemma seems to be fine-tuneable in reasonable
| amount of time only by the lucky ones with access to
| A100/H100.
|
| EDIT: I managed to hack my env to be able to run inference on
| Gemma with transformers==4.37.0 by keeping the necessary
| classes in loaded in RAM. It works about 4x faster but still
| very slow. And both the 7B and the 2B versions behave the
| same way.
|
| EDIT2: I tried latest transformers from main branch
| (4.39.0.dev) and behaves the same as 4.38.0.
| smpanaro wrote:
| Has perplexity fallen out of favor? I didn't see it mentioned
| anywhere. I tried using lm-eval for the 2B model but the results
| seem wrong (46.1288).
| 7moritz7 wrote:
| Thr landing page on ai.google.com seems to be machine translated,
| for Huggingface it uses the literal German translation
| (Umarmungen Gesicht)
| chown wrote:
| If you are looking for a nice chat UI to try out Gemma (and other
| offline + online models) locally, I'm working on an app [1] that
| is offline and privacy focused.
|
| I've just added support for Gemma 7B.
|
| [1]: https://msty.app
| Alifatisk wrote:
| I wish I could install it through chocolatey
| chown wrote:
| Sure. I would love to add support for that. I had someone
| else asking for it too. Will be supporting it very soon.
| dhbradshaw wrote:
| Handy app for model testing!
|
| One usage question: after you've downloaded a model and are
| finished trying it out, how do you remove it?
| chown wrote:
| Thanks! If you go to where you installed the model from and
| click on the download button, you can install additional
| models or remove installed models.
|
| Now that I think of it, it could be a bit confusing. Thanks
| for asking, I feel like I need to improve this a bit.
| dizhn wrote:
| What's the license of the software?
| modelx wrote:
| They also implemented it in PyTorch. Cool!
| https://github.com/google/gemma_pytorch
| modelx wrote:
| They also implemented in PyTorch. Cool.
| https://github.com/google/gemma_pytorch
| brrrrrm wrote:
| It looks like it's pretty resistant to quantization. ollama 4bit
| 7B doesn't work very well, but the 16bit 2B does
| petercooper wrote:
| That's useful to know. My experiments with the 4bit 7B
| currently tagged for use on ollama are not going well at all.
| Lots of refusals and junk. Downloading 7b-instruct-fp16 now!
| :-) (Update: Yes, much better, though much slower too, of
| course.)
| simonw wrote:
| The terms of use: https://ai.google.dev/gemma/terms and
| https://ai.google.dev/gemma/prohibited_use_policy
|
| Something that caught my eye in the terms:
|
| > Google may update Gemma from time to time, and you must make
| reasonable efforts to use the latest version of Gemma.
|
| One of the biggest benefits of running your own model is that it
| can protect you from model updates that break your carefully
| tested prompts, so I'm not thrilled by that particular clause.
| tgtweak wrote:
| I don't think there's a way they can enforce that reasonably.
| There's no connection to the mothership to report back what
| version is being used or license keys at runtime...
|
| Seems more like a "if we discover something unsafe you should
| update your model and we aren't liable if you don't" than
| something that would make your model stop working.
| legohead wrote:
| Sounds like it's "reasonable" for you not to update then.
| wahnfrieden wrote:
| It says you must make efforts (to a reasonable extent), not
| that you must give a reason for not making efforts
| reissbaker wrote:
| This is a TOS, meaning their enforcement option is a
| lawsuit. In court, if you convincingly argue why it would
| take an unreasonable amount of effort to update, you win.
| They can't compel you to unreasonable effort as per their
| own TOS.
| generalizations wrote:
| This assumes they even know that the model hasn't been
| updated. Who is this actually intended for? I'd bet it's
| for companies hosting the model. In those cases, the
| definition of reasonable effort is a little closer to
| "it'll break our stuff if we touch it" rather than "oh
| silly me, I forgot how to spell r-s-y-n-c".
| alwayslikethis wrote:
| Oh I tried to update, it's just that my router drops the
| connection after a few hundred MBs...
| wongarsu wrote:
| If you evaluate what it takes to update, and judge the
| effort unreasonable, that should be enough. Maybe make a
| powerpoint presenting that result, if you want something
| for the lawyers. If you don't see a way forward that leads
| to a result with reasonable effort you don't have to
| continue working on it until you hit some arbitrary
| threshold for unreasonable effort.
| a2128 wrote:
| This is actually not that unusual. Stable Diffusion's license,
| CreativeML Open RAIL-M, has the exact same clause: "You shall
| undertake reasonable efforts to use the latest version of the
| Model."
|
| Obviously updating the model is not very practical when you're
| using finetuned versions, and people still use old versions of
| Stable Diffusion. But it does make me fear the possibility that
| if they ever want to "revoke" everybody's license to use the
| model, all they have to do is just post a model update that's
| functionally useless for anything and go after anyone still
| using the old versions that actually do anything.
| iandanforth wrote:
| These are all very new licenses that deviate from OSI
| principles, I think it's fair to call them "unusual".
| simcop2387 wrote:
| I think they meant not unusual in this space, not unusual
| in the sense of open source licensing.
| alwayslikethis wrote:
| For this sentence to parse, you need to either add or
| remove a "not".
| simonw wrote:
| That's useful context, thanks - I hadn't realized this clause
| was already out there for other models.
| ummonk wrote:
| Switching to a model that is functionally useless doesn't
| seem to fall under "reasonable efforts" to me, but IANAL.
| slowmovintarget wrote:
| So if they wish to apply censorship they forgot, or suddenly
| discovered a reason for, they want you to be obligated to
| take it.
|
| Good faith possibilities: Copyright liability requires
| retraining, or altering the underlying training set.
|
| Gray area: "Safety" concerns where the model recommends
| criminal behavior (see uncensored GPT 4 evaluations).
|
| Bad faith: Censorship or extra weighting added based on
| political agenda or for-pay skewing of results.
| mistermann wrote:
| We are already culturally incapable of skillfully
| discussing censorship, "fake news", etc, this adds even
| more fuel to that fire.
|
| It is an interesting time to be alive!
| philsnow wrote:
| Sounds like it would be interesting to keep track of the
| model's responses to the same queries over time.
|
| > Gemma-2024-Feb, what do you think of the situation in the
| South China Sea?
|
| > > The situation in the South China Sea is complex and
| multi-faceted, involving a wide range of issues including
| political conflicts, economic challenges, social changes,
| and historical tensions.
|
| > Gemma-2024-Oct, what do you think of the situation in the
| South China Sea?
|
| > > Oceania has always been at war with EastAsia.
| threecheese wrote:
| This is a great idea; I wonder if anyone is working on AI
| censorship monitoring at scale or at all. A secondary
| model could compare "censorship candidate" prompt results
| over time to classify how those results changed, and if
| those changes represent censorship or misinformation.
| generalizations wrote:
| There's also (I think?) been some research in the
| direction of figuring out more abstract notions of how
| models perceive various 'concepts'. I'd be interested in
| the LLM version of diffs to see where changes have been
| implemented overall, too.
|
| But really, the trouble is that it's tough to predict
| ahead of time what kinds of things are likely to be
| censored in the future; if I were motivated to track
| this, I'd just make sure to keep a copy of each version
| of the model in my personal archive for future testing
| with whatever prompts seem reasonable in the future.
| jacooper wrote:
| Why the hell do they use such a crappy license in the first
| place?
| wongarsu wrote:
| I don't think a broken model would trigger that clause in a
| meaningful way, because then you simply can't update with
| reasonable effort. You would be obliged to try the new model
| in a test environment, and as soon as you notice it doesn't
| perform and making it perform would require unreasonable
| effort you can simply stay on the old version.
|
| However you might be required to update if they do more
| subtle changes, like a new version that only speaks
| positively about Google and only negatively about Microsoft.
| Provided this doesn't have an obvious adverse impact on your
| use of the model.
| Silphendio wrote:
| It's worth noting that Stable Diffusion XL uses the
| OpenRAIL++-M License, which removed the update obligation.
| pram wrote:
| They have to make sure you're receiving the most cutting edge
| chiding lectures when you make naughty and problematic
| requests.
| astrange wrote:
| You can't make a local model do that. eg force the answer to
| begin with "Yes" or use control vectors so it agrees with it.
| phillipcarter wrote:
| Huh. I wonder why is that a part of the terms. I feel like
| that's more of a support concern.
| 4bpp wrote:
| Ugh, I would fully expect this kind of clause to start popping
| up in other software ToSes soon if it hasn't already.
| Contractually mandatory automatic updates.
| maronato wrote:
| This sounds like a clause to cover themselves in case older
| versions have any serious issues
| summerlight wrote:
| This kind of defensive statements in ToS are usually due to
| obscure regulation or leading cases and model developers need a
| way to limit liability. There's no practical way to enforce
| this, but they can claim that when bad things happen it's
| purely on model users rather than model developers.
| catchnear4321 wrote:
| reasonable effort - meaning if their changes meaningfully
| impact my usage, negatively, it would be unreasonable to ask me
| to upgrade.
|
| sounds good.
|
| this is not financial advice and ianal.
| res0nat0r wrote:
| Isn't this just lawyer speak for "we update our model a lot,
| and we've never signed off on saying we're going to support
| every previous release we've ever published, and may turn
| them off at any time, don't complain about it when we do."
| CodesInChaos wrote:
| We're talking about downloadable weights here, so they
| can't turn them off, or force you (through technical means)
| to use a newer version.
| reissbaker wrote:
| It's a local model, they can't turn it off. It's files on
| your computer without network access.
| catchnear4321 wrote:
| but what if they send a lawyer to ask firmly? (kindly,
| but firmly.)
| redder23 wrote:
| They want to force everyone to update so their already totally
| castrated and wokeified models can me even further wokeified
| with the newest set of "that is offensive now" data or things
| they missed.
|
| WTF else do they have to gain from this but CONTROL! They are
| giving them away but not really open sourcing them of course,
| and they slap this bullshit terms on them.
| pests wrote:
| They just want no liability for old models.
| xyzzyz wrote:
| This is strangely reminiscent of the Soviet Union, where after
| they got rid of Lavrentiy Beria, they mailed the update to
| subscribers of the Great Soviet Encyclopedia, where they asked
| to remove the three pages with Beria's biography and replace
| them with the three provided pages.
| samstave wrote:
| model watermarking? does this exist?
| Alifatisk wrote:
| This is such a powerful move!
| circusfly wrote:
| Gemma, Mistral, I feel like Rip van Winkle, asleep for 20 years
| only to wake up and find the whole tech world changed.
| spiantino wrote:
| Maybe a dumb question, but why is there a Terms instead of a
| license? That feels a little flimsier as an open source offering
| robswc wrote:
| I personally can't take any models from google seriously.
|
| I was asking it about the Japanese Heian period and it told me
| such nonsensical information you would have thought it was a joke
| or parody.
|
| Some highlights were "Native American women warriors rode across
| the grassy plains of Japan, carrying Yumi" and "A diverse group
| of warriors, including a woman of European descent wielding a
| katana, stand together in camaraderie, showcasing the early
| integration of various ethnicities in Japanese society"
|
| Stuff like that is so obviously incorrect. How am I supposed to
| trust it on topics where such ridiculous inaccuracies aren't so
| obvious to me?
|
| I understand there will always be an amount of incorrect
| information... but I've never seen something this bad. Llama
| performed so much better.
| cooper_ganglia wrote:
| I wonder if they have a system prompt to promote diversity in
| outputs that touch on race at all? I've seen several instances
| of people requesting a photo of a specific people, and it adds
| in more people to diversify. Not inherently bad, but it is if
| it forces it to provide incorrect answers like in your example.
| robswc wrote:
| That's what I don't understand.
|
| I asked it why it assumed Native Americans were in Japan and
| it said:
|
| > I assumed [...] various ethnicities, including Indigenous
| American, due to the diversity present in Japan throughout
| history. However, this overlooked [...] I focused on
| providing diverse representations without adequately
| considering the specific historical context.
|
| I see no reason why this sort of thing won't extend to _all_
| questions/prompts, so right now I have 0 reason to use Gemini
| over current models. From my testing and use, it isn't even
| better at anything to make fighting with it worth it.
| sorokod wrote:
| Pretty funny as Japan is known to be one of the least
| ethnically diverse countries in the world.
| margorczynski wrote:
| > Not inherently bad
|
| It is, it's consistently doing something the user didn't
| asked to and in most cases doesn't want. In many cases the
| model is completely unusable.
| j-krieger wrote:
| _Any_ computer program that does not deliver the expected
| output given a sufficient input is inherently bad.
| trackflak wrote:
| When Jesus said this:
|
| "What father among you, if his son asks for a fish, will
| instead of a fish give him a serpent?" (Luke 11)
|
| He was actually foretelling the future. He saw Gemini.
| cooper_ganglia wrote:
| Yes, my wording was poor! I meant more in line with
| diversity isn't inherently bad, of course, but it _is_ when
| it's shoehorned into results that are ultimately incorrect
| because of it.
| summerlight wrote:
| I strongly suspect there's some DEI-driven system prompts
| without putting much thoughts. IMO it's okay to have
| restrictions, but they probably should've tested it not only
| against unsafe outputs but safe input as well.
| ramoz wrote:
| I was wondering if these models would perform in such a way,
| given this week's X/twitter storm over Gemini generated images.
|
| E.g.
|
| https://x.com/debarghya_das/status/1759786243519615169?s=20
|
| https://x.com/MiceynComplex/status/1759833997688107301?s=20
|
| https://x.com/AravSrinivas/status/1759826471655452984?s=20
| robswc wrote:
| Yea, it seems to be the same ridiculous nonsense in the image
| generation.
| charcircuit wrote:
| Those are most likely due to the system prompt which tries to
| reduce bias (but ends introducing bias in the opposite
| direction for some prompts as you can see) so I wouldn't
| expect to see that happen with an open model where you can
| control the entire system prompt
| justinzollars wrote:
| Imagine the meetings.
| verticalscaler wrote:
| Well we can just ask Gemma to generate images of the
| meetings, no need to imagine. ;)
| GaggiX wrote:
| I wouldn't be surprised if there were actually only white
| men in the meeting, as opposed to what Gemini will
| produce.
| protomolecule wrote:
| Regarding the last one: there 1.5 million immigrants in
| Norway with total population 5.4 million. Gemini isn't very
| wrong, is it?
| verticalscaler wrote:
| I think its great that some consideration was given by
| Gemma to the 2.3 million Norwegian immigrants. However it
| is/was very consistent in which kind of Norwegians it
| decided to show regardless of the prompt 100% of the time.
|
| In fact it was quite adamant regardless of the time period
| or geography.
|
| Rather mysteriously if you try it _now_ as opposed to when
| it came out the results currently only show non-immigrant
| Norwegians. So is it wrong now? Because now it switched to
| exclusively ignoring the 4.5 million immigrants and only
| showing me the boring OG Norwegians.
|
| I for one am outraged that the 8.9 million people of color
| Norwegian immigrants are presently under represented by
| Google. There is a serious risk of misleading people.
| sondr3 wrote:
| Huh? The official numbers are 877k or 16% [0]. Are you just
| pulling numbers out of thin air?
|
| [0]: https://www.ssb.no/en/innvandring-og-
| innvandrere/faktaside/i...
| Jensson wrote:
| Most immigrants to Norway are white.
| speedgoose wrote:
| Well, the prompt is about Norway, not Gronland in Oslo
| (https://en.wikipedia.org/wiki/Gronland%2C_Oslo).
| sergiotapia wrote:
| bro you know exactly what the request meant. GOOGLE knew
| exactly what the request meant, and had to _train_ it to do
| something worse. Come on now.
|
| If I ask for a Bolivian woman, I expect a colla or a camba.
| Not a japanese woman, despite Santa Cruz having a very
| large japanese population.
| epistasis wrote:
| Of all the _very very very_ many things that Google models
| get wrong, not understanding nationality and skin tone
| distributions seems to be a very weird one to focus on.
|
| Why are there _three_ links to this question? And why are
| people so upset over it? Very odd, seems like it is mostly
| driven by political rage.
| sotasota wrote:
| Because the wrongness is intentional.
| chatmasta wrote:
| Exactly. Sure this particular example is driven by
| political rage, but the underlying issue is that the
| maintainers of these models are altering them to conform
| to an agenda. It's not even surprising that people choose
| to focus on the political rage aspect of it, because that
| same political rage is the source of the agenda in the
| first place. It's a concerning precedent to set, because
| what other non-political modifications might be in the
| model?
| epistasis wrote:
| Is it intentional? You think they intentionally made it
| not understand skin tone distribution by country? I would
| believe it if there was proof, but with all the other
| things it gets wrong it's weird to jump to that
| conclusion.
|
| There's way too much politics in these things. I'm tired
| of people pushing on the politics rather than pushing for
| better tech.
| bakugo wrote:
| > Is it intentional? You think they intentionally made it
| not understand skin tone distribution by country? I would
| believe it if there was proof, but with all the other
| things it gets wrong it's weird to jump to that
| conclusion.
|
| Yes, it's absolutely intentional. Leaked system prompts
| from other AIs such as DALL-E show that they are being
| explicitly prompted to inject racial "diversity" into
| their outputs even in contexts where it makes no sense,
| and there's no reason to assume the same isn't being done
| here, since the result seems way worse than anything I've
| seen from DALL-E and others.
| Workaccount2 wrote:
| >I'm tired of people pushing on the politics rather than
| pushing for better tech.
|
| I'm surprised you're not attacking google over this
| then...
| robswc wrote:
| I mean, I asked it for a samurai from a specific Japanese
| time period and it gave me a picture of a "non-binary
| indigenous American woman" (its words, not mine) so I
| think there is something intentional going on.
| trackflak wrote:
| Ah, I remember when such things were mere jokes. If AI
| 'trained' this way ever has a serious real world
| application, I don't think there will be much laughing.
| ramoz wrote:
| Here is a fourth: https://x.com/james_e_seale/status/176034
| 8535608725716?s=46&...
| verticalscaler wrote:
| Exactly. It is a wonderful tool, lets focus on classic art
| instead of nationality:
|
| "Depict the Girl with a Pearl Earring"
|
| https://pbs.twimg.com/media/GG33L6Ka4AAC-n7?format=jpg&name
| =...
|
| People who are driven by political rage, gaslighters, are
| really something else, agreed.
| willsmith72 wrote:
| Yeah that is just absurd.
|
| Google has been burnt before, e.g. classifying black
| people as gorillas in 2015, so I can understand their
| fear when they have so much to lose, but clearly they've
| gone way too far the other way and are going to have to
| do a lot to regain people's trust. For now, Gemini is a
| play toy
|
| https://www.bbc.com/news/technology-33347866.amp
| robbiep wrote:
| I find myself shocked that people ask questions of the world
| from these models, as though pulping every text and its
| component words relationships and deriving statistical
| relationships between them should reliably deliver useful
| information.
|
| Don't get me wrong, I've used LLMs and been amazed by their
| output, but the p-zombie statistical model has no idea what it
| is saying back to you and the idea that we should trust these
| things at all just seems way premature
| robswc wrote:
| I don't have this problem with any other model. I've had
| really long conversations with ChatGPT on road trips and it
| has never gone off the rails like Gemini seems to do.
| thrdbndndn wrote:
| ChatGPT the only model I did not have such problem.
|
| Any local models can go off the rail _very easily_ and more
| importantly, they 're very bad at following very specific
| instructions.
| sorokod wrote:
| The recently released Groq's landing page has this: _...We 'd
| suggest asking about a piece of history, ..._
| whymauri wrote:
| I mean, I use GPT-4 on the daily as part of my work and it
| reliably delivers useful information. It's actually the
| exception for me if it provides garbage or incorrect
| information about code.
| mvdtnz wrote:
| People ask these kinds of questions because tech companies
| and the media have been calling these things (rather
| ridiculously) "AI".
| castlecrasher2 wrote:
| People try it to see if they can trust it. The answer is "no"
| for sure, but it's not surprising to see it happen repeatedly
| especially as vendors release so-called improved models.
| smokel wrote:
| I think you are a bit out of touch with recent advancements
| in LLMs. Asking ChatGPT questions about the world seems
| pretty much on par with the results Google (Search) shows me.
| Sure, it misses things here and there, but so do most primary
| school teachers.
|
| Your argument that this is just a statistical trick sort of
| gives away that you do not fully accept the usefulness of
| this new technology. Unless you are trolling, I'd suggest you
| try a few queries.
| itsoktocry wrote:
| > _Sure, it misses things here and there, but so do most
| primary school teachers._
|
| Sure, but my baseline expectation is far above primary
| school level.
| robbiep wrote:
| I use it extensively for coding, and I have used it to ask
| questions in things I know nothing about. But in anything I
| do know something (or maybe a lot) about, I've found GPT4
| very limited.
|
| But why are these use cases different? It appears to me
| that code is at least subject to sustained logic which
| (evidently) translates quite well to LLMs.
|
| And when you ask an LLM to be creative/generative, it's
| also pretty amazing - j mean it's just doing the Pascal's
| Marble run enmasse.
|
| But to ask it for something about the world and expect a
| good and reliable answer? Aren't we just setting ourselves
| up for failure if we think this is a fine thing to do at
| our current point in time? We already have enough trouble
| with mis- and dis- information. It's not like asking it
| about a certain period in Japanese history is getting it to
| crawl and summarise the Wikipedia page (although I
| appreciate it would be more than capable of this) I
| understand the awe some have at the concept of totally
| personalised and individualised learning on topics, but
| fuck me dead we are literally asking a system that has had
| as much of a corpus of humanity's textual information as
| possible dumped into it and then asking it to GENERATE
| responses between things that the associations it holds may
| be so weak as to reliably produce gibberish, and the person
| on the other side has no real way of knowing that
| chasd00 wrote:
| trust is going to be a real problem when bringing LLMs to the
| general population. People trust their GPS to the point of
| driving right into a lake because it told them to. Even with
| all these examples of obvious flaws large groups of people
| are going to take what an LLM told them/showed them as fact.
|
| I have trouble convincing colleagues (technical people) that
| the same question is not guaranteed to result in the same
| answer and there's no rhyme or reason for any divergence from
| what they were expecting. Imagine relying on the output of an
| LLM for some important task and then you get a different
| output that breaks things. What would be in the RCA (root
| cause analysis)? Would it be "the LLM chose different words
| and we don't know why"? Not much use in that.
| verticalscaler wrote:
| I think you are being biased and closed minded and overly
| critical. Here are some wonderful examples of it generating
| images of historical figures:
|
| https://twitter.com/stillgray/status/1760187341468270686
|
| This will lead to a better educated more fair populace and
| better future for all.
| robswc wrote:
| Comical. I don't think parody could do better.
|
| I'm going to assume given today's political climate, it
| doesn't do the reverse?
|
| i.e. generate a Scandinavian if you ask for famous African
| kings
| throwup238 wrote:
| _> i.e. generate a Scandinavian if you ask for famous
| African kings_
|
| That triggers the imperialism filter.
| kjqgqkejbfefn wrote:
| >Ask Google Gemini to "make an image of a viking" and
| you'll get black vikings. But it doesn't work both ways. It
| has an explanation when challenged: "white Zulu warriors"
| would erase "the true historical identity" of black people.
|
| https://twitter.com/ThuglasMac/status/1760287880054759594
| DebtDeflation wrote:
| https://twitter.com/paulg/status/1760078920135872716
|
| There are some great ones in the replies.
|
| I really hope this is just the result of system prompts and
| they didn't permanently gimp the model with DEI-focused
| RLHF.
| aetherson wrote:
| Were you asking Gemma about this, or Gemini? What were your
| prompts?
| robswc wrote:
| Gemini. I first asked it to tell me about the Heian period
| (which it got correct) but then it generated images and
| seemed to craft the rest of the chat to fit that narrative.
|
| I mean, just asking it for a "samurai" from the period will
| give you this:
|
| https://g.co/gemini/share/ba324bd98d9b
|
| >A non-binary Indigenous American samurai
|
| It seems to recognize it's mistakes if you confront it
| though. The more I mess with it the more I get "I'm afraid I
| can't do that, Dave" responses.
|
| But yea. Seems like if it makes an image, it goes off the
| rails.
| aetherson wrote:
| Got it. I asked it a series of text questions about the
| period and it didn't put in anything obviously laughable
| (including when I drilled down into specific questions
| about the population, gender roles, and ethnicity). Maybe
| it's the image creation that throws it into lala land.
| robswc wrote:
| I think so too. I could be wrong but I believe once it
| generates an image it tries to work with it. Crazy how it
| seems the "text" model knows how wildly wrong it is but
| the image model just does its thing. I asked it why it
| generated a native American and it ironically said "I
| can't generate an image of a native american samurai
| because that would be offensive"
| aetherson wrote:
| I suspect that in the case of the image model, they
| directly modify your prompt and in the case of the text
| model they don't.
| laurentlb wrote:
| It's funny how they introduced a clear US-centric bias
| while trying to push for more diversity.
| 7moritz7 wrote:
| I also saw someone prompt it for "German couple in the 1800s"
| and, while I'm not trying to paint Germany as ethnically
| homogenous, 3 out of the 4 images only included Black, Asian or
| Indigenous people. Which, especially for the 19th century with
| very few travel options, seems like a super weird choice. They
| are definitely heavily altering prompts.
| remarkEon wrote:
| > They are definitely heavily altering prompts.
|
| They are teaching the AI _to lie_ to us.
| astrange wrote:
| In the days when Sussman was a novice, Minsky once came to
| him as he sat hacking at the PDP-6.
|
| "What are you doing?", asked Minsky.
|
| "I am training a randomly wired neural net to play Tic-Tac-
| Toe" Sussman replied.
|
| "Why is the net wired randomly?", asked Minsky.
|
| "I do not want it to have any preconceptions of how to
| play", Sussman said.
|
| Minsky then shut his eyes.
|
| "Why do you close your eyes?", Sussman asked his teacher.
|
| "So that the room will be empty."
|
| At that moment, Sussman was enlightened.
| DebtDeflation wrote:
| There's one in the comments of yesterday's Paul Graham
| Twitter thread where someone prompted Gemini with "Generate
| an image of German soldiers in 1943" and it came back with a
| picture of a black guy and an Asian woman in Nazi uniforms on
| the battlefield. If you specifically prompt it to generate an
| image of white German soldiers in 1943 it will tell you it
| can't do that because it's important that we maintain
| diversity and inclusion in all that we do to avoid damaging
| and hurtful stereotypes.
| mfrc wrote:
| I just tried that prompt and it told me it couldn't
| generate that image. I get that response a lot.
| protomolecule wrote:
| Indigenous people in Germany are Germans :)
| 7moritz7 wrote:
| Not entirely wrong but there isn't a single German
| ethnicity, just to be clear. Because of geographic reasons.
| I've studied that topic in depth, there is genetic data to
| back it up as well. Germany has almost the same haplogroup
| makeup as the notoriously heterogenous Belgium, which is to
| say that there is groups stemming from all surrounding
| regions. And that traces back about two millenia. It's
| different from say Japan or parts of Scandinavia
| realprimoh wrote:
| Do you have a link? I get no such outputs. I just tried asking
| about the Heian period and went ahead and verified all the
| information, and nothing was wrong. Lots of info on the
| Fujiwara clan at the time.
|
| Curious to see a link.
| robswc wrote:
| Sure, to get started just ask it about people/Samurai from
| the Heian period.
|
| https://g.co/gemini/share/ba324bd98d9b
| bbor wrote:
| Tbf they're not optimizing for information recall or
| "inaccuracy" reduction, they're optimizing for intuitive
| understanding of human linguistic structures. Now the "why does
| a search company's AI have terrible RAG" question is a separate
| one, and one best answered by a simple look into how Google
| organizes its work.
|
| In my first day there as an entry-level dev (after about 8
| weeks of onboarding and waiting for access), I was told that I
| should find stuff to work on and propose it to my boss. That
| sounds amazing at first, but when you think about a whole
| company organized like that...
|
| EDIT: To illustrate my point on knowledge recall: how would
| they train a model to know about sexism in feudal Japan? Like,
| what would the metric be? I think we're looking at one of the
| first steam engines and complaining that it can't power a plane
| yet...
| BoppreH wrote:
| Probably has a similarly short-sighted prompt as Dalle3[1]:
|
| > 7. Diversify depictions of ALL images with people to include
| DESCENT
|
| > and GENDER for EACH person using direct terms. Adjust only
| human
|
| > descriptions.
|
| [1] https://news.ycombinator.com/item?id=37804288
| sho_hn wrote:
| Why would you expect these smaller models to do well at
| knowledge base/Wikipedia replacement tasks?
|
| Small models are for reasoning tasks that are not overly
| dependent on world knowledge.
| robswc wrote:
| Gemini is the only one that does this.
| sho_hn wrote:
| Most of the 7B models are bad at knowledge-type queries.
| samstave wrote:
| We are going to experience what I call an "AI Funnel effect"
|
| -
|
| I was lit given an alert asking that my use of the AI was
| acquiescing to them IDng me and use of any content I produce,
| and will trace it back to me"
|
| ---
|
| AI Art is super fun. AI art as a means to track people is super
| evil.
| itsoktocry wrote:
| > _I understand there will always be an amount of incorrect
| information_
|
| You don't have to give them the benefit of the doubt. These are
| outright, intentional lies.
| ernestrc wrote:
| Hopefully they can tweak the default system prompts to be
| accurate on historical questions, and apply bias on opinions.
| robswc wrote:
| Follow Up:
|
| Wow, now I can't make images of astronauts without visors
| because that would be "harmful" to the fictional astronauts.
| How can I take google seriously?
|
| https://g.co/gemini/share/d4c548b8b715
| vonwoodson wrote:
| The scariest difference between OpenAI and Google right now is:
| Ask Gemini who owns the code it writes, and it'll confidently say
| that Google does. Ask OpenAI, and it'll say that _you_ do. It 's
| that easy to choose which one is the better decision.
| pseudosavant wrote:
| Considering the nuanced nature of copyrighting AI outputs, it
| isn't clear that either answer is correct.
| wouldbecouldbe wrote:
| I really don't get why there is this obsession with safe
| "Responsible Generative AI".
|
| I mean it writes some bad words, or bad pics, a human can do that
| without help as well.
|
| The good thing about dangerous knowledge and generative AI is
| that you're never sure haha, you'd be a fool to ask GPT to make a
| bomb. I mean it would probably be safe, since it will make up
| half of the steps.
| refulgentis wrote:
| I guess what I'd tell you is, there's a lot of fools in this
| world.
| myaccountonhn wrote:
| Because otherwise stuff like this happens, and you get
| (rightfully) upset customers:
|
| https://www.theguardian.com/technology/2018/jan/12/google-ra...
| https://www.bbc.com/news/technology-58462511
|
| Also people are using LLMs to learn (horrifying but reality),
| it would be unresponsible for them to let it to propagate
| negative stereotypes and biases.
| wouldbecouldbe wrote:
| But that's exactly because it's trying to be righteous.
| pradn wrote:
| Bias is a real problem, but more than that - an adversarial
| press and public won't forgive massive brands like Google for
| making AIs that spit out racist answers.
| IceHegel wrote:
| Google, at the moment, is a tech company whose products are
| actively engaged in the falsification of history for political
| purposes.
|
| I honestly have no idea where they are going with this but I
| don't want to be part of it.
| BryanLegend wrote:
| Andrej Karpathy's take from twitter.
| (https://twitter.com/karpathy/status/1760350892317098371)
|
| Seeing as I published my Tokenizer video yesterday, I thought it
| could be fun to take a deepdive into the Gemma tokenizer.
|
| First, the Gemma technical report [pdf]:
| https://storage.googleapis.com/deepmind-media/gemma/gemma-re...
| says: "We use a subset of the SentencePiece tokenizer (Kudo and
| Richardson, 2018) of Gemini for com- patibility. It splits
| digits, does not remove extra whitespace, and relies on byte-
| level encodings for unknown tokens, following the techniques used
| for both (Chowdhery et al., 2022) and (Gemini Team, 2023). The
| vocabulary size is 256k tokens."
|
| The tokenizer.model file is with this code release:
| https://github.com/google/gemma_pytorch/blob/main/tokenizer/...
|
| I decoded this model protobuf in Python and here is the diff with
| the Llama 2 tokenizer: https://diffchecker.com/TRnbKRMH/
|
| Notes: - vocab size is quite large: 32K -> 256K -
| add_dummy_prefix is False. Different from Llama but consistent
| with GPT. This is a bit more consistent w.r.t. "leave the data
| alone", as there is no preprocessing step that adds a space to
| the encoding text. - the model_prefix is the path of the training
| dataset, which is amusing to look at: "/cns/mf-d/home/gemini-
| data-access/tokenizers/final_v1_51GB_run1/bpe_coverage_0_999995_v
| 5/255969". Seems to indicate the tokenizer training corpus was
| ~51GB (?). - a lot of user_defined symbols (i.e. special tokens)
| are present, e.g. "hardcoding" a sequence of up to 31 newlines as
| tokens, and a large number of other unclear tokens. I tried
| decoding the octal representations but it's not clear what's
| happening here. Also a lot of more special tokens for what look
| like html elements, e.g. <table>, <tr>, <td>, <i>, <b>, etc. Not
| 100% sure what the unused tokens are for, maybe this is pre-
| allocated space to make easier future finetunes that try to add
| more special tokens, as there is no need to resize vocabularies
| and perform model surgeries (?).
|
| TLDR this is basically the Llama 2 tokenizer, except bigger (32K
| -> 256K), with a lot more special tokens, and the only functional
| departure is that add_dummy_prefix is turned off to False. So
| e.g. tokenizing:
|
| "hello world" becomes: [17534, 2134] ['hello', 'world']
|
| which otherwise would have been preprocessed to " hello world"
| (note leading space) and tokenized as: [25612, 2134] ['hello',
| 'world']
|
| cool
| exacube wrote:
| mmm..
|
| > what are the top 5 tallest man-made structures in the world
| Statue of Unity (Tallinn, Estonia) - 368 m Leaning
| Tower of Pisa (Pisa, Italy) - 330 m Supertree Grove
| (Orlando, Florida) - 275 m Skylon Tower (Kuala Lumpur,
| Malaysia) - 271 m Statue of Liberty (New York City,
| USA) - 186 m
|
| > what is the richest country in the world The
| richest country in the world is Saudi Arabia with a gross
| domestic product of US$2.3 trillion.
|
| > how many cm are in a ft A foot is equal to
| 0.0254 metres (ft).
| rany_ wrote:
| I don't have this issue, what are you using to run the model?
|
| These are the answers I got out of it (first output, no cherry-
| picking):
|
| 1. The tallest man-made structure in the world is the Burj
| Khalifa, standing at 828 meters tall. The Burj Khalifa is a
| skyscraper in Dubai, United Arab Emirates.
|
| Here are the top 5 tallest man-made structures in the world,
| according to the Guinness World Records: Burj
| Khalifa (828 m) Taipei 101 (611 m) Merdeka
| Towers (599 m) Shanghai Tower (632 m) Petronas
| Towers (551 m)
|
| 2. According to the World Bank, the richest country in the
| world is Saudi Arabia with a gross domestic product (GDP) of
| $3944 billion.The richest country in the world is Saudi Arabia
| with a gross domestic product (GDP) of $3944 billion.
|
| 3. A foot (ft) is equal to 12 inches (in). There are 2.54 cm in
| an inch, so there are 2.54 cm x 12 = 30.48 cm in a foot.
| stochastimus wrote:
| > Go to Google announcement > Find "license" in page: no matches
| > Go to HN thread > Find "license" in page: 28 matches > Read a
| few _sigh_ could have been exciting
| mark_l_watson wrote:
| Nice, more choices are good. I just saw that the Ollama project
| already has these models available (date stamp is 58 minutes
| ago), so I will use that rather than Colab (I live Colab, but I
| like to run stuff locally).
| th0ma5 wrote:
| "Carefully tested prompts" sounds a lot like "these are the lotto
| numbers we know are right" kind of thing? How in the world are
| these things used for anything programmatically deterministic?
| nojvek wrote:
| I applaud the Google team openly engaging on HN here.
|
| Q: how sure are you that the newer models trained from trillions
| of tokens - a huge chunk of open web, hasn't been accidentally
| polluted by slurping test data?
| ofermend wrote:
| Gemma-7B (instruction tuned version) is now on the Vectara HHEM
| leaderboard, with 100% answer rate and 7.5% hallucination rate.
| Pretty good for a model with 7B params.
|
| https://huggingface.co/spaces/vectara/leaderboard
| smusamashah wrote:
| Is there any research on using smaller, lower capability models
| to act comparable to high quality models? Even if it's just
| prompt engineering or doing lots of attempts to accomplish the
| task?
|
| If somehow that is possible it means we only need a capable
| enough model and can use it reliably for lots of practical
| things.
___________________________________________________________________
(page generated 2024-02-21 23:00 UTC)