[HN Gopher] Falcon 2
       ___________________________________________________________________
        
       Falcon 2
        
       Author : tosh
       Score  : 215 points
       Date   : 2024-05-13 15:17 UTC (7 hours ago)
        
 (HTM) web link (www.tii.ae)
 (TXT) w3m dump (www.tii.ae)
        
       | tosh wrote:
       | huggingface model card: https://huggingface.co/tiiuae/falcon-11B
        
         | BryanLegend wrote:
         | "When tested against several prominent AI models in its class
         | among pre-trained models, Falcon 2 11B surpasses the
         | performance of Meta's newly launched Llama 3 with 8 billion
         | parameters (8B), and performs on par with Google's Gemma 7B at
         | first place, with a difference of only 0.01 average performance
         | (Falcon 2 11B: 64.28 vs Gemma 7B: 64.29) according to the
         | evaluation from Hugging Face."
         | 
         | From: https://falconllm.tii.ae/falcon-2.html
        
       | Hugsun wrote:
       | > New Falcon 2 11B Outperforms Meta's Llama 3 8B, and Performs on
       | par with leading Google Gemma 7B Model
       | 
       | I was strongly under the impression that Llama 3 8B outperformed
       | Gemma 7B on almost all metrics.
        
         | 7thpower wrote:
         | I found that curious too.
         | 
         | I don't stay up on the benchmarks much these days though; I've
         | fully dedicated myself to b-ball.
         | 
         | I'm actually a bit better than Lebron btw, who is nowhere near
         | as good as my 3 year old daughter. I occasionally beat her. At
         | basketball.
        
         | JimDabell wrote:
         | Anecdotal, I know, but in my experience Gemma is absolutely
         | worthless and Llama 3 8b is exceptionally good for its size.
         | The idea that Gemma is ahead of Llama 3 is bizarre to me.
         | Surely there's some contamination or something if Gemma is
         | showing up ahead in some benchmarks!?
        
           | freedomben wrote:
           | Adding more anecdata, but this has been exactly my experience
           | as well. I haven't dug into details about the benchmarks, but
           | just trying to use the things for basic question asking,
           | Llama 3 is so much better it's like comparing my Milwaukee
           | drill to my sons Fisher Price plastic toy drill.
        
             | hehdhdjehehegwv wrote:
             | Yeah, Llama3 is also smoking Mistral/Mixtral. It's my new
             | model of choice.
        
         | coder543 wrote:
         | Keep in mind that this is a comparison of base models, not chat
         | tuned models, since Falcon-11B does not have a chat tuned model
         | at this time. The chat tuning that Meta did seems better than
         | the chat tuning on Gemma.
         | 
         | Regardless, the Gemma 1.1 chat models have been fairly good in
         | my experience, even if I think the Llama3 8B chat model is
         | definitely better.
         | 
         | CodeGemma 1.1 7B is especially underrated compared to my
         | testing of other relevant coding models. The base CodeGemma 7B
         | base model is one of the best models I've tested for code
         | completion, and the chat model is one of the best models I've
         | tested for writing code. Some other models seem to game the
         | benchmarks better, but in real world use, don't hold up as well
         | as CodeGemma for me. I look forward to seeing how CodeLlama3
         | does, but it doesn't exist yet.
        
           | Hugsun wrote:
           | The model type is a good point. It's hard to track all the
           | variables in this very fast paced field.
           | 
           | Thank you for sharing your CodeGemma experience. I haven't
           | found a emacs setup I'm satisfied with, using a local llm,
           | but it will surely happen one day. Surely.
        
           | attentive wrote:
           | for me, CodeGemma is super slow. I'd say 3-4 times slower
           | than llama3. I am also looking forward to CodeLlama3 but I
           | have a feeling Meta can't improve on llama3 it. Was there
           | anything official from Meta?
        
             | coder543 wrote:
             | CodeGemma has fewer parameters than Llama3, so it
             | absolutely should not be slower. That sounds like a
             | configuration issue.
             | 
             | Meta originally released Llama2 and CodeLlama, and
             | CodeLlama vastly improved on Llama2 for coding tasks.
             | Llama3-8B is _okay_ at coding, but I think
             | CodeGemma-1.1-7b-it is significantly better than
             | Llama3-8B-Instruct, and possibly a little better than
             | Llama3-70B-Instruct, so there is plenty of room for Meta to
             | improve Llama3 in that regard.
             | 
             | > Was there anything official from Meta?
             | 
             | https://ai.meta.com/blog/meta-llama-3/
             | 
             | "The text-based models we are releasing today are the first
             | in the Llama 3 collection of models."
             | 
             | Just a hint that they will be releasing more models in the
             | same family, and CodeLlama3 seems like a given to me.
        
       | latchkey wrote:
       | It would be interesting to hear more about the compute used to
       | build this.
        
         | mmoskal wrote:
         | From model card:
         | 
         | Falcon2-11B was trained on 1024 A100 40GB GPUs for the majority
         | of the training, using a 3D parallelism strategy (TP=8, PP=1,
         | DP=128) combined with ZeRO and Flash-Attention 2.
         | 
         | Doesn't say how long though.
        
           | kouteiheika wrote:
           | It does say how long on Huggingface:
           | 
           | > The model training took roughly two months.
        
       | wg0 wrote:
       | Time is coming when you'll be able to select out of hundreds of
       | models and they'll be downloaded on demand if not already, would
       | totally infer locally and offline even on your _phone_ and I
       | guess no more than 2030, we 'll be there.
       | 
       | I am not so up to date with the hardware landscape but don't
       | think smart people would let not be noticing the need.
        
         | treprinum wrote:
         | Cloud models will always have edge compared to local models.
         | Maybe in 2030 your iPhone would run GPT-4 on your phone but
         | cloud GPT-9 will solve all your kids homework, do 95% of your
         | job and manage your household.
        
       | simonw wrote:
       | The license is not good: https://falconllm-
       | staging.tii.ae/falcon-2-terms-and-conditio...
       | 
       | It's a modified Apache 2 license with extra clauses that include
       | a requirement to abide by their acceptable use policy, hosted
       | here: https://falconllm-staging.tii.ae/falcon-2-acceptable-use-
       | pol...
       | 
       | But... that modified Apache 2 license says the following:
       | 
       | "The Acceptable Use Policy may be updated from time to time. You
       | should monitor the web address at which the Acceptable Use Policy
       | is hosted to ensure that your use of the Work or any Derivative
       | Work complies with the updated Acceptable Use Policy."
       | 
       | So no matter what you think of their current AUP they reserve the
       | right to update it to anything they like in the future, and
       | you'll have to abide by the new one!
       | 
       | Great example of why I don't like the trend of calling licenses
       | like this "open source" when they aren't compatible with the OSI
       | definition.
        
         | cs702 wrote:
         | Well, that really _sucks_.
         | 
         | Thanks But No Thanks.
        
         | tantalor wrote:
         | It's probably unenforceable
        
           | SiempreViernes wrote:
           | When it is backed by the UAE the muscle you have to contend
           | with is not simply legal muscle, it also includes armed
           | muscle of questionable moral fibre (see support for the RSF).
        
             | CityOfThrowaway wrote:
             | Do you have any examples of the UAE using its military
             | force to force foreign companies to abide a contract?
        
               | notatoad wrote:
               | they're not going to send their army after you, but
               | that's not what this means.
               | 
               | friendly middle-eastern countries negotiate all kinds of
               | concessions from western governments in exchange for
               | allowing military operations to stage in their country,
               | and "we want you to enforce our IP laws" is an easy one
               | for western governments to grant.
        
           | Perz1val wrote:
           | Maybe, but would you risk trying?
        
           | davedx wrote:
           | How is changing a software licence unenforceable?
        
             | tantalor wrote:
             | scolson said it best:
             | 
             |  _You can retroactively make a license more open, but you
             | cannot retroactively make it more closed._
             | 
             | https://news.ycombinator.com/item?id=10672751
        
               | jacoblambda wrote:
               | And even then you aren't retroactively making it more
               | open. You are just now offering an additional, more open
               | license as well as the existing one.
               | 
               | You haven't taken the original license away, you just
               | provided a better default option.
               | 
               | The same weirdly enough goes in reverse as well. You can
               | provide a more restrictive license retroactively even if
               | the rights holders don't consent as long as the existing
               | license is compatible with the new, more restrictive
               | license. i.e. you can promote a work from Apache-2.0 to
               | GPL-3.0-or-later as the former is fully compatible with
               | the latter. However you can't stop existing users from
               | using it as Apache-2.0, you can only stop offering it
               | yourself with that license (but anyone who has an
               | existing Apache-2.0 copy or who is an original rights
               | holder can freely distribute it).
        
               | selimthegrim wrote:
               | Didn't Oracle try with OpenSolaris?
        
         | binarymax wrote:
         | Not the first time they did some license shenanigans (happened
         | with Falcon 1). I applaud their efforts but it seems they are
         | still trying to figure out if/how to monetize.
        
           | Havoc wrote:
           | They've got Saudi oil money behind them no?
           | 
           | Falcon always struck me more as a regional prestige project
           | rather than "how to monetise"
        
             | Cyph0n wrote:
             | Emirati oil money
        
             | baq wrote:
             | The Saudis are on a clock. It's got decades on it, but the
             | person who witnesses the end of their oil has already been
             | born.
             | 
             | It may be a prestige project today but make no mistake
             | there's a long game behind it.
        
           | spacebanana7 wrote:
           | I doubt the Emiratis have much interest in monetisation. The
           | value they're probably looking for is in LLMs as a media
           | asset.
           | 
           | Just like Al Jazeera is valuable for Qataris and sports are
           | valuable for the Saudis. These assets create goodwill
           | domestically and with international stakeholders. Sometimes
           | they make money, but that's secondary.
           | 
           | If people spend a few hours a day talking to LLMs there's
           | some media value there.
           | 
           | They may also fear that Western models would be censored or
           | licensed in a way harmful to UAE security and cultural
           | objectives. Imagine if Llama 4's license prevented military
           | use without approval by some American agency.
        
         | abhorrence wrote:
         | > So no matter what you think of their current AUP they reserve
         | the right to update it to anything they like in the future, and
         | you'll have to abide by the new one!
         | 
         | I'm so curious if this would actually hold up in court. Does
         | anyone know if there's any case law / precedence around this?
        
           | davedx wrote:
           | Of course, projects change their licences all the time, why
           | wouldn't it be legal? There's a long history of startups who
           | started with open source/open core gradually closing off or
           | commercialising the licence. This isn't anything new at all.
           | 
           | This is why it's good to read licenses before adopting the
           | tech, especially if it's at all core to your
           | business/project.
        
             | tux3 wrote:
             | No. Projects sometimes stop offering the previous license
             | and start using a different one _for new work_.
             | 
             | But if your project is Apache-2, you cannot take away
             | someone's license after the fact. You can only stop giving
             | away new Apache-2 licenses from that point on.
             | 
             | The difference here is the license itself has mystery terms
             | that can change at any time. That, is very much not done
             | all the time.
        
             | mmoskal wrote:
             | Projects change license for new code going forward. The old
             | code remains available under the previous license (and
             | sometimes new). Here, they are able to change the
             | conditions for existing weights.
        
             | wtallis wrote:
             | Releasing a new version with a new license is not the same
             | as retroactively changing the license terms for copies
             | already distributed of existing versions.
        
         | Havoc wrote:
         | The 40b model appears to be pure apache though
        
           | coder543 wrote:
           | That is Falcon 1, not Falcon 2.
           | 
           | Falcon 1 is entirely obsolete at this point, based on every
           | benchmark I've seen.
        
         | JimDabell wrote:
         | So basically you can never use this for anything non-trivial
         | because they can deny your use-case at any time without even
         | notifying you.
        
           | dclowd9901 wrote:
           | It's the UAE - are we really shocked here?
        
           | worldsayshi wrote:
           | Are licenses that can be altered after the fact even legal?
           | Feels like more companies would use them if it was...
        
             | Domenic_S wrote:
             | A large company I work for declined to buy licenses from a
             | supplier that had this clause. From my understanding it's
             | not really legal - or at worst it's a gray area - but the
             | language is just too risky if you're not looking to be the
             | test case in court.
        
               | taneq wrote:
               | Exactly. It's not about whether it's enforceable, it's
               | about the fact that a sketchy license indicates that they
               | might not always operate in good faith.
        
             | tptacek wrote:
             | Why wouldn't they be? The starting presumption for a new
             | work is that you have no permission to use it at all.
        
               | xmprt wrote:
               | I suppose a better question would be "how drastic can the
               | change be to the license?" because by adding that term,
               | you're basically superseding every other term on the
               | license. How do licenses deal with contradictory terms if
               | that's even possible?
        
               | tptacek wrote:
               | The license is explicit that it can be updated
               | unilaterally. Nobody can adopt this software and claim
               | not to know that's a possibility. There are attorneys
               | specializing in open source licenses who comment on HN
               | regularly, and maybe they'll surprise us, but, as a non-
               | lawyer nerding out on this stuff I put all my chips down
               | on "this is legally fine".
        
               | gitfan86 wrote:
               | Judges don't enforce bad contracts.
               | 
               | Imagine if a landlord sued a tenant after 1 year for a
               | new roof of an apartment because the contract stated that
               | "tenant will be responsible for all repairs" but the
               | tenant pointed out that the contract also said "the house
               | is in perfect and new condition".
               | 
               | Both things cannot be true, so the judge throws it out.
               | 
               | Same thing here, you can't grant someone a license to use
               | something and then immediately say, "You can't use this
               | without checking with us first" it is contradictory.
        
               | xmprt wrote:
               | The two reasons I think it's not that black and white is:
               | 
               | 1. This brings up the question of being able to agree to
               | a contract before the contract is written which makes no
               | sense.
               | 
               | 2. If it's legal then why don't all companies do it.
               | Instead, companies like Google regularly put out updated
               | terms of service which you have to agree to before
               | continuing to use their service. Often times you don't
               | realize it because it's just another checkbox or button
               | to click before signing in.
        
               | tptacek wrote:
               | There are all sorts of contracts that can be unilaterally
               | terminated.
        
               | TrueDuality wrote:
               | It unfortunately isn't worded that it only affects new
               | usage. If you needed to check once before your initial
               | use that would be shady but unquestionably legal as the
               | terms of the contract are clear when you are entering
               | into the agreement.
               | 
               | This clause allows them to arbitrarily change the
               | contract with you at will, with no notice. That
               | _shouldn't_ be enforceable but AFAIK that kind of
               | contract has never been tested. It is _likely_
               | unenforceable though.
        
             | gitfan86 wrote:
             | Yes and No. If the full text of the license is "You must
             | contact us for written approval before doing anything with
             | this product" then yes that can be enforced.
             | 
             | But a document like this which basically has a bunch of
             | words followed by a "Just kidding" line, are not
             | enforceable, because it contradicts the previous language.
             | A judge would throw the whole thing out because it doesn't
             | meet the standard of a contract.
        
               | zarzavat wrote:
               | Which matters because training an LLM very likely is not
               | protected by copyright.
               | 
               | While for a copyrighted work the default is that you
               | can't use it unless you have a valid license, for an LLM
               | the default is that you _can_ use it unless you have
               | signed a contract restricting what you can do in return
               | for some consideration.
               | 
               | I don't think these contracts are designed to be enforced
               | because an attempt to enforce it would reveal it to be
               | hot air, they are just there to scare people into
               | compliance.
               | 
               | Better to download LLMs from unofficial sources to avoid
               | attempted contract shenanigans.
        
               | drited wrote:
               | Do you inspect the files manually or with a tool if you
               | download from unofficial sources as a verification?
        
           | candiddevmike wrote:
           | How will they know you're using it?
        
             | dartos wrote:
             | Some models may have watermarks
             | 
             | Though it's easy to workaround if you know it's there
        
         | darby_eight wrote:
         | > Great example of why I don't like the trend of calling
         | licenses like this "open source" when they aren't compatible
         | with the OSI definition.
         | 
         | Open source was _always_ a way to weasel about terms, that 's
         | why it's open source and not free.
        
           | croes wrote:
           | More the other way around.
           | 
           | Just because it's free doesn't mean you can change anything
           | or get the source.
           | 
           | Some claim something is open source but it's just for free.
        
       | dongobread wrote:
       | Their benchmark results seem roughly on par with Mistral 7B and
       | Llama 3 8B, which hardly seems that great given the increase in
       | model size.
       | 
       | https://huggingface.co/tiiuae/falcon-11B
       | 
       | https://huggingface.co/meta-llama/Meta-Llama-3-8B
       | 
       | https://mistral.ai/news/announcing-mistral-7b/
        
         | hehdhdjehehegwv wrote:
         | Seems to be the case with all their models - really huge in
         | size, no actual performance gains for the effort.
         | 
         | Their refined web dataset is heavily censored so maybe that has
         | something to do with it. It's very morally conservative - total
         | exclusion of pornography and other topics.
         | 
         | So I'd not be surprised if some of the issues are they are just
         | filtering out too much content and adding more of the same
         | instead.
        
         | nabakin wrote:
         | Exactly. Falcon-180b had a lot of hype at first but the
         | community soon realized it was nearly worthless. Easily
         | outperformed by smaller LLMs in the general case.
         | 
         | Now they are back and claiming their falcon-11b LLM outperforms
         | Llama 3 8b. I already see a number of issues with this:
         | 
         | - falcon-11b is like 40% larger than Llama 3 8b so how can you
         | compare them when they aren't in the same size class
         | 
         | - their claim seems to be based on automated benchmarks when it
         | has long been clear that automated benchmarks are not enough to
         | make that claim
         | 
         | - some of their automated benchmarks are _wildly_ lower than
         | Llama 3 8b 's scores. It only beats Llama 3 8b on one benchmark
         | and just barely. I can make an LLM does the best anyone has
         | ever seen on one benchmark, but that doesn't mean my LLM is
         | good. Far from it
         | 
         | - clickbait headline with knowingly premature claims because
         | there has been zero human evaluation testing
         | 
         | - they claim their LLM is better than Llama 3 but completely
         | ignore Llama 3 70b
         | 
         | Honestly, it annoys me how much attention tiiuae get when they
         | haven't produced anything useful and continue this misleading
         | clickbait.
        
         | marci wrote:
         | Maybe that's not the right metrics to compare.
         | 
         | True, the model is bigger, but required less tokens than Llama
         | 3 to train. The issue is when there's no open datasets, it's
         | hard to really compare and replicate. Is it because of the
         | model's architecture? Dataset quality? Model size? A mixture of
         | those? Something else?
        
       | iLoveOncall wrote:
       | > Outperforming Meta's New Llama 3
       | 
       | I know it's hard to objectively rank LLMs, but those are really
       | ridiculous ways to keep track of performance.
       | 
       | If my reference of performance is (like the vast majority of
       | users) ChatGPT-3.5, I have to first know how Llama 3 compares to
       | that to then understand how that new models compare to what I'm
       | using at the moment.
       | 
       | Now, if I look for the performance of Llama 3 compared to
       | ChatGPT-3.5, I don't find it on the official launch page
       | https://ai.meta.com/blog/meta-llama-3/ where it is compared to
       | Gemma 7B it, Mistral 7B Instruct, Gemini Pro 1.5 and Claude 3
       | Sonnet.
       | 
       | How does Gemma 7B perform? Well you can only find out how it
       | compares to Llama 2 on the official launch page
       | https://blog.google/technology/developers/gemma-open-models/.
       | 
       | Let's look at the Llama 2 performance on its launch announcement:
       | https://llama.meta.com/llama2/ No GPT-3.5 turbo again.
       | 
       | I get that there are multiple aspects and that there's probably
       | not one overall "performance" metric across all tasks, and I get
       | that you can probably find a comparative between two specific
       | models relatively easily, but there absolutely needs to be a
       | standard by which those performances are communicated. The number
       | of hoops to jump through is ridiculous.
        
         | coder543 wrote:
         | Human preference data from side by side, anonymous comparisons
         | of models: https://leaderboard.lmsys.org/
         | 
         | Llama3 8B significantly outperforms ChatGPT-3.5, and LLama3 70B
         | is significantly better than that. These are ELO ratings, so it
         | would not be accurate to try to say X is 10% better than Y
         | because the score is 10% higher.
         | 
         | Obviously Falcon 2 is too new to be on the leaderboard yet.
         | 
         | Honestly, I don't think _anybody_ should be using ChatGPT-3.5
         | as a chatbot at this point. Google and Meta both offer free
         | chatbots that are significantly better than ChatGPT-3.5, among
         | other options.
        
           | iLoveOncall wrote:
           | > Honestly, I don't think anybody should be using ChatGPT-3.5
           | as a chatbot at this point. Google and Meta both offer free
           | chatbots that are significantly better than ChatGPT-3.5,
           | among other options.
           | 
           | Yet I guarantee you that ChatGPT-3.5 has 95% of the "direct
           | to consumer" marketshare.
           | 
           | Unless you're a technical user, you haven't even heard about
           | any alternative, let alone used them.
           | 
           | Now onto the ranking, I perfectly recognized in my original
           | comment that those comparisons exist, just that they're not
           | highlighted properly in any launch announcement of any new
           | model.
           | 
           | I haven't used Llama, only ChatGPT and the multiple versions
           | of Claude 2 and 3. How am I supposed to know if this Falcon 2
           | thing is even worth looking at beyond the first paragraph if
           | I have to compare it to a specific model that I haven't used
           | before?
        
             | coder543 wrote:
             | > Unless you're a technical user, you haven't even heard
             | about any alternative, let alone used them.
             | 
             | > How am I supposed to know if this Falcon 2 thing is even
             | worth looking at beyond the first paragraph if I have to
             | compare it to a specific model that I haven't used before?
             | 
             | You're not. These press releases are for the "technical
             | users" that have heard of and used all of these
             | alternatives.
             | 
             | They are not offering a Falcon 2 chat service you can use
             | today. They aren't even offering a chat-tuned Falcon 2
             | model. The Falcon 2 model in question is a base model, not
             | a chat model.
             | 
             | Unless someone is very technical, Falcon 2 is not relevant
             | to them in any way at this point. This is a forum of
             | technical people, which is why it's getting some attention,
             | but I suspect it's still not going to be relevant to most
             | people here.
        
           | imjonse wrote:
           | Human preference does not always favor the model that is best
           | at reasoning/code/accuracy whatever. In particular there's a
           | recent article suggesting that Llama 3's friendly and direct
           | chattiness contributes to it having a good standing in the
           | leaderboard.
           | 
           | https://lmsys.org/blog/2024-05-08-llama3/
        
             | coder543 wrote:
             | Sure, that's why I called it out as human preference data.
             | But I still think the leaderboard is one of the best ways
             | to compare models that we currently have.
             | 
             | If you know of better benchmark-based leaderboards where
             | the data hasn't polluted the training datasets, I'd love to
             | see them, but just giving up on everything isn't a good
             | option.
             | 
             | The leaderboard is a good starting point to find models
             | worth testing, which can then be painstakingly tested for a
             | particular use case.
        
         | hmage wrote:
         | There's https://chat.lmsys.org/?leaderboard
         | 
         | Not a __full__ list, but big enough to have some reference.
        
           | iLoveOncall wrote:
           | Yeah that's my point. Say in the title that it ranks #X on
           | the leaderboards, not that it's "better" than some cherry-
           | picked model.
        
       | vessenes wrote:
       | I welcome open models, although the Falcon model is not super
       | open, as noted here. I will say that the original Falcon did not
       | perform as well as its benchmark stats indicated -- it was pushed
       | out as a significant leap forward, and I didn't find it
       | outperformed competitive open models at release.
       | 
       | The PR stating an 11B model outperforms 7B and 8B models 'in the
       | same class' feels like it might be stretching a bit. We'll see --
       | I'll definitely give this a go for local inference. But, my gut
       | is that finetuned llama 3 8B is probably best in class...this
       | week.
        
         | htrp wrote:
         | > I will say that the original Falcon did not perform as well
         | as its benchmark stats indicated
         | 
         | Yea I saw that as well. I believe it was undertrained in terms
         | of parameters vs tokens because they really just wanted to have
         | a 40bn parameter model (like pre chinchilla optimal)
        
           | vessenes wrote:
           | It's hard to know if there's any special sauce here, but the
           | internet so far has decided "meh" on these models. I think
           | it's an interesting choice to put it out as tech competitive.
           | Stats say this one was trained on 5T tokens. For reference,
           | Llama 3 so far was reported at 15T.
           | 
           | There is no way you get back what you lost in training by
           | expanding parameters 3B.
           | 
           | If I were in charge of UAE PR and this project, I'd
           | 
           | a) buy a lot more H100s and get the training budget up
           | 
           | b) compete on a regional / messaging / national freedom angle
           | 
           | c) fully open license it
           | 
           | I guess I'm saying I'd copy Zuck's plan, with oil money
           | instead of social money and play to my base.
           | 
           | Overstating capabilities doesn't give you a lot of benefit
           | out of a local market, unfortunately.
        
       | database_lost wrote:
       | First headline in bold: "Next-Gen Falcon 2 Series launches [...]
       | and is only AI Model with Vision-to-Language Capabilities" ...
        
       | artninja1988 wrote:
       | 11b model outperforms 8b model, news at 11
        
       | jl6 wrote:
       | "only AI Model with Vision-to-Language Capabilities"
       | 
       | What do they mean by this? Isn't this roughly what GPT-4 Vision
       | and LLaVA do?
        
         | tictacttoe wrote:
         | And all Claude models...
        
           | Me1000 wrote:
           | And Gemini.
        
         | Hugsun wrote:
         | At first I thought they were playing some semantic game.
         | 
         | Something like LLaVA being a language to vision model but I
         | can't steelman the idea so it makes sense.
         | 
         | Maybe they're just lying?
        
       | bbor wrote:
       | Absurdly biased article, cmon UAE be more subtle! "Beats llama 3"
       | is a dubiously helpful summary, and this is just baffling:
       | and is only AI Model with Vision-to-Language Capabilities
        
       | pimlottc wrote:
       | For a moment, I thought this might be related to the classic
       | flight sim:
       | 
       | https://en.wikipedia.org/wiki/Falcon_4.0
        
         | nordsieck wrote:
         | Also, SpaceX has the Falcon 1 and Falcon 9 rockets, as well as
         | the proposed but never developed Falcon 5.
        
       | j-pb wrote:
       | These reminders that AI will not only be wielded by democracies
       | with (at least partial attempts at) ethical oversight, but also
       | by the worst of the worst autocrats, are truly chilling.
        
         | logicchains wrote:
         | >but also by the worst of the worst autocrats
         | 
         | MBZ (note MBZ is not MBS; Saudia Arabia and UAE are two
         | different countries!) is one of the most popular leaders in the
         | world and his people among the wealthiest. His country is one
         | of the few developed countries in the world where the economy
         | is still growing steadily, and one of the safest countries in
         | the world outside of East Asia, in spite of having one of the
         | world's most liberal immigration policies. Much more a
         | contender for the best of the best autocrats than the worst of
         | the worst.
        
       | redleader55 wrote:
       | I want to understand something: the model was trained on mostly a
       | public dataset(?), with hardware from AWS, using well-known
       | algorithms and techniques. How is it different from other models
       | that anyone that has the money can train?
       | 
       | My skeptic/hater(?) mentality, sees this as only a "flex" and an
       | effort to try be seen as relevant. Is there more to this kind of
       | effort that I'm not seeing?
        
         | andy99 wrote:
         | A lot of models are in this category. Sovereignty (whether
         | national or corporate) has some value. And the threat of
         | competition is a good thing for everyone. I'm glad people are
         | working on these even if the end result in most cases isn't
         | anything particularly interesting.
        
       | adt wrote:
       | "With the release of Falcon 2 11B, we've introduced the first
       | model in the Falcon 2 series."
       | 
       | https://lifearchitect.ai/models-table/
        
       | GuB-42 wrote:
       | I am a bit disappointed that it isn't about a new, small rocket
       | from SpaceX with two first-stage engines.
        
       | jwblackwell wrote:
       | The speed things are moving, it feels like we'll get a GPT-4
       | level "small" model really soon.
        
         | cedws wrote:
         | I guess that explains why OpenAI are rushing to make their
         | models free despite having paying users. They don't want to
         | lose market share to local LLMs just yet.
        
       | renonce wrote:
       | So Falcon 2 with 11B params outperform Llama 3 8B? With more
       | parameters that doesn't make a fair comparison. The strongest
       | open source model seems to be Llama 3 70B, why claim
       | outperforming Llama 3 when you didn't outperform the best model?
        
       | hypertexthero wrote:
       | Sigh, I thought this was going to be about Spectrum Holobyte's
       | Falcon AT. From MyAbandonware.com:
       | 
       | > Essentially Falcon 2 but somehow marketed differently, Falcon
       | AT is the second release in Spectrum Holobyte's revolutionary
       | hard-core flight sim Falcon series. Despite popular belief that
       | Falcon 3.0 was THE dawn of modern flight sims, Falcon AT actually
       | is already a huge leap over Falcon, sporting sharp EGA graphics,
       | and a lot of realistic options and greatly expanded campaigns.
       | The game is still the simulation of modern air combat, complete
       | with excellent tutorials, varied missions, and accurate flight
       | dynamics that Falcon fans have come to know and love. Among its
       | host of innovations is the amazingly playable multiplayer options
       | -- including hotseat and over the modem. Largely forgotten now,
       | Falcon AT serves to explain the otherwise inexplicable gap
       | between Falcon and Falcon 3.0.
        
         | jhbadger wrote:
         | There seems to be a trend of people naming new things (perhaps
         | unintentionally) after classic computer games. We just had a
         | post here on a system called Loom which apparently isn't the
         | classic adventure game. I'm half expecting someone to come up
         | with an LLM or piece of networking software and name it Zork.
        
       ___________________________________________________________________
       (page generated 2024-05-13 23:00 UTC)