[HN Gopher] Falcon 2
___________________________________________________________________
Falcon 2
Author : tosh
Score : 215 points
Date : 2024-05-13 15:17 UTC (7 hours ago)
(HTM) web link (www.tii.ae)
(TXT) w3m dump (www.tii.ae)
| tosh wrote:
| huggingface model card: https://huggingface.co/tiiuae/falcon-11B
| BryanLegend wrote:
| "When tested against several prominent AI models in its class
| among pre-trained models, Falcon 2 11B surpasses the
| performance of Meta's newly launched Llama 3 with 8 billion
| parameters (8B), and performs on par with Google's Gemma 7B at
| first place, with a difference of only 0.01 average performance
| (Falcon 2 11B: 64.28 vs Gemma 7B: 64.29) according to the
| evaluation from Hugging Face."
|
| From: https://falconllm.tii.ae/falcon-2.html
| Hugsun wrote:
| > New Falcon 2 11B Outperforms Meta's Llama 3 8B, and Performs on
| par with leading Google Gemma 7B Model
|
| I was strongly under the impression that Llama 3 8B outperformed
| Gemma 7B on almost all metrics.
| 7thpower wrote:
| I found that curious too.
|
| I don't stay up on the benchmarks much these days though; I've
| fully dedicated myself to b-ball.
|
| I'm actually a bit better than Lebron btw, who is nowhere near
| as good as my 3 year old daughter. I occasionally beat her. At
| basketball.
| JimDabell wrote:
| Anecdotal, I know, but in my experience Gemma is absolutely
| worthless and Llama 3 8b is exceptionally good for its size.
| The idea that Gemma is ahead of Llama 3 is bizarre to me.
| Surely there's some contamination or something if Gemma is
| showing up ahead in some benchmarks!?
| freedomben wrote:
| Adding more anecdata, but this has been exactly my experience
| as well. I haven't dug into details about the benchmarks, but
| just trying to use the things for basic question asking,
| Llama 3 is so much better it's like comparing my Milwaukee
| drill to my sons Fisher Price plastic toy drill.
| hehdhdjehehegwv wrote:
| Yeah, Llama3 is also smoking Mistral/Mixtral. It's my new
| model of choice.
| coder543 wrote:
| Keep in mind that this is a comparison of base models, not chat
| tuned models, since Falcon-11B does not have a chat tuned model
| at this time. The chat tuning that Meta did seems better than
| the chat tuning on Gemma.
|
| Regardless, the Gemma 1.1 chat models have been fairly good in
| my experience, even if I think the Llama3 8B chat model is
| definitely better.
|
| CodeGemma 1.1 7B is especially underrated compared to my
| testing of other relevant coding models. The base CodeGemma 7B
| base model is one of the best models I've tested for code
| completion, and the chat model is one of the best models I've
| tested for writing code. Some other models seem to game the
| benchmarks better, but in real world use, don't hold up as well
| as CodeGemma for me. I look forward to seeing how CodeLlama3
| does, but it doesn't exist yet.
| Hugsun wrote:
| The model type is a good point. It's hard to track all the
| variables in this very fast paced field.
|
| Thank you for sharing your CodeGemma experience. I haven't
| found a emacs setup I'm satisfied with, using a local llm,
| but it will surely happen one day. Surely.
| attentive wrote:
| for me, CodeGemma is super slow. I'd say 3-4 times slower
| than llama3. I am also looking forward to CodeLlama3 but I
| have a feeling Meta can't improve on llama3 it. Was there
| anything official from Meta?
| coder543 wrote:
| CodeGemma has fewer parameters than Llama3, so it
| absolutely should not be slower. That sounds like a
| configuration issue.
|
| Meta originally released Llama2 and CodeLlama, and
| CodeLlama vastly improved on Llama2 for coding tasks.
| Llama3-8B is _okay_ at coding, but I think
| CodeGemma-1.1-7b-it is significantly better than
| Llama3-8B-Instruct, and possibly a little better than
| Llama3-70B-Instruct, so there is plenty of room for Meta to
| improve Llama3 in that regard.
|
| > Was there anything official from Meta?
|
| https://ai.meta.com/blog/meta-llama-3/
|
| "The text-based models we are releasing today are the first
| in the Llama 3 collection of models."
|
| Just a hint that they will be releasing more models in the
| same family, and CodeLlama3 seems like a given to me.
| latchkey wrote:
| It would be interesting to hear more about the compute used to
| build this.
| mmoskal wrote:
| From model card:
|
| Falcon2-11B was trained on 1024 A100 40GB GPUs for the majority
| of the training, using a 3D parallelism strategy (TP=8, PP=1,
| DP=128) combined with ZeRO and Flash-Attention 2.
|
| Doesn't say how long though.
| kouteiheika wrote:
| It does say how long on Huggingface:
|
| > The model training took roughly two months.
| wg0 wrote:
| Time is coming when you'll be able to select out of hundreds of
| models and they'll be downloaded on demand if not already, would
| totally infer locally and offline even on your _phone_ and I
| guess no more than 2030, we 'll be there.
|
| I am not so up to date with the hardware landscape but don't
| think smart people would let not be noticing the need.
| treprinum wrote:
| Cloud models will always have edge compared to local models.
| Maybe in 2030 your iPhone would run GPT-4 on your phone but
| cloud GPT-9 will solve all your kids homework, do 95% of your
| job and manage your household.
| simonw wrote:
| The license is not good: https://falconllm-
| staging.tii.ae/falcon-2-terms-and-conditio...
|
| It's a modified Apache 2 license with extra clauses that include
| a requirement to abide by their acceptable use policy, hosted
| here: https://falconllm-staging.tii.ae/falcon-2-acceptable-use-
| pol...
|
| But... that modified Apache 2 license says the following:
|
| "The Acceptable Use Policy may be updated from time to time. You
| should monitor the web address at which the Acceptable Use Policy
| is hosted to ensure that your use of the Work or any Derivative
| Work complies with the updated Acceptable Use Policy."
|
| So no matter what you think of their current AUP they reserve the
| right to update it to anything they like in the future, and
| you'll have to abide by the new one!
|
| Great example of why I don't like the trend of calling licenses
| like this "open source" when they aren't compatible with the OSI
| definition.
| cs702 wrote:
| Well, that really _sucks_.
|
| Thanks But No Thanks.
| tantalor wrote:
| It's probably unenforceable
| SiempreViernes wrote:
| When it is backed by the UAE the muscle you have to contend
| with is not simply legal muscle, it also includes armed
| muscle of questionable moral fibre (see support for the RSF).
| CityOfThrowaway wrote:
| Do you have any examples of the UAE using its military
| force to force foreign companies to abide a contract?
| notatoad wrote:
| they're not going to send their army after you, but
| that's not what this means.
|
| friendly middle-eastern countries negotiate all kinds of
| concessions from western governments in exchange for
| allowing military operations to stage in their country,
| and "we want you to enforce our IP laws" is an easy one
| for western governments to grant.
| Perz1val wrote:
| Maybe, but would you risk trying?
| davedx wrote:
| How is changing a software licence unenforceable?
| tantalor wrote:
| scolson said it best:
|
| _You can retroactively make a license more open, but you
| cannot retroactively make it more closed._
|
| https://news.ycombinator.com/item?id=10672751
| jacoblambda wrote:
| And even then you aren't retroactively making it more
| open. You are just now offering an additional, more open
| license as well as the existing one.
|
| You haven't taken the original license away, you just
| provided a better default option.
|
| The same weirdly enough goes in reverse as well. You can
| provide a more restrictive license retroactively even if
| the rights holders don't consent as long as the existing
| license is compatible with the new, more restrictive
| license. i.e. you can promote a work from Apache-2.0 to
| GPL-3.0-or-later as the former is fully compatible with
| the latter. However you can't stop existing users from
| using it as Apache-2.0, you can only stop offering it
| yourself with that license (but anyone who has an
| existing Apache-2.0 copy or who is an original rights
| holder can freely distribute it).
| selimthegrim wrote:
| Didn't Oracle try with OpenSolaris?
| binarymax wrote:
| Not the first time they did some license shenanigans (happened
| with Falcon 1). I applaud their efforts but it seems they are
| still trying to figure out if/how to monetize.
| Havoc wrote:
| They've got Saudi oil money behind them no?
|
| Falcon always struck me more as a regional prestige project
| rather than "how to monetise"
| Cyph0n wrote:
| Emirati oil money
| baq wrote:
| The Saudis are on a clock. It's got decades on it, but the
| person who witnesses the end of their oil has already been
| born.
|
| It may be a prestige project today but make no mistake
| there's a long game behind it.
| spacebanana7 wrote:
| I doubt the Emiratis have much interest in monetisation. The
| value they're probably looking for is in LLMs as a media
| asset.
|
| Just like Al Jazeera is valuable for Qataris and sports are
| valuable for the Saudis. These assets create goodwill
| domestically and with international stakeholders. Sometimes
| they make money, but that's secondary.
|
| If people spend a few hours a day talking to LLMs there's
| some media value there.
|
| They may also fear that Western models would be censored or
| licensed in a way harmful to UAE security and cultural
| objectives. Imagine if Llama 4's license prevented military
| use without approval by some American agency.
| abhorrence wrote:
| > So no matter what you think of their current AUP they reserve
| the right to update it to anything they like in the future, and
| you'll have to abide by the new one!
|
| I'm so curious if this would actually hold up in court. Does
| anyone know if there's any case law / precedence around this?
| davedx wrote:
| Of course, projects change their licences all the time, why
| wouldn't it be legal? There's a long history of startups who
| started with open source/open core gradually closing off or
| commercialising the licence. This isn't anything new at all.
|
| This is why it's good to read licenses before adopting the
| tech, especially if it's at all core to your
| business/project.
| tux3 wrote:
| No. Projects sometimes stop offering the previous license
| and start using a different one _for new work_.
|
| But if your project is Apache-2, you cannot take away
| someone's license after the fact. You can only stop giving
| away new Apache-2 licenses from that point on.
|
| The difference here is the license itself has mystery terms
| that can change at any time. That, is very much not done
| all the time.
| mmoskal wrote:
| Projects change license for new code going forward. The old
| code remains available under the previous license (and
| sometimes new). Here, they are able to change the
| conditions for existing weights.
| wtallis wrote:
| Releasing a new version with a new license is not the same
| as retroactively changing the license terms for copies
| already distributed of existing versions.
| Havoc wrote:
| The 40b model appears to be pure apache though
| coder543 wrote:
| That is Falcon 1, not Falcon 2.
|
| Falcon 1 is entirely obsolete at this point, based on every
| benchmark I've seen.
| JimDabell wrote:
| So basically you can never use this for anything non-trivial
| because they can deny your use-case at any time without even
| notifying you.
| dclowd9901 wrote:
| It's the UAE - are we really shocked here?
| worldsayshi wrote:
| Are licenses that can be altered after the fact even legal?
| Feels like more companies would use them if it was...
| Domenic_S wrote:
| A large company I work for declined to buy licenses from a
| supplier that had this clause. From my understanding it's
| not really legal - or at worst it's a gray area - but the
| language is just too risky if you're not looking to be the
| test case in court.
| taneq wrote:
| Exactly. It's not about whether it's enforceable, it's
| about the fact that a sketchy license indicates that they
| might not always operate in good faith.
| tptacek wrote:
| Why wouldn't they be? The starting presumption for a new
| work is that you have no permission to use it at all.
| xmprt wrote:
| I suppose a better question would be "how drastic can the
| change be to the license?" because by adding that term,
| you're basically superseding every other term on the
| license. How do licenses deal with contradictory terms if
| that's even possible?
| tptacek wrote:
| The license is explicit that it can be updated
| unilaterally. Nobody can adopt this software and claim
| not to know that's a possibility. There are attorneys
| specializing in open source licenses who comment on HN
| regularly, and maybe they'll surprise us, but, as a non-
| lawyer nerding out on this stuff I put all my chips down
| on "this is legally fine".
| gitfan86 wrote:
| Judges don't enforce bad contracts.
|
| Imagine if a landlord sued a tenant after 1 year for a
| new roof of an apartment because the contract stated that
| "tenant will be responsible for all repairs" but the
| tenant pointed out that the contract also said "the house
| is in perfect and new condition".
|
| Both things cannot be true, so the judge throws it out.
|
| Same thing here, you can't grant someone a license to use
| something and then immediately say, "You can't use this
| without checking with us first" it is contradictory.
| xmprt wrote:
| The two reasons I think it's not that black and white is:
|
| 1. This brings up the question of being able to agree to
| a contract before the contract is written which makes no
| sense.
|
| 2. If it's legal then why don't all companies do it.
| Instead, companies like Google regularly put out updated
| terms of service which you have to agree to before
| continuing to use their service. Often times you don't
| realize it because it's just another checkbox or button
| to click before signing in.
| tptacek wrote:
| There are all sorts of contracts that can be unilaterally
| terminated.
| TrueDuality wrote:
| It unfortunately isn't worded that it only affects new
| usage. If you needed to check once before your initial
| use that would be shady but unquestionably legal as the
| terms of the contract are clear when you are entering
| into the agreement.
|
| This clause allows them to arbitrarily change the
| contract with you at will, with no notice. That
| _shouldn't_ be enforceable but AFAIK that kind of
| contract has never been tested. It is _likely_
| unenforceable though.
| gitfan86 wrote:
| Yes and No. If the full text of the license is "You must
| contact us for written approval before doing anything with
| this product" then yes that can be enforced.
|
| But a document like this which basically has a bunch of
| words followed by a "Just kidding" line, are not
| enforceable, because it contradicts the previous language.
| A judge would throw the whole thing out because it doesn't
| meet the standard of a contract.
| zarzavat wrote:
| Which matters because training an LLM very likely is not
| protected by copyright.
|
| While for a copyrighted work the default is that you
| can't use it unless you have a valid license, for an LLM
| the default is that you _can_ use it unless you have
| signed a contract restricting what you can do in return
| for some consideration.
|
| I don't think these contracts are designed to be enforced
| because an attempt to enforce it would reveal it to be
| hot air, they are just there to scare people into
| compliance.
|
| Better to download LLMs from unofficial sources to avoid
| attempted contract shenanigans.
| drited wrote:
| Do you inspect the files manually or with a tool if you
| download from unofficial sources as a verification?
| candiddevmike wrote:
| How will they know you're using it?
| dartos wrote:
| Some models may have watermarks
|
| Though it's easy to workaround if you know it's there
| darby_eight wrote:
| > Great example of why I don't like the trend of calling
| licenses like this "open source" when they aren't compatible
| with the OSI definition.
|
| Open source was _always_ a way to weasel about terms, that 's
| why it's open source and not free.
| croes wrote:
| More the other way around.
|
| Just because it's free doesn't mean you can change anything
| or get the source.
|
| Some claim something is open source but it's just for free.
| dongobread wrote:
| Their benchmark results seem roughly on par with Mistral 7B and
| Llama 3 8B, which hardly seems that great given the increase in
| model size.
|
| https://huggingface.co/tiiuae/falcon-11B
|
| https://huggingface.co/meta-llama/Meta-Llama-3-8B
|
| https://mistral.ai/news/announcing-mistral-7b/
| hehdhdjehehegwv wrote:
| Seems to be the case with all their models - really huge in
| size, no actual performance gains for the effort.
|
| Their refined web dataset is heavily censored so maybe that has
| something to do with it. It's very morally conservative - total
| exclusion of pornography and other topics.
|
| So I'd not be surprised if some of the issues are they are just
| filtering out too much content and adding more of the same
| instead.
| nabakin wrote:
| Exactly. Falcon-180b had a lot of hype at first but the
| community soon realized it was nearly worthless. Easily
| outperformed by smaller LLMs in the general case.
|
| Now they are back and claiming their falcon-11b LLM outperforms
| Llama 3 8b. I already see a number of issues with this:
|
| - falcon-11b is like 40% larger than Llama 3 8b so how can you
| compare them when they aren't in the same size class
|
| - their claim seems to be based on automated benchmarks when it
| has long been clear that automated benchmarks are not enough to
| make that claim
|
| - some of their automated benchmarks are _wildly_ lower than
| Llama 3 8b 's scores. It only beats Llama 3 8b on one benchmark
| and just barely. I can make an LLM does the best anyone has
| ever seen on one benchmark, but that doesn't mean my LLM is
| good. Far from it
|
| - clickbait headline with knowingly premature claims because
| there has been zero human evaluation testing
|
| - they claim their LLM is better than Llama 3 but completely
| ignore Llama 3 70b
|
| Honestly, it annoys me how much attention tiiuae get when they
| haven't produced anything useful and continue this misleading
| clickbait.
| marci wrote:
| Maybe that's not the right metrics to compare.
|
| True, the model is bigger, but required less tokens than Llama
| 3 to train. The issue is when there's no open datasets, it's
| hard to really compare and replicate. Is it because of the
| model's architecture? Dataset quality? Model size? A mixture of
| those? Something else?
| iLoveOncall wrote:
| > Outperforming Meta's New Llama 3
|
| I know it's hard to objectively rank LLMs, but those are really
| ridiculous ways to keep track of performance.
|
| If my reference of performance is (like the vast majority of
| users) ChatGPT-3.5, I have to first know how Llama 3 compares to
| that to then understand how that new models compare to what I'm
| using at the moment.
|
| Now, if I look for the performance of Llama 3 compared to
| ChatGPT-3.5, I don't find it on the official launch page
| https://ai.meta.com/blog/meta-llama-3/ where it is compared to
| Gemma 7B it, Mistral 7B Instruct, Gemini Pro 1.5 and Claude 3
| Sonnet.
|
| How does Gemma 7B perform? Well you can only find out how it
| compares to Llama 2 on the official launch page
| https://blog.google/technology/developers/gemma-open-models/.
|
| Let's look at the Llama 2 performance on its launch announcement:
| https://llama.meta.com/llama2/ No GPT-3.5 turbo again.
|
| I get that there are multiple aspects and that there's probably
| not one overall "performance" metric across all tasks, and I get
| that you can probably find a comparative between two specific
| models relatively easily, but there absolutely needs to be a
| standard by which those performances are communicated. The number
| of hoops to jump through is ridiculous.
| coder543 wrote:
| Human preference data from side by side, anonymous comparisons
| of models: https://leaderboard.lmsys.org/
|
| Llama3 8B significantly outperforms ChatGPT-3.5, and LLama3 70B
| is significantly better than that. These are ELO ratings, so it
| would not be accurate to try to say X is 10% better than Y
| because the score is 10% higher.
|
| Obviously Falcon 2 is too new to be on the leaderboard yet.
|
| Honestly, I don't think _anybody_ should be using ChatGPT-3.5
| as a chatbot at this point. Google and Meta both offer free
| chatbots that are significantly better than ChatGPT-3.5, among
| other options.
| iLoveOncall wrote:
| > Honestly, I don't think anybody should be using ChatGPT-3.5
| as a chatbot at this point. Google and Meta both offer free
| chatbots that are significantly better than ChatGPT-3.5,
| among other options.
|
| Yet I guarantee you that ChatGPT-3.5 has 95% of the "direct
| to consumer" marketshare.
|
| Unless you're a technical user, you haven't even heard about
| any alternative, let alone used them.
|
| Now onto the ranking, I perfectly recognized in my original
| comment that those comparisons exist, just that they're not
| highlighted properly in any launch announcement of any new
| model.
|
| I haven't used Llama, only ChatGPT and the multiple versions
| of Claude 2 and 3. How am I supposed to know if this Falcon 2
| thing is even worth looking at beyond the first paragraph if
| I have to compare it to a specific model that I haven't used
| before?
| coder543 wrote:
| > Unless you're a technical user, you haven't even heard
| about any alternative, let alone used them.
|
| > How am I supposed to know if this Falcon 2 thing is even
| worth looking at beyond the first paragraph if I have to
| compare it to a specific model that I haven't used before?
|
| You're not. These press releases are for the "technical
| users" that have heard of and used all of these
| alternatives.
|
| They are not offering a Falcon 2 chat service you can use
| today. They aren't even offering a chat-tuned Falcon 2
| model. The Falcon 2 model in question is a base model, not
| a chat model.
|
| Unless someone is very technical, Falcon 2 is not relevant
| to them in any way at this point. This is a forum of
| technical people, which is why it's getting some attention,
| but I suspect it's still not going to be relevant to most
| people here.
| imjonse wrote:
| Human preference does not always favor the model that is best
| at reasoning/code/accuracy whatever. In particular there's a
| recent article suggesting that Llama 3's friendly and direct
| chattiness contributes to it having a good standing in the
| leaderboard.
|
| https://lmsys.org/blog/2024-05-08-llama3/
| coder543 wrote:
| Sure, that's why I called it out as human preference data.
| But I still think the leaderboard is one of the best ways
| to compare models that we currently have.
|
| If you know of better benchmark-based leaderboards where
| the data hasn't polluted the training datasets, I'd love to
| see them, but just giving up on everything isn't a good
| option.
|
| The leaderboard is a good starting point to find models
| worth testing, which can then be painstakingly tested for a
| particular use case.
| hmage wrote:
| There's https://chat.lmsys.org/?leaderboard
|
| Not a __full__ list, but big enough to have some reference.
| iLoveOncall wrote:
| Yeah that's my point. Say in the title that it ranks #X on
| the leaderboards, not that it's "better" than some cherry-
| picked model.
| vessenes wrote:
| I welcome open models, although the Falcon model is not super
| open, as noted here. I will say that the original Falcon did not
| perform as well as its benchmark stats indicated -- it was pushed
| out as a significant leap forward, and I didn't find it
| outperformed competitive open models at release.
|
| The PR stating an 11B model outperforms 7B and 8B models 'in the
| same class' feels like it might be stretching a bit. We'll see --
| I'll definitely give this a go for local inference. But, my gut
| is that finetuned llama 3 8B is probably best in class...this
| week.
| htrp wrote:
| > I will say that the original Falcon did not perform as well
| as its benchmark stats indicated
|
| Yea I saw that as well. I believe it was undertrained in terms
| of parameters vs tokens because they really just wanted to have
| a 40bn parameter model (like pre chinchilla optimal)
| vessenes wrote:
| It's hard to know if there's any special sauce here, but the
| internet so far has decided "meh" on these models. I think
| it's an interesting choice to put it out as tech competitive.
| Stats say this one was trained on 5T tokens. For reference,
| Llama 3 so far was reported at 15T.
|
| There is no way you get back what you lost in training by
| expanding parameters 3B.
|
| If I were in charge of UAE PR and this project, I'd
|
| a) buy a lot more H100s and get the training budget up
|
| b) compete on a regional / messaging / national freedom angle
|
| c) fully open license it
|
| I guess I'm saying I'd copy Zuck's plan, with oil money
| instead of social money and play to my base.
|
| Overstating capabilities doesn't give you a lot of benefit
| out of a local market, unfortunately.
| database_lost wrote:
| First headline in bold: "Next-Gen Falcon 2 Series launches [...]
| and is only AI Model with Vision-to-Language Capabilities" ...
| artninja1988 wrote:
| 11b model outperforms 8b model, news at 11
| jl6 wrote:
| "only AI Model with Vision-to-Language Capabilities"
|
| What do they mean by this? Isn't this roughly what GPT-4 Vision
| and LLaVA do?
| tictacttoe wrote:
| And all Claude models...
| Me1000 wrote:
| And Gemini.
| Hugsun wrote:
| At first I thought they were playing some semantic game.
|
| Something like LLaVA being a language to vision model but I
| can't steelman the idea so it makes sense.
|
| Maybe they're just lying?
| bbor wrote:
| Absurdly biased article, cmon UAE be more subtle! "Beats llama 3"
| is a dubiously helpful summary, and this is just baffling:
| and is only AI Model with Vision-to-Language Capabilities
| pimlottc wrote:
| For a moment, I thought this might be related to the classic
| flight sim:
|
| https://en.wikipedia.org/wiki/Falcon_4.0
| nordsieck wrote:
| Also, SpaceX has the Falcon 1 and Falcon 9 rockets, as well as
| the proposed but never developed Falcon 5.
| j-pb wrote:
| These reminders that AI will not only be wielded by democracies
| with (at least partial attempts at) ethical oversight, but also
| by the worst of the worst autocrats, are truly chilling.
| logicchains wrote:
| >but also by the worst of the worst autocrats
|
| MBZ (note MBZ is not MBS; Saudia Arabia and UAE are two
| different countries!) is one of the most popular leaders in the
| world and his people among the wealthiest. His country is one
| of the few developed countries in the world where the economy
| is still growing steadily, and one of the safest countries in
| the world outside of East Asia, in spite of having one of the
| world's most liberal immigration policies. Much more a
| contender for the best of the best autocrats than the worst of
| the worst.
| redleader55 wrote:
| I want to understand something: the model was trained on mostly a
| public dataset(?), with hardware from AWS, using well-known
| algorithms and techniques. How is it different from other models
| that anyone that has the money can train?
|
| My skeptic/hater(?) mentality, sees this as only a "flex" and an
| effort to try be seen as relevant. Is there more to this kind of
| effort that I'm not seeing?
| andy99 wrote:
| A lot of models are in this category. Sovereignty (whether
| national or corporate) has some value. And the threat of
| competition is a good thing for everyone. I'm glad people are
| working on these even if the end result in most cases isn't
| anything particularly interesting.
| adt wrote:
| "With the release of Falcon 2 11B, we've introduced the first
| model in the Falcon 2 series."
|
| https://lifearchitect.ai/models-table/
| GuB-42 wrote:
| I am a bit disappointed that it isn't about a new, small rocket
| from SpaceX with two first-stage engines.
| jwblackwell wrote:
| The speed things are moving, it feels like we'll get a GPT-4
| level "small" model really soon.
| cedws wrote:
| I guess that explains why OpenAI are rushing to make their
| models free despite having paying users. They don't want to
| lose market share to local LLMs just yet.
| renonce wrote:
| So Falcon 2 with 11B params outperform Llama 3 8B? With more
| parameters that doesn't make a fair comparison. The strongest
| open source model seems to be Llama 3 70B, why claim
| outperforming Llama 3 when you didn't outperform the best model?
| hypertexthero wrote:
| Sigh, I thought this was going to be about Spectrum Holobyte's
| Falcon AT. From MyAbandonware.com:
|
| > Essentially Falcon 2 but somehow marketed differently, Falcon
| AT is the second release in Spectrum Holobyte's revolutionary
| hard-core flight sim Falcon series. Despite popular belief that
| Falcon 3.0 was THE dawn of modern flight sims, Falcon AT actually
| is already a huge leap over Falcon, sporting sharp EGA graphics,
| and a lot of realistic options and greatly expanded campaigns.
| The game is still the simulation of modern air combat, complete
| with excellent tutorials, varied missions, and accurate flight
| dynamics that Falcon fans have come to know and love. Among its
| host of innovations is the amazingly playable multiplayer options
| -- including hotseat and over the modem. Largely forgotten now,
| Falcon AT serves to explain the otherwise inexplicable gap
| between Falcon and Falcon 3.0.
| jhbadger wrote:
| There seems to be a trend of people naming new things (perhaps
| unintentionally) after classic computer games. We just had a
| post here on a system called Loom which apparently isn't the
| classic adventure game. I'm half expecting someone to come up
| with an LLM or piece of networking software and name it Zork.
___________________________________________________________________
(page generated 2024-05-13 23:00 UTC)