[HN Gopher] Kimi K1.5: Scaling Reinforcement Learning with LLMs
___________________________________________________________________
Kimi K1.5: Scaling Reinforcement Learning with LLMs
Author : noch
Score : 171 points
Date : 2025-01-21 08:53 UTC (14 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| NitpickLawyer wrote:
| Really unfortunate timing with Deepseek-R1 and the distills
| coming out at basically the same time. Hard for people to pay
| attention to, and plus open source > API, even if the results are
| a bit lower.
| miohtama wrote:
| DeepSeek is not open source, as the source of its 14.8T high
| quality training tokens is not disclosed.
| NitpickLawyer wrote:
| Yawn. This has been debated ad nauseam. If you want to feel
| that way it's up to you, but disclosing means and hidden
| information has never been a requirement for open source. As
| long as the thing is licensed under a permissive license (MIT
| in this case), and you can see the data, change the data and
| re-publish the data, it's open source.
| cuuupid wrote:
| I really, really dislike when companies use GitHub to promote
| their product by posting a "research paper" and a code sample.
|
| It's not even an SDK, library, etc., it's just advertising.
|
| I've noticed a number of China-based labs do this; they will
| often post a really cool demo, some images, and then either an
| API or just nothing except advertising for their company (e.g.
| model may not even exist). Often they will also promise in some
| GitHub issue that they will release the weights, and never do.
|
| I'd love to see some sort of study here, I wonder what % of "omg
| really cool AI model!!!" hype papers [1] never provide an API,
| [2] cannot be reproduced at all, and/or [3] promise but never
| provide weights. If this was any other field, academics would be
| up in arms about likely fraud, false advertising, etc.
| diggan wrote:
| It's not just Chinese labs that do this, lots of companies
| upload a README to a GitHub repository then link that
| repository from the website, I guess so they can have a GitHub
| icon somewhere on the website?
|
| Submission is basically a form for requesting access to their
| closed API (which ironically is called "OpenPlatform" for some
| reason).
| rfoo wrote:
| > which ironically is called "OpenPlatform" for some reason
|
| This is pretty weird, the original text is Kai Fang Ping Tai
| , but it basically is another name for "API" in China.
|
| Not sure who started this, but it's really popular, for
| example, WeChat has an "Open Platform":
| https://open.weixin.qq.com/. AliPay too:
| https://open.alipay.com/. And peak strangeness, Alibaba Cloud
| (whose API is largely an AWS clone): https://open.aliyun.com/
| diggan wrote:
| Same thing in English, you have huge enterprises which
| basically operate on the complete opposite end of the
| spectrum, and end up calling themselves things like
| "OpenAI".
|
| It even bleeds into marketing pages, go to the Llama
| website and you see "open source model" plastered all over
| the place, completely misusing both the "open" and "source"
| parts of it.
| v3ss0n wrote:
| How about OpenAI?
| llm_trw wrote:
| You mean the charity* foundation** Open***AI?
| visarga wrote:
| That is unfortunate but they do present some theoretical
| insights about scaling context length and probably a more
| efficient way to do RL. Even knowledge about it can have an
| effect on next iterations from other labs.
| prjkt wrote:
| These types of "repositories" should contain some kind of
| flag/indication that it contains no source code, similar to
| when a repo is archived
| whimsicalism wrote:
| really? it takes like 1 second looking at the file structure
| to see what it is, maybe like 2 seconds if you're hopeful
| "images" somehow refers to a dockerfile or something
| whimsicalism wrote:
| ...but they do provide an APi.
|
| HN is really not beating the bikeshedding allegations
| ensignavenger wrote:
| Do they? I see a note that says it will be "available soon"?
| cuuupid wrote:
| They don't it's just promised in the future(tm). And even
| then, it should be a webpage on their website or API
| documentation, not a GitHub repo.
|
| It's not bikeshedding to expect a source code repository to
| have source code...
| sgtpepper13 wrote:
| I'm one of the authors of the paper. Thanks for raising a good
| point. It will be better if we upload the paper to arxiv but
| it's MLK in the US so submissions will be delayed by a couple
| of days. And we just can't wait to share some of the knowledge
| we gained from our experiments. Hope they will be useful for
| the community. Would much appreciate it if you have an idea
| about a better site for this. That said our API requests are
| open and we'll roll out more in the next few days depending on
| our server resources.
| asah wrote:
| The set of math/logic problems behind AIME 2024 appears to be...
| https://artofproblemsolving.com/wiki/index.php/2024_AIME_I_P...
|
| Impressive stuff! But unclear to me if it's literally just these
| 15 or if there's a large problem set...
| whimsicalism wrote:
| doesn't seem too hard to me, shame i was never exposed to this
| stuff in highschool
|
| e: oh i see, they get progressively harder
| codelion wrote:
| The full dataset is here - https://huggingface.co/datasets/AI-
| MO/aimo-validation-aime you can use the eval script I have in
| optillm to benchmark on it -
| https://github.com/codelion/optillm/blob/main/scripts/eval_a...
| zurfer wrote:
| Is it fair to say that 2 of the 3 leading models are from Chinese
| labs? It's really incredible how fast China has caught up.
| idiotsecant wrote:
| Its not all that surprising that the country with 20% of the
| population of earth has some smart people in it. What is, I
| think, surprising and fascinating is how China has been
| focusing on doing more with less - their underdog position
| w.r.t. hardware has pushed a huge focus on model efficiency and
| distillation, to the benefit of us all.
|
| I think its a distinct possibility that while the first AGI to
| say 'hello world' might do it in english, the first open source
| AGI running on consumer hardware will probably say it in
| mandarin.
| whimsicalism wrote:
| > Its not all that surprising that the country with 20% of
| the population of earth has some smart people in it.
|
| where's india's reasoning models? what about entire continent
| of africa? i'd be curious if they even have a single h100 on
| the continent
| airstrike wrote:
| I guess central planning can be very effective when playing
| catch up on greenfield projects of massive scale
| eunos wrote:
| Well DeepSeek hasn't really caught the eye of the
| leadership before they released R1.
| joaohkfaria wrote:
| But wait, which LLM models were used to train Kimi? It wasn't
| clear on the report.
___________________________________________________________________
(page generated 2025-01-21 23:01 UTC)