[HN Gopher] Yi 1.5
       ___________________________________________________________________
        
       Yi 1.5
        
       Author : tosh
       Score  : 105 points
       Date   : 2024-05-12 16:23 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | xhevahir wrote:
       | "Yi-1.5 is an upgraded version of Yi" is not a very informative
       | beginning.
        
         | kkzz99 wrote:
         | "It is continuously pre-trained on Yi with a high-quality
         | corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning
         | samples.
         | 
         | Compared with Yi, Yi-1.5 delivers stronger performance in
         | coding, math, reasoning, and instruction-following capability,
         | while still maintaining excellent capabilities in language
         | understanding, commonsense reasoning, and reading
         | comprehension.
         | 
         | Yi-1.5 comes in 3 model sizes: 34B, 9B, and 6B. For model
         | details and benchmarks, see Model Card."
         | 
         | Literally after that...
        
           | Jaxan wrote:
           | So it's a large language model?
        
       | gardnr wrote:
       | Yi is led by Dr. Kai-Fu Lee.
       | 
       | They have been releasing a lot of really good models over the
       | last ~6 months. Their previous (1.0?) Yi-34B-Chat model ranks
       | similar to GPT-3.5 on Chatbot Arena. [1] A quantized version of
       | that model can be run on a single consumer video card like the
       | RTX 4090.
       | 
       | This new set of models should raise the bar again by adding more
       | options to the open source LLM ecosystem. If you inspect the
       | config.json[2] in the model repo on HuggingFace, you can see that
       | the model architecture is LlamaForCausalLM (the same as Meta's
       | Lllama). The difference between the Yi models and a simple fine-
       | tuning is that Yi models have had a different set of data,
       | configuration, and process going back to the pre-training stage.
       | 
       | Their models perform well in Chinese and in English.
       | 
       | There are a lot of good models coming out of China, some of which
       | are only published to ModelScope. I haven't spent much time on
       | ModelScope because I don't have a Chinese mobile number to use to
       | create an account. Fortunately, Yi publish to HuggingFace as
       | well.
       | 
       | [1] https://huggingface.co/spaces/lmsys/chatbot-arena-
       | leaderboar...
       | 
       | [2]
       | https://huggingface.co/01-ai/Yi-1.5-34B-Chat/blob/fa695ee438...
        
         | option wrote:
         | Try asking their "chat" variants about topics sensetive to CCP,
         | like what has happened on Tiananmen square. Same for Baichan
         | models.
         | 
         | What other values and biases have been RLHFed there and for
         | what purpose?
        
           | polygamous_bat wrote:
           | This is an interesting question. Is there a "controversy-
           | benchmark" perhaps, to measure this?
        
           | ekianjo wrote:
           | the American models are similarly censored for specific
           | topics...
        
       | tosh wrote:
       | Benchmark charts on model card:
       | https://huggingface.co/01-ai/Yi-1.5-34B-Chat#benchmarks
       | 
       | Yi 34b with results similar to Llama 3 70b and Mixtral 8x22b
       | 
       | Yi 6b and 9b with results similar to Llama 3 8b
        
         | GaggiX wrote:
         | We need to wait for LMSYS Chatbot Arena to actually see the
         | performance of the model.
        
           | tosh wrote:
           | I had good results with the previous Yi-34b and its fine
           | tunes like Nous-Capybara-34B. Will be interesting to see what
           | Chatbot Arena thinks but my expectations are high.
           | 
           | https://huggingface.co/NousResearch/Nous-Capybara-34B
        
           | zone411 wrote:
           | No, Lmsys is just another very obviously flawed benchmark.
        
         | qeternity wrote:
         | Pretraining on the test set is all you need.
         | 
         | LLM benchmarks are horribly broken. IMHO there is better signal
         | in just looking at parameter counts.
        
       | mountainriver wrote:
       | Is it the same bad license?
        
         | tosh wrote:
         | It looks like they switched to Apache 2.0 for the weights.
        
       | Havoc wrote:
       | Never had any luck with the Yi family of models. They tend to get
       | sidetracked and respond in Chinese. Maybe my setup is somehow
       | flawed
        
         | segmondy wrote:
         | Your setup is flawed.
        
           | qeternity wrote:
           | No, it's not. This is a common issue with Yi models.
        
       | 999900000999 wrote:
       | Is 16 GB of ram enough to run these locally?
       | 
       | I'm considering a new laptop later this year and the ram is now
       | fixed to 16GB on most of them.
       | 
       | I plan on digging deep into ML during my coming break from paid
       | work.
        
         | coolestguy wrote:
         | No - 16gb of ram is barely enough to run regular applications
         | if you're a power user let alone the most breakthrough
         | computationally heavy workloads ever invented
        
           | 999900000999 wrote:
           | The price difference is about 150$ give or take for the
           | laptops I'm looking at.
           | 
           | I'll keep this in mind!
        
         | tosh wrote:
         | 16 GB is enough to run quantized versions of 9b and 6b.
        
       | adt wrote:
       | https://lifearchitect.ai/models-table/
        
         | Hugsun wrote:
         | This page is confusing to me. How is it useful to you? I can
         | see some utility but am curious if there's something I'm
         | missing.
        
       | smcleod wrote:
       | While interesting, Yi 1.5 only has a 4K context window, which
       | means it's not going to be useful for a lot of use cases.
        
       ___________________________________________________________________
       (page generated 2024-05-12 23:00 UTC)