hngopher.com

       [HN Gopher] RWKV RNN: Better than ChatGPT?
       ___________________________________________________________________
        
       RWKV RNN: Better than ChatGPT?
        
       Author : pffft8888
       Score  : 72 points
       Date   : 2023-03-23 20:45 UTC (2 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jacobn wrote:
       | From the project page: pronounced as "RwaKuv"
       | 
       | That is still quite challenging to pronounce, maybe one of "rwkv"
       | -> "raw-kv" -> "rawk-v" -> "rock-v"?
        
         | dragonwriter wrote:
         | "RwaKuv" seems like it would pretty closely match "Rock of"
        
         | tjr wrote:
         | Rocky V?
        
       | adeon wrote:
       | I've followed updates on this project r/machinelearning and for
       | me the existence of projects like this is some good evidence that
       | the OpenAI moat is not that strong. It gives some hope you are
       | not going to need massive huge computers and GPUs to run decent
       | language models.
       | 
       | I hope this project will thrive.
        
       | serverholic wrote:
       | [dead]
        
       | GaggiX wrote:
       | The best thing about this model is that it has O(T) speed and
       | O(1) memory during inference vs the O(T^2) speed and O(T) memory
       | (flash memory) of a GPT model, still it can be trained in
       | parallel like a GPT model.
        
         | pffft8888 wrote:
         | In addition,
         | 
         | 1) it's open source.
         | 
         | 2) you can run it yourself so the rug won't be pulled from
         | under you when they decide to shutdown and move users up to the
         | next version or another product as they've done with the older
         | text-davinci models.
         | 
         | 3) you get to align it (using RLFH) as opposed to a corporation
         | dictating what is "aligned" and what is "safe."
         | 
         | 4) you won't have to deal with government led censorship. For
         | example, instead of the FBI using JIRA to manage a list of URLs
         | to be censored (as they did according to the latest
         | revelations) they can train the AI to self-censor as Bing has
         | done.
         | 
         | 5) you won't be using the product of a company that was started
         | as a non-profit with $100M donation (from Elon Musk) to promote
         | transparent AI only to take that money and turn into a for-
         | profit company and close-source the AI.
        
       | pffft8888 wrote:
       | What test cases do folks here recommend for measuring this new
       | model's ability to reason? and, specifically, if it can reason
       | about code with similar (or better!) performance to ChatGPT4? Has
       | anyone managed to get it running locally?
        
         | gooseus wrote:
         | OpenAI has been collecting a ton of evals here
         | https://github.com/openai/evals with many of them including
         | some comments about how well GPT-4 does vs GPT-3.5.
         | 
         | You could clone that repo, adapt the oaieval script to run
         | against different APIs, then run the evals against both and
         | compare the results.
        
         | macrolocal wrote:
         | The author claims 61.0% on WinoGrande vis-a-vis GPT-4's 87.5%.
        
           | pffft8888 wrote:
           | "you can fine-tune RWKV into a non-parallelizable RNN (then
           | you can use outputs of later layers of the previous token) if
           | you want extra performance."
           | 
           | Is that 61% using the non-parallelizable RNN mode or the
           | standard mode? I wonder if it's the latter.
           | 
           | This new model may be a viable alternative to ChatGPT, which
           | is not only closed sourced but can be shut down in the future
           | just as they did with the older text-davinci models.
           | 
           | Plus, the alignement and safety has rendered ChatGPT useless
           | for helping with areas such as critical analysis of social
           | issues (that go against the aligned views) and any and all
           | critical thinking that goes against the aligned views of
           | those who own and program ChatGPT. This could a viable free
           | (as in freedom) alternative.
        
             | macrolocal wrote:
             | I think the Cambrian explosion is just beginning.
        
           | MaxikCZ wrote:
           | I can't seem to find it in GitHub repo, do you know the value
           | for ChatGPT before it switched to GPT-4?
        
             | macrolocal wrote:
             | Here are a few benchmarks:
             | 
             | https://paperswithcode.com/sota/common-sense-reasoning-on-
             | wi...
        
       | pffft8888 wrote:
       | Imagine having ChatGPT level AI running in an ASIC inside
       | earphones. This could be like an always-on buddy, available
       | offline and able to access resources when you're connected.
       | 
       | Or in Google Glasses. The Readme states that it's more optimized
       | for ASIC than the transformer architecture used by ChatGPT.
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2023-03-23 23:00 UTC)