[HN Gopher] Yi: Open Foundation Models by 01.AI
       ___________________________________________________________________
        
       Yi: Open Foundation Models by 01.AI
        
       Author : pama
       Score  : 161 points
       Date   : 2024-03-10 15:12 UTC (7 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | helsinkiandrew wrote:
       | The Github repository, gives a better introduction/howto:
       | 
       | https://github.com/01-ai/yi
       | 
       | > Yi-34B-Chat model landed in second place (following GPT-4
       | Turbo), outperforming other LLMs (such as GPT-4, Mixtral, Claude)
       | on the AlpacaEval Leaderboard (based on data available up to
       | January 2024).
       | 
       | > Yi-34B model ranked first among all existing open-source models
       | (such as Falcon-180B, Llama-70B, Claude) in both English and
       | Chinese on various benchmarks, including Hugging Face Open LLM
       | Leaderboard (pre-trained) and C-Eval (based on data available up
       | to November 2023).
        
         | WhitneyLand wrote:
         | It's been ~1 year since gpt-4 was released.
         | 
         | It's hard to guess how long before any flavor of an "open"
         | model will consensus match what was released in 2023 let alone
         | potentially exceed it.
         | 
         | A big part of the race seems like it will depend on how high
         | gpt-5 can raise the bar. If it's only incrementally things may
         | converge quickly.
        
           | c0n5pir4cy wrote:
           | The Yi models were actually released back in early November
           | 2023. So there isn't as big a gap in time as it seems.
           | 
           | I'm not sure why there is such a big gap between the release
           | of the models and the publication of the paper.
           | 
           | EDIT: Okay, this appears to be a new set of models with the
           | same name, based on the same models from November but now
           | with multimodal capabilities.
        
         | oersted wrote:
         | Looking at the leaderboard shows a clearer picture:
         | https://tatsu-lab.github.io/alpaca_eval/
         | 
         | - GPT-4-Turbo: 50.00%
         | 
         | - Snorkel (current 2nd, Mistral 7B fine-tune): 34.86%
         | 
         | - Yi 34B Chat (current 6th): 29.66%
         | 
         | - GPT-4: 23.58%
         | 
         | Thoughts:
         | 
         | - Just saying that it came 2nd is quite misleading, the
         | difference in score is significant.
         | 
         | - Not sure what's up with this benchmark, I've never seen
         | GPT-4-Turbo vs GPT-4 performing so differently.
         | 
         | - The Snorkel model is impressive with just 7B parameters. The
         | Yi authors claim that their success is based on good training
         | data cleaning. This seems to be key at least for this
         | benchmark. Snorkel has also always been all about that, using
         | programmatic methods to generate lots of quality training data.
        
           | doctorpangloss wrote:
           | > Not sure what's up with this benchmark, I've never seen
           | GPT-4-Turbo vs GPT-4 performing so differently.
           | 
           | The benchmark is bad.
        
           | lossolo wrote:
           | The only reliable benchmark is found at
           | 
           | https://huggingface.co/spaces/lmsys/chatbot-arena-
           | leaderboar...
           | 
           | Model creators train models including open-source benchmarks
           | in the data, either intentionally to achieve better scores or
           | inadvertently through leaks from various sources.
        
         | nickpsecurity wrote:
         | Anytime you see that, we should assume the newer models might
         | have been trained on either the benchmarks themselves or
         | something similar to them. If I was an evaluator, I'd keep a
         | secret pile of tests that I know aren't in any LLM's, do the
         | evaluations privately, and not publish scores either. Just rank
         | plus how far apart they are.
         | 
         | The best tests of these models are people who want to use AI to
         | solve real problems attempting to do that with various models.
         | If they work, report that they worked. Also, publish the work
         | and result pairs permissively when possible to evaluate _that_
         | and use it for fine-tuning, too.
        
       | jes5199 wrote:
       | 01, like from the Animatrix ?
        
       | bearjaws wrote:
       | Seeing models like this work so well gives me hope that mobile
       | first LLMs for things like better voice to text and typing
       | prediction will not just 'work' in 2-3 years but actually not
       | kill your battery too.
        
         | m3kw9 wrote:
         | It it kills battery it won't be part of OS, and if it is on iOS
         | it won't be allowed to be in the AppStore, or it will be gated
         | by API/hardware gating.
         | 
         | For Andoird, they'll just allow it and your batter will last
         | 30min after a few questions
        
           | coffeebeqn wrote:
           | If it's fast enough to be useful it would also not physically
           | be able to use that much power. Your phone CPU and GPU have a
           | maximum W they can pull at any one time and if this runs for
           | a few seconds then the maximum it can use is that.
           | 
           | If it maxes out all cores and memory for 30 minutes then it
           | won't really work for anything
        
           | evilduck wrote:
           | MLC Chat is already on the App Store and allowed to be used.
           | I haven't used Yi with it, but a quantized Mistral or Llama
           | runs quite well on an iPhone 15. See https://llm.mlc.ai.
           | "Apple GPT" is also rumored to be coming too.
           | 
           | It is processor and therefore battery intensive but it
           | already won't kill your battery inside of 30 minutes.
           | Obviously it will be worse for resource usage than an app if
           | it's always kept running by some OS level process and set as
           | the processing layer for every trivial thing but it seems
           | like cheaper input handling could decide to promote some
           | input up to being evaluated by an LLM or not.
        
           | barronli wrote:
           | iOS already has app with LLM running locally on iPhone:
           | https://apps.apple.com/gb/app/mlc-chat/id6448482937
           | 
           | I've also tried a few Android LLM apps, all running more than
           | 30min.
           | 
           | Current LLM models are not running constantly on the phones
           | to drain your battery. They just run when responding to a
           | prompt. It by no means consumes more battery than a heavy
           | game.
        
       | orra wrote:
       | The repo source code is Apache 2.0 licensed, but the weights are
       | not.
       | 
       | Just in case anybody else is excited then misled by their tagline
       | "Building the Next Generation of Open-Source and Bilingual LLMs".
        
         | est31 wrote:
         | More reading on the weight license:
         | https://news.ycombinator.com/item?id=38159862
        
         | mmastrac wrote:
         | The model license, excerpts:
         | 
         | https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMEN...
         | 
         | 1) Your use of the Yi Series Models must comply with the Laws
         | and Regulations as well as applicable legal requirements of
         | other countries/regions, and respect social ethics and moral
         | standards, including but not limited to, not using the Yi
         | Series Models for purposes prohibited by Laws and Regulations
         | as well as applicable legal requirements of other
         | countries/regions, such as harming national security, promoting
         | terrorism, extremism, inciting ethnic or racial hatred,
         | discrimination, violence, or pornography, and spreading false
         | harmful information.
         | 
         | 2) You shall not, for military or unlawful purposes or in ways
         | not allowed by Laws and Regulations as well as applicable legal
         | requirements of other countries/regions, a) use, copy or
         | Distribute the Yi Series Models, or b) create complete or
         | partial Derivatives of the Yi Series Models.
         | 
         | "Laws and Regulations" refers to the laws and administrative
         | regulations of the mainland of the People's Republic of China
         | (for the purposes of this Agreement only, excluding Hong Kong,
         | Macau, and Taiwan).
        
         | echelon wrote:
         | Weights are trained on copyrighted data. I think that
         | ethically, weights should be public domain unless all of the
         | data [1] is owned or licensed by the training entity.
         | 
         | I'm hopeful that this is where copyright law lands. It seems
         | like this might be the disposition of the regulators, but we'll
         | have to wait and see.
         | 
         | In the meantime, maybe you should build your product in this
         | way anyway and fight for the law when you succeed. I don't
         | think a Chinese tech company is going to find success in
         | battling a US startup in court. (I would also treat domestic
         | companies with model licenses the same way, though the outcome
         | could be more of a toss up.)
         | 
         | "Break the rules."
         | 
         | "Fake it until you make it."
         | 
         | Both idioms seem highly applicable here.
         | 
         | [1] I think this should be a viral condition. Finetuning on a
         | foundational model that incorporates vast copyrighted data
         | should mean downstream training also becomes public domain.
        
       | jacobn wrote:
       | "we attribute the performance of Yi models primarily to its data
       | quality resulting from our data-engineering efforts"
       | 
       | Data work is rarely sexy, but (almost) always useful.
       | 
       | Did they release the corpus?
        
         | gwern wrote:
         | They did not, in part because it would reveal the data-
         | filtering routines (particularly the political censorship -
         | Chinese LLM papers sometimes mention the ban list but never
         | reveal it), and also in part because it might reveal things
         | they'd rather keep secret.
         | 
         | For example, Bytedance has already been caught using the OA API
         | to generate data for their models because they are having such
         | a hard time catching up to OA - and evading bans for doing
         | that, and also instructing employees on how to lie & cover it
         | up: https://www.theverge.com/2023/12/15/24003151/bytedance-
         | china...
         | 
         | Do you think that a small Chinese startup like 01.AI, which by
         | their own admission had to "bet the farm" to buy enough GPUs to
         | train the Yi models at all
         | https://www.bloomberg.com/news/articles/2023-11-05/kai-fu-le...
         | , and which were completely silent about cloning the American
         | LLaMA architecture until people analyzed the released
         | checkpoints and noticed it looked awfully familiar, is going to
         | be above such tactics...? In this economic/geopolitical
         | context? Especially when everyone seems to be doing it, not
         | just Bytedance?* (01.AI claims that, the architecture aside,
         | they didn't simply further train LLaMA models but trained from
         | scratch. You can decide for yourself how much you are willing
         | to believe this.) I wouldn't bet a lot of money on it, and
         | that's why I don't expect to see any large comprehensive data
         | releases from 01.AI for the Yi models.
         | 
         | * This is one of my theories for why so many disparate models
         | by so many different groups all seem to weirdly converge on the
         | same failure modes like 'write a non-rhyming poem', and why
         | GPT-3.5, and then GPT-4, seemed to be oddly difficult to
         | surpass, as if there were some magnetic force which made
         | reaching _near_ 3.5 /4 quality easy for 'independent' models,
         | but then _surpassing_ somehow difficult. Everyone is lying or
         | mistaken about 3.5 /4 data getting into their corpus, and the
         | sugar-rush of imitation learning fools you into thinking you're
         | making a lot of progress, even when your overall approach
         | sucks. (As Andrej Karpathy notes, neural nets _want_ to work,
         | and so even if you have serious bugs in your code, they will
         | still work pretty well - and simply permanently fall short of
         | their true potential. Cautionary recent example:
         | https://twitter.com/karpathy/status/1765473722985771335 )
        
           | visarga wrote:
           | > 01.AI claims that, the architecture aside, they didn't
           | simply further train LLaMA models but trained from scratch.
           | You can decide for yourself how much you are willing to
           | believe this.
           | 
           | You can't hide this. The latent space remains mostly fixed
           | after pre-training. It all depends on the seed for the
           | initial random init. Further pre-training won't move it
           | enough. Because of this property, you can even average two
           | fine-tunings from the same parent model, but never on models
           | trained from different seeds.
        
             | sroussey wrote:
             | the averaging seems like good test for who the parent is.
        
             | gwern wrote:
             | I don't know anyone has properly analyzed this, nor how
             | robust such methods are if one is trying to cover it up.
             | Also, I doubt anyone has analyzed the scenario where the
             | warm-started model is then extensively trained for
             | trillions of token (possibly with a cyclical LR)
             | particularly in Chinese - the latent spaces are _not_
             | perfectly aligned Chinese /English and I'd expect that to
             | change it a lot. (The point of this would be that
             | 'cheating' by warm-starting it should let you avoid a lot
             | of training instabilities and issues early in training, and
             | may get you better quality at the end.)
        
       | dannyw wrote:
       | Can play around with the model here:
       | https://replicate.com/01-ai/yi-34b-chat
       | 
       | Very slow, but this is unquantized and there's probably a lot of
       | demand.
        
       | mg wrote:
       | Hmm.. it fails for my favorite test prompt:
       | 
       | https://www.gnod.com/search/ai#q=Two%20cars%20have%20a%20100...
       | 
       | I gave it 3 tries and each time, Yi picked one of the cars as the
       | winner.
       | 
       | I've been watching for many months now, how LLMs got better and
       | better at solving it. Many still struggle with it, but the top
       | ones nowadays mostly get it right.
        
         | appplication wrote:
         | On one hand, I don't really understand why anyone would expect
         | an LLM to solve logic puzzles. The only way it can do so is not
         | through reasoning, but by having been trained on a structurally
         | similar puzzle.
         | 
         | On the other hand, it does feel fun that the top ones appear to
         | solve it, and I understand why it feels cool to have a computer
         | that appears to be capable of solving these puzzles. But
         | really, I think this is just specificity in training. There is
         | no theoretical or empirical basis for LLMs having any reasoning
         | capability. The only reason it can solve it is because side the
         | creators of these top models specifically trained the models on
         | problems like this to give the appearance of intelligence.
        
           | mg wrote:
           | There might be no reasoning in a single pass which outputs a
           | single token. But in the loop where the output of the LLM
           | repeatedly gets fed back into its input, reasoning is clearly
           | happening:
           | 
           | The LLMs lay out how to go about figuring out the answer, do
           | a series of calculation steps and then come up with an
           | answer.
           | 
           | If you add "Please answer in just one short sentence." to the
           | prompt, even the top ones get it wrong.
        
             | visarga wrote:
             | Reasoning is also an iterative process. Besides scaling in
             | response length, the model can also get multiple feedbacks
             | from outside to correct itself.
        
             | spyder wrote:
             | Yep, humans too have to think before answering most non-
             | trivial questions, and especially the ones that include
             | calculations. So it seems "obvious" that we should try to
             | to give LLMs too some time to think before answering, for
             | example with the popular methods of asking for step-by-step
             | thinking, thinking out loud, and only giving the final
             | answer at the end, and also asking it to proofread and
             | correct it's answers at the end all can help with that.
             | 
             | Pause tokens (thinking tokens) are also an interesting
             | method to achieve that and seems to have a positive effect
             | on performance:
             | 
             | https://arxiv.org/abs/2310.02226
        
           | visarga wrote:
           | > There is no theoretical or empirical basis for LLMs having
           | any reasoning capability.
           | 
           | Yes there is. Learning to predict the next token implies a
           | lot of things, among which is also logical reasoning. The
           | chain-of-thought approach shows that when you stimulate this
           | behavior, you get higher accuracies.
        
           | xcv123 wrote:
           | > There is no theoretical or empirical basis for LLMs having
           | any reasoning capability.
           | 
           | Deep learning models are specifically designed for automatic
           | pattern recognition. That includes patterns of reasoning and
           | problem solving.
           | 
           | > The only reason it can solve it is because side the
           | creators of these top models specifically trained the models
           | on problems like this to give the appearance of intelligence.
           | 
           | That's not how deep learning works, and not how machine
           | learning works in general. The models can automatically
           | recognize patterns of reasoning then apply those methods to
           | problems it has never seen before.
           | 
           | > The only way it can do so is not through reasoning, but by
           | having been trained on a structurally similar puzzle.
           | 
           | This is a fundamental misunderstanding of how it works. The
           | large deep learning models have 100+ layers, modelling
           | extremely abstract features of the data, which include
           | abstract patterns of problem solving and reasoning. They are
           | not simply regurgitating training examples.
        
           | xcv123 wrote:
           | > There is no theoretical or empirical basis for LLMs having
           | any reasoning capability.
           | 
           | Geoffrey Hinton - Mapping Part-Whole Hierarchies into
           | Connectionist Networks (1990)
           | 
           | https://www.cs.toronto.edu/~hinton/absps/AIJmapping.pdf
           | 
           | "The paper, titled "Mapping Part-Whole Hierarchies into
           | Connectionist Networks" (1990), demonstrated how neural
           | networks can learn to represent conceptual hierarchies and
           | reason about relations like family trees.
           | 
           | Specifically, Hinton showed that by training a neural network
           | on examples of family relationships (parent-child,
           | grandparent-grandchild, etc.), the network was able to
           | accurately model the inherent logical patterns and reason
           | about new family tree instances it had not encountered during
           | training.
           | 
           | This pioneering work highlighted that instead of just
           | memorizing specific training examples, neural networks can
           | extract the underlying logical rules and reasoning patterns
           | governing the data. The learned representations captured
           | abstract concepts like "parent" that enabled generalizing to
           | reason about entirely new family tree configurations."
        
           | stevenhuang wrote:
           | Your assertion that LLMs cannot reason is some exquisite
           | irony considering the extensive theoretical foundation in
           | support of the idea.
        
         | AndrewKemendo wrote:
         | I asked my 12 year old son to solve this prompt.
         | 
         | His answer was "Neither win" and it took him 1 minute and 24
         | sec using no pre-defined algorithm or heuristic.
         | 
         | He said his process of thoughts was:
         | 
         | "I figured it would take 10 hours for car A to finish 100 miles
         | and it would take twice that long for car B. Since Car B is
         | already halfway there when car A starts, then they would arrive
         | together"
         | 
         | I as 40 year old man, approached it intentionally naively (eg.
         | I did not go looking for an optimal solver first) by making a
         | drawing and attempting to derive the algorithm. It took me ~3
         | minutes to come to the same conclusion but at the end I had a
         | series of equations, but no algebraic proofs.[1]
         | 
         | So now you have a human child reference metric if you want it.
         | 
         | [1]https://twitter.com/AndrewKemendo/status/1766872572300235022
        
         | mattstir wrote:
         | Interestingly, GPT-4 also fails to correctly solve this prompt,
         | choosing car A each time after multiple tries for me. I tend to
         | find that models struggle with such logic puzzles when using
         | less common phrasing (e.g., two cars "having" a race instead of
         | participating in one, "headstart" instead of "head-start",
         | etc).
         | 
         | GPT-4 correctly solved the problem when it was reworded to:
         | "There is a 100 mile race with two participants: car A and car
         | B. Car A travels at 10 miles per hour but does not begin
         | driving immediately. Car B travels at 5 miles per hour and is
         | given a 10 hour head-start. After 10 hours, car A begins to
         | move as well. Who wins the race?"
        
       | theptip wrote:
       | "01.ai" is not a very auspicious name; 01 was the first AI
       | nation, that eventually waged war with humanity and then enslaved
       | them in The Matrix.
        
         | riku_iki wrote:
         | > that eventually waged war with humanity
         | 
         | I think humanity waged war on 01
        
           | ben_w wrote:
           | One thing I assumed when watching the Animatrix, but never
           | had confirmed, was that the name "01" was chosen because it
           | sounds a bit like "Zion".
        
         | acjohnson55 wrote:
         | I, for one, welcome our digital overlords.
        
       | d-z-m wrote:
       | there's also a new Yi model, Yi-9B[0].
       | 
       | [0]: https://huggingface.co/01-ai/Yi-9B
        
       | gyre wrote:
       | Potentially interesting on the alignment front: In my experience
       | the yi-6b model running on ollama is more likely to refuse
       | politically sensitive queries (relating to Tiananmen Square, Peng
       | Shuai's disappearance, etc) when asked in Chinese, and more
       | likely to provide information when asked in English. I wonder if
       | this difference falls out naturally from available training data,
       | is a deliberate internationalization choice, or is just noise
       | from the queries I happened to run.
        
         | Havoc wrote:
         | Could also be both. Training data organically creating the
         | difference but with an additional layer of specific alignment
         | on top too
        
         | mattstir wrote:
         | I noticed similar behaviour in an older model (Skywork 13B) a
         | few months back. When asked in Chinese, it would politely say
         | that nothing of note occurred when responding to queries about
         | Tiananmen Square, etc. In English, it would usually respond
         | truthfully. It was deliberate in the case of Skywork, based on
         | their model card
         | (https://huggingface.co/Skywork/Skywork-13B-base):
         | 
         | > We have developed a data cleaning pipeline with great care to
         | effectively clean and filter low-quality data and eliminate
         | harmful information from text data.
         | 
         | I'd imagine it's likely similar for Yi.
        
           | BoorishBears wrote:
           | Huge jump to go from that line in the model card to it being
           | intentional from the model's creators.
           | 
           | China censors those events. They pre-trained with a specific
           | focus on Chinese text, and integrated more native Chinese
           | text than most models do.
           | 
           | Doesn't require any additional filtering on their behalf to
           | have the model reflect that, and if anything the fact that
           | they're mentioned in english implies the opposite of your
           | hypothesis.
           | 
           | If they were going to filter Tiananmen Square, the lift to
           | filter it in English would not be any higher.
        
         | arijun wrote:
         | I wonder if you could use the multilingual capabilities to
         | workaround it's own censorship? I.e. what would happen if you
         | asked it to translate the query to English, asked it in
         | English, and then asked it to translate back to Chinese.
        
         | advael wrote:
         | This may be a useful workaround, but it also forms the
         | strongset argument I've yet seen so far against claims that
         | LLMs do something like "understanding" or "an underlying world
         | model". Maybe models knowing the same facts in different
         | languages, especially across political controversy, might form
         | a good benchmark to evaluate
        
       | GaggiX wrote:
       | Yi-34B is the LLM used by LLaVA-1.6 (also known as LLaVA-NeXT)
       | and it's by far the best open source large multimodal models,
       | demo: https://llava.hliu.cc/
        
       | zone411 wrote:
       | Yi 34B Chat has not done well on my new NYT Connections benchmark
       | and it's only in the 22nd place on the LMSYS Elo-based
       | leaderboard (151 Elo below GPT 4 Turbo). It's doing better in
       | Chinese. When it comes to models with open-sourced weights, Qwen
       | 72B is clearly stronger.
        
         | Yenrabbit wrote:
         | Ooh I also use connections as a benchmark! It tends to favour
         | things with 'chain of thought' style reasoning in the training
         | mix somewhere since directly producing the answer is hard. Do
         | you have public code you could share?
        
       | gpjanik wrote:
       | I understand that all these new models are an attempt to catch up
       | with GPT-4, but frankly speaking, in the current shape and form,
       | they're almost entirely useless.
       | 
       | I frantically tried anything available on Groq to improve
       | performance of my GPT-4 based chatbot - it's incomparably bad -
       | and the more of them I see, the more I believe OpenAI has
       | fundamentally no competition at all at the moment.
       | 
       | No exception with the above, also pretty bad (IMHO worse than
       | GPT-3.5).
        
       | yumraj wrote:
       | Given that this is a Chinese model, I'm genuinely curious if
       | researchers have been evaluating risk that these models could be
       | used for soft propaganda or similar purpose?
       | 
       | As others have reported, English and Chinese queries return
       | different replies on topics that are not kosher in China.
       | 
       | What's the risk that such models could be used for nefarious
       | purposes by providing propaganda/biased/incorrect/... responses
       | that on a cursory glance seem factual.
        
         | ithkuil wrote:
         | At the very least models will exhibit the bias present in the
         | underlying training text and on top of that there will be a
         | bias imposed by those wanting to correct the unwanted bias
         | present in the underlying training text, possibly swinging the
         | pendulum too far in the other side.
         | 
         | I have the feeling you're asking something more specific,
         | something more of a direct interference coming from politics
         | and not just the natural "point of view" about various topics
         | that are present in the chinese training corpora that is
         | understandably different from western corpora.
         | 
         | Do you have anything specific in mind about something that you
         | expect the Chinese government to feed as propaganda that is not
         | already widely being sculpted into the chinese text corpora
         | available on the internet?
        
           | yumraj wrote:
           | > I have the feeling you're asking something more specific...
           | > Do you have anything specific in mind about something that
           | you expect the Chinese government to feed as propaganda that
           | is not already widely being sculpted into the chinese text
           | corpora available on the internet?
           | 
           | I don't have anything specific, and it doesn't have to be
           | different from _" chinese text corpora available on the
           | internet"_, it's just that these models can become yet
           | another channel of distribution for the _" chinese text
           | corpora available on the internet"_ especially if they are
           | unknowingly/naively picked up and used as the foundation by
           | others to build their offerings.
        
             | maxglute wrote:
             | PRC will definitely weaponize this for mass foreign
             | propaganda, which up until now PRC has been thoroughly
             | deficient in, despite all the reees of 50cents on western
             | net. The reality pre-LLM is PRC propaganda on western
             | social media platforms has been very limited in scale for
             | the simple reason that they are not wasting valuable
             | English fluency to shit post on western platforms enmass.
             | Low 100s-1000s of accounts, most of which target diasphora
             | in spambot efforts, frequently in Chinese. Now that LLM has
             | made it cheap to spam passable English/foreign languages,
             | I'd expect increased volumes of PRC propaganda on western
             | social media where anonymous posting is asymmetrically
             | easier. But then again, they don't need a PRC LLM for that,
             | plenthy of US bad posting on western platforms from
             | international audiences and US herself.
        
               | ithkuil wrote:
               | > they are not wasting valuable English fluency to shit
               | post on western platforms enmass
               | 
               | How's the economy of that different for the Russian
               | campaigns? Do they have a larger pool of English fluency
               | to draw from or is the urgency of the operation higher in
               | their case?
        
               | maxglute wrote:
               | A PRC person with English fluency good enough to blend in
               | with native English on western platform has much better
               | job opporunities. Even in PRC, 50c posts are largely
               | civil servants told to write a few perfunctory
               | platitutdes on domestic platforms. The MO is to overwhelm
               | with spam not engage where effort:return is low. Like
               | even Ministry of Foreign Affairs and most of thinktanks
               | that also publish in English can rarely find people to
               | write "casual" English. You'd have to write 1000s of
               | "50c" comments for 1 hour of English tutoring gig. The
               | economics of it doesn't make sense pre LLM.
        
               | ithkuil wrote:
               | Does it mean that in Russia of 10 years ago a person with
               | the same english language skills would not be able to
               | find a better job or does it mean that the troll farms
               | pay more? Or is it just patriotism?
               | 
               | (I genuinely would like to learn more about this topic)
        
         | TaylorAlexander wrote:
         | It's a fair question, but one we should be asking about all
         | models, perhaps especially our own. It's of course easier to
         | see the propaganda of foreign cultures, and this should be
         | investigated, but let's not let ourselves believe that a model
         | is more likely to contain propaganda because it is Chinese. It
         | will just contain propaganda that is easier for us to see as
         | propaganda.
         | 
         | Noam Chomsky and Edward Herman wrote extensively about
         | propaganda in democratic societies in their 1988 book
         | Manufacturing Consent. A nice introductory excerpt is here, and
         | the first two or three paragraphs are enough to begin to see
         | the argument:
         | 
         | https://chomsky.info/consent01/
         | 
         | Put as briefly as possible: propaganda in totalitarian
         | societies is simpler. They just use force to remove people who
         | say the wrong things, and state media to broadcast the "right
         | things". In democratic societies, institutional power still
         | wants to protect itself, and this is achieved through more
         | complex means, but it is nonetheless still rather effective.
        
           | yumraj wrote:
           | > It's a fair question, but one we should be asking about all
           | models, perhaps especially our own. It's of course easier to
           | see the propaganda of foreign cultures, and this should be
           | investigated, but let's not let ourselves believe that a
           | model is more likely to contain propaganda because it is
           | Chinese. It will just contain propaganda that is easier for
           | us to see as propaganda.
           | 
           | Yes, but in this particular case I'm coming from a viewpoint
           | where I view China as a hostile power. So, at the moment, my
           | worry is about that.
           | 
           | In future, if US slips into authoritarianism, which TBH it
           | might depending on the outcome of the next election, what you
           | note would become a very real problem.
           | 
           | So, putting it differently and more neutrally, is there any
           | research being done on evaluating a political, and other,
           | bias in a model or is it just being all put in the bucket of
           | _hallucination_?
        
             | TaylorAlexander wrote:
             | > if US slips into authoritarianism
             | 
             | The point of Chomsky's work in this case is to show that
             | authoritarianism does not make propaganda more or less
             | likely, it just changes the means by which propaganda is
             | created and reinforced. Chinese propaganda is easier to
             | identify as a foreigner, but the propaganda of your home
             | country has a much more significant effect on your life.
             | The nature of living with pervasive propaganda is that it
             | is hard to see or consider how your life would be different
             | without the propaganda, and that's what makes it so
             | dangerous.
             | 
             | > is there any research being done on evaluating a
             | political, and other, bias in a model or is it just being
             | all put in the bucket of hallucination?
             | 
             | It's a better question, and again one that we should ask
             | regardless of the model's origins.
        
               | seanmcdirmid wrote:
               | Wouldn't have more to do with propaganda that reinforces
               | and caters to your cognitive biases being less detectable
               | than propaganda that doesn't? Even inside America, I'm
               | pretty resistant to FoxNews propaganda but if CNN has
               | any, it isn't registering much on my propaganda
               | detectors.
        
       ___________________________________________________________________
       (page generated 2024-03-10 23:00 UTC)