[HN Gopher] YaLM-100B: Pretrained language model with 100B param...
___________________________________________________________________
YaLM-100B: Pretrained language model with 100B parameters
Author : f311a
Score : 674 points
Date : 2022-06-23 09:00 UTC (14 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| londons_explore wrote:
| For those of us without 200GB of GPU RAM available... How
| possible is it to do inference loading it from SSD?
|
| Would you have to scan through all 200GB of data once per
| character generated? That doesn't actually sound too painful - 1
| minute per character seems kinda okay.
|
| And I guess you can easily do lots of data parallelism, so you
| can get 1 minute per character on _lots_ of inputs and outputs at
| the same time.
| julienfr112 wrote:
| What about 250gb of ram and use a cpu ?
| hnechochamber2 wrote:
| $ dd if=/dev/zero of=/swapfile bs=1G count=250
| status=progress $ chmod 600 /swapfile $
| mkswap -U clear /swapfile $ swapon /swapfile
| jstimpfle wrote:
| Is there a reason why it is required to fill the swapfile
| with zeroes here? Normally you'd see something like "dd
| of=/swapfile bs=1G seek=3 count=0", creating a file of size
| 3G but with no space allocated (yet). It's much quicker to
| complete the setup this way.
| wongarsu wrote:
| I assume if you force the file system to allocate inodes
| you are likely to have a less fragmented file than if you
| create a sparse file that gets inodes assigned over time
| when each part is used.
| olddustytrail wrote:
| Interesting guess but wrong I'm afraid :)
|
| It's simply because it's an easy way to create a file of
| a certain size that most Linux users would be familiar
| with.
|
| The quicker way (and possibly more "proper" way) is to
| use fallocate, but who has even heard of that vs dd ?
| theblazehen wrote:
| Which won't matter on SSDs
| wongarsu wrote:
| On all the benchmarks of SSDs I've seen they perform 1.5
| to 4 times better on sequential reads than on random
| reads. That's a much better ratio than HDDs, but still
| enough to care about it.
|
| You're also likely to get less write amplification if
| your swap file is continuous.
|
| Of course with all the layers of indirection it's a
| numbers game, you don't know if your file system
| allocates adjacent inodes, and you don't know how your
| SSD will remap the blocks. But all else being equal,
| trying to make the file as sequential as possible seems
| preferable.
| pflanze wrote:
| If you bother to set the permissions, I suggest to do it in
| a way that doesn't leave a time window during which it
| still is unprotected (note that non-priviledged processes
| just need to open the file during that window; they can
| keep reading even after your chmod has been run). Also, not
| sure what the point of `-U clear` was, that's setting the
| uuid for the swap, better leave it at the default random
| one? $ ( umask 077; dd if=/dev/zero
| of=/swapfile bs=1G count=250 status=progress ) $
| mkswap /swapfile $ swapon /swapfile
| Aardwolf wrote:
| Way too slow on CPU unfortunately
|
| But this does make me wonder if there's any way to allow a
| graphics card to use regular RAM in a fast way? AFAIK built-
| in GPU's inside CPU's can but those GPU's are not powerful
| enough
| julienfr112 wrote:
| Slow, but is it still practical, like taking minutes to
| generate few words ca still be useful for testing or on
| certain low usage use-cases ?
| easytiger wrote:
| I thought cuda had a unified memory system? Maybe I
| misunderstood
| Cu3PO42 wrote:
| Unified memory exists, but it's not a magic bullet. If a
| page is accessed that doesn't reside on device memory
| (i.e. on the GPU), a memcpy is issued to fetch the page
| from main RAM. While the programming model is nicer, it
| doesn't fundamentally change the fact that you need to
| constantly swap data out to main RAM and while not as bad
| as loading it from the SSD or HDD, that's still quite
| slow.
|
| Integrated GPUs that use a portion of system memory are
| an exception to this and do not require memcpys when
| using unified memory. However, I'm not aware of any
| powerful iGPUs from Nvidia these days.
| easytiger wrote:
| Sure. Makes sense. So I guess for discrete GPUs the
| unified memory stuff provides a universal address space
| but merely abstracts the copying/streaming of the data.
|
| There does seem to be a zero copy concept as well and
| I've certainly used direct memory access over pcie before
| on other proprietary devices.
|
| https://docs.nvidia.com/cuda/cuda-c-best-practices-
| guide/ind...
| yarandex wrote:
| Assuming running on CPU is memory-bandwidth limited, not
| CPU-limited, it should take about 200GB / (50GB/sec) = 4
| seconds per character. Not too bad.
| lostmsu wrote:
| That's per token. And you can generate quite a few per
| pass.
| [deleted]
| toxik wrote:
| These models are not character-based, but token-based. The
| problem with CPU inference is the need for random access to 250
| GiB of parameters, meaning immense paging and orders of
| magnitude slower than normal CPU operation.
|
| I wonder how bad it comes out with something like Optane?
| amelius wrote:
| It's not really random access. I bet the graph can be
| pipelined such that you can keep a "horizontal cross-section"
| of the graph in memory all the time, and you scan through the
| parameters from top to bottom in the graph.
| toxik wrote:
| Fair point, but you'll still be bounded by disk read speed
| on an SSD. The access pattern itself matters less than the
| read cache being << the parameter set size.
| lostmsu wrote:
| Top SSDs do over 4GB/s so you can infer in 50 seconds if
| disk bound.
|
| You can also infer a few tokens at once, so it will be
| more than 1 char a minute. Probably more like sentence a
| minute.
| toxik wrote:
| You can read bits at that rate yes, but keep in mind that
| it's 250 GiB /parameters/, and matrix-matrix
| multiplication is typically somewhere between quadratic
| and cubic in complexity. Then you get to wait for the
| page out of your intermediate result etc etc.
|
| It's difficult to estimate how slow it would be, but I'm
| guessing unusably slow.
| lostmsu wrote:
| The intermediate result will all fit into a relatively
| small amount of memory.
|
| During inference you only need to keep layer outputs
| until the next layer's outputs are computed.
|
| If we talk about memory bandwidth, it is space
| requirements that are important, not so much time
| complexity.
| guywhocodes wrote:
| I wonder if you can't do that LSH trick to turn it into a
| sparse matrix problem and run it on CPU that way.
| nmfisher wrote:
| That's pretty much what SLIDE [0] does. The driver was
| achieving performance parity with GPUs for CPU training,
| but presumably the same could apply to running inference
| on models too large to load into consumer GPU memory.
|
| https://github.com/RUSH-LAB/SLIDE
| sharmin123 wrote:
| egorfine wrote:
| I have huge respect for developers at Yandex. It's kind of sad
| that achievements like these are tainted by the fact that they
| come from Russia (and I speak as a Ukrainian). I wonder if the
| permissive license is able to mitigate that.
| f6v wrote:
| The achievements aren't in any way tainted by their
| nationality, citizenship, sex, sexual orientation, age, etc.
| toyg wrote:
| Well... I'm sorry if I reach for the reductio at Hitlerum,
| but any achievements Nazi scientists might have reached in
| concentration camps are definitely tainted. Similarly,
| achievements in the field of online consumer analysis in a
| country where consumer-privacy protections are nonexistent,
| surely should be considered tainted...?
| f6v wrote:
| > consumer-privacy protections are nonexistent
|
| And you can back this up how?
| pastacacioepepe wrote:
| Wow. Yes you should have refrained from this. You are
| comparing Nazi scientists who killed many innocents to some
| software engineers working on a cool project and releasing
| it for free to the world.
|
| What is your problem?
| f6v wrote:
| Just FYI regardless of your stance on any of the recent
| conflicts. De-humanization is a primary tool in
| information warfare these days.
| p1anecrazy wrote:
| Wernher von Braun would disagree
| amelius wrote:
| I suppose the question should be: did the malevolence help
| in reaching the results?
| aaaaaaaaata wrote:
| > definitely tainted
|
| Operation Paperclip...?
| FpUser wrote:
| That is why the US imported Nazi scientists in bulk to work
| in their labs. Starting with Wernher von Braun who had
| become the heart of the US space program. Soviets did the
| same at the time.
|
| If you are so conscious about consuming tainted fruits the
| only way to escape is to be living on some deserted island
| catching your own food.
| kamray23 wrote:
| They are and we use them all the same. Rockets fly almost
| every week now, jet engines are the most common form of
| propulsion, tons of medicine forcibly tested on innocent
| people is on the market, and to pass up any of that
| technology would be pure idiocy.
| dTal wrote:
| >tons of medicine forcibly tested on innocent people is
| on the market
|
| What medicine?
| skrebbel wrote:
| And Yandex's AI work got helped by the Russian invasion of
| Ukraine how, exactly? Did they train the bots on Ukrainian
| captives first?
| The_Colonel wrote:
| It's the other way round. It's quite possible that e.g.
| this model helped to spread Kremlin propaganda in various
| discussion boards.
| Svoka wrote:
| Yandex search, which works on top of their AI is straight
| out propaganda machine for Russian government. Every time
| you go to yandex.ru, you're greeted with curated happy
| news about how Ukrainians are killing themselves, and
| russians are not fascists at all.
|
| Their government does. They empower it. Just to be clear,
| it is their army, and their government doing the deed.
| They elected, they pay salaries to.
| lotusmars wrote:
| Its developers stayed complicit with a company with a
| biggest propagandist pro-war resource in a country
| (Yandex News).
| trasz wrote:
| I'd guess they've been getting a whole lot more traffic
| lately, thanks to ban on Western services.
| geek_at wrote:
| the age old question if the art should be linked or
| disconnected from the artist
| somenameforme wrote:
| This is nothing like that, because the question is not one
| of their own personal actions - but of their nationality or
| ethnicity. That, until about 4 months ago, would have been
| widely acknowledged as racism.
|
| The difference between holding values, and holding values
| when convenient rather sums up the entirety of human
| history in one phrase.
| trasz wrote:
| Yeah, it's certainly about ethnicity, not at all about
| being controlled by a government which is in the process
| of perpetrating genocide.
| 5e92cb50239222b wrote:
| I'm ethnically Russian (mostly), although I've never been
| to that country and have less influence on their foreign
| policy than your average European (who at least has some
| say in how his own country behaves towards Russia -- and
| we've seen how well they managed that). I don't know how
| this would translate to the real world if I lived in "the
| West", but from what I'm seeing on the internet for the
| past few months, it definitely is about ethnicity. I've
| been called many things and blamed for everything bad
| that has happened since 1945, and not many seem to care
| that half of Putin's army consists of people of Asian and
| Caucasian ethnicities, and there are many Russians in the
| Ukrainian army. If you go to places like r/worldnews,
| there are open calls for violence that have strong
| fascist overtones, and those seem to be getting more
| popular.
|
| Can't say the same about HN, it's one of the few places
| that seems to have kept its sanity (for now?)
| trasz wrote:
| How is your Russian ethnicity different from average
| Ukrainian?
|
| But yes, there is, sadly, some discrimination. It's got
| nothing to do with ethnicity though; it's the same thing
| that happened to ordinary Germans in 1939-1945, and for
| the same reasons.
| rapnie wrote:
| and if technology is neutral.
| [deleted]
| egorfine wrote:
| Some people see it differently.
| danuker wrote:
| Coming from Russia doesn't mean you agree with government
| policy. If you saw people get arrested as soon as they start
| protesting, what would you do?
| vegai_ wrote:
| > If you saw people get arrested as soon as they start
| protesting, what would you do?
|
| Probably move if that's anywhere near a possibility, and if
| not, cowardly stay as unnoticeable as possible. I know that
| at least in Finland Russian refugees are mostly welcome
| (although the border might be closed right now), even if they
| probably will face a lot of scrutiny from various authorities
| for obvious reasons. Most certainly it's nothing like the
| attention such people would face in Russia.
|
| We fondly remember even the smallest acts of defiance that
| ordinary Germans acted out against their regime during
| 1933-1945. We all would like to be those people in times of
| crisis, but obviously most of us are not. They were probably
| ultimately pretty futile acts during that time, though, but
| put together with all the other actions that happened against
| the Nazis played a significant grander role. And we know that
| more than a few significant Jewish scientists and engineers
| fled from Nazi Germany and made significant contributions to
| the war effort. For instance, one guy called Einstein.
| 5e92cb50239222b wrote:
| Move where, exactly? I have a few friends in Russia and
| they sit on their asses because there's nowhere for them to
| go. You don't exactly qualify as a refugee unless the state
| is after you (which you can trigger very easily, but then
| you may not be able to leave the country), and even then
| it's not a given.
| martin_a wrote:
| It would probably fit better to compare "small actions" to
| the people of the DDR which were protesting against their
| regime and ultimately helped to bring it down.
| artemonster wrote:
| But as a company, they do. They are filtering alternative
| media from their search.
| hnechochamber2 wrote:
| Not much they can do if the alternative is go to jail.
| guerrilla wrote:
| They're just following the law of their host country, like
| DuckDuckGo and Google have to... What's the alternative?
| Open rebellion against the state?
| chupasaurus wrote:
| They've started to do so before the law was adjusted
| guerrilla wrote:
| Like Facebook, Twitter and every other forum did? I
| wonder if there were any consequences if they didn't.
| chupasaurus wrote:
| Basically they've taken a step further with censorship of
| all media not controlled by government which at the time
| (2014) couldn't been penalized whatsoever.
| guerrilla wrote:
| And what if they hadn't?
| chupasaurus wrote:
| "Couldn't be penalized whatsoever" wasn't enough? In
| broader view, government could start taking hostages like
| they did with Google last December, but that wouldn't
| help reaching the goal even a small bit back then since
| there was no leverage on technical side of things.
| lotusmars wrote:
| No, they just removed all but pro-Putin or state media
| out of Yandex News. It's just only pro-war Kremlin
| propaganda now.
| iillexial wrote:
| Yandex is under full control of Russian government. Pretty
| sure FSB can access anything.
| lotusmars wrote:
| Yes, similar to VK where low-level police can retrieve
| all of your location, messaging, IP data and sell it to
| mafia or thugs.
|
| Russian police is notorious for selling logs, private
| data or access and also selling databases on a black
| market.
| mojuba wrote:
| Genuinely curious: which media outlets are banned from
| Yandex Search? I just tried Meduza for example, comes up in
| search just fine.
| aliher1911 wrote:
| Yandex has a news block right in the middle of its front
| page. Only "approved" news sources could be shown there.
| For people not specifically searching for recent updates
| on a topic or particular media outlets it represents the
| news. As far as I remember they wanted to remove this
| block instead of completely instead of censoring, but
| they were not allowed to. I think it was described in one
| of the documentaries about Yandex, but my memories are
| vague now.
| mojuba wrote:
| I know, but the OP meant Search, not News. Google and
| even DDG can downrank certain news sources and I wouldn't
| be very surprized if Yandex did too, but I'd really like
| to see examples.
| [deleted]
| Thorentis wrote:
| Doesn't mean you can't be punished for the actions of the
| government though. See: Western companies and government
| pulling out of Russia or issuing sanctions on private
| individuals. I think it's even worse to say "we don't think
| you're guilty, but we do think you should be punished."
| baxtr wrote:
| "We want to punish you" and "We don't want to make business
| with you" are two very different things IMHO.
| pastacacioepepe wrote:
| If you go out of your way to not do business with
| someone, in order to cripple them economically, then it
| is a punishment.
| layer8 wrote:
| The purpose is deterrence, not punishment.
| pastacacioepepe wrote:
| And yet sanctions never work as a deterrent. Cuba is
| still socialist after 60 years of sanctions. Great
| deterrent! No, sanctions just punish generation after
| generation of innocent people and serve no other purpose.
|
| If you still mantain that the purpose is deterrence, then
| you must be a fool or worse, since it never works! Can't
| you learn from the past?
|
| Edit:
|
| Before you even dig out some article like this: https://w
| ww.washingtonpost.com/news/worldviews/wp/2014/04/28...
|
| I will make an important point: sanctions CAN work only
| to prevent something from happening. Once that happened,
| e.g. the war started, there's no sanction that can stop
| it and that's exactly when sanctions stop being a
| deterrent and start being a punishment.
| The_Colonel wrote:
| > No, sanctions just punish generation after generation
| of innocent people and serve no other purpose.
|
| They are pretty effective at preventing future wars. This
| war happened only because Russia wasn't crippled enough
| after 2014.
| layer8 wrote:
| Obviously you have a much stronger opinion on that than I
| do, but at the very least, sanctions should deter other
| countries from acting in a similar fashion. For example,
| if -- hypothetically -- China is considering invading
| Taiwan, they will have to factor in that the Western
| world will stop doing business with them. If the West
| hadn't put sanctions in place for Russia, that would
| lessen the concern for China. Maybe you think that isn't
| worth it -- that's a valid personal judgement of course.
| trasz wrote:
| And the fact that Russia is not capable of manufacturing
| tanks anymore has nothing to do whatsoever with
| sanctions?
| baxtr wrote:
| If you want to define punishment like that, it is your
| call.
|
| In my opinion it is not punishment to stop a relationship
| if the basis of that relationship was destroyed
| deliberately by one side.
| pastacacioepepe wrote:
| I don't define it, it's exactly the official definition,
| like it or not:
|
| > punishment: the infliction or imposition of a penalty
| as retribution for an offence.
|
| Then you can play all you want with language to make it
| say what you'd like but it's pointless.
| baxtr wrote:
| An "offence" as defined in a criminal code? Two countries
| don't share a criminal code AFAIK.
| trhway wrote:
| > Western companies and government pulling out of Russia
|
| you can't really blame the companies for not wanting to be
| associated in any way with a nazist regime genociding a
| neighboring country.
|
| And Starbucks or Mercedes pulling out of Russia isn't
| punishment. It is freedom of association and economic
| activity. Russians are whining about "punishment" because
| they have no idea about freedom, and that is them getting a
| bit of taste of it. They think they can plaster whole
| country with their swastika - "Z" - in enthusiastic support
| of the bloody genocide while the whole word shouldn't be
| able to express its disgust at those happenings.
|
| Especially funny how Russians are cry-baby style whining
| about supposed violation of their property rights by the
| West sanctions while Russians have been violating property
| rights of more than 40 million of Ukrainians (even if we
| don't consider all the mass killing and raping of civilians
| that Russians have been doing there). The deep and profound
| disintegration of any morals in my old country is stunning.
|
| >or issuing sanctions on private individuals.
|
| due to the size of their wealth and the de-facto rules of
| economic activity in Russia those aren't private wealth of
| private individuals - they are integral part of that nazist
| regime, and thus they are guilty too.
|
| And for the sanctioned Russian government officials - that
| is for example the Roskosmos CEO Rogozin, who is one of the
| main founders of the Russian Nazi movement "Motherland"
| (people from which has since taken prominent roles across
| the Russian government and the ruling Party) and who is one
| of the most prominent voices around Putin and the Putin's
| favorite, giving a Nazi salute and the end of his Nazi
| speech at the Russian Nazi march in Moscow. The specific
| phrase they all give Nazi salute to is "Glory to Russia!".
|
| https://youtu.be/xkXVVcPWSU8?t=87
| lotusmars wrote:
| It was hilarious when people from Moscow wrote "they took
| away our ability to buy Chanel bags, so much for European
| tolerance!"
|
| Not even seeing the irony. What did they expect in
| response to bombing, pillaging, mass rape? Friendly hug?
| cpursley wrote:
| Just want to point out the mass rape claims were
| fabricated. Ukraine even fired Lyudmila Denisova over the
| ordeal:
|
| https://greekcitytimes.com/2022/06/01/ukrainian-official-
| fir...
| lotusmars wrote:
| No, there are credible and quite horrifying reports[1].
| Also some of the actual rapists were found out and
| victimes stepped forward. Please don't spew Russian
| propaganda.
|
| [1] https://meduza.io/en/feature/2022/04/18/i-can-do-
| whatever-i-...
| cpursley wrote:
| How is it Russian propaganda? It's been widely reported
| by western sources that Denisova was fired by Ukrainian
| officials for lying specifically about the mass rape
| claims.
|
| I'm not suggesting there have been no rapes or that
| Russians are the good guys. Just that the reports of mass
| rapes were fabricated (Denisova admitted she thought it
| would help Ukraine obtain more sympathy and weapons from
| the west).
| pastacacioepepe wrote:
| If I counted every time a NATO member committed worse
| atrocities but wasn't held accountable at all I'd
| probably stop after Ukraine is conquered.
|
| What did they expect? Probably indifference, same
| reaction to any war crime committed by the west so far.
| lotusmars wrote:
| Say what you want, most people in Eastern Europe, Ukraine
| and even lots of people in Russia would prefer to be
| under protection of NATO.
|
| If it were not for NATO, Russian rapists would already be
| in Tallinn, Vilnius, Helsinki. Claiming they were
| offended by historical injustices therefore women should
| be raped and men shot dead ("denazified").
| lotusmars wrote:
| Yandex is arguably the biggest censoring and propaganda
| machine in Russia.
|
| Yandex News is IIRC the biggest news media in Russia.
|
| It filtered all results on protests and opposition resources
| leaving only government propaganda. Same with war. Filtering
| not meaning downranking. Just straight up not showing.
|
| Editors were fired for not staying in line until it was
| completely sterilized and filled with pro-war propaganda.
|
| Working in Yandex is being complicit with it.
| Svoka wrote:
| Every russian citizen pays for death and destruction in
| Ukraine. With taxes, with national wealth.
|
| What should russians do you ask? Fight. I did it in Ukraine
| in 2004. Then in 2014. I didn't run from cops, I didn't let
| them take my friends. But regardless, now we pay with our
| lives, being subjected to genocide because of russian
| cowardliness.
|
| because so far they are all just paying for the genocide.
| f6v wrote:
| They might as well agree with it. Or agree with some of it.
| It just strikes me how everyone wants to put everyone else
| into these well-defined black-and-white boxes. I get that
| it's simpler, but it's often at odds with reality.
| FpUser wrote:
| >"It just strikes me how everyone ..."
|
| Because most of those "everyone's" are not facing the
| choices themselves and are basically keyboard warriors.
| Let's see what they say when they'll be asked to sacrifice
| their own well being to be on a "high moral ground".
| 5e92cb50239222b wrote:
| I think we're already seeing that in recent French
| elections. I have a feeling this is only the beginning.
|
| As a relatively neutral party in all of this whose
| country hasn't tainted itself (but who nevertheless spent
| all my live under a similar autocracy), I can't help but
| shake my head at keyboard revolutionaries who
| _definitely_ would have overthrown the regime, if only
| they lived in Russia. You just have no freaking idea what
| you 're talking about. Guard your democracy as best you
| can so you don't have to find out.
| varispeed wrote:
| This is a stupid argument. Ukrainians are being _killed_ and
| you compare that to fear of being arrested? It 's a nice
| excuse. "Oh yes I don't support my government, but you know
| these arrests, I'd rather stay in my cosy home and enjoy my
| tea. Now could you please lift the sanctions? I already said
| I don't support my government, why normal people like me
| should suffer?" etc etc
| FpUser wrote:
| >""Oh yes I don't support my government, but you know these
| arrests, I'd rather stay in my cosy home and enjoy my tea."
|
| Why are you so surprised? This is exactly how most of the
| population behaves everywhere. People go about their
| business and "support" criminal actions of their
| governments all the time. This includes the West. Our
| governments have no problems exterminating, starving and
| displacing people (as long as they're the "right" people to
| mess with) while the majority of the population is going on
| merrily about their business. And Europe keeps buying
| Russian stuff even now while "Ukrainians are being killed".
| Where are the mass protests and fights with the police?
|
| Things might change when "messing" with people will ALWAYS
| have the consequences for ANY country. But this is not what
| is happening and is unlikely to change. We have no extra
| terrestrial entity to police us in impartial way.
| lotusmars wrote:
| > Our governments have no problems exterminating,
| starving and displacing people
|
| It's high time to bring "but US bombed Iraq". Classic
| playbook.
| dmpk2k wrote:
| And yet it is still true.
| lotusmars wrote:
| Americans just love to talk about themselves. Who cares
| about Russians under Putin's oppression or Ukrainians
| being exterminated. Let's talk about your government,
| Bush, Trump and Google.
| mardifoufs wrote:
| This is not true. I can assure you that tons of Muslims
| and people from the middle east also care about the fact
| that the same actors who gleefully engineered wars on
| terror that led to a million people dying and entire
| countries getting devastated, with absolutely 0
| consequences for them, are now so very keen to hold other
| people accountable for illegitimate invasions.
|
| No one likes hypocrisy, especially when it is coming from
| the same westerners that at most protested for a few
| weeks back in 2003 when their own countries bombed us for
| 2 decades, that are now calling for other people to get
| arrested and possibly tortured/executed by putin's regime
| because that's just the right thing(tm) to do to stop the
| war. It would be laughable if it wasn't despicable.
| FpUser wrote:
| It is as classic as your own standard script. I've just
| explained what is going on. I did not want to single out
| the US as it happens everywhere. But if you are so touchy
| maybe you should not have "supported" that particular
| subject. Remind me what was your punishment?
| oleg_antonyan wrote:
| Yeah, by being arrested you're doing so much more for the
| Ukrainians
| varispeed wrote:
| More people having to deal with prisoners, potentially
| fewer people going to Ukraine. Certainly better than
| doing nothing.
| aaaaaaaaata wrote:
| Wait until you hear about the folks your country is
| killing!
| varispeed wrote:
| https://en.wikipedia.org/wiki/And_you_are_lynching_Negroe
| s
| mardifoufs wrote:
| Yes posting that Wikipedia link isn't a magic way to
| deflect from the fact that the iraq war led to a million
| people dead. And that people are still dying from the war
| on terror. It's amazing that you just said that people in
| ukraine are still dying, and that just saying that you
| don't support your government from the comfort of your
| couch isn't enough... and then you proceeded to link an
| article specifically so that you can ignore/deflect the
| deaths that are also happening now and that should be
| (according to your own argument) much more important than
| any of your own comfort or even liberty?
|
| "Yes hundreds of thousands of Muslims died and are still
| dying, but bringing it up or asking me to do anything
| about is fallacious! Checkmate"
|
| As you said, who cares about debate tricks when people in
| the middle east are still dying from the war on terror as
| we spead? Why are you holding other people to standards
| that you don't even pretend to hold yourself to? You are
| expecting people to get arrested to prevent deaths and
| talk about the situation in ukraine, but I guess making
| you uncomfortable with "whataboutism" is the limit?
| egorfine wrote:
| I am asking this question myself for 120 days and I still
| don't have an answer.
| lotusmars wrote:
| Maybe not be "apolitical" for 20 years that led to this?
|
| Russian IT and media were showered with relatively high
| wages to stay quiet while last semblance of elections was
| finally destroyed, independent media was took over by pro-
| Putin oligarchs and activists were crushed or murdered.
|
| As was the sarcastic saying coined recently, "if you are
| apolitical then bullets don't hit you".
| 2Gkashmiri wrote:
| https://github.com/yandex/YaLM-100B/blob/main/LICENSE
|
| apache license
|
| >The model is published under the Apache 2.0 license that
| permits both research and commercial use
|
| you should be fine
| pfortuny wrote:
| The achievements cannot be tainted.
|
| Kolmogorov complexity is (I hope) untainted.
|
| Also, Hilbert's problems are not untainted (and he never flew
| Nazi Germany!).
| ask_b123 wrote:
| I think the correct word might be fled -> he never fled.
| pfortuny wrote:
| Yes, sorry.
| puranjay wrote:
| What percentage of American inventions and scientific
| developments post WW2 were led or influenced by former Nazi
| scientists?
| Svoka wrote:
| Probably about same as share as of USSR's. They just were
| open about it. Also, crucial word here is "former". Like,
| there is a big difference between being of a former fascist
| state, and carrying on ongoing genocide.
| niek_pas wrote:
| Are American developers' achievement tainted by the fact they
| come from the United States?
| toyg wrote:
| Some of them, like the invasive analysis tools that exploit
| the Patriot Act, yes for sure.
| martin_a wrote:
| Yeah. US developers might create great software but due to
| the Cloud Act, Patriot Act and whatnot you don't want to use
| it for anything that's not public data at first. It's just
| not protected against unauthorized access.
| kamray23 wrote:
| Often, yes. In many places we're even wary of using US-based
| services at all. The EU has been having a bit of a back and
| forth with the US and many companies because EU law prohibits
| foreign states gaining access to personal information whereas
| US law requires foreign personal information transfers to CC
| the NSA on request. There's some big legislative deadlocks
| where American companies simply cannot operate fully in
| countries other than the US because the US laws require wild
| violations of basic rights to privacy of anyone who isn't a
| US citizen.
|
| Lots of things have to be cleared for backdoors, Intel and
| AMD are scary with their built-in ME and whatever AMD had, I
| can't exactly remember, proprietary hardware in general is
| very scary outside the US due to surveillance and possible
| backdoors, it's kind of weird. Same goes for China, though
| they don't surveil foreigners exactly as hard, at least here
| they don't. It's not exactly an ideal situation and I think
| there should be agreements done internationally on stuff like
| this to keep the US and China out of our devices or allowing
| them to kindly fuck off entirely.
|
| Not exactly the same kind of taint though, your products
| aren't as much morally tainted as they are simply dangerous
| to use, like little telescreens you have to carry around.
| MrBuddyCasino wrote:
| Underrated comment, considering the history of NATO
| expansion, color revolutions and the hundreds of thousands
| killed in pursuit of reckless ideological overt or covert
| warfare.
| bloqs wrote:
| Yikes
| aaaaaaaaata wrote:
| Unsure what this comment means.
| throwaway4good wrote:
| What sort of machine can run this model?
| f311a wrote:
| Nvidia DGX
| throwaway4good wrote:
| I can see it is about 150.000 USD for such a machine. Is this
| the cheapest option out there?
| kamray23 wrote:
| Well you can custom-build a suitable system for the middle
| five digits. It's still not something every idiot can run,
| but most medium to large companies can set up their own for
| sure.
| SXX wrote:
| First of all regardless for political situation this is great
| step in making ML research actually open. So huge thanks for
| those developers who pushed to make it public. Still...
|
| Yandex is in fact share responsibility for Russian government
| actions. While it impossible to fight censorship they could
| certainly shut down their News service completely.
|
| Yandex could also certainly move more of their company and staff
| out of country. It was their deliberate choice stay in Russia and
| getting advantages on local market by using their political
| weight.
| vbezhenar wrote:
| If that would make you happier, Yandex is selling its News
| service to Mail.ru.
| SXX wrote:
| This only happen now after the war began.
| tomp wrote:
| How much responsibility does Google share for US wrecking
| Afghanistan, Iraq, Libya and Syria?
| lostmsu wrote:
| Google doesn't censor antiwar propaganda.
| gambler wrote:
| Blatantly incorrect. Google engages in egregious political
| censorship all the time. Including censorship for Russian
| government and censorship of US anti-war voices.
|
| https://reclaimthenet.org/youtube-responds-to-cpac-
| censorshi...
|
| https://reclaimthenet.org/google-expanded-its-censorship-
| of-...
|
| https://reclaimthenet.org/russia-continues-to-order-
| google-t...
|
| In US they pretend to "decide" to censor things "on their
| own" because 1st amendment prevents the government from
| officially demanding censorship.
| lostmsu wrote:
| None of your links show Google censoring anti-war
| propaganda.
| gambler wrote:
| https://medium.com/dan-sanchez/don-t-see-
| evil-148ae18bc9fe
|
| https://citizenactionmonitor.wordpress.com/2017/08/02/goo
| gle...
|
| https://www.protocol.com/bulletins/google-censors-war
| lostmsu wrote:
| Still no. First case I would not even consider
| censorship. The third one was temporary until Google
| stopped operating in Russia altogether.
|
| A quote from the second one: "cumulative 45 percent
| decrease in traffic from Google searches"
| gambler wrote:
| There is a difference between "Google does not censor
| anti-war content" and "Google does censor anti-war
| content, but usually has an excuse I find acceptable".
|
| When a company puts Jon Lennon's Merry Xmas (War is Over)
| behind age restriction banner[1], the question stops
| being "Is there censorship?" and becomes about the logic
| of such censorship.
|
| _> The third one was temporary until Google stopped
| operating in Russia altogether._
|
| They've censored other things on behest of the Russian
| government _for years_ [2]. Again, I cannot fathom how
| people on a tech website like HN can be unaware of such
| things. This is common knowledge broadly covered on
| mainstream websites.
|
| ---
|
| [1] https://reclaimthenet.org/youtube-john-lennon-war-is-
| over-wa...
|
| [2] https://www.rferl.org/a/google-censors-search-
| results-after-...
| lostmsu wrote:
| Precisely zero of what you mentioned so far is censoring
| anti-war content.
|
| Even in the translation case (which I assume you mean by
| your "excuse" remark) the original source is still
| available as is. I am not even sure from the description
| what translation team it was talking about and what does
| it have to do with Google exactly. "translate company
| text for the Russian market" this passage sounds like it
| talks about translating Google's own interfaces, help
| pages, press releases, or support articles to Russian.
| E.g. no external voice is being censored.
| dang wrote:
| > This is either astounding ignorance or blatant
| gaslighting.
|
| Can you please edit name-calling / swipes like that out
| of your HN comments? It breaks the site guidelines and
| weakens your point.
|
| https://news.ycombinator.com/newsguidelines.html
| lostmsu wrote:
| Considering the importance of the topic, and provided the
| linked articles actually contained examples of Google
| censoring anti-war propaganda, I believe the swipe would
| have been fully justified.
|
| Highly emotional tone changes how the data affects the
| reader. If he is right, I would surely better remember
| next time that Google is in the same ballpark due to the
| insult hitting hard. If he is wrong, I will know better
| to ignore such claims in the future without a direct
| quote or something else that consumes less time than
| reading an entire linked article.
| [deleted]
| kgeist wrote:
| There's no laws in US which punish for spreading antiwar
| propaganda which Google needs to comply with.
| lostmsu wrote:
| There's no law that says Yandex must operate in Russia.
| kgeist wrote:
| Yandex has offices in 8 countries. I wonder if they
| censor news everywhere or only in Russia.
| [deleted]
| SXX wrote:
| Google is US company which pays taxes in US. The company just
| like everyone in US obviously do share responsibility for
| what US government does. Fortunately Google and it's
| leadership actually does have political positions even if you
| dont like it.
|
| In any case as unfortunate owner of Russian passport with
| friends and collegues in Ukraine I am more affected by Putins
| war than by anything US does.
|
| So I want Yandex to be seen as part of Kremlin propoganda
| machine and threated accordingly. This company grew monopoly
| in many markets in Russia and they directly benefited from
| Putin regime. Since Ilya Segalovich died company started to
| be "out of politics" and this complete lack of any political
| activity lead country to these terrible events.
| idealmedtech wrote:
| In the download script, it skips parts of the model (02 and 83);
| any ML people have ideas why you'd do that?
| hansonw wrote:
| It appears the indexing for the model parts is deliberately not
| contiguous; the 03-82 range represents the main 80 transformer
| layers.
| https://github.com/yandex/YaLM-100B/blob/main/megatron_lm/me...
| idealmedtech wrote:
| That makes sense, thanks for clearing it up!
| lukestateson wrote:
| memorable wrote:
| Source for all the claims?
| sgc wrote:
| Those are trivial to verify. If you can't do your 5 minutes
| of research, OP should not feel obliged to humor your attempt
| to create busy work for them.
| Vaslo wrote:
| Plenty of people cite their work and arguments in
| contentious discussions.
| trasz wrote:
| https://twitter.com/timsoulo/status/1510955352267063296
| OmicronCeti wrote:
| 2) https://www.businessinsider.com/yandex-russia-former-news-
| di...
|
| >The ex-head of news at Russia's largest internet company has
| advice for his former colleagues: quit.
|
| >Lev Gershenzon worked at Yandex in various roles for four
| years, according to his LinkedIn profile. He took to Facebook
| early Tuesday morning to warn people still working at the
| company -- which is one of the largest search engines in
| Russia -- that it was contributing to the censorship of the
| country's invasion into Ukraine.
|
| >"The fact that a significant part of the Russian population
| may believe that there is no war is the basis and driving
| force of this war," Gershenzon wrote, also tagging six of his
| former coworkers. "Today, Yandex is a key element in hiding
| information about war. Every day and hour of such "news"
| costs human lives. And you, my former colleagues, are also
| responsible for this."
|
| 2) https://techcrunch.com/2022/03/16/russia-yandex-news-vk/
|
| >Yandex's former head of news accused the company of being a
| 'key element in hiding information' from Russians about the
| war in Ukraine.
|
| 3) Result of Yandex's slower crawler and default display
| mode, although the effect is as described: https://twitter.co
| m/maryilyushina/status/1510930537187319813...
| jeroenhd wrote:
| Russian company follows Russian propaganda rules, that's hardly
| news. It's pretty clear that concepts like "free press" and
| "freedom of information" aren't compatible with the Russian
| regime and expecting such features from a company operating
| mostly in Russia is kind of pointless. It should be obvious
| that anything Yandex (or any company targeting Russia, really)
| should be met with a good deal of scepticism. Companies like
| Yandex and Baidu can still deliver usable research, though, as
| long as you realise with what kind of perspective their code
| was written and their algorithm trained.
|
| In a similar vain, Microsoft has censored "tank man" from their
| image search (and that of all their image search customers,
| such as DuckDuckGo). Google is a more transparent about their
| censorship, usually showing a link or explanation why they
| remove certain information at the bottom of the page, but it
| still reflects the values of western civilisation, for example
| by delisting Russian propaganda such as RT.
|
| These biases are everywhere in all research into this field.
| The Russian situation is obviously worse than that in many
| other countries, but you should never forget the bias that AI
| models from free countries have been trained with either.
| lizardactivist wrote:
| Do you use Google, Bing, Twitter, or Facebook?
| wongarsu wrote:
| Now we just need someone to figure out how to compress the model
| to get similar performance in 10B parameters.
|
| I assume some of the services that offer GPT-J APIs will pick
| this up, but it doesn't look cheap or easy to get this running.
| alexb_ wrote:
| I have to wonder if 10 years down the line, everyone will be able
| to run models like this on their own computers. Have to wonder
| what the knock-on effects of that will be, especially if the
| models improve drastically. With so much of our social lives
| being moved online, if we have the easy ability to create fake
| lives of fake people one has to wonder what's real and what
| isn't.
|
| Maybe the dead internet theory will really come true; at least,
| in some sense of it.
| https://www.theatlantic.com/technology/archive/2021/08/dead-...
| Jimmy wrote:
| There's a very simple solution, of course: turn off the
| computer and physically interact with real people.
| espadrine wrote:
| > _I have to wonder if 10 years down the line, everyone will be
| able to run models like this on their own computers._
|
| Isn't that already the case? Sure, it costs $60K, but that is
| accessible to a surprisingly large minority, considering the
| potency of this software.
| alexb_ wrote:
| ...what? 60 thousand dollars for a dedicated computer that
| you can't use is not everyone, not on their own computers,
| and is also a crazy large amount of money for nearly
| everyone. Sure there are some that could, but that's not what
| I said.
| H8crilA wrote:
| Indeed. What "everyone" can use is a ~$200 smartphone, so
| there's a ~300x gap to be bridged.
| wjnc wrote:
| log(300) / log(2) = only 8.2 doublings away. That's near
| future material.
| thfuran wrote:
| Maybe at 90s hardware growth rates, but not now.
| rictic wrote:
| The dream of the 90s is alive in the GPU market:
| https://aiimpacts.org/2019-recent-trends-in-gpu-price-
| per-fl...
|
| Moore's law didn't stop, just Dennard scaling. Expect
| graphics and AI to continue to improve radically in
| performance/price, while more ordinary workloads see only
| modest improvements.
| thfuran wrote:
| GPU TDP seems on the verge of going exponential, cost per
| transistor isn't really decreasing so much at the very
| latest nodes, and even that article seems to suggest it'd
| likely be decades before 300x flops/$
| ben_w wrote:
| Most of the cost of a phone isn't the processor, so
| probably closer to x1000. Hardware may get that much
| cheaper, but it was never guaranteed, and we're not
| making progress as fast as we used to.
| px43 wrote:
| Eh, 60k is just a bit more expensive than your average car,
| and lots of people have cars, and that's just how things
| are today. I imagine capabilities will be skyrocketing and
| prices will fall drastically at the same time.
| fuzzer37 wrote:
| > 60k is just a bit more expensive than your average car
|
| If by "A bit" you mean about 30-40k
| paganel wrote:
| Plus, that's the energy costs involved when running a
| computer now worth 60k, I'm pretty sure that in the current
| socio-economic climate those power costs will surpass the
| initial acquisition cost (those 60k, that is) pretty
| easily.
| colechristensen wrote:
| An 80GB nvidia A100 goes for $20k and uses 300 watts, the
| energy costs of using one (or three) isn't going to
| surpass the hardware costs for... a while.
| paganel wrote:
| I wanted to add that I was writing it metaphorically in a
| way, as in, seeing as how high those energy bills will be
| they might as well all add up to 60k.
|
| Not sure about most of the people in here, but I would
| get really nervous at the thought of running something
| that eats up 3x300 watts per hour, for 24/7, just as part
| of a personal/hobby project. The incoming power bills
| would be too high, you have to be in the wage-percentile
| for which dropping 60k on a machine just to carry out
| some hobby project is ok, i.e. you'd have to be "high-
| ish" middle-class at least.
|
| The recent increases in consumer power prices are a heavy
| blow for most of the middle-class around Europe (not sure
| about how things are in the States), so a project like
| this one is just a no-go for most of middle-class
| European programmers/computer people.
| colechristensen wrote:
| At full power 3 of those would cost me ~$3.50 per day
| ($0.15 per kWh is what I paid for last month's
| electricity, though I could pay less if I made some
| difference choices), I occasionally have a more expensive
| coffee order, or have a cocktail worth three times as
| much.
|
| Things are getting more expensive here but nothing like
| the situation in Europe (essentially none of our energy
| was imported from Russia, historically ~10% of oil
| imports but that was mostly to refine and re-export, we
| have all the natural gas locally that we need) The US
| crossed the line into being a net hyrdocarbon energy
| exporter a while ago (unsure what the case is recently
| but it is at worst about at parity)
| golem14 wrote:
| You must not have a pool :)
| wellthisisgreat wrote:
| What kind of computer would they be?
|
| Can you spec it out roughly?
| joshvm wrote:
| You could just run this on a desktop CPU, there's nothing
| stopping you in principle, you just need enough RAM. A big
| memory (256GB) machine is definitely doable at home. It's
| going to cost 1-2k on the DIMMs alone, less if you use
| 8x32GB, but that'll come down. You could definitely do it for
| less than $5k all in.
|
| Inference latency is a lot higher in relative terms, but even
| for things like image processing running a CNN on a CPU isn't
| particularly bad if you're experimenting, or even for low
| load production work.
|
| But for really transient loads you're better off just renting
| seconds-minutes on a VM.
| sascha_sl wrote:
| From the readme, it looks like you need that RAM on your
| GPU.
| joshvm wrote:
| There isn't any reason you can't run a neural net on a
| CPU. It's still just a bunch of big matrix operations.
| The advantage of the GPU is it's a lot faster, but "a
| lot" might be 1 second versus 10 seconds, and for some
| applications 10 seconds of inference latency is just fine
| (I have no idea how long this model would take). All the
| major ML libraries will operate in CPU-only mode if you
| request it.
| visarga wrote:
| They are pretty slow even on GPU. The problem is that
| it's an autoregressive model. So it needs to do a forward
| pass for each token.
| kamray23 wrote:
| You're grossly overestimating. People who make 60k annually
| are getting a bit rarer nowadays, it's not like everyone can
| afford it. For the majority of people it'd be a multi-decade
| project, for a few it might only take 7 years, very few
| people could buy it all at once.
| uniqueuid wrote:
| Nitpick: This uses 8x A100 which are at least $10k a piece to
| my knowledge. Add in the computer and you're closer to $100k.
| sascha_sl wrote:
| And also, NVIDIA does not sell them to the consumer market
| whatsoever. Linus Tech Tips could only show one because
| someone in the audience sent theirs over for review.
| taink wrote:
| I believe you're confusing the amount of A100 graphics
| cards used to _train_ the model (the cluster was actually
| made up of 800 A100s), and the amount you need to _run_ the
| model :
|
| > The model [...] is supposed to run on multiple GPUs with
| tensor parallelism.
|
| > It was tested on 4 (A100 80g) and 8 (V100 32g) GPUs, [but
| should work] with [?]200GB of GPU memory.
|
| I don't know what the price of a V100 is, but given $10k a
| piece for A100s we would be closer to the $60k estimate.
| uniqueuid wrote:
| The $10k price is for an A100 with 40GB ram, so you need
| 8 of those. If you can get your hands on the 80GB
| variant, 4 are enough.
|
| Also, if you want to have a machine with eight of these
| cards, it will need to be a pretty high-spec rack-mounted
| or large tower. To feed these GPUs, you will want to have
| a decent amount of PCIe-4 lanes, meaning EPYC are the
| logical choice. So that's $20k for an AMD EPYC server
| with at least 1.6kw PSUs etc etc.
| taink wrote:
| Do you happen to know the cost of the 80GB variant?
| uniqueuid wrote:
| The PNY variant is pretty much the only one you can try
| to buy as an individual part, and those go for ~$15k. If
| you can get them.
|
| Note that A100 like other datacenter GPUs are passively
| cooled. You need a strong airflow and duct in any case
| that would house them.
| colechristensen wrote:
| Dell sells them for $20k
| uniqueuid wrote:
| Or ~25k in Euro. Ouch.
| riku_iki wrote:
| There is also $5k A6000 with 48GB
| justinlloyd wrote:
| Which will work just fine with NVIDIA SWITCH and a decent
| GPU compute case from ASUS or IBM or even building your
| own out of an off-the-shelf PCIe switch and consumer
| motherboard.
| justinlloyd wrote:
| You don't need a "decent amount" of PCIe-4 lanes. You
| just need 16 of them. And they can be PCIe 3.0 and will
| work just fine. Deep learning compute boxes predominantly
| use a PCIe switch. e.g. the ASUS 8000 box, which handles
| eight cards just fine. You only need a metric tonne of
| PCIe bandwidth if you are constantly shuttling data in
| and out of the GPU, e.g. in a game or exceedinyl large
| training sets of computer vision data. A little latency
| of a few hundred milliseconds moving data to your GPU in
| a training session that will take hours if not days to
| complete is neither here nor then. I suspect this model,
| with a little tweaking, will run just fine on an eight
| way RTX A5000 setup, or a five-way A6000 completely
| unhindered. That puts the price around $20,000 to
| $30,000. If I put two more A5000s in my machine, I
| suspect I could figure out how to get the model to load.
|
| It also sounds like they haven't optimized their model,
| or done any split on it, but if they did, I suspect they
| could load it up and have it infer slower on fewer GPUs,
| by using main memory.
| arathore wrote:
| If by running models you mean just the inference phase, then
| even today you can run large family of ML models on commodity
| hardware (with some elbow grease, of course). The training
| phase is generally the one not easily replicated by non-
| corporations.
| pgt wrote:
| The Move to the Edge is one of the strongest trends in
| technology. So, yes. I would never best against it.
|
| (applies to computing and other technologies like power
| production and agriculture)
| user3939382 wrote:
| When I see AWS, cloud, and server side rendering frameworks
| it seems like we're moving the other way in some sense.
| ubercore wrote:
| There's a strong trend to push to the edge of the cloud
| though -- cloudfront workers, deno.deploy, etc
| Comevius wrote:
| That's definitely the future, personalized entertainment and
| social interactions will be big. I could watch a movie made for
| me, and discuss it with a bunch of chat bots. The future will
| be bubbly as hell, people will be decaying in their safe places
| as the hellscape rages on outside.
| pydry wrote:
| I get the feeling that creative sci fi used to kind of help
| inoculate us against these kinds of future but it seems like
| there's much less of it than there used to be.
|
| "Black mirror" was good but it's not nearly enough.
| Peritract wrote:
| > I could watch a movie made for me
|
| We're a long, long way from this. Stringing words/images
| together into a coherent sequence is arguably the easy bit of
| creating novels/films, and computers still lag a long way
| behind humans in this regard.
|
| _Structuring_ a narrative is a harder, subtler step. Our
| most advanced ML solutions are improving rapidly, but often
| struggle with coherence over a single paragraph; they 're not
| going to be doing satisfying foreshadowing and emotional
| beats for a while.
| [deleted]
| axg11 wrote:
| > We're a long, long way from this.
|
| We're probably 18 months away from this. We're probably
| less than 5 years away from being able to do this on local
| hardware. AI/ML is advancing faster than most people
| realise.
| thatwasunusual wrote:
| > Structuring a narrative is a harder, subtler step.
|
| You can say that about many movies/series made entirely by
| humans today. :)
| fumblebee wrote:
| Maybe. But I think a lot of folks have a short term memory;
| it was not so long ago that Word2Vec and AlexNet were SOTA.
| Remember when the thought of a human besting a world-class
| player at Go was impossible? Me too.
|
| We've come ludicrously far since then. That progress
| doesn't guarantee that innovation in the space will
| continue at its current pace, but it sure does feel like
| it's possible.
| natly wrote:
| We're probably a long way away from narrative, but dall-e
| for video is probably only a year or two away from now
| (they're probably training the model as we speak).
| importantbrian wrote:
| I actually wouldn't be surprised if the technology catches
| up to this faster than we realize. I think the actual
| barrier to large scale adoption of it will be financial and
| social incentives.
|
| A big reason all the major studios are moving to big
| franchises is that the real money is in licensing the
| merch. The movies and TV shows are really just there to
| sell more merch. Maybe this will work when we all have high
| quality 3d printers at our desks and we can just print the
| merch they sell us.
|
| The other big barrier is social. A lot of what people
| watch, they watch because it was recommended to them by
| friends or colleagues, and they want to talk about what
| other people are talking about. I'm sure that there will be
| many people who will get really into watching custom movies
| and discussing those movies with chatbots, but I bet most
| people will still want to socialize and discuss the movies
| they watch with other humans. FOMO is an underestimated
| driver of media consumption.
| jb_s wrote:
| For many movies, sure.
|
| I'm pretty sure the Marvel franchise is shat out by an
| algorithm.
| orbital-decay wrote:
| You jest, but it really is the case. When your movie has
| a goddamn board of directors, you can be 100% sure it
| will be A/B tested until it transmutes the surrounding
| air into gold.
| rasz wrote:
| You really dont want to live in Mindwarp (1992 Bruce Campbell
| movie) or in this !114! year old short story
| https://en.wikipedia.org/wiki/The_Machine_Stops
| dTal wrote:
| The Machine Stops is _eerily_ prescient - or perhaps just
| keenly observant of trends visible even at the time - but
| in fairness the humans in it are not _socially_ isolated,
| as such; they do not converse with bots, but rather with
| each other. The primary social activity in the The Machine
| Stops is the Zoom meeting.
|
| I do not look forward to the day when that story becomes an
| _optimistic_ view of the future.
| mrguyorama wrote:
| That story is already an optimistic view compared to our
| own: They have no ads
| unixhero wrote:
| Yes, the vision is that everyone has an AI cube in their house.
| shaky-carrousel wrote:
| Then, we'll hack all those cubes to build an AGI.
| psychoslave wrote:
| I don't know for you, but most of my online interactions are
| text based. Context of interpretation matter far much than the
| form of the content. If you know it's easy to fake text
| exchanges, you might be more careful about text origin, and
| other contextual hints. Even it's the syntax imitate your
| children verbal oddities, you may not necessarily run to comply
| thoughtlessly to an unusual demand you just receive by SMS from
| their phone number. Trust and check.
| Byamarro wrote:
| It could be possible with analog chips. I.e. ones that Mythic
| works on.
| redox99 wrote:
| I'm not sure why you got downvoted. Yes, ASICs (either analog
| or digital) that have some model hardcoded in would probably
| make it feasible, but it won't be programmable which is the
| interesting part.
| trasz wrote:
| Totally not my field, but why wouldn't they be
| programmable? Analog FPGA's already exist.
| redox99 wrote:
| Yes, true. I was referring to the Mythic ones the other
| comment mentioned which are only for inference of a
| specific model.
| dav_Oz wrote:
| The bots/machine vs human reminds me of that famous experiment
| from the 30s in which Winthrop Kellogg[0], a comparative
| psychologist, and his wife decided to raise their human baby
| (Donald) simultaneously with a chimpanzee baby (Gua) in an
| effort to "humanize the ape". It was set out to last 5 years
| but was relatively quickly abrupted after only 9 months. The
| explicit reason wasn't stated only that it successfully proved
| the hereditary limits within the "nature vs nurture" debate of
| a chimpanzee, the reticent statement reads as follows:
|
| > _Gua, treated as a human child, behaved like a human child
| except when the structure of her body and brain prevented her.
| This being shown, the experiment was discontinued_
|
| There have been a lot of speculation as to other reasons of
| ending the experiment so prematurely. Maybe exhaustion. One
| thing which seemed to dawn on the parents - if one reads
| carefully - is that a human baby is far superior at _imitating_
| than the chimpanzee baby, frighteningly so, that they decided
| to abort the experiment early on in order to prevent any
| irreversible damage in the development to their human child
| which at that point had become far more similar to the
| chimpanzee than the chimpanzee to the human.
|
| So, I would rephrase "the internet is dead" into "the internet
| becomes increasingly undead" because humans condition
| themselves in a far more accelerated way to behave like bots
| than bots are potentially able to do. From the wrong side this
| could be seen as progress when in fact it's _opposite
| progress_. It sure feels like that way for a lot of of people
| and is a crucial _reciprocal_ element often overlooked
| /underplayed (mostly in a benign effort to reduce _unnecessary_
| complexities) when analyzing human behaviour in interactions
| with the environment.
|
| [0]https://en.m.wikipedia.org/wiki/Winthrop_Kellogg#The_Ape_and
| ...
| r3trohack3r wrote:
| A tangentially related thought:
|
| Actors attempt to imitate humans. "Good acting" is
| convincing; the audience believes the actor is giving a
| reasonable response to the portrayed situation.
|
| But the audience is also trying to imitate the actors to some
| degree. Like you point out, humans imitate. For some subset
| of the population, I'd imagine the majority of social
| situations they are exposed to, and the responses to
| situations they observe, are portrayed by actors.
|
| At what point are actors defining the social responses that
| they then try to imitate? In other words, at what point does
| acting beget acting and how much of our daily social
| interactions actually are driven by actors? And is this world
| of actors creating artificial social responses substantially
| different than bots doing the same?
| dvirsky wrote:
| Someone wrote once about how Wall Street people started
| behaving like the slick image projected of them in movies
| in the 80s, namely of Michael Douglas; before that they
| were more like the "boring accountant" type.
| jdsully wrote:
| This is a common phenomena where the fake is more
| believable than the real thing due to over exposure of the
| imitation.
|
| Famously the bald eagle sounds nothing like it does in tv
| and the movies and explosions are rarely massive fireballs.
| For human interaction it's much harder to pin down cause
| and effect but if it happens in other cases it would be
| very surprising to not happen there.
| dougmwne wrote:
| This is famously theorized by postmodernism. See:
| https://en.m.wikipedia.org/wiki/Simulacra_and_Simulation
| boplicity wrote:
| Case in point: recently, I've noticed that I'm getting more
| and more emails with the sign off "Warm regards." This is not
| a coincidence. It is an autosuggestion from Google. If you
| start signing off an email, it will automatically suggest
| "Warm regards." It just appears there -- probably an idea
| generated from an AI network. There are more and more of
| these algorithmic "suggestions" appearing every day, in more
| and more contexts. This is true for many text messaging
| programs: There are "common" replies suggested. How often do
| people just click on one of the suggested replies, as opposed
| to writing their own? These suggestions push us into
| conforming to the expectations of the algorithm, which then
| reinforces those expectations, creating a cycle of further
| pushing us into the language use patterns generated by
| software -- as opposed to idiosyncratic language created by a
| human mind.
|
| In other words, people are already behaving like bots; and
| we're building more and more software to encourage such
| behavior.
| alephxyz wrote:
| Those suggestions appear in Google chat too and even if you
| don't click on them, the simple fact of reading the
| suggestion makes you much more likely to type it yourself.
| There's clearly a priming effect to it.
| yoyopa wrote:
| depends on your personality
| jcelerier wrote:
| nope, just being exposed to text influences you whether
| you want it or not
| Izkata wrote:
| They're saying it might influence you to type something
| different. Some of us are just contrary.
| Dylan16807 wrote:
| Sometimes. Good luck keeping that up the majority of the
| time something tries to influence you.
| AlecSchueler wrote:
| On average, it doesn't. This is why advertising and magic
| work.
| bgroat wrote:
| I'm a magician and a developer by training.
|
| Now primarily employed in a marketing capacity.
|
| Over my career I've worked with: - Doctors - Lawyers -
| Engineers - Fund managers - Academics (hard and soft
| sciences) - Mentalists/Hypnotists
|
| All of them believed that they're specific training and
| temperament made them immune from simple persuasion
| techniques and that they were purely rational actors.
|
| None of them struck me as any more rational/more
| independent thinkers than anyone else off the street
| xcambar wrote:
| It is typical to rate yourself above your actual self.
|
| Even when someone rates oneself down like when saying of
| themself that they're dumb, ugly or whatever, they
| generally mean it in a lesser fashion than for any other
| peer they'd attribute as such.
| ClumsyPilot wrote:
| But its not above, its ascribung a mythical ability that
| does not exist - we don't talk about people who think
| they are psycic as optimistic, we call them crazy.
|
| these guys are similar, except it's common belief.
| remram wrote:
| Those suggestions are very few so I suspect they were hand-
| picked.
| mattnewton wrote:
| I don't know if this is how it still works, but early
| attempts were modeled as classification problems with
| hundreds of hand picked completions. Can't predict
| something really bad if it isn't in your prediction list.
| This limits the surface of bad things to cases of tone
| mismatch like "sounds great" when talking about someone
| grieving a loss or something.
| orbital-decay wrote:
| Doesn't GMail collect the data in some form of federated
| learning nowadays, like GBoard does? Federated learning
| does seem to be able to create the unintended positive
| feedback loop, converging on a single phrase and causing
| the users to lock themselves in a bubble.
| codeviking wrote:
| Which is why it's important for folks to start applying AI
| to more interesting (but harder, more nuanced) problems.
| Instead of making it easier for people to write emails, or
| targeting ads, it should be used to help doctors, surgeons
| and scientists.
|
| The problem is that these problems are less profitable. And
| that the companies with enough compute to train these types
| of models are concerned about getting more eyeballs, not
| making the world a better place.
| wskinner wrote:
| The problem is not that those problems are less
| profitable. The problem is a combination of 1. Those
| problems are much harder 2. The potential harm from
| getting them wrong is much larger
| codeviking wrote:
| Yup, I definitely agree that they're harder (and noted
| this). But I'm not sure I agree with your second point.
| Or rather, I think there's some nuance to it.
|
| Sure, using AI to treat people without a human in the
| loop would clearly do harm. But using AI as an assistant,
| to help a doctor make the right diagnosis, seems like
| it'd do the opposite. It'd help doctors serve a larger
| patient population, make less mistakes, and probably
| equate to less harm in the long run.
|
| Anyway, I think we can all agree that using AI for
| anything other than ad targeting is a net win.
| jazzyjackson wrote:
| thanks for the awesome analogy, I always had the sinking
| feeling that the bots are finding it increasingly easy to fit
| in among the humans because the humans on social media act
| increasingly like bots.
|
| "monkey see, monkey do"
| [deleted]
| gigglesupstairs wrote:
| Wow this is such a mind bending perspective. Thanks for
| sharing it.
| alcover wrote:
| Nice post! But to me your analogy does not really stand :
| bots _are_ the ones catching up with human conversation in an
| "accelerated way", feeding on a corpus that predates them.
| Bots are not an invariant nature that netizens imitate.
| iforgotpassword wrote:
| It's the commonly believed reason; the child starting to take
| on habits from Gua, like noises when she wanted something,
| and the way monkeys scratch themselves. No authoritative
| source for it though, it's what I've been told during a
| lecture back in college, and I think PlainlyDifficult
| mentions it too in their video about it.
|
| https://youtu.be/VP8DD9TGNlU
| deathemperor wrote:
| My mind is blown. Thanks for sharing. Especially with the
| movie analogy. I'm a very movie person and I imitate my
| personality traits a lot based on characters on movies...
| freewizard wrote:
| So maybe the Turing Test is not about AI are smart enough,
| but about how stupid humans become?
| rexpop wrote:
| Not stupid; imaginative and agreeable.
| bgroat wrote:
| These are also the elements that make a good hypnosis
| subject.
|
| I can't put a dumb person under.
|
| I need someone with an active imagination who wants to
| work with me (for best results)
| time_to_smile wrote:
| Comments like this make me feel like I'm losing my mind.
|
| I think it's far more likely that in 10 years we'll all become
| more used to rolling blackouts, and fondly remember we all used
| to be able to afford to eat out, and laugh over a glass of
| cheap gin about how wild things were back in the old days
| before things got really bad.
|
| 10 years ago was a much more exciting and hopeful time than
| today. I remember watching Hinton show off what deep learning
| was just starting to do. It was frankly more interesting that
| high parameter language models. Startups were all working on
| some cool problems rather than just trying to screw over
| customers.
|
| That's just technology. Economically, socially and ecologically
| things looks far brighter in 2012 than they do now, and in 2032
| I suspect we'll feel the same about today, but far more
| dramatically.
|
| We've already pass the peak of "things are getting better all
| the time!" but people are just in denial about this.
| zackmorris wrote:
| Unpopular opinion: something will stop egalitarian power for
| the masses. I had high hopes for multicore computing in the
| late 90s and early 2000s but it got blocked every step of the
| way by everyone doubling down on DSP (glorified vertex buffer)
| approaches on video cards, leaving us with the contrived
| dichotomy we see today between CPU and GPU.
|
| Whatever we think will happen will not happen. A less-inspired
| known-good state will take its place, creating another status
| quo. Which will funnel us into dystopian futures. I'm just
| going off my own observations and life experience of the last
| 20 years, and the way that people in leadership positions keep
| letting the rest of us down after they make it.
| ur-whale wrote:
| You're an optimist.
|
| Before any of the things you describe happen, most states
| will mandate the equivalent of a carry permit to be able to
| freely use compute for undeclared and/or unapproved purposes.
| nradov wrote:
| In what sense is the dichotomy between CPU and GPU contrived?
| Those are designed around fundamentally different use cases.
| For low power devices you can get CPU and GPU integrated into
| a single SOC.
| ggktk wrote:
| I'm predicting that the upcoming Mac Pro will be very popular
| among ML developers, thanks to unified memory. It should be
| able to fit the entire model in memory.
|
| Combine that with the fact that PyTorch recently added support
| for Apple silicon GPUs.
| tehsauce wrote:
| upcoming mac pro will have pretty poor ML performance when
| compared to even an old nvidia gpu sadly.
| uniqueuid wrote:
| Although memory capacity may matter more than speed for
| _inference_. As long as you 're not training or fine
| tuning, the mac pro / studio may be just fine _.
|
| _ apart from the fact that you can 't use any of the many
| nvidia-specific things; if you're dependent on cuda,
| nvcuvid, AMP or other things that's a hard no.
| TuringNYC wrote:
| >> I have to wonder if 10 years down the line, everyone will be
| able to run models like this on their own computers.
|
| Do you mean _train_ or _run_? My assumption was all these
| models could be run on most computers, probably with a simple
| docker container, as long as there is sufficient RAM to hold
| the network, which should be most laptops > 16gb ram.
|
| Speaking of which, anyone have recommendations on pre-trained
| docker containers with weights included?
| tiborsaas wrote:
| I think there will be a trend where model's size will shrink
| due to better optimization / compression while hardware specs
| keep increasing.
|
| You can already see this with Chinchilla:
|
| https://towardsdatascience.com/a-new-ai-trend-chinchilla-70b...
| natly wrote:
| I know it's a sort of exaggerated paranoid thought. But like
| these things do all come down to scale and some areas of the
| world definitely could have the amount of compute available to
| make dall-e level quality full scale videos which we might be
| consuming right now. It really does make you start to wonder at
| what point we will rationally be able to have zero trust that
| not everything we watch online is fabricated.
| thelamest wrote:
| Historically, hard-to-falsify documents are an anomaly, the
| norm was mostly socially conditional and enforced trust.
| Civilizations leaned and still lean on limited-trust
| technologies like personal connections, word of mouth, word
| on paper, signatures, seals, careful custody etc. I agree
| losing cheap trust can be a setback, just want to point out
| we're adaptable.
| lostmsu wrote:
| Running the models like this on own computer is already
| possible with DeepSpeed. I think it even supports training
| albeit it would be extremely slow.
|
| https://www.deepspeed.ai/
| nonrandomstring wrote:
| > one has to wonder what's real and what isn't.
|
| And whether it really matters. That's the bigger question.
|
| I think, for most of us, it does matter. But we're not sure why
| and what a loss of human reality would really mean.
|
| For a few who wholeheartedly embrace it there's some resonance
| with the psychedelic/60s creed that sees this as some kind of
| "liberation".
| simonh wrote:
| It's more likely, if not inevitable that these things will
| become ubiquitously available remotely, like Siri and Alexa.
| It's access that's important, not hosting.
| lukestateson wrote:
| 1. Yandex supports the Russian Terrorist regime.
|
| 2. Yandex News service ignores the genocide currently happening
| in Ukraine.
|
| 3. Yandex Search engine hides the pictures of Bucha and Irpin
| massacre as well as Kharkiv and Mariupol destruction.
|
| Yandex using whitewashing tactics via open source.
| htrp wrote:
| > It was tested on 4 (A100 80g) and 8 (V100 32g) GPUs, but is
| able to work with different configurations with [?]200GB of GPU
| memory in total which divide weight dimensions correctly (e.g.
| 16, 64, 128).
|
| so we looking at crazy prices just for inference. RIP to the
| first guy's cloud billing account who makes this public
| jhoelzel wrote:
| so err the cheapest A100 i could find was EUR 10.579,79 .
|
| Suddenly that 3090 i wanted to get, does not seem so
| expensive....
| [deleted]
| [deleted]
| jamix wrote:
| Doing a Yandex image search for "Bucha" tells me all I need to
| know about Yandex.
| orbital-decay wrote:
| What does it tell you? I'm seeing mostly pictures of
| destruction and mass graves for both Bucha and Bucha.
| OneLessThing wrote:
| If you read the resulting articles you'll find a few of them
| suggest that all the deaths were staged or committed by
| Ukrainians. Headlines like "The truth is out there..." or
| "Global lies..." are examples. There still are many results
| from mainstream western media on the other hand.
|
| Google, in contrast, has zero results implying the deaths
| were staged or committed by Ukrainians.
| ketzu wrote:
| Seeing those gigantic models it makes me sad that even the 4090
| is supposed to stay at 24GB of RAM max. I really would like to be
| able to run/experiment on larger models at home.
| EugeneOZ wrote:
| Can Apple Silicone's unified memory be an answer?
| josu wrote:
| For the people that didn't click on the link:
|
| >but is able to work with different configurations with
| [?]200GB of GPU memory in total which divide weight dimensions
| correctly (e.g. 16, 64, 128).
| MrBuddyCasino wrote:
| Wondering if Apple Silicon will bring arge amounts of unified
| main memory with high bandwidth to the masses?
|
| The Mac Studio maxes out at 128GB currently for around $5K, so
| 256GB isn't that far out and might work with the ~200GB Yandex
| says is required.
| Havoc wrote:
| Perhaps on quantity. Substantially slower though around ~3x
| from what I can tell...substantial roadblock if you're
| training models that take weeks.
| MrBuddyCasino wrote:
| I meant for inference, not training. People just want to
| run the magic genies locally and post funny AI content.
| Havoc wrote:
| ah right - gotcha
| out_of_protocol wrote:
| Take a look at Apple's M1 Max, a lot of fast unified memory. No
| idea how useful though
| postalrat wrote:
| Apple is selling M1's with > 200gb ram? Have a link so I can
| buy one?
| jeroenhd wrote:
| What's the difference between Apple's unified memory and the
| shared memory pool Intel and AMD integrated GPUs have had for
| years?
|
| In theory you could probably assign a powerful enough iGPU a
| few hundred gigabytes of memory already, but just like Apple
| Silicon the integrated GPU isn't exactly very powerful. The
| difference between the M1 iGPU and the AMD 5700G is less than
| 10% and a loaded out system should theoretically be tweakable
| to dedicate hundreds of gigabytes of VRAM to it.
|
| It's just a waste of space. An RTX3090 is 6 to 7 times faster
| than even the M1, and the promised performance increase of
| about 35% for the M2 will means nothing when the 4090 will be
| released this year.
|
| I think there are better solutions for this. Leveraging the
| high throughput of PCIe 5 and resizable BAR support might be
| used to quickly swap out banks of GPU memory, for example, at
| a performance decrease.
|
| One big problem with this is that GPU manufacturers have
| incentive to not implement ways for consumers GPUs to compete
| with their datacenter products. If a 3080 with some memory
| tricks can approach an A800 well enough, Nvidia might let a
| lot of profit slip through their hands and they can't have
| that.
|
| Maybe Apple's tensor chip will be able to provide a
| performance boost here, but it's stuck on working with macOS
| and the implementations all seem proprietary so I don't think
| cross platform researchers will really care about using it.
| You're restricted by Apple's memory limitations anyway, it's
| not like you can upgrade their hardware.
| zaptrem wrote:
| Apple gets significant latency and frequency benefits from
| placing their LPDDR4 on the SoC itself.
| thereddaikon wrote:
| Unified memory is and always has been a cost cutting tactic.
| Its not a feature not matter how much manufacturers who use
| it try to claim it is.
| perryizgr8 wrote:
| Nvidia deliberately keeps their consumer/gamer cards limited in
| memory. If you have a use for more RAM, they want you to buy
| their workstation offerings like RTX A6000 which has 48G DDR6
| RAM or A100 which has 80G.
| justinlloyd wrote:
| What NVIDIA predominantly does on their consumer cards is
| limit the RAM sharing, not the RAM itself. The inability for
| each GPU to share RAM is the limiting factor. It is why I
| have RTX A5000 GPUs and not RTX 3090 GPUs.
| Voloskaya wrote:
| If you don't care about inference speed being in the 1-5sec
| range, then that should be doable with CPU offloading, with
| e.g. DeepSpeed.
| qayxc wrote:
| 200+ GiB of RAM still sounds like a pretty steep hardware
| requirement.
| justinlloyd wrote:
| Oh yeah, that $750 for 256GB of DDR-4 is going to totally
| break the bank.
| kfrzcode wrote:
| Damn I didn't know ram was so cheap
| Voloskaya wrote:
| If you have an nvme deepspeed can offload there as a second
| tier once the RAM is full.
|
| 175 GB aggregate on both RAM and nvme is in the realm of
| home deep learning workstation.
|
| As long as you aren't too fussy about inference speed of
| course.
| thejosh wrote:
| It's also a power issue. The 4090 sounds like you're going to
| need a much, MUCH higher PSU than you currently use.. or it'll
| suddenly turn off as it uses 2-3x the power.
|
| You'll need your own wiring to run your PC soon :-)
| melenaboija wrote:
| I think it is a stupid question, but does the power
| consumption needed by processors to infer compared to human
| brains demonstrate that there is something fundamentally
| wrong for the AI approach or is it more physics related?
|
| I am not a physicist or biologist or anything like that so my
| intuition is probably completely wrong but it seems to me
| that for more basic inference operations (lets say add two
| numbers) power consumption from a processor and a brain is
| not that different. It's like seeing how expensive it is for
| computers to infer for any NLP model, humans should be
| continuously eating carbs just to talk.
| agalunar wrote:
| Around room temperature, an ideal silicon transistor has a
| 60 mV/decade subthreshold swing, which (roughly speaking)
| means that a 10-fold increase in current requires at least
| a 60 mV increase in gate potential. There are some
| techniques (e.g. tunneling) that can allow you to get a bit
| below this, but it's a fairly fundamental limitation of
| transistors' efficiency.
|
| [It's been quite a while since I studied this stuff, so I
| can't recall whether 60 mV/decade is a constant for silicon
| specifically or all semiconductors.]
| visarga wrote:
| The AI is much faster than the brain, if you batch requests
| the cost goes down.
| googlryas wrote:
| > but it seems to me that for more basic inference
| operations (lets say add two numbers) power consumption
| from a processor and a brain is not that different
|
| Sure it is - it is too hard to figure it out based on 2
| numbers number, but lets multiply that by a billion - how
| much energy does it take a computer to add two billion
| numbers? Far less than the energy it would take a human
| brain to add them.
| PartiallyTyped wrote:
| I bought a 1500w psu soon after the previous crypto collapse
| for around $150, one of the best purchases I did.
| Dylan16807 wrote:
| The RAM is not using all that much of the power, and I think
| that scales more on bus width than capacity.
| justinzollars wrote:
| What is the TLDR on this model? What exactly does it do? Its not
| clear from the source examples.
| [deleted]
| braingenious wrote:
| This is one of the funniest threads I've ever seen on this
| website. People are yelling at eachother about the CIA and the
| legitimacy of Israel and Assange and the definition of fascism
| and... anything that pisses anybody off about international
| politics in general. In a thread about a piece of software that's
| (to me and likely many others) prohibitively expensive to play
| around with.
|
| Anyway I hope somebody creates a playground with this so I can
| make a computer write a fan fiction about Kirby and Solid Snake
| trying to raise a human baby on a yacht in the Caspian Sea or
| whatever other thing people will _actually_ use this for.
| option wrote:
| Did they bias it toward ru propaganda talking points?
|
| Edit: I would like to see more details in addition to size and
| languages (en, ru) about training data. For example, did they use
| their own Yandex.news (a cesspool of propoganda)?
| dang wrote:
| You've made a version of this comment 3 times in this thread
| now. It's shallow and flamebaity, and the repetition just adds
| noise and does no good, so please don't keep doing that. I
| understand the strong feelings, but the rules still apply--in
| fact that's when they apply most.
|
| https://news.ycombinator.com/newsguidelines.html
| option wrote:
| Thanks for reminder, I deleted other two comments which were
| more flamebaity. My overall point still stands - they did not
| give any details other than size on the training data. This
| is crucial (I train LLMs for a living)
| lukestateson wrote:
| dang wrote:
| You've posted 7 highly repetitive comments taking this thread
| straight into flamewar hell. That's not what this site is for,
| and destroys what it is for. If you'd please review
| https://news.ycombinator.com/newsguidelines.html and stick to
| the rules, we'd appreciate it.
|
| Hijacking top comments when flamebait hasn't succeeded in
| setting an entire thread on fire yet is particularly abusive.
|
| We detached this subthread from
| https://news.ycombinator.com/item?id=31853016.
| [deleted]
| cockhole_desu wrote:
| [deleted]
| narrator wrote:
| Back on topic, are you in favor of releasing language models if
| it means we won't be able to prevent the Russians from using
| them for propaganda for example?
|
| As long as we're going on tangents, according to the Zach
| Vorhies leak, Google censors lots and lots of topics for
| blatantly political reasons[1].
|
| [1]https://www.breitbart.com/tech/2021/08/19/google-
| whistleblow...
| xpl wrote:
| _> Yandex Search engine hides the pictures of Bucha and Irpin
| massacre as well as Kharkiv and Mariupol destruction_
|
| That's just not true, try it yourself. It just does not display
| the _latest_ images by default (though it 's easily turned on
| in the filter settings), and that's why on the very day the
| news appeared on the Internet, people went crazy about that
| Yandex somehow "hides the truth"...
|
| _> Yandex News service ignores the genocide currently
| happening in Ukraine_
|
| That is actually required by the Russian regulations on news
| aggregator services. Yeah, those regulations are unfair and
| oppressive, but it's the local law to which Yandex must comply.
| And by the way, they're going to get rid of that toxic asset:
| https://techcrunch.com/2022/04/28/yandex-sells-news-zen-vk
|
| (I suppose they can't just shut it down because the government
| threatens to nationalize Yandex in response)
|
| _> Yandex supports the Russian Terrorist regime_
|
| Can you please show any public statement from Yandex from which
| one could derive that?
| lukestateson wrote:
| > That is actually required by the Russian regulations
|
| Russian Terrorist Regime *
| SXX wrote:
| > That is actually required by the Russian regulations on
| news aggregator services.
|
| I Was Just Following Orders (c)
|
| Yandex could just shut down Yandex.News service completely
| years ago without repercussions. They choose not to.
| xpl wrote:
| _> without repercussions_
|
| That comes from where? The repercussions could have been
| very severe. The Russian government easily takes over and
| seizes control over "rogue companies". Russia is not a free
| country, my friend.
| SXX wrote:
| I am from Russia (though moved to Turkiye once war began)
| and I do have several friends working at Yandex on
| different positions including some quite high in
| management. So I well aware about their reasoning behind
| keeping working at Yandex.
|
| Basically even today after war has began and tens of
| thousands were killed on both sides some of people
| working there still hold the illusion that they could
| continue to live in their bubble and continue to innovate
| in Russia like nothing happen. So no, they are not some
| poor IT company opressed by the government. Every
| employee who wanted to immigrate was able to move abroad.
|
| 6-10 years ago Yandex can certainly shut down their news
| service without being seized. Back in 2008-2012 one of
| Yandex co-founders and ex-CTO Ilya Segalovich was often
| visitor of street protests almost until his death in 2013
| and this did not caused company to be seized.
| The_Colonel wrote:
| > I suppose they can't just shut it down because the
| government threatens to nationalize Yandex in response
|
| They can destroy equipment, safely delete all the code
| repositories etc. beforehand, thus rendering the company
| useless before the nationalization. But $$$ is more
| important.
|
| > Can you please show any public statement from Yandex from
| which one could derive that?
|
| Yandex pays tens/hundreds of millions in taxes and thus
| finances the war.
| xpl wrote:
| _> Yandex pays tens /hundreds of millions in taxes and thus
| finances the war._
|
| So what? You shut down the business with 20k employees, on
| the grounds that you do not agree with local regulations or
| because the government did bad? That is as far from reality
| as it gets.
|
| _> But $$$ is more important_
|
| Yeah, I think preserving the company is more important than
| that proposed suicide move (that wouldn't have worked
| anyway because the company is just too huge).
|
| It's not just money, it's people, it's culture, it's all
| the great projects the company does.
| The_Colonel wrote:
| > It's not just money, it's people, it's culture, it's
| all the great projects the company does.
|
| What about people killed by the Russian army, sponsored
| by Yandex?
|
| I guess those matter less than the company culture,
| right?
| xpl wrote:
| The Russian army is not sponsored by Yandex. The money
| comes from selling natural resources... and mostly to
| Europe, surprise. It's about $1 billion per day. Tax
| money from private companies is nothing compared to that.
| So Europe is sponsoring the war way more than Yandex.
|
| Let's then shut down the Europe, right? You can say --
| look, they're trying hard to get rid of Russian
| resources. But Yandex is also trying hard to become less
| dependent on Russian economy -- they try to
| internationalize their business. And all that "canceling"
| of Yandex really doesn't help (it does the opposite in
| fact).
| The_Colonel wrote:
| > The Russian army is not sponsored by Yandex.
|
| It is.
|
| > So Europe is sponsoring the war way more than Yandex.
|
| That's unfortunately true. The dependency is real, and it
| will take a long time to get rid of it.
|
| > And all that "canceling" of Yandex really doesn't help
| (it does the opposite in fact).
|
| Cancelling Yandex completely, as in forcing it to
| collapse, would help a lot. Yandex services (together
| with VK) are extremely important in the Russian society
| and economy, and their collapse would weaken Russia and
| its ability to wage (military/economic) war a lot. As
| such, this would be the best course of action (as
| mentioned before, burn the equipment, delete the code).
| xpl wrote:
| _> Cancelling Yandex completely, as in forcing it to
| collapse, would help a lot._
|
| It's just a wishful thinking. It wont "collapse", it
| would just become controlled by government, and then it
| _truly_ becomes the instrument of the evil, so that not
| only News, but every service Yandex provides will serve
| the government needs. They will recruit soldiers through
| Yandex services, they make Yandex develop AI-controlled
| tanks and whatnot. Every thing that Yandex doesn 't do
| now (because they do not actually support the war) --
| they will make it to do.
|
| _> their collapse would weaken Russia and its ability to
| wage (military /economic) war a lot_
|
| Of course not, because the Russian army and the military
| industrial complex is in no way dependent on the search
| engine and the food delivery service Yandex provides. You
| can destroy those, sure. People lifes get slightly worse,
| and then competitors catch up (there is a lot of
| competition to Yandex in Russia and they are not going to
| fade away).
| The_Colonel wrote:
| > It's just a wishful thinking. It wont "collapse", it
| would just become controlled by government
|
| That's why part of my suggestion is to burn the
| equipment/infrastructure and delete the code.
|
| > They will recruit soldiers through Yandex services,
| they make Yandex develop AI-controlled tanks and whatnot
|
| And the only thing stopping them now from doing that is
| that Yandex is not nationalized. Yeah, sure.
|
| > Of course not, because the Russian army and the
| military industrial complex is in no way dependent on the
| search engine and the food delivery service Yandex
| provides.
|
| Yandex provides many services, it's much like google -
| maps, translation, drive, mail etc. etc. Bringing it down
| would cripple many private and economic activities.
| Russia can't sustain waging wars if they don't have an
| economy and disgruntled population.
|
| With the exception of VK, there isn't really any step-in
| competition to Yandex. Even if there was, losing all your
| data in e.g. mail/drive will have significant
| consequences.
| xpl wrote:
| _> Russia can 't sustain waging wars if they don't have
| an economy and disgruntled population._
|
| Russia can wage wars on natural resource selling alone,
| it only needs to keep the gas and oil flowing through the
| infrastructure. All those private companies' activities
| the government sees mostly as a distraction, it doesn't
| give a damn about them (until they get in the way). They
| don't matter much.
|
| It's very much unlike the Western economies where the
| private companies drive the economy. Russia is more like
| a giant oil and gas pipe with military industrial complex
| around that.
|
| _> exception of VK, there isn 't really any step-in
| competition to Yandex_
|
| If we talk about city services (taxi, delivery, online
| shopping) there are lots of other players.
| Search/mail/social -- then yeah, apart from VK not many.
| And VK is in fact state-owned. Yandex is not. So if
| Yandex leaves the scene, the only game in town would be
| state-owned. This only reinforces the evil regime.
|
| _> That 's why part of my suggestion is to burn the
| equipment/infrastructure and delete the code._
|
| It's pretty unrealistic. You can do it in small company,
| easy. In a huge decentralized company I don't know how
| one could even pull that off. There simply isn't a way to
| "delete all the code", nor a single place you could burn
| all the servers. It just doesn't have a kill switch. And
| the moment you try that, the government swoops in and
| goodbye the company.
|
| _> And the only thing stopping them now from doing that
| is that Yandex is not nationalized. Yeah, sure._
|
| If Yandex gets nationalized, the government will replace
| the management and the uncooperative employees. Most of
| them would just leave the day it happens. It won't be
| Yandex anymore of course. That is essentially the same as
| killing the company, but worse, as the remnants could
| still be used for evil.
| m00dy wrote:
| well, I can call this "the real open ai".
| JeopardyJJJ wrote:
| drno123 wrote:
| Who controls Google? What did Google do to stop the inasion of
| Iraq? Will Google take responsibility for silent support od war
| in Iraq?
| csee wrote:
| Your comparison fails a test of facts. Yandex actively
| censors any perspective not approved by the Kremlin. Google
| does not do anything comparable to this.
| tremarley wrote:
| Google absolutely does the same thing.
| baisq wrote:
| Not to mention that Yandex does it in Russia because the
| law forces them to, while Google does it happily just to
| maintain the political status quo, of which they are a
| part of.
| SXX wrote:
| Google does not show government propoganda on search
| engine front page. If Yandex wanted to shut down their
| news aggregator they could have done it.
| ptnxlo wrote:
| Care to elaborate?
| londons_explore wrote:
| There is lots of content Google bans/hides. Copyrighted
| content, Adult content, child pornography, official
| secrets, etc.
|
| I don't think thats so different from other countries
| which also have a (partially overlapping) list of whats
| not allowed.
|
| Normally, when people think about that they say "well
| pictures of naked children are morally wrong, whereas
| talking about LGBTQ stuff is fine". But people in other
| parts of the world might have different morals and might
| think the other way around.
| waffleiron wrote:
| Also google specifically bans content that:
|
| - Disparage or belittle victims of violence or tragedy.
|
| - Deny an atrocity.
|
| - We don't allow content that promotes terrorist or
| extremist acts, which includes recruitment, inciting
| violence, or the celebration of terrorist attacks.
|
| Now I don't think these are bad rules, but they are rules
| that very much depend on the official narrative. A
| terrorist to one is a freedom fighter to another. These
| are rules that can be applied as wanted.
|
| https://support.google.com/websearch/answer/10622781
| [deleted]
| hgazx wrote:
| f311a wrote:
| Please ask the model to answer these questions.
| obituary_latte wrote:
| What are some use cases for something like this? I understand it
| says "generating and processing text", but is it a replacement
| for OCR? Or something else?
| vbezhenar wrote:
| Chat bots I guess. Or with voice engines - phone bots.
| jorgemf wrote:
| No, it is more like generating a conversations, translating
| text, summarization texts, writing code, etc.
| DennisP wrote:
| If I wanted to use it for summarization, what would I have to
| do?
| gwern wrote:
| Postfix "tldr:" to the text being summarized. (Even GPT-2
| could do that.)
| londons_explore wrote:
| The download fails because the vocab file link returns HTTP
| 403... :-(
|
| https://yalm-100b.s3.yandex.net/vocab/voc_100b.sp
|
| EDIT: It seems fine if you download with a browser useragent not
| CURL... I guess I just got hit by some anti-bot thing they have
| accidentally have turned on.
| brobinson wrote:
| curl -A Chrome -O
| https://yalm-100b.s3.yandex.net/vocab/voc_100b.sp
| uniqueuid wrote:
| Try opening the inspector in firefox, selecting the download
| request and using "copy as CURL". That gives you a working curl
| command.
| sandGorgon wrote:
| is this the first GPT-like models which is fully opensource ?
| none of the others are right ?
| littlestymaar wrote:
| Aren't eleutherai's model so?
| sandGorgon wrote:
| doesnt seem the code is there - pretrained models are there.
| https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b
|
| https://huggingface.co/EleutherAI/gpt-j-6B
|
| isnt that so ?
| p1esk wrote:
| The code is there: https://github.com/EleutherAI/gpt-neox
| manishsharan wrote:
| Is there a way for developers, who do not have AI/ML background,
| to get started using this ? I have been curious about GPT-3 but I
| do not have any AI/ML experience or knowledge. Is there a
| "approachable" course on Coursera or Udemy that could help me get
| started with technologies like GPT ?
| rripken wrote:
| I would not start with this model. Its impractically large.
|
| Start here: https://www.vennify.ai/gpt-neo-made-easy/
| lostmsu wrote:
| https://www.deepspeed.ai/
| amai wrote:
| Is that the model used by the russian government to generate fake
| news?
| Destiner wrote:
| You don't need ai for that, tons of ppl here in russia will do
| it for pennies.
| keewee7 wrote:
| Weird that Yandex developers gets so much hate for being Russian
| when some of the Yandex founders and management have lived in
| Israeli settlements.
|
| The former Yandex CEO literally moved to Israel to escape Western
| sanctions.
|
| How many companies have to annually dispel rumors that they're
| moving their HQ to Israel? Yandex is shady as fuck but not
| because they're Russian.
| dTal wrote:
| What have Israel connections got to do with shadiness? What are
| you insinuating?
| dicknuckle wrote:
| Settling land that was recently taken from Palestinian
| families by force, often (literally) knocking the existing
| houses over with a bulldozer. In my opinion, that's a moral
| red flag to participate in such an atrocity.
| [deleted]
| dTal wrote:
| What has Yandex got to do with any of that? Is everyone who
| lives in a country that commits atrocities "shady as fuck"?
|
| <edit>: reading closely I see that the initial allegation
| did use the word "settlement", and indeed that would
| constitute ethically questionable behavior. However, a
| sibling comment refutes this.
| DrewADesign wrote:
| I think you're missing that the commenter was talking
| about Israeli _settlers_ rather than Israelis in general?
| The settlers are controversial because they live in areas
| Israel occupied during the 6 day war in 1967. Much of the
| world considers their presence illegal, though Israel
| disputes that. Many, if not most, would consider living
| in these settlements a deliberate political provocation.
|
| *edit: you hadn't posted your edit yet. I have no idea if
| the allegation is truthful.
|
| *edit again: Why do downvoters think I'm wrong?
| marcinzm wrote:
| Have you seen what America has done in the Middle East the
| last 20 years? If you want to make a moral point then you
| should start there instead of trying to grind whatever axe
| you have against Israel.
| maleldil wrote:
| Why does it have to be mutually exclusive? The thread is
| about Israel, so they're pointing out Israel's crimes.
| You can be critical of multiple governments
| simultaneously.
| theplumber wrote:
| >> when some of the Yandex founders and management have lived
| in Israeli settlements.
|
| So now you have one more reason to say Yandex is bad for the
| world.
| [deleted]
| huma wrote:
| Not for being Russians, but for active participation in
| censorship by tweaking their news aggregation to show only hand
| picked government approved sources
| postalrat wrote:
| Exactly what google has been doing the past year or two.
| memorable wrote:
| Is there a source for this? I'm curious.
| SXX wrote:
| Just read about Yandex.News which display censored news
| sources on Yandex frontpage that millions of people visit.
| There is really no hidden censorship here - they just
| follow Russian law that literally whitelist exclusively
| press controlled by the state.
|
| They show Kremlin propoganda on their front page which
| makes Yandex part of Kremlin propoganda machine. They could
| have shut down news agregator, but they choose not to.
| proxysna wrote:
| https://www.kommersant.ru/doc/4975254
|
| https://www.moscowtimes.ru/2022/04/18/yandeks-ubral-
| izpoisko...
|
| https://zona.media/news/2021/12/25/oiya
|
| https://www.rbc.ru/politics/07/09/2021/613709739a79476fd52e
| 1...
|
| They are complying with russian censorship laws and it's
| gotten so bad that they are planning to sell the news
| service altogether to VK which is far worse than Yandex
| when it to how eager they are to enforce these laws and to
| work with cops.
|
| https://www.kommersant.ru/doc/5258943
| Reubachi wrote:
| I've no horse in this race, so please don't take this the wrong
| way,
|
| Escaping western sanctions by moving to Israel + Being a
| Russian shill are usually mutually inclusive, not exclusive
| things.
| yonixw wrote:
| Bad comment. First, can you please name the founder? Because
| according to wiki Ilya Segalovich never lived in Israel and
| Arkady_Volozh lives in Tel Aviv (Not a settlement). Both
| Jewish, so why present it as some "must-be-hidden-cause"
| connection with Israel?
|
| Also, nothing shady from Israel side in term of sanctions. They
| have a large Jewish community in both Russia and Ukraine and
| need to be on good term with both to have their gov helping in
| supporting (or evacuating) them. Not to mention Russia has
| heavy presence in Syria which borders Israel. A conflict with
| Russia without anything like a NATO back is out of the
| question.
| OmicronCeti wrote:
| >Elena Bunina, who is Jewish, is stepping down from her role
| as CEO of Yandex LLC, 'Russia's Google,' amid the war in
| Ukraine. Sources confirm she is in Israel and has no
| intention of returning to Russia
|
| https://www.haaretz.com/israel-news/tech-
| news/2022-04-06/ty-...
| [deleted]
| yonixw wrote:
| It doesn't say she lives in a settlement. Or have any
| special treatment from Israel or influence on Israel in
| term of sanctions.
| keewee7 wrote:
| Many people consider all of Israel to be an illegitimate
| Western colonial settler state in the Middle East.
| starik36 wrote:
| Those "many people" are likely ignorant and/or antisemitic.
| yonixw wrote:
| Then they have some explaining to do. As they are very
| wrong, from a historical/archaeological point view [1].
|
| [1] https://en.wikipedia.org/wiki/History_of_the_Jews_and_J
| udais...
| brailsafe wrote:
| > Then they have some explaining to do. As they are very
| wrong, from a historical point view [1]
|
| Kind of explains the general vibe of relations between
| everyone in that area historically
| VictorPath wrote:
| The relevant Wikipedia page is
| https://en.wikipedia.org/wiki/Khazars
| yonixw wrote:
| Did you read it? It literally has a "Use in antisemitic"
| section. Can you have any bigger red flag?
|
| > Use in antisemitic polemic
|
| > conspiracy theorist, David Icke, who states that the
| Israelians falsely claim to be descendants of the
| Biblical Jews
|
| I don't really care about conspiracy theorists. Mainly
| because they ignore 2000 years of accepted archeology.
| nashashmi wrote:
| https://en.wikipedia.org/wiki/The_Thirteenth_Tribe#Geneti
| c_r...
|
| This led me to look up similar information. Another
| article [1] looks into this a little more deeply.
|
| I feel there is a resurgence of despising European
| dominance over the last 200 years and Israel is just
| another point here. Thus, we have material hypothesizing
| the illegitimacy of European Jews when the Jews of other
| ethnicities may have better acceptance in the region.
| (But all of this is just a vague hypothesis.)
|
| [1] https://www.science.org/content/article/tracing-
| roots-jewish...
| generalizations wrote:
| > Bad comment.
|
| >> Be kind. Don't be snarky. Have curious conversation; don't
| cross-examine. Please don't fulminate. Please don't sneer,
| including at the rest of the community.
| the_duke wrote:
| > The former Yandex CEO literally moved to Israel to escape
| Western sanctions.
|
| That somehow doesn't support the point your are trying to make
| at all...
| lotusmars wrote:
| edf13 wrote:
| Wonder what the split is between Russian and English in the
| model?
| londons_explore wrote:
| Open the vocab file (from the script in the download directory)
| and you can get a pretty good idea.
|
| Looks to be approximately 50/50 from my random scrolling
| through the list.
| f311a wrote:
| That's because English and Russian have pretty similar
| vocabulary size. Vocabulary does not reflect the size of the
| data.
| londons_explore wrote:
| In this case, it does, because the vocab is not a list or
| words, but a list of tokens. Each token _may_ be a word,
| but it might also be a phrase or part of a word. The tokens
| are generated to be optimal on the input data - ie. for a
| given vocab size to minimize the number of tokens to
| represent it.
|
| Therefore, the size of the vocab gives a good guide to the
| size of the data, since if there was 10x more english
| language data then the optimal distribution would be to
| dedicate more token space to english than russian.
| ma2rten wrote:
| I am one of the people who worked on Google's PaLM model.
|
| Having skimmed the GitHub readme and medium article, this
| announcement seems to be very focused on the number of parameters
| and engineering challenges scaling the model, but it does not
| contain any details about the model, training (learning rate
| schedules, etc.), or data composition.
|
| It is great that more models are getting released publicly, but I
| would not get excited about it before some evaluations have been
| published. Having a lot of parameters should not be a goal in and
| of itself. For all we know this model is not well trained and
| worse than Eleuther AI's 20B parameter model, while also being
| inconveniently large.
| rllearneratwork wrote:
| Given that Yandex is a crucial part of Russian propaganda arm,
| we should consider the whole range of possibilities from:
|
| * Good. This is great researchers helping community by sharing
| great work. (which is what I'd like to assume before I have any
| proof of the contrary)
|
| * Bad. This very expensive training has been approved by Ya
| leadership (which is under Western personal sanctions) because
| they've secretly built in RU's propaganda talking points into
| the model. Such as "war in Ukraine is not a war but special
| operation" etc.
| whimsicalism wrote:
| Should we assume language models released by Twitter have
| injected content praising Hunter Biden?
| rllearneratwork wrote:
| No. read my message again. As I said, we should assume good
| intention first until proven otherwise.
|
| But we should have better tools to test for
| biases/toxicity. Perspective API is great tool for toxicity
| detection. But I'm not aware of any "propoganda" detection
| tool.
| [deleted]
| javajosh wrote:
| _> this announcement seems to very focused on number of
| parameters_
|
| And yet your own project headline is "Pathways Language Model
| (PaLM): Scaling to 540 Billion Parameters for Breakthrough
| Performance"[0].
|
| 0-https://ai.googleblog.com/2022/04/pathways-language-model-
| pa...
| panda-giddiness wrote:
| 1. The OP did not criticize the headline; they criticized the
| content. If you read the article that you linked, you would
| find that they do, in fact, evaluate the performance of the
| model.
|
| 2. 540 billion parameters is notable for its size, which is
| likely why they lead with that particular headline.
| gwern wrote:
| The difference is PaLM was extensively benchmarked and it
| performed as well as it should, which is to say, amazingly
| well. The irony here is that you should instead be invoking
| that _other_ ~500b model, Nvidia 's Megatron-530b, which was
| undertrained, only cursorily evaluated (no interest in any
| new capabilities or even examining old ones like inner
| monologues) and promptly forgotten by everyone after the
| headlines about being the largest dense model:
| https://arxiv.org/abs/2201.11990#microsoftnvidia
| MichaelRazum wrote:
| It's just crazy how much it costs to train such models. As I
| undestand 800 A100 cards would cost about 25.000.000 without
| considering the energy costs for 61 days of training.
| semitones wrote:
| https://coreweave.com/ offers some of the cheapest GPU compute
| out there
| refulgentis wrote:
| 16,000,000 at MSRP
| StevenWaterman wrote:
| Lambda labs will rent you an 8xA100 instance for 3 months for
| $21,900. That would put it at around $2m
| MichaelRazum wrote:
| Still a bit to expensive for my sideproject ; ) To be honest
| it seems only big corp can do that kind of stuff. By the way
| if try to do hyper parameter tuning or some exploration in
| the architecture it becomes guess 10x or 100x more expensive.
| bmcahren wrote:
| AWS has them in US-EAST1 for $9.83/hr spot with 96 CPU
| cores, 1152GB of ram, 8 A100s with 320 GB of RAM, 8TB of
| NVME, and 19 Gbps of EBS bandwidth to load your data
| quickly.
|
| https://aws.amazon.com/ec2/instance-types/p4/
|
| p4d.24xlarge
|
| An alternative is the p3.16xlarge for 8 V100s with 256GB of
| GPU RAM but you might as well get the A100s since it's only
| $0.50/hr cheaper
| narrator wrote:
| I love Yandex. They are the best search engine by far for
| politically controversial topics. They also release a language
| model to benefit everyone even if it says politically incorrect
| stuff. They also name their projects "cocaine" probably to
| perhaps to prevent western competitors from using them.
|
| You look at OpenAI and how they don't release their models mainly
| because they fear "bad people" will use them for "bad stuff."
| This is the trend in the west. Technology is too powerful, we
| must control it! Russia is like... Hey, we are the bad guys
| you're talking about so who are we keeping this technology from?
| The west has bigger language models than we do, so who cares.
| Also their attitude to copyright and patents, etc. They don't
| care because that's not how their economy makes money. Cory
| Doctorow's end of general purpose computing[1] and locked down
| everything is very fast approaching. I'm glad the Russians are
| around and aren't very interested in that project.
|
| [1]https://csclub.uwaterloo.ca/resources/tech-talks/cory-
| doctor...
| risyachka wrote:
| >> They are the best search engine by far for politically
| controversial topics
|
| FYI, they are Russian subject that follows ALL their censorship
| laws (and oh boy do they have a lot of it).
|
| >> probably to perhaps to prevent western competitors from
| using them The irony here. All yandex products are exact copies
| of western, adjusted to local market.
| cpursley wrote:
| Actually they're not, some of the Yandex products are
| actually better and pretty innovative (ignoring the political
| stuff). Maps and Go are especially good. Ditto with Russian
| banking apps, they out American bank apps to shame.
| jhgb wrote:
| Wait, so you're saying it's a Russian company breaking
| Russian laws and getting away with it?
| abra0 wrote:
| >They are the best search engine by far for politically
| controversial topics.
|
| This is an interesting take given the political censorship in
| Russia (for some ineffable reason much harsher now than it used
| to be 4 months ago) and cases like
| https://twitter.com/kevinrothrock/status/1510944781492531208.
| narrator wrote:
| Search Google and Yandex for "2020 election fraud." The
| results are VERY different. The Zach Vorhies leak shows that
| Google regularly does blatant censorship for political
| purposes.[1]
|
| [1]https://www.breitbart.com/tech/2021/08/19/google-
| whistleblow...
| skrebbel wrote:
| I don't know man, "thegatewaypundit.com" as a top reputable
| source? seems to me like it's not "honest two-sided
| results" but just, well, a rather random mix of result of
| widely varying quality. Mad Altavista vibes!
|
| What I'm trying to say is that even if you believe that
| "was the 2020 US election stolen?" is worth debating, which
| it isn't, the yandex results are shit.
| narrator wrote:
| If you get all your information through mainstream
| channels, and you don't want to see anything
| contradicting those channels then you should continue to
| use Google because they explicitly implement the
| algorithms on controversial topics to prefer mainstream
| news sources[1]. What I mean by "better" in terms of
| controversial searches is that on controversial matters,
| it will rank the searches the same way it does for all
| other searches. I mean yeah, I don't have access to the
| internal code base of Yandex, but it certainly feels more
| organic.
|
| [1]https://www.breitbart.com/tech/2019/05/12/study-the-
| cnn-sear...
| zaptrem wrote:
| Why link to Breitbart of all places instead of the
| original source?
|
| https://www.cjr.org/tow_center/google-news-algorithm.php
|
| Btw Wikipedia's first few sentences on Breitbart are not
| inspiring
|
| > Its journalists are widely considered to be
| ideologically driven, and much of its content has been
| called misogynistic, xenophobic, and racist by liberals
| and traditional conservatives alike.[10] The site has
| published a number of conspiracy theories[11][12] and
| intentionally misleading stories.[13][14]
| narrator wrote:
| This is the association fallacy, which is, unfortunately,
| how most people determine what to believe these days.
|
| An absurd example of this fallacy would be, Wikipedia,
| which you cite, has articles that indicate tobacco
| smoking may cause disease. The nazis were also anti-
| smoking[1]. Therefore Wikipedia is Nazi propaganda and
| you should not trust anything on there.
|
| [1]https://www.amazon.com/Nazi-War-Cancer-Robert-
| Proctor/dp/069...
| alphabetting wrote:
| Google: 118M results. Top link is the best resource on
| verified election fraud cases.
|
| Yandex: 9M results. The top two links are pretty suspect.
| Top link promotes Dinesh D'Souza's 2000 Mules documentary
| in the banner which at best is a one-sided take on election
| fraud. At worst, very misleading.
|
| https://i.imgur.com/n5a9LOd.png
| chinathrow wrote:
| Is this sarcasm?
| jhgb wrote:
| > This is the trend in the west. Technology is too powerful, we
| must control it!
|
| I take it that you're either too young or too untraveled to be
| aware of the level of state control of technology in "the
| east". Xerographic machines, mimeographs, and other similar
| reprographic devices used to be highly controlled machinery
| behind the Iron Curtain. This is absolutely not something
| exclusive or even peculiar to "the west".
| lukestateson wrote:
| denysvitali wrote:
| I don't want to defend them, but I'm genuinely curious: aren't
| they maybe doing it because the opposite will cause them huge
| legal issues?
| lukestateson wrote:
| They can protest, they can boycott, they can disagree, they
| can tell the truth.
|
| But... They chose to obey.
|
| It's the choice that matters.
| denysvitali wrote:
| From what I've seen, telling the truth in authoritian
| countries doesn't end up well.
|
| To the best of my knowledge, they are a Russian company -
| it's not like they can just tell the truth and move away
| from Russia that easily, so I think (and hope?) they're
| just playing a political game.
|
| What would Google do in their position? Idk
| fabrika wrote:
| They will simply have their company taken away from them.
|
| Nevertheless, they had many years before the war to start
| marking their news as 'Official'. Or sell the news service.
| They certainly could have done so. This would have solved
| their image problems.
| lukestateson wrote:
| It's already taken away.
| jeroenhd wrote:
| This is often overlooked and it's a fair point in defence of
| the people working for Yandex. You can't judge someone just
| for working for Yandex or even most Russian companies. The
| people who have voiced concern are already out of the company
| and it's perfectly reasonable that the rest would like to
| keep their jobs, especially in uncertain economic times with
| all these sanctions against Russia.
|
| However, this also implies that Yandex, as a company, cannot
| be trusted. It's not the researcher's fault, but they simply
| aren't allowed to work in a way that doesn't reinforced the
| Russian government's bias. As usual, the Russian government
| is the real villain here, but its authoritarian rule
| "infects" any company and country it has control over.
|
| It can be assumed that the people working for Yandex are also
| victims of their abusive government, but that doesn't change
| the fact that their work is unlikely to be trusted outside
| the Russian sphere of influence.
| 2a0c40 wrote:
| Any similarity to our news and search services is pure
| coincidence
| timeon wrote:
| There is hardly any.
| joshsyn wrote:
| Yandex > Google.
| lumost wrote:
| To add a voice of skepticism. The recent rush to open source
| these models may be indicative that the tens of millions that's
| spent training these things has relatively poor roi. There may be
| a hope that someone else figures out how to make these
| commercially useful.
| MivLives wrote:
| We're using these at where I work (large retail site) to help
| make filler text on generated articles. Think the summary blurb
| no one reads at the top. As for why we're writing these
| articles (we have a paid team that writes them too), the answer
| is SEO. This is probably the only thing I've seen done with a
| text model in production usage. I'm not 100% sure what model
| they're using.
| BonoboIO wrote:
| Content made for machines. Probably a billion dollar
| industry.
| fab1an wrote:
| Content made for machines serving humans made by machines
| pretending to be human
| jquery wrote:
| Made by machines, for machines. It's poetic.
| tobr wrote:
| Sorry but every part of that sounds so terrible.
| jandrese wrote:
| You just know that some Amazon listings are written by GANs.
| varispeed wrote:
| I hate this so much. These tools are getting better, so often
| you realise only half way through that you are reading AI
| text. Then you have to flush your brain and take a mental
| note, to never visit that site again.
| lostmsu wrote:
| They did not publish benchmarks about quality of the models,
| which is very suspicious.
|
| I personally squinted hard when they said removing dropout
| improves training speed (which is in iterations per second),
| but said nothing about how it affects the performance (rate of
| mistakes in inference) of the trained model.
| jasonphang wrote:
| I agree that the lack of benchmarks makes it hard to
| determine how valuable this model is. But on the topic of
| dropout, dropout has been dropped for the pretraining stage
| of several other large models. Off the top of my head:
| GPT-J-6B, GPT-NeoX-20B, and T5-1.1/LM.
| HeavyStorm wrote:
| Maybe training it is not that expensive?
|
| I know from practice that it takes a really really long time to
| train even a small nn (thousands of params) , so you'll need a
| lot more hardware to train one with billions... But, it's
| expensive to buy the hardware, not necessarily to use it. If
| you, for some reason, have a few hundred GPU lying around, it
| might be "cheap" to do the necessary training.
|
| Now, that's not your point - cost != price. But, still...
| vgel wrote:
| My guess is they're mostly vanity projects for large tech
| companies. While the models have some value, they also serve as
| interesting research projects and help them attract ML talent
| to work on more profitable models like ad-targeting.
| gfodor wrote:
| An equally plausible frame is that once a technology becomes
| replicated across several companies, it makes sense to open
| source it since the marginal competitive advantage are the
| possible resultant external network effects.
|
| I don't _know_ if that 's the right way to think about the open
| sourcing of large language models. I just think we really can't
| read too much into such releases regarding their motivation.
| dandiep wrote:
| There are tons of commercial uses for these models. I've been
| experimenting with an app targeted toward language learners
| [1]. We use large language models to:
|
| - Generate vocabulary - e.g. for biking: handlebars, pedals,
| shifters, etc
|
| - Generate translation exercises for given topic a learner
| wants to learn about - e.g. I raised the seat on my bike
|
| - Generate questions for the user - e.g. What are the different
| types of biking?
|
| - Provide more fluent ways to say things - I went on my bike to
| the store -> I rode my bike to the store
|
| - Provide explanations of the difference in meaning between two
| words
|
| And we have fine tuned smaller models to do other thing like
| grammar correction, exercise grading, and embedded search.
|
| These models are going to completely change the field of
| education in my opinion.
|
| 1) https://squidgies.app - be kind it's still a bit alpha
| mumblemumble wrote:
| From what I've seen, using these huge models for inference at
| any kind of scale is expensive enough that it's difficult to
| find a business case that justifies the compute cost.
| f311a wrote:
| Yandex uses it for search and voice assistant
| Voloskaya wrote:
| Those models aren't trained with the objective of being
| deployed in production. They are trained to be used as
| teachers during distillation into smaller models that fit the
| cost/latency requirements for whatever scenario those big
| companies have. That's where the real value is.
| MasterScrat wrote:
| HuggingFace will soon release their BigScience model:
| https://twitter.com/BigScienceLLM/status/1539941348656168961
|
| "a 176 billion parameter transformer model that will be trained
| on roughly 300 billion words in 46 languages"
|
| So anything smaller than that will become worthless. May be a
| factor, companies have a last chance to make a PR splash before
| it happens.
|
| Read more about it:
| https://bigscience.huggingface.co/blog/model-training-launch...
| rahidz wrote:
| Not necessarily, only ~30% of the database is in English, so
| it likely won't be as good as a smaller model trained solely
| or mostly on English words.
|
| https://bigscience.huggingface.co/blog/building-a-tb-
| scale-m...
| pembrook wrote:
| Side note: Yandex search is awesome, and I really hope they stay
| alive forever. It's the only functional image search nowadays,
| after our Google overlords neutered their own product out of fear
| over lawyers/regulation and a disdain for power users.
|
| You can't even search for images "before:date" in Google anymore.
| [deleted]
| whywhywhywhy wrote:
| Yandex Image Search is today is what Google Image Search should
| have been.
|
| End of the day I'll use what actually gets the job done.
|
| Same goes for OpenAI and Google AI. If you don't actually ever
| release and let others use your stuff and end paralyzed in fear
| at what your models may do then someone else is gonna release
| the same tech, and at this rate it seems like that'll be
| Chinese or Russian companies who don't share your sensibilities
| at all, and their models will be the ones that end up
| productized.
| jowday wrote:
| The "ethical concerns" thing is just a progressive-sounding
| excuse for why they're not going to give their models away
| for free. I guarantee you those models are going to be
| integrated into various Google products in some form or
| another.
| SXX wrote:
| OpenAI should just rebrand since nothing they do is actually
| open.
| daniel-cussen wrote:
| You know 100 years ago you could just buy uranium openly?
| Leo Szilard hustled up 200 kilograms, pleted, in the 30's.
| SXX wrote:
| What does it have to do with OpenAI branding?
|
| Their "moral" reasoning behind not publishing models is
| simply laughtable because they do sell API access to them
| to anyone who can pay. And "bad guys" generally have
| money.
| sanxiyn wrote:
| They can (and do) revoke API access from bad guys. They
| can't do that to downloaded models. Look, I don't like
| what OpenAI does, but "API access, but no model download"
| _makes sense_ if you are worried about misuses.
| 2Gkashmiri wrote:
| >if you are worried about misuses
|
| why is morality into this? is this the same discussion of
| car manufacturers not selling cars to certain people
| because they are worried about misuse?
| sanxiyn wrote:
| Automotive companies, in fact, have product liability.
| It's about liability, not morality.
| 2Gkashmiri wrote:
| when you release a project into the wild under a
| permissive license, aren't you essentially washing
| yourself from any "liability" ?
|
| > MIT " IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
| HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
| OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
| SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE."
|
| don't commercial licenses have same/similar wording so
| what liability are you talking about?
| daniel-cussen wrote:
| So they do that because they're doing it for free.
| Otherwise they couldn't be generous with their work--
| those licenses are about permitting the generous intent.
| ggktk wrote:
| Bad actors still can get access to such models. It even
| makes them more dangerous than it would if everyone had
| access to them.
|
| Here's an alternative: progressively release better and
| better models (like 3B params, 10B, 50B, 100B) and let
| people figure out the best way to fight against bad
| actors using them.
| sanxiyn wrote:
| > It even makes them more dangerous than it would if
| everyone had access to them.
|
| This is the sort of argument that proves guns would be
| less dangerous if everyone had access to them.
| prometheus76 wrote:
| "An armed society is a polite society" - Robert Heinlein
| adamc wrote:
| "It even makes them more dangerous..." needs to be
| demonstrated, not asserted.
| SXX wrote:
| Every company out there says it will "revoke API access
| for misuse", but do they have transparency reports? Who
| do they even consider bad guys and what do they consider
| as misuse?
|
| I would be totally on their side if their reasoning was
| that they dont publish models to compete with FAANG more
| efficiently and get more income for their research, but
| this moral reasoning just sounds completely fake because
| bad actors do have funding to train their own models.
| sanxiyn wrote:
| OpenAI published "Lessons Learned on Language Model
| Safety and Misuse" in March.
| https://openai.com/blog/language-model-safety-and-misuse/
| It also promised "forthcoming publication".
|
| Examples of "real cases of misuse encountered in the
| wild" include "spam promotions for dubious medical
| products and roleplaying of racist fantasies".
|
| Yes, some bad actors can train their own models, but
| OpenAI can't do much about that either way. It is
| doubtful whether spam promoters of dubious medical
| products can, at least for a while.
| true_religion wrote:
| It would be better for misuse to be criminalized and
| taken care of by national governments, rather than leave
| it to for-profit companies to decide what is or isn't
| "misuse".
|
| Personally, I think using AI to manufacture
| advertisements on demand is misuse... but will Google
| agree with me?
| remram wrote:
| Maybe they should rename to SafeAI, if their concern is
| controlling access.
| JacobThreeThree wrote:
| Good point. The issue is not the policy per se, it's the
| fact that their name is not accurate.
| gaudat wrote:
| This reminded me of a shitpost comparing Google and Yandex.
|
| https://desuarchive.org/g/thread/78144754/#78145600
| sereja wrote:
| IMO the main reason these companies don't release their
| models is not ethical concerns but money:
|
| - NVIDIA sells GPUs and interconnect needed for training
| large models. Releasing a pretrained LM would hurt sales,
| while only publishing a teaser paper boosts them.
|
| - Google, Microsoft, and Amazon offer ML-as-a-service and
| TPU/GPU hardware as a part of their cloud computing
| platforms. Russian and Chinese companies also have their
| clouds, but they have low global market share and aren't
| cost-efficient, so nobody would use them to train large LMs
| anyway.
|
| - OpenAI are selling their models as an API with a huge
| markup over inference costs; they are also largely sponsored
| by the aforementioned companies, further aligning their
| interests with them.
|
| Companies that release large models are simply those who have
| nothing to lose by doing so. Unfortunately, you need a lot of
| idle hardware to train them, and companies that have it tend
| to also launch a public cloud with it, so there is a
| perpetual conflict of interests here.
| hdjjhhvvhga wrote:
| > Google overlords neutered their own product out of fear over
| lawyers/regulation
|
| What kind of lawyers/regulation do you have in mind? If
| anything, I'd find the opposite: lawyers and copyright holders
| should be grateful for such a tool that - when it was still
| working - allowed you to trace websites using your images
| illegally.
|
| Now they all use Yandex for this purpose, with relatively good
| results.
| rascul wrote:
| Maybe the view image link removal in 2018.
|
| https://www.theverge.com/2018/2/15/17017864/google-
| removes-v...
| thereddaikon wrote:
| IIRC it was mostly from groups like Getty images. They and
| other image licensing companies didn't want google showing
| their images in search results. They claimed it was copyright
| infringement and given the absolute state of IP law in the US
| they could have made Google's life very difficult.
| hdjjhhvvhga wrote:
| We're talking about reverse search, right? (Because
| "normal" image search still kind of works, it's reverse
| search that is completely broken.) In this case, you
| already have the copyrighted image, and if you find out
| that the same image is on Getty Images, then all the better
| as you can check it license. Also, it's better for GI as it
| gives them more exposure, and the kind of companies who use
| GI are very unlikely to pirate images.
| omniglottal wrote:
| Couldn't compliance with a robots.txt file have prevented
| all of this?
| 323 wrote:
| You misunderstood parent post. It's about Google not being
| sued for discrimination.
|
| https://www.washingtonpost.com/news/the-
| intersect/wp/2016/08...
|
| https://www.theguardian.com/technology/2016/apr/08/does-
| goog...
|
| https://www.bloomberg.com/news/articles/2021-10-19/google-
| qu...
|
| https://theconversation.com/googles-algorithms-
| discriminate-...
| hdjjhhvvhga wrote:
| Oh I see. What I'm looking for is the reason why they broke
| the reverse image search. It was working well many years
| ago but some time after that they switched it to some
| strange image classifier (I upload an image of an apple to
| find exactly the same image to track its license of origin,
| and it says "possibly an image of an apple" - oh thank you
| Google I didn't know that.)
| hooby wrote:
| Tineye works reasonably well, for finding exactly the
| same image (including different resolutions, crops, etc.)
|
| https://tineye.com/
| tablespoon wrote:
| > Tineye works reasonably well, for finding exactly the
| same image (including different resolutions, crops, etc.)
|
| Tineye is definitely better than Google with crops, etc.
| Google reverse image search seems to have more data, but
| it seems much less able to recognize even basic
| modifications to the input.
| busymom0 wrote:
| Do they at least tell you the type of Apple it is?
| tablespoon wrote:
| > You misunderstood parent post. It's about Google not
| being sued for discrimination.
|
| Who's suing them and on what grounds? If they made changes,
| it's probably for PR reasons, not legal ones.
|
| Also not all of these seem "fixed" e.g.:
|
| > https://www.theguardian.com/technology/2016/apr/08/does-
| goog...
|
| Article from 2016, but results look very similar today: htt
| ps://www.google.com/search?q=unprofessional+hair&source=l..
| .
| visarga wrote:
| They used to have in their AI ethics department some of
| the most anti-AI progressives. They picked on everything
| - biased training data, discriminatory usage, consuming
| too much energy to train, models are just stochastic
| parrots, etc. while forgetting to mention any effort to
| mitigate the problems (of course these are real concerns
| and being under intense research) Now these critics are
| fired, but Google must have learned to fear them.
|
| If they let everyone use the latest models, critics could
| uncover ugly biases in 10 minutes. Then Google would have
| to do damage control. These models are very suggestible.
| You can induce them to make fools of themselves.
| psyc wrote:
| I regularly use it for a sample of what Google and Bing are
| intentionally omitting.
| whoami_nr wrote:
| FWIW, https://same.energy/ seems to work fine for me
| jeanlucas wrote:
| A 500 days old product in beta? I hope they do well.
| Kye wrote:
| Extended betas used to be Google's thing.
| memorable wrote:
| I agree with this. When I am still addicted to porn, Yandex
| Image is the only one that seems to find relevant and useful
| links.
| upupandup wrote:
| Does anybody want to crowd fund the training?
| qwertywert_ wrote:
| Already trained, but still need some ~200GB GPU mem to run the
| model.
| schizo89 wrote:
| I hope one day it will be possible to run this kind of models at
| home.
| lannisterstark wrote:
| I was about to comment exactly the same thing. Stuff like this
| makes me feel so much behind because there's no way I can run
| this lol.
| f6v wrote:
| They hardware they mention can be rented from cloud
| providers. It's just that it's not very cheap.
| irthomasthomas wrote:
| I think that unlikely. Barring some breakthrough that takes us
| beyond the limits of silicon.
| haswell wrote:
| Couldn't the same thing be said about most things we do on
| our phones these days?
|
| Won't incremental advancement cover this eventually? (i.e. no
| major breakthrough required, just patience).
| Akronymus wrote:
| Well, it used to be impossible to render on anything not a
| mainframe in a reasonable time.
|
| The day will come when we will be able to.
| rocgf wrote:
| When it will be possible to run this at home, the big companies
| will have models way bigger than this...
| rapnie wrote:
| Or maybe the AI will own big companies that build bigger
| models for it. /s
| albertzeyer wrote:
| If your disk has enough space to store the model, I think in
| theory you could run them, using the disk to store states. But
| it will be slow. I'm not sure how slow though, and also if
| anyone has implemented this. It actually should not be too
| difficult.
| redox99 wrote:
| Disk makes no sense considering RAM is pretty cheap. But even
| then RAM is way too slow (and the communication overhead way
| too high). You probably get like a 100x slowdown or more.
| lostmsu wrote:
| I think you are overestimating compute and I/O for this
| model. If you assume it is RAM bandwidth bound, with a
| single channel top DDR4 you will get inference time as a
| low multiple of 7 seconds (200GB/25GBs). In a workstation
| you can have 8 channels.
| justinlloyd wrote:
| 12-channels in mine. 24-channels on some configurations,
| though I think that is the upper limit at this time, with
| a maximum density of 512GB per channel.
| lostmsu wrote:
| Is it multisocket?
| cal85 wrote:
| Speaking of which... I built a gaming PC a few years ago but I
| never use it these days. I want to install Linux on it and
| start playing around with machine learning.
|
| Can anyone recommend any open source machine learning project
| that would be a good starting point? I want one that does
| something interesting (whether using text, images, whatever),
| but simple/efficient enough to run on a gaming PC and see some
| kind of results in hours, not months. I'm not sure what I want
| to do with ML yet, I just know I'm interested, and getting
| something up and running is likely to enthuse me to start
| playing and researching further.
|
| My spec is: GeForce RTX 2080 Ti (11GB), a 24-core AMD Ryzen
| Threadripper, and 128GB RAM. I'd be willing to spend on a new
| graphics card if it would make all the difference. I am a
| competent coder and familiar with Python but my experience with
| ML is limited to fawning over things on HN. Any recommendations
| gratefully received!
| schizo89 wrote:
| I would recommend auditing Stanford courses in following
| order:
|
| 1. CS231n Machine Vision https://www.youtube.com/playlist?lis
| t=PLkt2uSq6rBVctENoVBg1T...
|
| 2. CS234 Reinforcement Learning https://www.youtube.com/watch
| ?v=FgzM3zpZ55o&list=PLoROMvodv4...
|
| 3. CS330 Meta Learning https://www.youtube.com/watch?v=0rZtSw
| NOTQo&list=PLoROMvodv4...
|
| Those will get you on track with general concepts about
| reasoning, AI engineering and concepts of learning itself
|
| Language models for me a bit of headache because there're in
| different domain on intersection with linguistics and
| humanities but here's a good course
|
| https://www.youtube.com/watch?v=rha64cQRLs8&list=PLoROMvodv4.
| ..
|
| Those are all free and high-quality but require a lot of
| brain power
| Hendrikto wrote:
| If you live in a datacenter, it already is!
| alexpotato wrote:
| You already have access to thousands of machine now from your
| home computer.
|
| Naval Ravikant put it best here:
| https://twitter.com/naval/status/1002106977273565184
| kome wrote:
| I agree, yandex is a great search engine
| lotusmars wrote:
| Well you can have it in the West. We'd prefer something
| separate from Kremlin.
| zeofig wrote:
| We can have what, a great search engine? Maybe if you have a
| time machine to 2003
| lotusmars wrote:
| Just take it to America from us, thank you. Along with VK.
| Great search engine and a social network. Full of backdoors
| for thugs and corrupt police, censorship and other lovely
| stuff... but you'll probably say that Google is full of it
| too, because you had no experience of living in Russia.
| honkler wrote:
| and you haven't had experience of living in the states.
| dash2 wrote:
| This comment needs expansion. Tell us your experience of
| police brutality and corruption in the US.
| honkler wrote:
| Lmao. Just read the news my man.
| speed_spread wrote:
| 2003 certainly didn't have a better search engine. It only
| had a much smaller, open and un-SEO-biased Internet, making
| the indexing job correspondingly easier.
| dang wrote:
| We detached this subthread from
| https://news.ycombinator.com/item?id=31847666 since it turned
| into a tedious generic flamewar.
| bgandrew wrote:
| no it's not. they straight up serve kremlin, promoting kremlin
| fake news and silencing russian opposition (not much to silence
| but still). they can have whatever functionality they like, I
| still won't use it in billion years.
| joshsyn wrote:
| What a hyperbolic emotional liar incapable of reasoning...
| aeyes wrote:
| They sold the news platform, it looks like a step towards
| having their company less associated with Kremlin content
| moderation:
|
| https://www.reuters.com/technology/yandex-sells-news-
| content...
|
| They do business in other countries and for that it is best
| for the business to appear as neutral as possible. We don't
| know how much they fiddle with the search results and ranking
| but this still looks quite neutral to me:
| https://yandex.com/search/?text=russo+ukrainian+war
| carlhjerpe wrote:
| You know that some people use search engines for other things
| than news right?
| ok123456 wrote:
| yes russia bad. thank you for your contribution to this
| discussion.
| squarefoot wrote:
| Every country is or has been bad in some context in
| different times, because doing the interests of your
| country often translates into doing harm to some others.
| Yandex is a really nice search engine and I agree it's
| excellent for image searches compared to Google results
| polluted with Pinterest links and other cancerous SEO
| rubbish. But does Yandex echo propaganda for the Kremlin?
| Yes of course, as do Google and most of the others for
| their advertisers and governments, albeit to some different
| degrees. The usual approach when someone or some company
| with a controversial public image does something good with
| apparently no strings attached should be "Timeo Danaos et
| dona ferentes", that is, take the gift but don't trust
| them, mo matter if they're called Google, Microsoft, Yandex
| or whatever. Their purpose is of course to associate the
| Yandex brand, and therefore Russia, to something perceived
| as good, have more people use it, so that more users will
| be exposed to their filtered news. Just be aware of that,
| take the good and ignore the rest.
| timeon wrote:
| > yes russia bad
|
| that is the point
| cmsj wrote:
| Yeah we definitely shouldn't worry about the political
| sympathies/vulnerabilities of the web services we use as
| the foundations of our shared knowledge...
| ok123456 wrote:
| Do you have the same level of concern about the leverage
| Five-Eyes intelligence agencies have over Facebook and
| Google?
| lotusmars wrote:
| There's a world of difference between Five-Eyes and being
| harrassed, mobbed, jailed, having a "Z" and "traitor"
| spray painted on your apartment door or being murdered.
|
| By conflating those two clearly means you don't
| understand what's going on Russia and its Putin-
| controlled satellites like Belarus.
| carlhjerpe wrote:
| There's a world of difference between living in Russia
| and using Yandex to search for how to kill Putin and
| living in the west and using Yandex to search for how to
| spin up a FastAPI server.
|
| By conflating those two clearly means you don't
| understand that everyone isn't in the same situation as
| yourself.
| colordrops wrote:
| Julian Assange anyone? Gary Webb? Michael Hastings?
|
| And you've got Abby Martin and Chris Hedges who've had
| much of their content removed by YouTube. Chris Hedges is
| even a Pulitzer prize winner.
| marshray wrote:
| You've named three people, and some YouTube videos.
|
| On the other side, the FSB has deported 1.3 million
| innocent Ukrainian civilians to concentration camps.
| (number is from official Russian sources)
| lizardactivist wrote:
| How do you feel about western-owned web services?
| lotusmars wrote:
| People in Russia felt much safer using iCloud, Gmail or
| Google Drive. Of course they comply to some requests by
| Kremlin or police. But Yandex or VK just give information
| straight away often times without much procedure.
| honkler wrote:
| the same way I feel much comfortable using Yandex in
| united states. Google and Facebook feed their data to
| NSA.
| riku_iki wrote:
| Any proof of that?
| [deleted]
| ChuckNorris89 wrote:
| Snowden's leaks are not enough for you?
| riku_iki wrote:
| What proof Snowden provided about Google and FB feeding
| data to NSA exactly?
| waffleiron wrote:
| https://en.wikipedia.org/wiki/PRISM#The_slides
| riku_iki wrote:
| There is no clarity on these slides if collection
| happened proactively or it was a way to transfer
| information for FISA warrants.
| waffleiron wrote:
| You asked for proof of the following:
|
| > Google and Facebook feed their data to NSA.
|
| We know that at least some companies were ordered to
| handover all data, continuously [1].
|
| edit: I think we have enough evidence that I would assume
| that it's valid for the other companies on the slides,
| and if it's not true you'll have to provide some proof of
| that.
|
| edit 2: [2]
|
| > It searches that database and lets them listen to the
| calls or read the emails of everything that the NSA has
| stored, or look at the browsing histories or Google
| search terms that you've entered, and it also alerts them
| to any further activity that people connected to that
| email address or that IP address do in the future."
|
| > Greenwald explained that while there are "legal
| constraints" on surveillance that require approval by the
| FISA court, these programs still allow analysts to search
| through data with little court approval or supervision.
|
| > "There are legal constraints for how you can spy on
| Americans," Greenwald said. "You can't target them
| without going to the FISA court. But these systems allow
| analysts to listen to whatever emails they want, whatever
| telephone calls, browsing histories, Microsoft Word
| documents."
|
| > "And it's all done with no need to go to a court, with
| no need to even get supervisor approval on the part of
| the analyst," he added.
|
| edit 3:
|
| > Equally unusual is the way the NSA extracts what it
| wants, according to the document: "Collection directly
| from the servers of these U.S. Service Providers:
| Microsoft, Yahoo, Google, Facebook, PalTalk, AOL, Skype,
| YouTube, Apple." [3]
|
| [1] https://www.theguardian.com/world/2013/jun/06/nsa-
| phone-reco...
|
| [2] https://abcnews.go.com/blogs/politics/2013/07/glenn-
| greenwal...
|
| [3] https://www.washingtonpost.com/investigations/us-
| intelligenc...
| riku_iki wrote:
| You brought two links on:
|
| - phone calls surveillance in Venezuella: no Google no FB
| mentioned
|
| - plain words of some reporter without any evidence
| provided, no Google no FB mentioned
| waffleiron wrote:
| Weird how there is limited hard evidence of a secret,
| illegal government program... It's a lot more than I've
| seen than evidence for the claims of Yandex proactively
| sharing data with the Russian government.
|
| Also where do you see Venezuela?
| riku_iki wrote:
| So, no proof, no evidence. Ok.
|
| > It's a lot more than I've seen than evidence for the
| claims of Yandex proactively sharing data with the
| Russian government.
|
| The difference is that checks and balances are much
| stronger in US, and such activities can be successfully
| investigated and government sued.
|
| As an example, your verizon case was successfully
| challenged:
| https://en.wikipedia.org/wiki/Klayman_v._Obama
|
| In Russia, court system works in manual mode from
| Kremlin.
|
| > Also where do you see Venezuela?
|
| I misread, you are right.
| waffleiron wrote:
| > The difference is that checks and balances are much
| stronger in US,
|
| You say that after we were talking about the NSA
| literally spying on US citizens, and without any proof?
| C'mon, are you really going to badger me about not having
| having the exact "hard evidence", and not even read my
| sources or provide ANY evidence yourself.
|
| edit: Yes, it got challenged AFTER needing to be leaked
| by a whistleblower that still can't return to his home.
| riku_iki wrote:
| > Yes, it got challenged AFTER needing to be leaked by a
| whistleblower that still can't return to his home.
|
| Good chance is that whistleblowing would be protected in
| this specific case.
| waffleiron wrote:
| Yes no,
|
| > Snowden was charged with theft, "unauthorized
| communication of national defense information" and
| "willful communication of classified communications
| intelligence information to an unauthorized person,"
| according to the complaint. The last two charges were
| brought under the 1917 Espionage Act.
| ok123456 wrote:
| The Espionage Act has no whistleblower protection. If the
| courts were allowed rule honestly and without political
| entanglements, there's no way the Espionage Act is
| constitutional at prima facia.
| riku_iki wrote:
| "Yes no" what?
|
| Government can charge him with whatever they want, it is
| up to court to decide if charges are valid.
| ChuckNorris89 wrote:
| _> So, no proof, no evidence._
|
| Do you really expect the US government to literally
| publish their illegal surveillance operations on
| Wikipedia as proof?
|
| Snowden's leaks and his statements should be enough to
| understand the big-tech surveillance apparatus aids the
| government under the table.
| dekhn wrote:
| You misunderstand. The NSA went out of their way to tap
| Google's lines outside of the US, which made the
| leadership at Google _furious_. It accelerated the work
| to encrypt international fiber (I think many people were
| really bothered by the tcpdump of a bigtable RPC
| containing a user ID). I was at a conference shortly
| after an saw a SVP rip an NSA rep to pieces.
|
| If Google is doing anything that is required of them
| legally as a US corp, I don't have a problem with that.
| waffleiron wrote:
| That's what Google claims, however the leaked slides
| claimed "direct access".
|
| edit: Does it really matter if they setup an FTP server
| instead of direct access, when we know a request can
| literally ask for "all" data (see Verizon).
|
| > When required to comply with these requests, we deliver
| that information to the US government -- generally
| through secure FTP transfers and in person," Google
| spokesman Chris Gaither told Wired, among other news
| outlets. [1]
|
| [1] https://www.theatlantic.com/technology/archive/2013/0
| 6/googl...
| dekhn wrote:
| right, you're discussing the mechanism by which Google
| shares information with the US government- when required
| by law.
|
| These systems don't give access to "all" data. Telephone
| companies are different- AT&T had a long standing, off
| the books agreement with US intelligence agencies (see
| Idea Factory for a fact-based discussion of what AT&T
| did) to share large amounts of information illegally.
| simonh wrote:
| It's not that Russia Bad, it's that if you know a search
| engine will serve you censored, biased results that makes
| it an unreliable search engine.
| Thiez wrote:
| And someone upthread claimed their image search was so
| great in comparison to google... because google also
| censors their results. They just censor different things.
| lotusmars wrote:
| carlhjerpe wrote:
| People using duckduckgo bangs often use different search
| engines for different topics.
|
| I usually try ddg first, if it's tech I use Bing, if it's
| local I use Google.
| ushakov wrote:
| Yandex users already assume it is censored and biased
|
| they continue to use it however, because it gives them
| the expected results most of the time
| daniel-cussen wrote:
| Compared to whom!? Who will serve you uncensored,
| unbiased results!? Like run your own crawler, dude grep,
| go to the library, come on!
| eloff wrote:
| Only if you're searching for censored things.
| afroboy wrote:
| Literally what google doing in favor of USA.
| relaunched wrote:
| Huge difference. Google does it for money. Yandex does it
| to enable an autocracy and to maintain their ability to
| operate.
| honkler wrote:
| and how do you know google does not do it to maintain
| their ability to operate? That's the whole point of deep
| state, no?
| turdit wrote:
| colordrops wrote:
| You're going to get downvoted, but Eric Schmidt worked
| regularly with the state department, and google employees
| were involved in spurring the color revolutions.
|
| Julian Assange detailed this in a newsweek article before
| his name and body were smeared into the ground:
|
| https://www.newsweek.com/assange-google-not-what-it-
| seems-27...
|
| Oh, but they say he's not trustworthy, or that it's a
| conspiracy theory that he was intentionally smeared. Well,
| the CIA and their contractors have been doing it for over a
| decade, even before he was unfairly accused of helping
| trump:
|
| https://arstechnica.com/tech-policy/2011/02/the-
| ridiculous-p...
|
| Google is an arm of the state department, no doubt.
| azinman2 wrote:
| I think you ran out of tinfoil this one is so large.
| spaniard89277 wrote:
| Yeah first thing we hear of US gov using tech companies
| for spionage and data mining. So much tinfoil yadda yadda
| nosianu wrote:
| I doubt that anything like this happend to Google execs in
| the US:
|
| "Putin's agents reportedly threatened a top Google
| executive in Moscow with a 24-hour ultimatum - Take down
| Russia protest vote app or go to prison" --
| https://www.businessinsider.com/russia-agents-threatened-
| goo...
|
| Not yet at least, the political climate may deteriorate to
| that point, especially when it's about elections, given
| recent revelations.
|
| Still, at least right now it looks to me - and I have
| visited Russia and Ukraine several times in the past and
| still have indirect connections (to people heavily involved
| in business there) - that there still is considerable more
| freedom from the government and its wishes for people and
| companies in the West.
|
| If you publicly criticize a US politician you may get some
| hate messages, but at least they are from private citizens
| and you don't have FBI agents knocking on your door
| threatening you with prison. In Germany some rogue police
| were found to send threatening messages, but as soon as it
| was discovered the government acted against it. Also in
| Germany there even were public rallies from pro-Russian
| folks, now try that in Moscow with pro-Ukraine banners...
| Russia even bans the colors yellow and blue, even when they
| have nothing whatsoever to do with Ukraine and are just
| decorative: "Russians Strip Yellow and Blue From the
| Nation's Streets Over Ukraine War" --
| https://www.themoscowtimes.com/2022/04/27/in-photos-
| russians...
| waffleiron wrote:
| > you don't have FBI agents knocking on your door
| threatening you with prison.
|
| Correct, it's DHS.
|
| https://twitter.com/_secondthought/status/133274617257067
| 725...
| bpodgursky wrote:
| You know people online can just say things, right?
| ginjas wrote:
| >You know people online can just say things, right?
|
| You know that if this was FSB instead of DHS, and it was
| me saying instead of you, you would be calling me a
| 'Russia Shill' or a 'KGB agent', right?
| bpodgursky wrote:
| I reject the false equivalence of the DHS and FSB. Not
| gonna both-sides this, sorry.
|
| Russia is an authoritarian militaristic state, and the US
| is a flawed liberal democracy. So yes, my priors are
| different. That's life.
| gre wrote:
| The United States has over 750 military bases in 80
| countries. Which state is the militaristic state?
| ginjas wrote:
| >I reject the false equivalence of the DHS and FSB. Not
| gonna both-sides this, sorry.
|
| lmao mkay. Not identical, but very similar. It's not even
| 'Alex Jones'-tier to say this. I think you forget you are
| if you are under US or (even NATO). YOU WILL hear
| propaganda from your side, as the Russians do. It's
| NORMAL. We live under control of a hegemon with self-
| interests.
|
| May I have to remind you of these? And tell me the
| difference between these and Russian spookery:
|
| >Assange was being hunted down by the US worldwide
| (https://diem25.org/exactly-10-years-ago-wikileaks-
| released-a...)
|
| >https://en.wikipedia.org/wiki/Operation_Mockingbird
|
| >https://en.wikipedia.org/wiki/Operation_Northwoods
|
| >https://en.wikipedia.org/wiki/PRISM
|
| I could go on, but the point was made already.
|
| Edit: So yes, it's actually 'both sides'
| dang wrote:
| Would you please stop posting flamewar comments and using
| HN for ideological battle? We ban accounts that do those
| things, and you've already been doing it repeatedly.
|
| If you wouldn't mind reviewing
| https://news.ycombinator.com/newsguidelines.html and
| taking the intended spirit of the site more to heart,
| we'd be grateful.
| bpodgursky wrote:
| Yeah, you could go on, but you get paid per post, not per
| word.
| ginjas wrote:
| Unmasked as a shill for saying that great powers engage
| in propaganda, false flagging and dissent crushing.
|
| I have no skin in the game. War, no war, it doesn't
| matter to me the outcome of this war to be honest.
|
| EDIT: But if you are all moral highground, answer me
| this:
|
| Why did the US goad Ukraine into taking a hostile stance
| against a neighbouring (and somewhat rival) great power?
| Whas this to the interest of Ukranians? Or to the
| geopolitical interests of US?
| https://www.youtube.com/watch?v=93eyhO8VTdg
| bpodgursky wrote:
| I don't accept that you can make this case:
|
| "Why did you, W, goad X into expressing their sovereignty
| against Y? Didn't you know that Y would react with
| violence? That makes W the bad guy"
|
| No, Y is always on the wrong side; you can't use the
| threat of violence and then claim via realpolitik that
| the other side was in the wrong. "Moral high ground"
| means you act out of principle, not political
| convenience. In this case, Ukraine didn't want to be in
| the Russian sphere, so we supported them.
|
| And now yeah, the US is paying a lot of money and
| inconvenience to support Ukraine. Gas will be more
| expensive, we're spending tens of billions on weapons.
| But that's because it's the right thing to do; not every
| decision is a realpolitik game about maximizing revenue
| from vassal states (which I hope Russia will learn
| someday).
| ginjas wrote:
| what I meant for the goading part was this:
|
| Ukraine has self interests. Everyone has. But not
| everyone can actualize those, due to reality. The reality
| is that Ukraine neighbours a powerful hegemon.
|
| Since international relations are anarchistic (due to not
| being a supra-entity that has authority over states
| [authority!=international courts bullsh*]), Ukraine
| hasn't any right (to its sovereign, that does not exist)
| to be sovereign. It has to go out and look for itself.
|
| Ukraine thought that had the US/NATO back, that made it
| act in a more reckless way (kind of when you rely on your
| big brother type stuff). It escalated 'till it decided it
| wanted to join NATO. It was goaded.
|
| >you can't use the threat of violence and then claim via
| realpolitik that the other side was in the wrong.
|
| who says? That's your problem. You lack the 'anarchistic'
| framework of geopolitics.
|
| Now, realpolitik-wise, Ukraine's self-interests (of being
| more independent of Russia thru NATO) did clash with
| Russia's self-interests of being safe (and probably made
| Russia have a expansionary Casus Belli).
|
| I feel that the US triggered and amplified the war, thru
| regime change in Ukraine (yep, maidan was a coup),
| recognizing aspirations of UA to NATO, making Zeleskyy
| too comfy to be more harsh in negotiations (where he had
| no leverage, cuz Ukraine's power small vs Rus.),
| ultimately resulted in unnecessary deaths, just for the
| purpose of sphere of influence expansion.
|
| >so we supported them.
|
| Even if it's reckless and could trigger something like
| this?
|
| Also, I will play the 'reversed roles card' again. This
| time with a REAL example. Cuba. Was. The. Same. Thing.
|
| That's why this
| https://en.wikipedia.org/wiki/Operation_Northwoods and
| this https://en.wikipedia.org/wiki/Monroe_Doctrine.
|
| US has the same pattern as Russia. It's actually
| incredible how close these are.
| dang wrote:
| You broke the site guidelines badly here. The rules apply
| regardless of how wrong another comment is or you feel it
| is. We've had to warn you about this kind of thing a lot.
| If you keep doing it, we're going to end up having to ban
| you, so please stop.
|
| https://news.ycombinator.com/newsguidelines.html
| mjhay wrote:
| Not everyone who questions US foreign policy is a paid
| Russian agent.
| bpodgursky wrote:
| No. But most 2-week-old accounts are.
| waffleiron wrote:
| I don't think it's that far fetched that DHS shows up to
| someone who is opposed to the US government and is a self
| identifying communist.
|
| The same DHS who bans immigrants that are or have been
| members of a communist party [1].
|
| [1] https://www.uscis.gov/sites/default/files/document/po
| licy-ma...
| roenxi wrote:
| > I doubt that anything like this happend to Google execs
| in the US:
|
| It seems plausible; we don't know what gets done under
| the FISA court but it would presumably involve companies
| like Google. Some suited agent of the US government
| turning up at Google HQ and threatening jail time under
| some FISA warrant if some pro-Trump something doesn't
| disappear off Google.
|
| That'd be a scandal but not the worst abuse of the secret
| court system. It hasn't exactly covered itself with glory
| since inception. They already spy on basically everyone
| and that is a lot worse than some light censorship.
| capdeck wrote:
| > Take down Russia protest vote app or go to prison
|
| What about Canadian truckers? Didn't Trudeau call them
| terrorists, took their trucks, donations, bank accounts
| and driver licenses... There is no right to protest
| anywhere, don't kid yourself.
| marshray wrote:
| The Canadian truck protesters were allowed to shut down
| the center of the city, blast their horns 24 hours a day,
| and shut down a major international trade route. They
| were permitted to do this for _weeks_ before the citizens
| got sick of it and demanded action from their government.
|
| They gave protest a bad name.
|
| Your conclusion that "There is no right to protest
| anywhere" is simply ridiculous.
| ginjas wrote:
| >Your conclusion that "There is no right to protest
| anywhere" is simply ridiculous.
|
| BLM rioters did this, and more. Violence + Property
| damage + Corporate Backing + gov backing. They didn't had
| their donation money seized,and almost no resistance to
| establish order.
| d23 wrote:
| I would assume this could go unsaid, but apparently it
| needs to be said somewhere in this thread: there is zero
| comparison between the US and an autocratic dictator who
| attempts to kill and then jails his opposition, runs
| fraudulent elections, kills journalists, and invades
| sovereign countries. Zero. None. Zero.
|
| Zero.
|
| Get it?
|
| None.
|
| Zero.
| ginjas wrote:
| I mean, if your only criteria is the intensity of the
| quality, then kinda.
|
| But... :
|
| (persecuted journalist)
|
| >https://www.bbc.com/news/uk-61839256
|
| >https://www.amnesty.org/en/latest/news/2013/07/usa-must-
| not-...
|
| (jails its opposition)
|
| >https://eu.usatoday.com/storytelling/capitol-riot-mob-
| arrest...
|
| >"hatespeech" (newspeak)
|
| >fraudulent elections: funny how the concerns that 'half'
| of the US had with elections were dismissed. Especially
| when conditions were different, by using a method usually
| agreed (until now, cuz narrative) prone to tampering. So
| much for free and fair elections.
|
| So _Zero_ huh?
| ushakov wrote:
| can you name any Russian company that doesn't?
|
| obeying to Kremlin is just an aspect of running business in
| Russia
|
| the only option would be not to operate in Russia at all.
| Yandex can't do this, because their audience is primarily in
| Russia
| lotusmars wrote:
| Well they've made their choice and silenced our protest and
| opposition, and later spewed pro-war anti-Ukrainian
| propaganda using country's largest media (Yandex News).
|
| If you're profiteering from our suffering and choose
| Kremlin's needs over ours, don't be suprised then when we
| tell you to shove your AI models and your search.
| joshsyn wrote:
| Who's we? I am not in this together. So change it to "I".
| I don't care about you lot... lmao
| ushakov wrote:
| they're selling Yandex News to VK (Mail.ru)
| lotusmars wrote:
| It's still working as usual and they announced the
| transition after 8 years of warmongering and blacklisting
| all opposition resources. And only when sanctions hit.
|
| Now they scramble to present a whitewashed image to
| Western public. They will probably put themselves forward
| as great contributors to open source.
| [deleted]
| orbital-decay wrote:
| Not really, it's very different for Yandex in particular.
| Along with several other companies like Vimpelcom, they
| started the "Safe Internet League", an organization which
| exploited the _think of the children_ argument to build the
| censorship regime from scratch. They practically created
| the original censorship laws, or participated in the
| creation, when they were in the best position to resist the
| government (and had the incentive to do so). As an example,
| Telegram successfully resisted the censorship while having
| _much_ less leverage, much later.
|
| Of course Yandex likes to pose as the victim of censorship,
| but the truth is that they are the censors themselves.
| They've been steamrolled by a runaway process they helped
| to create.
| blackhaz wrote:
| Which doesn't excuse them at all. They are full-on
| supporting the war machine and bloodshed and bear
| responsibility.
___________________________________________________________________
(page generated 2022-06-23 23:00 UTC)