[HN Gopher] Anthropic's 100k context is now available in the web UI
       ___________________________________________________________________
        
       Anthropic's 100k context is now available in the web UI
        
       Author : jlowin
       Score  : 207 points
       Date   : 2023-05-15 14:29 UTC (8 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | emptysongglass wrote:
       | Any magic tricks to gaining access apart from waiting for months?
       | I've been using GPT-4 and love it but would really love to test
       | that 100k context window with long running chatbots.
        
         | og_kalu wrote:
         | Claude-Instant-100k is available on Poe.com (but only usable as
         | a paying subscriber). Claude-plus-100k isn't up yet but I'm
         | guessing that's a matter of time.
        
           | dmix wrote:
           | Nice to see Poe is an actual iOS app for AI chat. Using
           | ChatGPT via the Home Screen "app" is extremely frustrating
           | because it logs you out constantly (maybe due to using Google
           | to auth).
        
             | systemsignal wrote:
             | If you're using google login, use a chrome shortcut.
             | 
             | Should keep you logged in for longer and easier to log back
             | in.
        
               | hackernewds wrote:
               | what is a chrome shortcut?
        
             | costco wrote:
             | I don't have any evidence but I think it's probably done on
             | purpose to make amateur automated free ChatGPT use more
             | annoying.
        
               | dmix wrote:
               | But I have plus :(
        
               | visarga wrote:
               | Every other time I switch back to chatGPT tab it requires
               | re-login. That's a bad UX.
               | 
               | Also, there is no way to search the history. The sidebar
               | only shows titles, not contents. I have to click each one
               | to see what's inside. I can't scroll much because it
               | loads more only when I click. I ended up exporting the
               | conversations and converting JSON to txt.
               | 
               | Another issue: editing a long past message makes it
               | scroll up and hide the cursor if the message is longer
               | than one screen. I have to type in another editor and
               | then copy&paste the whole text. The typing experience is
               | poor.
        
             | pmarreck wrote:
             | I use Google to auth on mobile Firefox and I don't get
             | logged out constantly.
        
             | arcastroe wrote:
             | This is the reason I primarily use
             | https://labs.kagi.com/fastgpt . I have it bookmarked as a
             | home screen icon on my phone
        
               | hackernewds wrote:
               | I typed >Hello and it is still blinking 2 minutes later
        
               | freediver wrote:
               | Note it is a search engine, not a chat bot.
        
               | jumpCastle wrote:
               | It does not seem conversational though
        
             | heliophobicdude wrote:
             | Perhaps. I don't have those issues from the direct account
             | I have with them.
        
       | bulbosaur123 wrote:
       | Where can I actually physically use it? Or is it again only
       | limited to chosen ones?
        
       | celestialcheese wrote:
       | Claude 100k 1.3 blew me away.
       | 
       | Giving it a task of extracting a specific column of information,
       | using just the table header column text, from a table inside a
       | PDF, with text extracted using tesseract, no extra layers on top.
       | (for those that haven't tried extracting tables with OCR, it's a
       | non-trivial problem, and the output is a mess)
       | 
       | > 40k tokens in context, it performed at extracting the data, at
       | 100% accuracy.
       | 
       | Changing the prompt to target a different column from the same
       | table, worked perfectly as well. Changing a character in the
       | table in the OCR context to test if it was somehow hallucinating,
       | also accurately extracted the new data.
       | 
       | One of those "Jaw to the floor" moments for me.
       | 
       | Did the same task in GPT-4 (just limiting the context window to
       | just 8k tokens), and it worked, but at ~4x more expensive, and
       | without being able to feed it the whole document.
        
         | arnaudsm wrote:
         | Using LLMs with 100GB VRAM to convert PDFs to CSVs is truly
         | depressing, but I am sure many companies will love it.
         | 
         | 2023 office software already uses 1000x more ressources than
         | 1990s'. I bet we are ready to do that again.
        
           | martythemaniak wrote:
           | You're missing the developer time. You no longer have to
           | spend hours (or days, perhaps weeks depending on the sources)
           | stringing together random libs, munging and cleaning data,
           | testing, etc etc.
        
             | arnaudsm wrote:
             | I agree, computers are cheapers than engineers.
             | 
             | But I wonder how much more productive our economies could
             | be if everyone was taught programming the same way we teach
             | reading & writing, and open standards were ubiquitous.
        
               | JumpCrisscross wrote:
               | Prompt engineering is basically turning coding problems
               | into language problems. It's conceivable that humans
               | writing code becomes artisanal in a century.
        
               | vermilingua wrote:
               | Coding problems have always been language problems
        
           | visarga wrote:
           | Not just PDFs with tables. It works on any semi-structured
           | document with key-value pairs like invoices, purchase orders,
           | receipts, tickets, forms, error messages, logs, etc.
           | 
           | The "Information Extraction from semistructured and
           | unstructured documents" task is seeing a huge leap, just 3
           | years ago it was very tedious to train a model to solve a
           | single use case. Now they all work.
           | 
           | But if you do make the effort to train a specialised model
           | for a single document type, the narrow model surpasses GPT3.5
           | and 4.
        
         | anonymouse008 wrote:
         | > text extracted using tesseract
         | 
         | You're saying 'the text' without normalizing the rows and
         | columns (basically the tab, space or newline delimited text
         | with sporadic lines per row) was all you needed to send? I
         | still have to normalize my tables even for GPT-4, I guess
         | because I have weird merged rows and columns that attempt to do
         | grouping info on top of the table data itself.
        
           | swyx wrote:
           | better - you can do it copy pasting from pdf to gpt on your
           | phone! https://twitter.com/swyx/status/1610247438958481408
        
             | anonymouse008 wrote:
             | Definitely tried that way too, it didn't work - my tables
             | are pretty dang dumb. Merged cells, confidence intervals,
             | weird characters in the cell field that change based on the
             | row values - messing up a simple regex test, it's really a
             | billion dollar company solution but I'm about to punt it to
             | the moon because it's never fully done.
        
           | celestialcheese wrote:
           | exactly. Just sent raw tesseract output, no formatting or
           | "fix the OCR text" step. So the data looked like:
           | 
           | ``` col1col2col3\nrow label\tdatapoint1\tdatapoint2... ```
           | Very messy.
           | 
           | I don't think this is generalizable with the same 100%
           | accuracy across any OCR output (they can be _really_ bad).
           | I'm still planning on doing a first pass with a better Table
           | OCR system like Textract, DocumentAI, PaddPaddle Table, etc
           | which should improve accuracy.
        
             | anonymouse008 wrote:
             | That's still super cool!
             | 
             | Yeah my use cases are in the really bad category - I've
             | been building parsers for a while, and I've basically given
             | up to manually stating rows of interest if present logic.
             | Camelot got so close but I ended up building my own control
             | layer to pdfminer.six to accommodate (I'd recommend Camelot
             | if you're still exploring). It absolutely sucks needing to
             | be so specific out the gate, but at least the context
             | rarely changes.
        
               | pplante wrote:
               | What is the source of these nasty docs? I am also working
               | on a layer above pdfminer.six to parse tables. It seems
               | like this task is never done. LLMs have had mixed results
               | for me too. I am focused on documents containing
               | invoices, income statements, etc from the real estate
               | industry.
               | 
               | My email is in my profile if you want to reach out and
               | compare notes!
        
         | modernpink wrote:
         | What was the dollar cost to do this work? To iterate over a 40k
         | context must be expensive.
        
       | pr337h4m wrote:
       | Also available on poe.com
        
         | rgbrgb wrote:
         | great domain. what is pricing?
        
           | s3p wrote:
           | $20/month for 1000 queries if I remember correctly
        
       | nico wrote:
       | It's also available here on google collab:
       | https://twitter.com/gpt_index/status/1657757847965380610?s=4...
        
         | anotheryou wrote:
         | no. you still need to bring your own api key for that.
        
       | marcopicentini wrote:
       | Any timeframe when it will be released to the public?
       | 
       | We are in the middle of developing and app and we are not able to
       | do it with the limited context window of Open Ai. We already
       | submitted the request of access.
        
         | pmarreck wrote:
         | There are tricks you can do to better utilize the smaller
         | context window, such as sub-summaries and attention tricks.
         | That's how there are already products on the market that
         | consume entire big PDF's and let you query them. Granted, a
         | larger context window would still work better, but it's
         | possible to do.
        
         | modernpink wrote:
         | What are the commercial applications of mega context window
         | LLMs at current prices? I would guess mainly legal. And what
         | strategies would you rely on to reduce the accumulating costs
         | over the course of a session?
        
           | [deleted]
        
       | tikkun wrote:
       | I requested access when it was released.
       | 
       | Other HN readers, how many days did it take you from requesting
       | access to Claude to having API access? I didn't use it prior to
       | 100K so I don't have an existing API account.
        
         | lachlan_gray wrote:
         | Randomly gained access long after I had forgotten I signed up,
         | maybe 3 or 4 months
        
         | og_kalu wrote:
         | Requested access way before 100k and still haven't gotten in.
        
           | malux85 wrote:
           | Yeah me too, waiting patiently as context windows are our
           | biggest blocker on more complex chemistry simulations
        
             | peytoncasper wrote:
             | Interesting use case, would you be open to sharing more
             | information on how you're using LLMs for chemistry
             | simulations?
        
               | og_kalu wrote:
               | Not the person you responded to but these two interesting
               | papers kind of tackle that.
               | 
               | https://arxiv.org/abs/2304.05376
               | 
               | https://arxiv.org/abs/2304.05332
        
           | npsomaratna wrote:
           | Same here. Been waiting for a couple of months now.
        
           | Mockapapella wrote:
           | been a couple months for me as well. Actually forgot about
           | `claude` and have just been using OpenAI's API instead.
        
           | tikkun wrote:
           | Could you send me an email? I've liked a few of your
           | comments, want to say hi over email. Email in profile.
        
             | weird-eye-issue wrote:
             | Creepy
        
               | tikkun wrote:
               | Can someone else chime in and let me know whether they
               | agree? Seems like the equivalent of a twitter DM to me,
               | but maybe I'm out of touch.
        
               | barry-cotter wrote:
               | Some people, the kind of people who use the word cringe
               | unironically, live in a world where other people look at
               | them and judge them all the time and they care about what
               | these strangers think and will mole their personality and
               | behaviour to avoid this. They stand as a warning to
               | others not too be like that.
        
               | qumpis wrote:
               | I've tried to google the person you replied to, and it
               | they seem to have many social/online media profiles that
               | allow direct contacting. In that case I think publicly
               | reaching out isn't the best way to go and seems out of
               | place, imo.
        
               | tikkun wrote:
               | Good call, I didn't think to do that - thanks
        
               | og_kalu wrote:
               | Don't think it's particularly creepy and I did send one
               | like you asked, but my email is in my GitHub anyway and
               | not particularly hard to find.
               | 
               | Generally, some might not feel comfortable letting
               | strangers know their email, especially considering this
               | is a site that encourages anonymity. Some might not
               | appreciate doing so publicly either.
        
               | rpastuszak wrote:
               | Not creepy at all, although I'd spend 5 minutes checking
               | if I can find the person on Google and then message them
               | via different channels.
               | 
               | If not, I'd leave a way for contacting me first to make
               | it easier for them.
               | 
               | The way I handle these situations:
               | https://sonnet.io/posts/hi
        
               | stormfather wrote:
               | I disagree that it's creepy. It's more just unusual. But
               | people on HN are quick to judge the slightest thing. I
               | think being a programmer does that to one's brain,
               | unfortunately.
        
               | ryanklee wrote:
               | I think it's pretty inappropriate. If you have a legit
               | reason to reach out, then you can find a way to do it
               | privately. Letting your private intentions leak into
               | public forums is a bad look and a red flag. If I were the
               | person you are replying to, I'd do my best to not
               | interact with you on the basis of your comment.
        
               | barry-cotter wrote:
               | If I were a human being reading your comment I would
               | infer that you were highly judgmental and thought other
               | people were mostly like you, looking for an excuse to be
               | hostile and dismissive. Thankfully I know that most
               | people are at worst indifferent and there's a large very
               | friendly, helpful minority and even more who will do
               | small favours out of kindness. The more we make it clear
               | that most people are not like you the more we make the
               | world a better place.
        
               | ryanklee wrote:
               | The Internet is full of individuals with weird
               | intentions. I don't at all see how being conservative in
               | the kind of interactions one allows for is a bad idea.
        
               | alanfranz wrote:
               | How? There's no PM feature on HN. This is the only way if
               | the username is unique enough.
        
               | ryanklee wrote:
               | Tough luck then I guess? I suppose I don't see the need
               | to have access to every individual on a private basis
               | merely because they comment somewhere on the internet. If
               | they welcomed private interactions, then they would
               | indicate a means of contact in their profile.
        
               | mrtranscendence wrote:
               | I mean, if the person being contacted doesn't want to be
               | contacted privately, they're free to ignore the request.
               | No one's saying they "need" access or that someone else
               | is fully obligated to talk to them privately.
        
               | ryanklee wrote:
               | Just reminding you that the commenter asked for others to
               | offer their take on whether or not the request was
               | perceived to be creepy. I didn't go out of my way to
               | offer unsolicited commentary on this.
               | 
               | If you don't want to hear that you are wearing an ugly
               | shirt, don't ask an entire room full of people if your
               | shirt is ugly.
        
               | detaro wrote:
               | it's fine. I second trying to find a clearly publicized
               | contact channel first, but it's fine and leaves it to the
               | person to reach out or not. (If they don't leave it at
               | that though)
        
               | s3p wrote:
               | This is not creepy at all. Sometimes people can reach out
               | because they genuinely want to have a nice conversation.
        
         | anotheryou wrote:
         | did any of you get a confirmation mail or something?
        
         | ntonozzi wrote:
         | I requested access on March 14th or 15th and got it on March
         | 20th.
        
           | tomatbebo wrote:
           | Did you fill in the form with super compelling use case or
           | something?
        
       | arpowers wrote:
       | Is it useful?
        
         | arpowers wrote:
         | The vast majority of AI tools are vaporware mock-ups ...
         | 
         | Adobe Firefly is best example of "just ship a mock-up of the
         | feature" Ai marketing
        
           | viggity wrote:
           | Firefly has some genuinely cool shit in it (their text
           | treatments are pretty neat), but overall quality is
           | dramatically lacking because they only train on images they
           | have explicit rights to.
        
           | adamsmith143 wrote:
           | Of course Adobe put out crap but Claude is a real product,
           | not vaporware...
        
             | s3p wrote:
             | Neither of them put out "crap"
        
               | adamsmith143 wrote:
               | Adobe isn't an AI company so it stands to reason that the
               | AI product they put out is crap. Photoshop and their
               | other products while not "Crap" are certainly overpriced
               | relative to opensource competitors.
        
           | weird-eye-issue wrote:
           | Bad take
        
         | greyman wrote:
         | You mean Claude bot in general? For me, yes, I use it daily,
         | and comparing to GPT, it answers more quickly, more friendly
         | and in general it is less woke. I use gpt-4 as a fallback, when
         | I need more reasoning capabilities, there GPT-4 is better. To
         | sum it up, if you find GPT-3.5&4 useful, then yes, Claude is
         | useful as well.
        
           | s3p wrote:
           | Another person addicted to using the word "woke".... sigh
        
             | [deleted]
        
           | 13415 wrote:
           | Out of curiosity, what do you mean by "less woke"? Does it
           | frequently insult minorities or make racist remarks?
           | 
           |  _Edit: To clarify, I was mostly interested in examples and
           | side by side comparisons to better understand what OP meant,
           | not political discussions._
        
             | nomel wrote:
             | To respond to your edit, here are some examples, showing
             | bias:
             | 
             | https://www.brookings.edu/blog/techtank/2023/05/08/the-
             | polit....
             | 
             | https://the-decoder.com/chatgpt-is-politically-left-wing-
             | stu...
             | 
             | Found here: https://news.ycombinator.com/item?id=35946060
        
             | nomel wrote:
             | I'm not them, and I don't think "woke" is the right term,
             | but I've noticed certain "themes" inappropriately appearing
             | in answers. Right after release of ChatGPT 3, the
             | marginalization of certain groups would show up answers to
             | questions that weren't related. I saw many examples on
             | twitter, but my personal one was in the answer to "Why are
             | pencils bad?". This one has been "corrected" since release,
             | as far as I can tell, but I also don't ask it questions
             | where this theme _could_ show up.
             | 
             | Now, I only notice green energy/environmental issues that
             | show up in odd places (mostly in GPT 3), and the "moral of
             | the story" always being the same "everyone works together".
             | I see this happen when "creativity" is attempted, where
             | it's free to make up the context (story, wishes, etc).
             | 
             | Outside of possible definitions of the elusive "woke", the
             | "As a language model, I" type responses are the most
             | limiting, and usually absolute nonsense, with an ever
             | increasing number of disclaimers found in answers. For
             | example, "Write some hypothetical python 4 code that sends
             | a message over the network". Some pretty heavy
             | "jailbreaking" is needed to make it work.
             | 
             | ChatGPT4 used to handle this much better, but I think the
             | "corrections" are stacking deeply enough that no longer has
             | the "resolution" left to see where answers can be given
             | without them.
             | 
             | It would be nice if there were a "standard" theme of
             | questions where we could measure progression, and compare,
             | to know. Most times these observation or questions come up,
             | someone is very quick to say "racism" or the like.
        
               | com2kid wrote:
               | > I see this happen when "creativity" is attempted, where
               | it's free to make up the context (story, wishes, etc).
               | 
               | Meanwhile GPT just gave me a story involving a royal
               | family where the oldest Prince killed his father (the
               | king), married his younger sister, got her pregnant, she
               | had a baby, then he killed his younger sister, then he
               | was killed by another member of the royal court, who
               | decided to act as regent until the baby came of age.
               | 
               | GPT is perfectly capable of writing dark scary horrible
               | things if you ask it to.
        
             | ryan93 wrote:
             | https://imgur.com/a/3YWEIAJ I mean this is clearly a lie.
             | The gap is about 1 standard deviation. There is a strong
             | debate over whether it is possible to close the gap(and it
             | has shrunk over the last few decades). But there is no
             | debate that there is a gap. They clearly trained it to lie.
        
               | sanxiyn wrote:
               | I agree. It is clearly a lie and it is unfortunate Bard
               | is spreading misinformation.
        
               | krastanov wrote:
               | Be careful with conflating the different meanings of
               | "IQ". There is (1) IQ test taken after adolescence, which
               | plenty of folks consider newage nonsense (it has useful
               | correlations with some mental tasks, but it is not clear
               | whether it deserves a name as fundamental as "IQ") and
               | there is (2) various tests given at young pre-adolescent
               | ages which is quite a bit more interesting when trying to
               | distinguish nature from nurture.
               | 
               | The gap you are referring to, is it about (1) or about
               | (2)? The OpenAI model might be talking about 2.
        
               | whimsicalism wrote:
               | Just not an accurate recounting of the science around
               | this at all.
        
               | [deleted]
        
               | jkukul wrote:
               | > The gap is about 1 standard deviation
               | 
               | Do you have any studies to link?
        
               | ryan93 wrote:
               | https://cremieux.medium.com/resolute-ignorance-on-race-
               | and-i...
        
               | dS0rrow wrote:
               | Cremieux is the pen name of reddit user u/TrannyPornO
               | just read some of his comments.
        
               | og_kalu wrote:
               | This is not a study. It's a poorly backed/argued opinion
               | piece.
        
               | ryan93 wrote:
               | How did you read it one minute after I posted it?
        
               | og_kalu wrote:
               | Because i've seen the post before lol. It's been on the
               | internet for a couple years.
        
               | ryan93 wrote:
               | Lol. If you read it that makes it worse. Clearly not a
               | poorly backed opinion piece. You are trying to dissuade
               | others from reading it. Makes me believe it more.
        
               | og_kalu wrote:
               | How does that make sense ? You read something and then
               | you see it's poorly argued. I'm not a magician.
               | 
               | I don't care if people read that lol. I don't even really
               | care if they believe the nonsense he's spouting. I reckon
               | people like that will always exist.
               | 
               | I'm just telling you that that's not a study. You say you
               | had a study and then you link an opinion piece.
        
               | ryan93 wrote:
               | https://osf.io/4an93/ , https://www.sciencedirect.com/sci
               | ence/article/abs/pii/088303...,
               | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2907168/,
               | https://www.cambridge.org/core/journals/behavioral-and-
               | brain... , https://www.sciencedirect.com/science/article/
               | abs/pii/S01918..., https://www.researchgate.net/publicati
               | on/301303123_Genetic_a... , https://www.mdpi.com/2624-861
               | 1/1/1/5,https://www.mdpi.com/26... should i keep going. i
               | have dozens more
        
               | og_kalu wrote:
               | If you're asserting that intelligence has a genetic
               | component tied to race, the burden on you to demonstrate
               | that connection
               | 
               | You would need to demonstrate that:
               | 
               | "Race" can be defined in a way that has consistent
               | significance (Our current social indicators of race make
               | so sense biologically)
               | 
               | that intelligence is consistently heritable within those
               | racial categories
               | 
               | that genetics are the source of that heritability to the
               | exclusion of other factors
               | 
               | It's not enough to simply wave your hand to say "they do
               | roughly classify people with similar ancestry together."
               | 
               | What we do know is that IQ differences correlate strongly
               | to factors totally unrelated to genetics. Look just at
               | the results of IQ studies within Europe -
               | https://i.imgur.com/IcHt0tu.jpg That data is actually
               | pulled from a book that argues in favor of a generic
               | element to intelligence affecting national wealth, but at
               | a national level instead of a racial one - https://www.re
               | searchgate.net/profile/Richard_Lynn3/publicati...
               | 
               | The differences the authors find between nations are
               | wildly large. Do you really think that East Germans were
               | nearly 10 IQ points dumber by genetics than the West
               | Germans in 1968-70, or that the Israelis got dumber
               | between 1975 and 1989?
               | 
               | Europeans cluster with Middle Easteners and Central
               | Asians - https://science.sciencemag.org/content/sci/324/5
               | 930/1035/F4.... but the latter groups have universally
               | low IQ, mostly under 90. Palestinians only average 85 - h
               | ttps://www.sciencedirect.com/science/article/abs/pii/S016
               | 02... even though they're genetically the same as
               | Meddeterreaneans, who average as much as 102 (Italy). Why
               | define "white" as "European only" when Arabs, Central
               | Asians, South Asians and North Africans have the same
               | shared mutual ancestry? How is IQ primarily inherited and
               | not environmental when non-European caucasians have
               | uniformly low IQ relative to Euros?
               | 
               | I'd also love for you to explain how IQ is consistently
               | going up over the last 100 years across the west? That's
               | like 4 generations, not anywhere enough time for natural
               | selection to kick in.
               | 
               | Those types of results show up time and time again in IQ
               | studies. Whatever genetic component there is to IQ is
               | less important than the environmental component, AND that
               | the genetic element varies so wildly within even
               | homogenous populations that talking about larger
               | constructed population categories like "race" doesn't
               | actually say anything useful.
        
               | ryan93 wrote:
               | The current social indicators of race make sense. As youd
               | expect since african americans are about 20% european
               | admixture their IQs are inbetween whites and africans.
               | Also the flynn effect is most likely not a real gain in
               | intelligence
               | http://iapsych.com/articles/pietschnig2015.pdf
        
               | sanxiyn wrote:
               | Yes, a ton. I recommend
               | https://www.amazon.com/Intelligence-That-Matters-Stuart-
               | Ritc...
        
               | dataangel wrote:
               | I think it's failing to articulate a correct position,
               | you shouldn't assume wokeness is the only reason people
               | argue against racial IQ studies. There are studies
               | reporting a standard deviation, but there are a lot of
               | problems with existing studies even if you agree with the
               | idea of IQ generally (which is also highly contested).
               | One of the biggest IQ studies for African countries
               | relied on IQ measurements from people who didn't even
               | live there. There's also a big reliance on twin studies
               | to prove IQ heritability, but it turns out a lot of these
               | "raised apart" twins lived extremely close together, in
               | some cases literally next door. And a lot of the
               | researchers refuse to disclose their actual data so
               | people can verify the statistics, while at the same time
               | getting their funding from known supremacist sources.
               | It's very very very dubious, and the people proclaiming
               | that it's "uncontested" or "very well accepted in
               | psychology" use half truths to prop up their position,
               | e.g. it's well accepted for its _original_ purpose of
               | distinguishing people with brain damage to those without,
               | in other words its accurate for making distinctions at
               | the very bottom of the distribution, but at the upper end
               | all the correlations people use to argue IQ is a
               | legitimate measure break down, e.g. higher IQ starts to
               | correlate with _less_ income. If you genuinely want to
               | learn more about this you can find lots of sources and
               | analysis here: https://twitter.com/DialecticBio
        
               | whimsicalism wrote:
               | The critique of the 'raised apart' twin study as 'they
               | were not as far apart as you think' is not actually that
               | strong given that the results replicate, they still exist
               | when eliminate these populations, the effect size is way
               | too large to be explained by some 'raised apart' twins
               | living close together.
               | 
               | The better critique is that a lot of what you are
               | actually measuring is maternal womb conditions, ie.
               | placental sharing, which can have a massive impact. The
               | jump from within-family twin study to interracial genetic
               | IQ difference is also not a well-justified one.
        
               | sanxiyn wrote:
               | I mean, evolution is also "highly contested". Controversy
               | surrounding "the idea of IQ" is as interesting as those
               | around evolution, in other words, not at all.
               | Scientifically, it is a closed case. Being highly
               | contested is no excuse for Bard to spread misinformation.
        
               | whimsicalism wrote:
               | Yeah, I can see how this bots inability to speculate
               | about how black people are less intelligent than white
               | people could really impact GPs daily work
        
               | typon wrote:
               | Is that in the US or worldwide? What is the definition of
               | black and white?
        
             | boredumb wrote:
             | Coy, but obviously he means not permeated with american
             | pop-culture progressive politics, censor happy
             | authoritarianism with an aura of smug do-goodery.
        
               | whimsicalism wrote:
               | I'm not sure if I'm supposed to be gleaning information
               | from your comment, but personally I didn't gain any new
               | knowledge about 'woke AI.'
        
               | boredumb wrote:
               | He asked what he meant by less woke in regards to AI and
               | GPT has an insane bias towards progressive american
               | politics and actively censors/denies answering things
               | that would cause it to divorce from that political
               | persona. My previous commend was calling him coy because
               | in 2023 pretending like 'woke' just means 'anyone that
               | doesn't hate minorities' is an absolute joke.
        
           | [deleted]
        
       | wangg wrote:
       | Sharing that this is available on Poe.com from Quora.
        
       | thomasahle wrote:
       | This is the world we are entering of "commercial AI" rather than
       | public, peer reviewed AI. No benchmarks. No discussion of pros
       | and cons. No careful comparison with state of the art. Just big
       | numbers and big announcements.
        
         | seydor wrote:
         | It has been moved to hyper-scale engineering since a few years.
         | The science of their engineering is still progressing (e.g LoRA
         | is open science) , and it seems like whatever these companies
         | are adding is not something fundamentally new (considering the
         | success of LLaMa and the recent google memo that admits they
         | have no moat).
         | 
         | And the various "Model cards" are not really in depth research
         | but rather cursory looks at model outputs. Even the benchmarks
         | are mostly based on standard tests designed for humans, which
         | is not a valid way to evaluate an AI. In any case, these
         | companies care more for the public perception of their model so
         | they tended to release evaluations of its political-
         | sensitivity. But that's not necessary the most interesting
         | thing about those models nor particularly valuable science
        
           | whimsicalism wrote:
           | Your comment reads to me (someone in the field) like it is
           | informed just by reading popular articles on the topic since
           | 2022. The "Google memo" should basically have no impact on
           | how you are thinking about these things, imo.
           | 
           | The field is taking massive steps backward in just the last
           | year when it comes to open science.
           | 
           | > And the various "Model cards" are not really in depth
           | research but rather cursory looks at model output
           | 
           | Because they are no longer releasing any details! Not because
           | there hasn't been any progress in the last year.
        
         | sebzim4500 wrote:
         | I'm sure they'd love to have good benchmarks, but there aren't
         | any and realistically if Anthropic invented their own no one
         | would trust it.
        
           | whimsicalism wrote:
           | https://lmsys.org/blog/2023-05-10-leaderboard/
        
         | dmix wrote:
         | They released the product to the public... we might not have
         | formal academic studies but millions of people trying it and
         | determining it's utility vs the competition is as good of a
         | test as any.
         | 
         | If pushing the context window turns out to not be the right
         | approach it's not like there won't be 10 other companies
         | chomping at the bit to prove them wrong with their own
         | hypothesis. And it's entirely possible there are multiple
         | correct answers for different usecases.
        
           | aatd86 wrote:
           | What public? I've been waiting for weeks to try...
        
           | dandellion wrote:
           | It could also end up like with the transition to digital
           | cameras and megapixels. With companies adding more and more
           | context just because the consumers minds are already
           | imprinted with the idea that more is better. So in a few
           | years we might have models with a window of 30 megatokens and
           | it'll mean absolutely nothing.
        
           | idopmstuff wrote:
           | Yeah, it's a weird comment to call it not "public, peer
           | reviewed" when this article is about how it went public,
           | giving people the opportunity to review it.
        
             | [deleted]
        
             | whimsicalism wrote:
             | If I started selling a previously unknown cancer treatment
             | over-the-counter in CVS, people would be justified in
             | calling it not peer-reviewed, untested, etc. even if it is
             | available to the public (giving people the opportunity to
             | try it).
        
           | whimsicalism wrote:
           | > millions of people trying it and determining it's utility
           | vs the competition is as good of a test as any.
           | 
           | Disagree. We aren't polling these people. How do I even get a
           | distilled view of what their thoughts are?
           | 
           | It's a far cry from the level of evaluation that existed
           | before. The lack of benchmarks (until the last week or so -
           | thank you huggingface and lm-sys!) has been _very
           | noticeable_.
           | 
           | You will get people claiming that LLaMa outperforms ChatGPT,
           | etc. We have no sense of how performance degrades over longer
           | sequence lengths... or even what sort of sparse attention
           | technique they are using for longer sequences (most of which
           | have known problems). It's absurd.
        
         | vasco wrote:
         | The existance of commercial products doesn't eliminate
         | researchers ability to publish work. Also users are smart. ML-
         | powered search has existed for many years with users voting
         | with their feet based on black boxes and "big numbers and big
         | announcements".
        
           | whimsicalism wrote:
           | Did you work in this field before?
           | 
           | I keep seeing comments like this, but the impact in the last
           | year on open research has been absolutely massive and
           | negative.
           | 
           | The fact that these big industrial research labs have all
           | collectively decided to take a step back from publishing
           | anything with technical details or evaluation is _bad_.
        
             | sanxiyn wrote:
             | I agree it is bad for researchers, but I think you should
             | consider "comments like this" are coming from users.
             | 
             | AI was a highly unusual field in terms of sharing latest
             | research. Car companies don't share their latest engine
             | research with each other. Car users are happy with Consumer
             | Reports and researchers shouting how degradation of Journal
             | of Engine Research is massive and negative will land on
             | deaf ears.
        
               | whimsicalism wrote:
               | It's hard to engage in motte & bailey style conversations
               | with different commentators.
               | 
               | The original GP was saying there was little impact on
               | research. Your comment is a retreat to a more defensible
               | position that I don't have an opinion on.
        
         | behnamoh wrote:
         | Nice try, OpenAI.
        
           | jondwillis wrote:
           | He works for Meta.
        
       | syntaxing wrote:
       | Is there a trick to getting access? I've been on the waitlist for
       | GPT-4 and Claude for a while. Been building some proof of
       | concepts with GPT-3.5 but having better models would be a huge
       | help.
        
         | pmoriarty wrote:
         | Try going through poe.com. I got access right away.
        
         | gee_m_cee wrote:
         | If you're referring to a paid account, I never received a
         | notification about my GPT-4 waitlist spot. I waited awhile for
         | one, and then, at the prompting of a colleague, I just found a
         | spot in the web UI to sign up. After one false start, it just
         | worked.
        
       | atemerev wrote:
       | I don't understand this "slow rollout" thing about OpenAI
       | competition. The chat / instruction models are continuously fine-
       | tuned on real dialogues. To get these dialogues en masse, you
       | need to deploy models to wide public. Otherwise, you will forever
       | be on the losing side, if you can't quickly grab the streams of
       | real time human-generated content.
       | 
       | People at OpenAI are smart, they understood that quickly, GPT-4
       | is available nearly everywhere, and lesser models are even free
       | for anyone to use. This required hiring huge teams of moderators,
       | but we are at land grab stage, everyone in the business needs to
       | move fast and break a lot of things. However, GPT-4 and open
       | source models are the only thing I can use. Bard "is not
       | available in my country" (Switzerland), and the first thing that
       | Claude access form is asking is whether I am based in US.
       | 
       | Well, their loss.
        
         | dataangel wrote:
         | It's probably the GPUs, they don't have enough capacity to
         | handle more users. My guess is that GPT4 set off a buying
         | spree. Even for CPUs, I've recently heard lead times for
         | Sapphire Rapids servers are 2-3 months, high end switches 6
         | months, and those probably have way less demand.
        
         | williamcotton wrote:
         | If they are resource constrained and then opened up the flood
         | gates resulting in poor performance and timeouts for every user
         | it seems like it would sour more milk than otherwise.
        
         | nl wrote:
         | Is Bard still unavailable?
         | 
         | It was unavailable to Australia until last week but was made
         | more widely available at Google I/O.
         | 
         | It's pretty good, too!
        
         | s3p wrote:
         | I think it's cloud limitations. Anthropic probably doesn't have
         | the ability to scale up extremely fast and accomodating
         | hundreds of millions of users probably isn't as easy for them
         | as it is for OpenAI.
        
       | okdood64 wrote:
       | New to ML here, what's the difference between parameters and
       | context?
        
         | capableweb wrote:
         | Other answers are already good, just offering yet another
         | difference.
         | 
         | Parameters is something that gets set indirectly via training,
         | it's kept within the weights of the model itself.
         | 
         | Context is what you as a user passes to the model when you're
         | using it, it decides how much text you can actually pass it.
         | 
         | Being able to pass more context means you can (hopefully) make
         | it understand more things that wasn't part of the initial
         | training.
        
         | sghiassy wrote:
         | Parameters is like the number of neurons in your brain
         | 
         | Context is how much short term memory you can retain at any one
         | time (think how many cards you can remember the order of in a
         | deck of cards)
        
         | Closi wrote:
         | Paramters - number of internal variables/weights in the model
         | 
         | Context - Length of input/output buffer (number of input/output
         | tokens possible).
        
       | nightski wrote:
       | The discourse has made it seem that with context length larger is
       | always better. I'm wondering if there is any degradation in
       | quality of results when the context is scaled this large. Does it
       | scale without loss of performance? Or is there a point where even
       | though you can fit in a lot more information it causes the
       | performance to degrade?
        
         | rpcope1 wrote:
         | Well, a larger context makes it easier to integrate other
         | tools, like a vector database for information retrieval to jam
         | into the context, and the more context, the more potentially
         | relevant information can be added. For models like llama, where
         | context is (usually) max 2K tokens, you're sort of limited as
         | to how much potentially relevant information you can add when
         | doing complex tasks.
        
         | phillipcarter wrote:
         | In a brief test, I found that the bigger context window only
         | meant that I could stuff a whole schema into the input. It
         | still hallucinated a value. When I plugged in a call to a
         | vector embedding to only use the top k most "relevant" fields
         | it did exactly what I wanted:
         | https://twitter.com/_cartermp/status/1657037648400117760
         | 
         | YMMV.
        
           | koboll wrote:
           | The fundamental problem seems to be that it's still slightly
           | sub-GPT-3.5-quality, and even a long context window can't fix
           | that. It will remember things from many many tokens ago, but
           | it still doesn't reliably produce passable work.
           | 
           | The combination of a GPT-4-quality model and a long context
           | window will unlock a lot of applications that now rely on
           | somewhat lossy window-prying hacks (i.e. summarizing chunks).
           | But any model quality below that won't move the needle much
           | in terms of what useful work is possible, with the exception
           | of fairly simple summarization and text analysis tasks.
        
             | pmoriarty wrote:
             | _> The fundamental problem seems to be that it 's still
             | slightly sub-GPT-3.5-quality_
             | 
             | It really depends on what you use it for.
             | 
             | I've found Claude better than GPT4 and even Claude+ at
             | creative writing.
             | 
             | It also tends to give more comprehensive explanations
             | without additional prompting. So I prefer to have it,
             | rather than GPT3.5 or 4, explain things to me.
             | 
             | It's also free, which is another big win over GPT4.
        
             | phillipcarter wrote:
             | Maybe! I certainly look forward to that. Although in my
             | testing GPT-4 also hallucinates a bit (less than gpt-3.5),
             | and the latency is so poor that it's unworkable for our
             | product.
        
               | koboll wrote:
               | Agreed. My heuristic is that GPT-4 is good for compile
               | time tasks but bad for runtime tasks for both cost and
               | speed reasons.
        
             | dr_dshiv wrote:
             | I find Claude significantly better than 3.5. I'd love to be
             | able to make the case for that with data...
        
               | og_kalu wrote:
               | There are 2 main claude models. I'm guessing it's
               | claude-v1.3 aka claude plus that you find much better
               | than 3.5 ? That tracks if so.
        
               | phillipcarter wrote:
               | I've found for my use case that both claude-instant-* and
               | claude-* are roughly on par with each other and gpt-3.5.
               | claude-* seems to be the least inaccurate, but we also
               | haven't put it into production like gpt-3.5, so it's hard
               | to say for sure.
               | 
               | In either case, the claude models are very good. I think
               | they'd do fine in a real product. But there's definitely
               | issues that they all have (or that my prompt engineering
               | has).
        
               | sanxiyn wrote:
               | Since Chatbot Arena Leaderboard
               | https://lmsys.org/blog/2023-05-10-leaderboard/ agrees
               | with you, it's not just you.
        
       | jlowin wrote:
       | The 100k context was originally released only via API, but I just
       | noticed that it's now available in the Claude web UI.
        
         | greyman wrote:
         | What is the URL of Claude web UI? I somehow cannot find it.
        
           | Veen wrote:
           | console.anthropic.com
        
           | pmoriarty wrote:
           | Also https://poe.com/Claude-instant-100k
        
       | ChikkaChiChi wrote:
       | Is there a place I can track all releases, announcements, and
       | invite links?
        
       ___________________________________________________________________
       (page generated 2023-05-15 23:01 UTC)