[HN Gopher] New models and developer products
___________________________________________________________________
New models and developer products
Author : kevin_hu
Score : 312 points
Date : 2023-11-06 18:17 UTC (2 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| alach11 wrote:
| There are a lot of huge announcements here. But in particular,
| I'm excited by the Assistants API. It abstracts away so many of
| the routine boilerplate parts of developing applications on the
| platform.
| gregorym wrote:
| how so?
| simonw wrote:
| The new assistants API looks both super-cool and (unfortunately)
| a recipe for all kinds of new applications that are vulnerable to
| prompt injection.
| burcs wrote:
| Do you see a way around prompt injection? It feels like any
| feature they release is going to be susceptible to it.
| minimaxir wrote:
| I suspect OpenAI's black box workflow has some safeguards for
| it.
| sillysaurusx wrote:
| Still, safeguards are quite a lot less safe than if
| statements. We live in interesting times.
|
| I don't think there's any way to guarantee safety from
| prompt injection. The most you can do is make a
| probabilistic argument. Which is fine; there are plenty of
| those, and we rely on them in the sciences. But it'll be
| difficult to quantify.
|
| CS majors will find it pretty alien. The blockchain was one
| of the few probabilistic arguments we use, and it's
| precisely quantifiable. This one will probably be empirical
| rather than theoretical.
| bluecrab wrote:
| Use an llm to evaluate the input and categorise it.
| alexander2002 wrote:
| With great power comes great responsibility!
| minimaxir wrote:
| Most of the products announced (and the price cuts) appear to be
| more about increasing lock-in to the OpenAI API platform, which
| is not surprising given increased competition in the space. The
| GPTs/GPT Agents and Assistants demos in particular showed that
| they are a black box within a black box within a black box that
| you can't port anywhere else.
|
| I'm mixed on the presentation and will need to read the fine
| print on the API docs on all of these things, which have been
| updated just now: https://platform.openai.com/docs/api-reference
|
| The pricing page has now updated as well:
| https://openai.com/pricing
|
| Notably, the DALL-E 3 API is $0.04 _per image_ which is an order
| of magnitude above everyone else in the space.
|
| EDIT: One interesting observation with the new OpenAI pricing
| structure not mentioned during the keynote: finetuned ChatGPT 3.5
| is now 3x of the cost of the base ChatGPT 3.5, down from 8x the
| cost. That makes finetuning a more compelling option.
| visarga wrote:
| Mistral + 2 weeks of work from the community. Not as good, but
| private and free. It will trail OpenAI by 6-12 months in
| capabilities.
| coder543 wrote:
| OpenAI offering 128k context is very appealing, however.
|
| I tried some Mistral variants with larger context windows,
| and had very poor results... the model would often offer
| either an empty completion or a nonsensical completion, even
| though the content fit comfortably within the context window,
| and I was placing a direct question either at the beginning
| or end, and either with or without an explanation of the task
| and the content. Large contexts just felt broken. There are
| so many ways that we are more than "two weeks" from the open
| source solutions matching what OpenAI offers.
|
| And that's to say nothing of how far behind these smaller
| models are in terms of accuracy or instruction following.
|
| For now, 6-12 months behind also isn't good enough. In the
| uncertain case that this stays true, then a year from now the
| open models could be perfectly adequate for many use cases...
| but it's very hard to predict the progression of these
| technologies.
| pclmulqdq wrote:
| Comparing a 7B parameter model to a 1.8T parameter model is
| kind of silly. Of course it's behind on accuracy, but it
| also takes 1% of the resources.
| coder543 wrote:
| The person I replied to had decided to compare Mistral to
| what was launched, so I went along with their comparison
| and showed how I have been unsatisfied with it. But,
| these open models can certainly be fun to play with.
|
| Regardless, where did you find 1.8T for GPT-4 Turbo? The
| Turbo model is the one with the 128K context size, and
| the Turbo models tend to have a much lower parameter
| count from what people can tell. Nobody outside of OpenAI
| even knows how many parameters regular GPT-4 has. 1.8T is
| one of several guesses I have seen people make, but the
| guesses vary significantly.
|
| I'm also not convinced that parameter counts are
| everything, as your comment clearly implies, or that
| chinchilla scaling is fully understood. More research
| seems required to find the right balance:
| https://espadrine.github.io/blog/posts/chinchilla-s-
| death.ht...
| danielmarkbruce wrote:
| It's an order of magnitude comparison.
|
| Let's just agree it's 100x-300x more parameters, and
| let's assume the open ai folks are pretty smart and have
| a sense for the optimal number of tokens to train on.
| razodactyl wrote:
| This definitely. Andrej Karpathy himself mentions tuned
| weight initialisation in one of his lectures. The TinyGPT
| code he wrote goes through it.
|
| Additionally explanations for the raw mathematics of log
| likelihoods and their loss ballparks.
|
| Interesting low-level stuff. These researchers are the
| best of the best working for the company that can afford
| them working on the best models available.
| razodactyl wrote:
| Nah, it's training quality and context saturation.
|
| Grab an 8K context model, tweak some internals and try to
| pass 32K context into it - it's still an 8K model and
| will go glitchy beyond 8K unless it's trained at higher
| context lengths.
|
| Anthropic for example talk about the model's ability to
| spot words in the entire Great Gatsby novel loaded into
| context. It's a hint to how the model is trained.
|
| Parameter counts are a unified metric, what seems to be
| important is embedding dimensionality to transfer
| information through the layers - and the layers
| themselves to both store and process the nuance of
| information.
| spankalee wrote:
| A friend of mine is building Zep (https://www.getzep.com/),
| which seems to offer a lot of the Assistant + Retrieval
| functionality in a self-hostable and model-agnostic way. That
| type of project may the way around lock-in.
| davidbarker wrote:
| Also, DALL*E 3 "HD" is double the price at $0.08. I'm curious
| to play around with it once the API changes go live later
| today.
|
| The docs say:
|
| > By default, images are generated at standard quality, but
| when using DALL*E 3 you can set quality: "hd" for enhanced
| detail. Square, standard quality images are the fastest to
| generate.
|
| https://platform.openai.com/docs/guides/images/usage
| faeriechangling wrote:
| It's a good strategy. For me, avoiding the moat means either a
| big drop in quality and just ending up in somebody elses moat,
| or a big drop in quality and a lot more money spent. I've
| looked into it and maybe the most practical end-to-end system
| for owning my own LLM is to run a couple of 3090s on a consumer
| motherboard at substantial running cost to keep them up 24/7
| and that's not powerful enough to cut it and rather expensive
| simultaniously. For a bit more expense, you can get more
| quality and lower running costs and much slower processing from
| buying a 128gb/192gb apple silicon setup and that's much much
| much slower than the "Turbo" services that OpenAI offers.
|
| I think the biggest thing pushing me away from OpenAI was they
| were subsidizing the chat experience much more than the API and
| this seems to reconcile that quite a bit. Quite simply OpenAI
| is sweetening the pot here too much for me to really ignore,
| this is a massively subsdizised service. I honestly don't feel
| the switching costs in the future will outweigh the benefits
| I'm getting now.
| ebiester wrote:
| I don't understand the lock-in argument here. Yes, if a
| competitor comes in there will be switching cost as everything
| is re-learned. However, from a code perspective, it is a
| function of the key and a relatively small API. New regulations
| outstanding, what is stoping someone from moving from OpenAI to
| Anthropic (for example) other than the cost of learning how to
| effectively utilize Anthropic for your use case?
|
| OpenAI doesn't have some sort of egress feed for your database.
| pclmulqdq wrote:
| I sometimes wonder how much OpenAI pays for people to post
| arguments about how great they are on HN, because it looks
| like you are pretty much right. There isn't a ton about
| OpenAI that is actually sticky.
| minimaxir wrote:
| I most definitely am not paid by OpenAI and am very
| confused how my original (critical) comment could be seen
| as astroturfing.
| airstrike wrote:
| _> Please don 't post insinuations about astroturfing,
| shilling, brigading, foreign agents, and the like. It
| degrades discussion and is usually mistaken. If you're
| worried about abuse, email hn@ycombinator.com and we'll
| look at the data._
|
| https://news.ycombinator.com/newsguidelines.html
| minimaxir wrote:
| > OpenAI doesn't have some sort of egress feed for your
| database.
|
| That's what they're trying to incentivize, especically with
| being able to upload files for their own implementation of
| RAG. You're not getting the vector representation of those
| files back, and switching to another provider will require
| rebuilding and testing that infrastructure.
| vsareto wrote:
| >The GPTs/GPT Agents and Assistants demos in particular showed
| that they are a black box within a black box within a black box
| that you can't port anywhere else.
|
| This just rings hollow to me. We lost the fights for database
| portability, cloud portability, payments/billing portability,
| and other individual SaaS lock-in. I don't see why it'll be
| different this time around.
| activescott wrote:
| I think it's more about finding places to add value than "lock
| in" per se. It seems they're adding value with improved
| developer experience and cost/performance rather than on the
| models themselves. Not necessarily nefarious attempts to lock
| in customers, but it may have the same outcome :)
| crakenzak wrote:
| The 128k context window GPT-4 Turbo model looks unreal. Seems
| like Anthropic's day of reckoning is here?
| infecto wrote:
| Anthropic never even had a day. I said this before in another
| Anthropic thread but I signed up 6 months ago for API access
| and they never responded. An employee in that thread apologized
| and said to try again, did it, week later still nothing. As far
| as commercial viability, they never had it.
| QkPrsMizkYvt wrote:
| same here. I wonder why they are not opening it up to more
| devs. Seems strange.
| freedomben wrote:
| Purely a guess, but having tried to scale services to new
| customers, it can be a lot harder than it seems, especially
| if you have to customize anything. Early on, doing a
| generic one-size-fits-all can be really, really hard, and
| acquiring those early big customers is important to
| survival and often requires customizations.
| og_kalu wrote:
| Yeah i know this wasn't the case for everyone but i got gpt-4
| access back in march the next day. Tried Claude and still
| waiting. Oh well lol.
| taf2 wrote:
| I got access to Claude 2 - it's really good and have been
| chatting with their sales team. Seems they were reasonably
| responsive- but overall with OpenAI 128k context and price
| anthropic has no edge
| bluecrab wrote:
| They can't even compete with open source since multiple
| platforms have apis available.
| a_wild_dandan wrote:
| Anthropic's $20 billion valuation is buck wild, especially to
| those who've used their "flagship" model. The thing is
| insufferable. David Shapiro sums it up nicely.[1] Fighting
| tools is horrendous enough. Those tools also deceiving and
| lecturing you regarding benign topics is inexcusable. I suspect
| that this behavior is a side-effect of Anthropic's fetishistic
| AI safety obsession. I further suspect that the more one brain
| washes their agent into behaving "acceptably", the more it'll
| backfire with erratic and useless behavior. Just like with
| humans, the antidote to harmful action is _more_ free thought
| and education, not less. Punishment methods rooted in fear and
| insecurity will result in fearful and insecure AI (i.e
| ironically creating the _worst_ outcome we 're all trying to
| avoid).
|
| [1] https://www.youtube.com/watch?v=PgwpqjiKkoY
| machdiamonds wrote:
| Anthropic doesn't care about consumer products. Their CEO
| believes that the company with the best LLM by 2026 will be too
| far ahead for anyone else to catch up.
| topicseed wrote:
| 128,000 token context, Assistants API, JSON mode, April 2023
| knowledge cutoff, GPT 4 Turbo, lower pricing, custom GPTs, a good
| bunch of announcements all-round!
|
| https://openai.com/pricing
| TIPSIO wrote:
| That map/travel demo was insane. Trying to find the demo again.
| topicseed wrote:
| It was but most of that functionality was within the "function
| calling", not really within the assistant as a top 10 of Paris
| sights isn't really that crazy. Plotting these on a map is the
| key part which is still your own code, not GPT-based.
| rictic wrote:
| Turning an airline receipt pdf into a well structured
| function call is very nice.
| dnadler wrote:
| This might also be a bit easier than it seems. I've done
| similar (though not nearly as nice of a UI) with
| `unstructured`.
| davidbarker wrote:
| https://www.youtube.com/live/U9mJuUkhUzk?t=2006
|
| (Timestamp 33:26)
|
| Edit: updated the timestamp
| brunoqc wrote:
| ~~wat? the video is 45:35 long.~~
| davidbarker wrote:
| Oh! When I replied it was a lot longer -- it still had the
| countdown from before the stream went live. I guess they
| replaced it with the trimmed version.
| brunoqc wrote:
| Thanks!
| WanderPanda wrote:
| Yep I feel like they solved the problem that Apple never
| managed to solve with Siri: How to interface it with apps.
| Seems like this was an LLM-hard problem
| freedomben wrote:
| My guess is an LLM-based Siri is right around the corner.
| Apple commonly waits for tech to be proved by others before
| adopting it, so this would be in-line with standard operating
| procedures.
| singularity2001 wrote:
| My guess is that LLM-Siri will be crippled by internal
| processes and lawyers
| glass-z13 wrote:
| One step closer to augmenting day to day internet browsing with
| the announcement of the GPT's
| vineet wrote:
| The Assistants API is really cool. Together with the retrieval
| feature, it makes me wonder how many companies OpenAI killed by
| creating it.
| modeless wrote:
| Whisper V3 is released!
| https://github.com/openai/whisper/commit/c5d42560760a05584c1...
|
| Looks like it's just a new checkpoint for the large model. It
| would be nice to have updates for the smaller models too. But
| it'll be easy to integrate with anything using Whisper V2. I'm
| excited to add it to my local voice AI
| (https://www.microsoft.com/store/apps/9NC624PBFGB7)
|
| I assume ChatGPT voice has been using Whisper V3 and I've noticed
| that it still has the classic Whisper hallucinations ("Thank you
| for watching!"), so I guess it's an incremental improvement but
| not revolutionary.
| ianbicking wrote:
| Do you also get those hallucinations just on silence?
|
| I kind of wonder if they had a bunch of training data of video
| with transcripts, but some of the video/audio was truncated and
| the transcript still said the last speech, and so now it thinks
| silence is just another way of signing off from a TV program.
|
| IMHO the bottleneck on voice now is all the infrastructure
| around it. How do you detect speech starting and stopping? How
| do you play sound/speech while also being ready for the user to
| speak? This stuff is necessary, but everything kind of works
| poorly, and you really need hardware/software integration.
| modeless wrote:
| You're right, I think that's exactly what happened.
|
| Silence is when you get the most hallucinations. But there is
| a trick supported by some implementations that helps a lot.
| Whisper does have a special <|nospeech|> token that it
| predicts for silence. You can look at the probability of that
| token even when it's not picked during sampling.
| Hallucinations often have a relatively high probability for
| the nospeech token compared to actual speech, so that can
| help filter them out.
|
| As for all the surrounding stuff like detecting speech
| starting and stopping and listening for interruptions while
| talking, give my voice AI a try. It has a rough first pass at
| all that stuff, and it needs a lot of work but it's a start
| and it's fun to play with. Ultimately the answer is end-to-
| end speech-to-speech models, but you can get pretty far with
| what we have now in open source!
| Void_ wrote:
| Too bad they didn't upgrade Whisper API yet. Can't wait to make
| it available in https://whispermemos.com
| dang wrote:
| Related:
|
| _OpenAI releases Whisper v3, new generation open source ASR
| model_ - https://news.ycombinator.com/item?id=38166965
| zavertnik wrote:
| And here I was in bliss with the 32k context increase 3 days ago.
| 128k context? Absolutely insane. It feels like now the bottle
| neck in GPT workflows is no longer GPT, but instead its the
| wallet!
|
| Such an amazing time to be alive.
| naiv wrote:
| now with the prices reduced so much even the wallet might not
| be the bottle neck anymore
| in3d wrote:
| For GPT-4 Turbo, not GPT-4.
| dragonwriter wrote:
| GPT-4-Turbo seems to be replacing GPT-4 (non-turbo); the
| GPT-4 (non-turbo) model is marked as "Legacy" in the model
| list.
|
| EDIT: the above is corrected, it previously erroneously said
| the non-turbo model was marked as "deprecated", which is a
| different thing.
| kridsdale3 wrote:
| Yes, nowhere in the text today was there any assertion that
| Turbo produces (eg) source code at the same level of
| coherence and consistently high quality as GPT4.
| marban wrote:
| Comment will not age well.
| Swizec wrote:
| > 128k context? Absolutely insane
|
| 128k context is great and all, but how effective are the middle
| 100,000 tokens? LLMs are known to struggle with remembering
| stuff that isn't at the start or end of the input. Known as the
| Lost Middle
|
| https://arxiv.org/abs/2307.03172
| saliagato wrote:
| sama said they improved it
| robertkoss wrote:
| Does anyone know when this will be coming to Azure OpenAI?
| kasetty wrote:
| I would be also interested in knowing when these show up in
| Azure OpenAI offerings.
| Onawa wrote:
| If Azure's history when rolling out GPT-4 is any indication,
| probably a couple months and/or a staged rollout.
| robertkoss wrote:
| Is Azure adoption really that slow? Ugh.
| Zaheer wrote:
| The playbook OpenAI is following is similar to AWS. Start with
| the primitives (Text generation, Image generation, etc / EC2, S3,
| RDS, etc) and build value add services on top of it (Assistants
| API / all other AWS services). They're miles ahead of AWS and
| other competitors in this regard.
| gumballindie wrote:
| And just like amazon they will compete with their own
| customers. They are miles ahead in this regard as well since
| they basically take everyone's digital property and resell it.
| sharemywin wrote:
| don't hate the player hate the game.
| chipgap98 wrote:
| The Assistants API and OpenAI Store are really interesting. Those
| are the types of things that could build a moat for OpenAI
| visarga wrote:
| You think it is hard to export an agent? It's a master prompt,
| a collection of documents and a few generic plugins like
| function calling and code execution. This will be implemented
| in open source soon. You can even fine-tune on your bot logs.
| WanderPanda wrote:
| Agreed, the moat are the models (as an extension of the
| instruction tuning data)
| chipgap98 wrote:
| The Assistants playground doesn't seem to be available yet
| singularity2001 wrote:
| https://chat.openai.com/gpts/editor
|
| you currently do not have access to this feature :(
| cryptoz wrote:
| For DALL-E 3, I'm getting "openai.error.InvalidRequestError: The
| model `dall-e-3` does not exist." is this for everyone right now?
| Maybe it's gonna be out any minute.
|
| I see the python library has an upgrade available with breaking
| changes, is there any guide for the changes I'll need to make?
| And will the DALL-E 3 endpoint require the upgrade? So many
| questions.
|
| Edit: Oh I see,
|
| > We'll begin rolling out new features to OpenAI customers
| starting at 1pm PT today.
| minimaxir wrote:
| The documentation/READMEs in the GitHub repo was updated to
| play nice with the new v1.0.0 of the package:
| https://github.com/openai/openai-python/
| cryptoz wrote:
| Aha, makes sense, thanks :)
| davio wrote:
| Stream of keynote: https://youtu.be/U9mJuUkhUzk?t=1806
| WanderPanda wrote:
| Does anyone have an idea why they are so open about Whisper? Is
| it the poster child project for OAI people scratching their open
| source itch? Is there just no commercial value in speech to text?
| htrp wrote:
| speech to text is a relatively crowded area with a lot of other
| companies in the space. Also really hard to get "wow"
| performance as it's either correct (like most other people's
| models) or it's wrong
| teaearlgraycold wrote:
| Everyone's got a loss leader
| freedomben wrote:
| I've been wondering this as well. I'm super glad, but it seems
| so different than every _other_ thing they do. There 's
| _definitely_ commercial value, so I find it surprising.
| StanAngeloff wrote:
| I personally use Whisper to transcribe painfully long meetings
| (2+ hours). The transcripts are then segmented and, you guessed
| it, entered right into GPT-4 for clean up, summarisation,
| minutes, etc. So in a sense it's a great way to get more people
| to use their other products?
| htrp wrote:
| We need some independent benchmarks (LLM elo via chatbot arena
| etc) about how gpt4 Turbo compares to gpt4.
| freedomben wrote:
| Text to Speech is exciting to me, though it's of course not
| particularly novel. I've been creating "audiobooks" for personal
| use for books that don't have a professional version, and despite
| high costs and meh quality have been using AWS.
|
| Has anybody tried this new TTS speech for longer works and/or
| things like books? Would love to hear what people think about
| quality
| dang wrote:
| Related ongoing threads:
|
| _GPTs: Custom versions of ChatGPT_ -
| https://news.ycombinator.com/item?id=38166431
|
| _OpenAI releases Whisper v3, new generation open source ASR
| model_ - https://news.ycombinator.com/item?id=38166965
|
| _OpenAI DevDay, Opening Keynote Livestream [video]_ -
| https://news.ycombinator.com/item?id=38165090
| QkPrsMizkYvt wrote:
| Most of the API docs were updated, but none of the new APIs work
| for me. Are other people experiencing the same?
| davidbarker wrote:
| They will start rolling out at 1pm PST today.
| QkPrsMizkYvt wrote:
| got it - thanks
| QkPrsMizkYvt wrote:
| nice it is live now!
| willsmith72 wrote:
| If they could roll back the extreme rate-limiting on dalle 3 in
| gpt4, that would be great.
| kelseyfrog wrote:
| JSON mode is a great step in the right direction, but the holy
| grail is either JSON-schema support or (E)BNF grammar
| specification.
| minimaxir wrote:
| The function calling is JSON Schema support but extremely
| poorly marketed. I am planning on writing a blog post about it.
| danenania wrote:
| Yeah I'm not sure I see the point of "JSON mode", in its
| current iteration at least, considering function calling
| already does this more effectively.
|
| I suppose it could help to make simpler API calls and save
| some prompt tokens, but it would definitely need schema
| support to really be useful.
| minimaxir wrote:
| It makes it a bit easier to parse returned tabular data,
| anyways.
|
| I'll be curious to see if it can handle outputting nested
| data without prompting.
| Wherecombinator wrote:
| Is this just for the API for now?
|
| I just got premium the other day for ChatGPT 4 and have been
| blown away. I'm wondering if I'll automatically get turbo when
| it's released?
| tornato7 wrote:
| GPT-4 Turbo is already available by default in ChatGPT
| kvn8888 wrote:
| I can't find anything that says it's available in ChatGPT
| dragonwriter wrote:
| ChatGPT (at least in Plus) when using the GPT-4 model
| selected (instead of GPT-3.5) currently consistently
| reports the April 2023 knowledge cutoff of GPT-4-Turbo
| (gpt-4-1106-preview/gpt-4-vision-preview) as its knowledge
| cutoff, not the Sep 2021 cutoff for gpt-4-0613, the most
| recent pre-turbo GPT-4 model release.
|
| The most sensible explanation is that ChatGPT is using
| GPT-4-Turbo as its GPT-4 model.
| Topfi wrote:
| I am very much looking forward to, but also dreading, testing
| gpt-4-turbo as part of my workflow and projects. The lowered cost
| and much larger context window are very attractive; however, I
| cannot be the only one who remembers the difference in output
| quality and overall perceived capability between gpt-3.5 and
| gpt-3.5-turbo, combined with the intransparent switching from one
| model to the other (calling the older, often more capable model
| "Legacy", making it GPT+ exclusive, trying to pass of
| gpt-3.5-turbo as a straight upgrade, etc.). If the former had
| remained available after the latter became dominant, that may not
| have been a problem in itself, but seeing as gpt-3.5-turbo has
| fully replaced its precursor (both on the Chat website and via
| API) and gpt-4 as offered up to this point wasn't a fully perfect
| replacement for plain gpt-3.5 either, relying on these models as
| offered by OpenAI has become challenging.
|
| A lot of ink has been spilled about gpt-4 (via the Chat website,
| but also more recently via API) seeming less capable over the
| last few months compared to earlier experiences and whilst I
| still believe that the underlying gpt-4 model can perform at a
| similar degree to before, I will admit that purely the amount of
| output one can reliably request from these models has become
| severely restricted, even when using the API.
|
| In other words, in my limited experience, gpt-4 (via API or
| especially the Chat website) can perform equally well in tasks
| and output complexity, but the amount of output one receives
| seems far more restricted than before, often harming existing use
| cases and workflows. There appears a greater tendency to include
| comments ("place this here") even when requesting a specific
| section of output in full.
|
| Another aspect that results from their lack of transparency is
| communicating the differences between the Chat Website and API. I
| understand why they cannot be fully identical in terms of output
| length and context window (otherwise GPT+ would be an even bigger
| loss leader), but communicating the Status Quo should not be an
| unreasonable request in my eyes. Call the model gpt-4-web or
| something similar to clearly differentiate the Chat Website
| implementation from gpt-4 and gpt-4-1106 via API (the actual name
| for gpt-4-turbo at this point in time). As it stands, people like
| myself have to always add whether the Chat website or API is what
| our experiences arise from, while people who may only casually
| experiment with the free Website implementation of gpt-3.5-turbo
| may have a hard time grasping why these models create such
| intense interest in those more experienced.
| doctoboggan wrote:
| In the keynote @sama claimed GPT-4-turbo was superior to the
| older GPT-4. Have any benchmarks or other examples been shown? I
| am curious to see how much better it is, if it all. I remember
| when 3.5 got its turbo version there was some controversy on
| whether it was really better or not.
| tornato7 wrote:
| A few notes on pricing:
|
| - GPT-4 Turbo vision is much cheaper than I expected. A 768*768
| px image costs $0.00765 to input. That's practical to replace
| more specialized computer vision models for many use-cases.
|
| - ElevenLabs is $0.24 per 1K characters while OpenAI TTS HD is
| $0.03 per 1K characters. Elevenlabs still has voice copying but
| for many use-cases it's no longer competitive.
|
| - It appears that there's no additional fee for the 128K context
| model, as opposed to previous models that charged extra for the
| longer context window. This is huge.
| taf2 wrote:
| Does this mean OpenAI tts is available via api? I saw whisper
| but not tts - maybe I'm missing it?
| davidbarker wrote:
| It is, indeed!
|
| https://platform.openai.com/docs/guides/text-to-speech
| alach11 wrote:
| There are a lot of huge announcements here. But in particular,
| I'm excited by the Assistants API. It abstracts away so many of
| the routine boilerplate parts of developing applications on the
| platform.
| og_kalu wrote:
| The new TTS is much cheaper than eleven labs and better too.
|
| I don't know how the model works so maybe what i'm asking isn't
| even feasible but i wish they gave the option of voice cloning or
| something similar or at least had a lot more voices for other
| languages. The default voices tend to make other language output
| have an accent.
|
| Uh if turbo's the much faster model a few have had access to in
| the past week, then pressing x on the "more intelligent than
| legacy 4" statement.
| obiefernandez wrote:
| My profit margins at https://olympia.chat just got 3x better <3
| saliagato wrote:
| I think your startup just died
| leobg wrote:
| Elaine Jusk...lol
| whytai wrote:
| Every day this video ages more and more poorly [1].
|
| categories of startups that will be affected by these launches:
|
| - vectorDB startups -> don't need embeddings anymore
|
| - file processing startups -> don't need to process files anymore
|
| - fine tuning startups -> can fine tune directly from the
| platform now, with GPT4 fine tuning coming
|
| - cost reduction startups -> they literally lowered prices and
| increased rate limits
|
| - structuring startups -> json mode and GPT4 turbo with better
| output matching
|
| - vertical ai agent startups -> GPT marketplace
|
| - anthropic/claude -> now GPT-turbo has 128k context window!
|
| That being said, Sam Altman is an incredible founder for being
| able to have this close a watch on the market. Pretty much any
| "ai tooling" startup that was created in the past year was
| affected by this announcement.
|
| For those asking: vectorDB, chunking, retrieval, and RAG are all
| implemented in a new stateful AI for you! No need to do it
| yourself anymore. [2] Exciting times to be a developer!
|
| [1] https://youtu.be/smHw9kEwcgM
|
| [2] https://openai.com/blog/new-models-and-developer-products-
| an...
| Der_Einzige wrote:
| Startups built around actual AI tools, like if one formed
| around automatic1111 or oogabooga, would be unaffected, but
| because so much VC money went to the wrong places in this
| space, a whole lot of people are about to be burned hard.
| throwaway-jim wrote:
| damn hahaha it's oobabooga not oogabooga
| atleastoptimal wrote:
| There will be a lot of startups who rely on marketing
| aggressively to boomer-led companies who don't know what email
| is and hoping their assistant never types OpenAI into Google
| for them.
| yawnxyz wrote:
| i'm excited for the open source, local inferencing tech to
| catchup. The bar's been raised.
| morkalork wrote:
| If you want to be a start-up using AI, you have to be in
| another industry with access to data and a market that
| OpenAI/MS/Google can't or won't touch. Otherwise you end up
| eaten like above.
| ushakov wrote:
| We just launched our AI-based API-Testing tool
| (https://ai.stepci.com), despite having competitors like
| GitHub Co-Pilot.
|
| Why? Because they lack specificity. We're domain experts, we
| know how to prompt it correctly to get the best results for a
| given domain. The moat is having model do one task extremely
| well rather than do 100 things "alright"
| darkwater wrote:
| Sorry to be blunt but they can be totally right, if you do
| not succeed and have to shut down your startup.
| ushakov wrote:
| It certainly will be a fun experience. But our current
| belief is that LLMs are a commodity and the real value is
| in (application-specific) products built on top of them.
| esafak wrote:
| If you just launched it is too soon to speak.
| ushakov wrote:
| Of course! Today our assumption is that LLMs are
| commodities and our job is to get the most out of them
| for the type of problem we're solving (API Testing for
| us!)
| sharemywin wrote:
| Time will tell
| parkerhiggins wrote:
| Domain specialization could be the moat, not only in the
| business domain but the sheer cost of
| deployment/refinement.
|
| Check out Will Bennett's "Small language models and
| building defensibility" - https://will-
| bennett.beehiiv.com/p/small-language-models-and... (free
| email newsletter subscription required)
| renewiltord wrote:
| Writer.ai is quite successful, and is totally in another
| industry that Google+MS participate in.
| colordrops wrote:
| I haven't been paying attention, why are embeddings not needed
| anymore?
| lazzlazzlazz wrote:
| OP is incorrect. Embeddings are still needed since (1)
| context windows can't contain all data and (2) data
| memorization and continuous retraining is not yet viable.
| nextworddev wrote:
| "yet"
| coding123 wrote:
| It's also much slower. LLMs are generating text token at
| a time. That's not very good for search.
|
| Pre-search tokenization however, probably a good fit for
| LLMs.
| zwily wrote:
| But the common use case of using a vector DB to pull in
| augmentation appears to now be handled by the Assistants
| API. I haven't dug into the details yet but it appears you
| can upload files and the contents will be used (likely with
| some sort of vector searching happening behind the scenes).
| emadabdulrahim wrote:
| I believe their API can be stateful now:
| https://openai.com/blog/new-models-and-developer-products-
| an...
| sharemywin wrote:
| Retrieval: augments the assistant with knowledge from outside
| our models, such as proprietary domain data, product
| information or documents provided by your users. This means
| you don't need to compute and store embeddings for your
| documents, or implement chunking and search algorithms. The
| Assistants API optimizes what retrieval technique to use
| based on our experience building knowledge retrieval in
| ChatGPT.
|
| The model then decides when to retrieve content based on the
| user Messages. The Assistants API automatically chooses
| between two retrieval techniques:
|
| it either passes the file content in the prompt for short
| documents, or performs a vector search for longer documents
| Retrieval currently optimizes for quality by adding all
| relevant content to the context of model calls. We plan to
| introduce other retrieval strategies to enable developers to
| choose a different tradeoff between retrieval quality and
| model usage cost.
| sjnair96 wrote:
| Really cool to see the Assistants API's nuanced document
| retrieval methods. Do you index over the text besides
| chunking it up and generating embeddings? I'm curious about
| the indexing and the depth of analysis for longer docs,
| like assessing an author's tone chapter by chapter--vector
| search might have its limits there. Plus, the process to
| shape user queries into retrievable embeddings seems
| complex. Eager to hear more about these strategies, at
| least what you can spill!
| lazzlazzlazz wrote:
| Embeddings are still important (context windows can't contain
| all data + memorization and continuous retraining is not yet
| viable), and vertical AI agent startups can still lead on UX.
| Finbarr wrote:
| Context windows can't contain all data... yet.
| ren_engineer wrote:
| depends on how much developers are willing to embrace the risk
| of building everything on OpenAI and getting locked onto their
| platform.
|
| What's stopping OpenAI from cranking up the inference pricing
| once they choke out the competition? That combined with the
| expanded context length makes it seem like they are trying to
| lead developers towards just throwing everything into context
| without much thought, which could be painful down the road
| keithwhor wrote:
| I suspect it is in OpenAI's interest to have their API as a
| loss leader for the foreseeable future, and keep margins slim
| once they've cornered the market. The playbook here isn't to
| lock in developers and jack up the API price, it's the
| marketplace play: attract developers, identify the highest-
| margin highest-volume vertical segments built atop the
| platform, then gobble them up with new software.
|
| They can then either act as a distributor and take a
| marketplace fee or go full Amazon and start competing in
| their own marketplace.
| baq wrote:
| Checking hn and product hunt a few times a week gives you most
| of that awareness and I don't need to remind you about the
| person behind hn 'sama' handle.
| bluecrab wrote:
| Vector DBs should never have existed in the first place. I feel
| sorry for the agent startups though.
| m3kw9 wrote:
| How does this absolve vectordbs
| danielbln wrote:
| It doesn't, but semantic search is a lot less relevant if
| you can squeeze 350 pages of text into the context.
| dragonwriter wrote:
| If you are using OpenAI, the new Assistants API looks like
| itnwill handle internally what you used to handle
| externally with a vector DB for RAG (and for some things,
| GPT-4-Turbo's 128k context window will make it unnecessary
| entirely.) There are some other uses for Vector DBs than
| RAG for LLMs, and there are reasons people might use non-
| OpenAI LLMs with RAG, so there is still a role for
| VectorDBs, but it shrunk a lot with this.
| echelon wrote:
| We don't want Open AI to win everything.
| blibble wrote:
| HN is quite notorious for _that_ Dropbox comment
|
| I suspect that video is going to end up more notorious, it's
| even funnier given it's the VCs themselves
| arcanemachiner wrote:
| More context, please.
|
| EDIT: I guess it's this:
|
| https://news.ycombinator.com/item?id=8863#9224
| blibble wrote:
| that's the one
| bilsbie wrote:
| Why don't you need embedding?
| riku_iki wrote:
| > - vectorDB startups -> don't need embeddings anymore
|
| they don't provide embedings, but storage and query engines for
| embeddings, so still very relevant
|
| > - file processing startups -> don't need to process files
| anymore
|
| curious what is that exactly?..
|
| > - vertical ai agent startups -> GPT marketplace
|
| sure, those startups will be selling their agents on
| marketplace
| make3 wrote:
| they definitely do provide embeddings,
| https://openai.com/blog/new-models-and-developer-products-
| an... ctrl+f retrieval, "... won't need to ... compute or
| store embeddings"
| riku_iki wrote:
| I mean embeddingsDB startups don't provide embeddings. They
| provide databases which allows to store and query computed
| embeddings (e.g. computed by ChatGPT), so they are
| complimentary services.
| larodi wrote:
| Well, if said startups were visionaries, the could've known
| better the business they're entering. On the other hand - there
| are plenty of VC-inflated balloons, making lots of noise, that
| everyone would be happy to see go. If you mean these startups -
| well, farewell.
|
| There's plenty more to innovate, really, saying OpenAI killed
| startups it's like saying that PHP/Wordpress/NameIt killed
| small shops doing static HTML. or IBM killing the... typewriter
| companies. Well, as I said - they could've known better.
| Competition is not always to blame.
| karmasimida wrote:
| TBH those are low-hanging fruits for OpenAI. Much of the value
| still being captured by OpenAI's own model.
|
| The sad thing is, GPT-4 is its own league in the whole LLM
| game, whatever those other startups are selling, it isn't
| competing with OpenAI.
| schrodingerscow wrote:
| I'm confused by the pricing. Gpt-4 turbo appears to be better in
| every way, but is 3x cheaper?!
| dragonwriter wrote:
| The same as true of GPT-3.5-turbo compared to the GPT-3 models
| which preceded it.
|
| They want everyone on GPT-4-turbo. It may also be a smaller (or
| otherwise more efficient) but more heavily trained model that
| is cheaper to do inference on.
| tornato7 wrote:
| According to [1], the new gpt-4-1106-preview model should be
| available to all, but the API is telling me "The model
| `gpt-4-1106-preview` does not exist or you do not have access to
| it."
|
| Anyone able to call it from the API?
|
| 1. https://help.openai.com/en/articles/8555510-gpt-4-turbo
| anotherpaulg wrote:
| Same. I am eager to run my code editing benchmark [1] against
| it, to compare it with gpt-4-0314 and gpt-4-0613.
|
| Edit: Ha, I just re-read the announcement [2] and it says 1pm
| in the 5th sentence: We'll begin rolling out
| new features to OpenAI customers starting at 1pm PT today.
|
| [1] https://aider.chat/docs/benchmarks.html
|
| [2] https://openai.com/blog/new-models-and-developer-products-
| an...
| naiv wrote:
| rumours on x are that it will be available 1pm san francisco
| time
| tekacs wrote:
| > We'll begin rolling out new features to OpenAI customers
| starting at 1pm PT today.
|
| ^ It says exactly this in the linked article.
| naiv wrote:
| oh, totally overread this :D
| reqo wrote:
| Didn't the tickets to Dev Day cost around 600$? They basically
| took that money and gave it back to developers as credits so they
| can start using their API today! Pretty smart move!
| longnguyen wrote:
| Awesome. Adding GPT-4 Turbo and DALL*E 3 to my ChatGPT macOS
| client[0]
|
| [0]: https://boltai.com
| gwern wrote:
| > We're also launching a feature to return the log probabilities
| for the most likely output tokens generated by GPT-4 Turbo and
| GPT-3.5 Turbo in the next few weeks, which will be useful for
| building features such as autocomplete in a search experience.
|
| This is very surprising to me. Are they not worried about people
| not just training on GPT-4 outputs to steal the model
| capabilities, but doing full blown logit knowledge-distillation?
| (Which is the reason everyone assumed that they disabled logit
| access in the first place.)
| leobg wrote:
| How many GBs worth of logits would you need to reverse engineer
| their model? Also, if it's a conglomerate of models that
| they're using, you'd end up in a blind alley.
| danielmarkbruce wrote:
| I thought the same thing.... My guess is they did a lot of
| analysis and decided it would be safe enough to do? "most
| likely" might be literally a handful and cover little of the
| entire distribution % wise?
| saliagato wrote:
| You can now [1] pay from $2 to $3 million to pretrain custom
| gpt-n model. This has gone unnoticed but seems really neat.
| Provided that a start-up has enough money spend on that, it would
| certainly give competitive advantage.
|
| [1] https://openai.com/form/custom-models
|
| Edit: forgot to put the link
| llmllmllm wrote:
| While this makes some of what my startup https://flowch.ai does a
| commodity (file uploads and embeddings based queries are an
| example, but we'll see how well they do it - chunking and
| querying with RAG isn't easy to do well), the lower prices of
| models make my overall platform way better value, so I'd say
| overall it's a big positive.
|
| Speaking more generally, there's always room for multiple
| players, especially in specific niches.
| mediaman wrote:
| Their system also does not seem to support techniques like
| hybrid search, automated cleaning/modifying of chunks prior to
| embedding, or the ability to access citations used, all of
| which are pretty important for enterprise search.
|
| Could just mean it's coming, though.
| aantix wrote:
| Can I pay someone to have my ChatGPT transcripts searchable?
| raylad wrote:
| So with 128K context window, if you actually input 100K it would
| cost you:
|
| Input: $0.01 per 1K tokens * 100 = $1.00
|
| $1.00 per query?
|
| Given that each query uses the entire context window, the session
| would start at $1 for the first query and go up from there? Or do
| I have it wrong?
| minimaxir wrote:
| It would be $1 for each individual API call, if you were
| continuing the conversation based on the same 100K input.
| ChatGPT is stateless.
| raylad wrote:
| Right, so that adds up very fast.
| 0xDEF wrote:
| If it truly is GPT-4+ with a 128K context window it's still
| absolutely worth the high price. However if they are cheating
| like everyone else who has promised gigantic context windows
| then we are better off with RAG and a vector database.
| shanusmagnus wrote:
| This is kind of the wrong place for this, but given the burst of
| attention from LLM-loving people: is there any open source chat
| scaffolding that actually provides a good UI for organizing chat
| streams and doing stuff with them?
|
| A trivial example is how the LHS of the ChatGPT UI only allows
| you a handful of characters to name your chat, and you can't even
| drag the pane to the right to make it bigger; so I have all these
| chats with cryptic names from the last eleven months that I can't
| figure out wtf they are; and folders are subject to the same
| problem.
|
| Seriously, just being able to organize all my chats would be a
| massive help; but there are so many cool things you could do
| beyond this! But I've found nothing other than literal clones of
| the ChatGPT UI. Is there really nothing? Nobody has made anything
| better?
| bluecrab wrote:
| Also natural language search of the chat history would be
| great.
| nextworddev wrote:
| Organize how?
| sharemywin wrote:
| tree structure. like email.
| shanusmagnus wrote:
| That would be one very obvious way and a big improvement
| over the current state of affairs.
| sharemywin wrote:
| I agree why not vector search for history.
| davidbarker wrote:
| This may not be useful to you, but there are browser extensions
| that add a bunch of functionality to ChatGPT.
|
| The first that comes to mind:
| https://chrome.google.com/webstore/detail/superpower-chatgpt...
| shanusmagnus wrote:
| No joy with the one you linked (can't see what problem that
| one is actually solving), but I'll look through browser
| extensions -- I hadn't considered that.
| ryanklee wrote:
| ChatGPT Keeper Chrome extension at least allows for search.
| singularity2001 wrote:
| did they break the api?
|
| from openai import OpenAI
|
| Traceback (most recent call last): File "<stdin>", line 1, in
| <module> ImportError: cannot import name 'OpenAI' from 'openai'
|
| If so where is the current documentation?
| ofermend wrote:
| Excited to see GPT4-Turbo and longer sequence lengths from
| OpenAI. We just released Vectara's "Hallucination Evaluation
| Model" (aka HEM) today
| https://huggingface.co/vectara/hallucination_evaluation_mode...
| (along with this leaderboard:
| https://github.com/vectara/hallucination-leaderboard). GPT-4 was
| already in the lead. Looking forward to seeing GPT4-Turbo there
| soon.
| m3kw9 wrote:
| How many startups got shafted today?
| dangrigsby wrote:
| Is there a special "developer" designation? I am a paying API
| customer, but can't see gpt-4-1106-preview in the playground and
| can't use it via the API.
| danenania wrote:
| Apparently they'll be granting access at 1pm PST. We'll see
| what happens. Rate limits also don't seem to be updated yet to
| reflect their new "Usage Tiers" -
| https://platform.openai.com/docs/guides/rate-limits/usage-ti...
| karmajunkie wrote:
| As other comments have noted it seems to be rolling out at 1pm
| PST today
| wilg wrote:
| What context length will ChatGPT have on GPT-4-Turbo? It wasn't
| using the full 32K before was it?
| bluck wrote:
| Copyright Shield
|
| > OpenAI is committed to protecting our customers with built-in
| copyright safeguards in our systems. Today, we're going one step
| further and introducing Copyright Shield--we will now step in and
| defend our customers, and pay the costs incurred, if you face
| legal claims around copyright infringement. This applies to
| generally available features of ChatGPT Enterprise and our
| developer platform.
|
| So essentially they are giving devs a free pass to treat any
| output as free of copyright infringement? Pretty bold when
| training data sources are kinda unknown.
| fnordpiglet wrote:
| It's not unknown to OpenAI, presumably? And I assume the shield
| evaporates if their court cases goes against them.
| layer8 wrote:
| It probably also means having to remain a paying customer as
| long as you want that protection to persist for any previous
| output.
| tyree731 wrote:
| I am not a lawyer, but this doesn't seem quite "free". Note
| that they aren't indemnifying customers for any consequences of
| said legal claims, meaning that customers would seem to bare
| the full brunt of those consequences should there be a credible
| copyright infringement claim.
| ShakataGaNai wrote:
| For large-scale usage, it doesn't matter what the devs want. If
| the lawyers show up and say "We can't use this technology
| because we're probably going to get sued for copyright
| infringement", it's dead in the water.
|
| It's a logical "feature" for them to offer this "shield" as it
| significantly mitigates one of the large legal concerns to
| date. It doesn't make the risks fully go away, but if someone
| else is going to step up and cover the costs, then it could be
| worthwhile.
|
| For large enterprises, IP is a big deal, probably the single
| biggest concern. They'll spend years and billions of dollars
| attempting to protect it, _cough_ sco /oracle _cough_ , right
| or wrong.
| conorh wrote:
| We just changed a project we've been working on to try out the
| new gpt-4-turbo model and it is MUCH faster. I don't know if this
| is a factor of the number of people using it or not, but
| streaming a response for the prompts we are interested in went
| from 40-50 seconds to 6 seconds.
| activescott wrote:
| It is interesting that the updates are largely developer
| experience updates. It doesn't appear that significant
| innovations are happening on the core models outside of
| performance/cost improvements. Both devex and perf/cost are
| important to be sure, but incremental.
| Davidzheng wrote:
| presumably next model is coming next year?
| danielmarkbruce wrote:
| 128k context?
| layer8 wrote:
| The TTS seems really nice, though still relatively expensive, and
| probably limited to English (?). I can't wait until that level of
| TTS will become available basically for free, and/or self-hosted,
| with multi-language support, and ubiquitous on mobile and
| desktop.
___________________________________________________________________
(page generated 2023-11-06 21:00 UTC)