[HN Gopher] OpenAI: Increased errors across API and ChatGPT
___________________________________________________________________
OpenAI: Increased errors across API and ChatGPT
Author : zeptonix
Score : 66 points
Date : 2023-11-28 19:51 UTC (3 hours ago)
(HTM) web link (status.openai.com)
(TXT) w3m dump (status.openai.com)
| bun_at_work wrote:
| > OK I have a table in postgresql and I am adding a trigger such
| that when an insert happens on that table, an insert happens on
| another table. The second table has a constraint. What happens to
| the first insert if the second insert violates the constraint?
|
| How can I get help with this now?
|
| Google result 1:
| https://stackoverflow.com/questions/77148711/create-a-trigge...
|
| Google result 2:
| https://dba.stackexchange.com/questions/307448/postgresql-tr...
|
| Like 90% of my questions like this are going to ChatGPT these
| days.
|
| I can figure it out via the docs, but ChatGPT is SO convenient
| for things like this.
| netcraft wrote:
| Agreed that chatgpt is great for this kind of thing - a
| coworker is working on this GPT specifically for postgres
| https://chat.openai.com/g/g-uXYoYQEFi-sql-sage
|
| But with it being down, my biggest advice would be to try it
| and see. something like dbfiddle.uk is perfect for these kinds
| of tests.
| SoftTalker wrote:
| > What happens to the first insert if the second insert
| violates the constraint?
|
| Try it and see? Why do you need an AI to help with this?
| speedgoose wrote:
| Why do you use an internet search engine when you can walk to
| the library?
| rurp wrote:
| The question at hand is pretty easy to test manually and
| the information you get is much more useful. You will get
| to see the exact behavior for yourself, can easily build on
| the test case as related questions come up, and you _know_
| the information you are getting is correct rather than a
| hallucination.
|
| Copying information from ChatGPT is the newer version of
| blindly copying answers from StackOverflow. It often works
| out ok and at times makes sense to do, but it can easily
| lead to software flaws and doesn't do much to build a
| better undersanding of the domain which is necessary to
| solve more difficult challenges that don't fit into a Q&A
| format well.
| speedgoose wrote:
| In my experience, I encounter more issues and waste more
| time when I fiddle on my own and try stuff compared to
| doing the same, but using chatGPT.
|
| There is a lot of knowledge that I don't want to have
| expertise with. Sure, I could carefully read the
| PostgreSQL documentation about triggers and implement it
| myself, or I could get the job done in a few minutes and
| procrastinate on HN instead.
| JoshuaDavid wrote:
| > The question at hand is pretty easy to test manually
| and the information you get is much more useful.
|
| This approach can be hazardous to the health of the
| product you're building. For example, if you take this
| approach to answer the question of "what happens if I
| have two connections to a MySQL database, start a
| transaction in one of them and insert a row (but don't
| commit) and then issue a SELECT which would show the
| inserted row", then you will see consistent results
| across all of the experiments you run with that
| particular database, but you could easily end up with
| bugs that only show up when the transaction isolation
| level changes from how you tested it.
|
| Whereas if you search for or ask that question, the
| answers you get will likely mention that transaction
| isolation levels are a thing.
|
| You might _also_ be able to get this level of knowledge
| by reading the manual, though there will still be things
| that are not included in the manual but do come up
| regularly in discussions on the wider internet.
| Tommstein wrote:
| You can also, uh, just try it with a trivial test and see what
| happens.
| swatcoder wrote:
| Well, were it possible, I'd say go back in time and study your
| tools so that you're not spending the journeyman period of your
| career ricocheting between tutorials and faqs.
|
| Failing that, read the documentation. Failing that, stand up a
| quick experiment.
|
| Somehow, we survived before ChatGPT and even before saturated
| question boards. Those strategies are still available to you
| and well worth learning
| dvfjsdhgfv wrote:
| I see your point but the world changes so fast. Back in my
| day you just needed to learn C, understand algorithms and so
| on and then you could get deeper in an area or two. Today,
| you need to understand and be able to proficiently use so
| many technologies that you can feel lost.
|
| And this is what happens when, say, you loose a job you've
| been doing for 10-15 years. You need to re-learn the world.
| And a lifetime is not enough to do it the way we used to do
| it.
| mritchie712 wrote:
| I'd go straight to the experiment, create the tables on a
| local postgresdb and try to get it to work.
| linsomniac wrote:
| The "good old days weren't always good". I'm tired of either
| limiting myself to the information I have on the top of my
| head, the LLMs are really helping allow me to be creative and
| stretch out to do things that are just beyond my bread and
| butter, or things that I do infrequently.
| bigfudge wrote:
| Exactly this. I -could- become an expert in the intricacies
| of every tool I touch, or I could use chat gpt and move on
| to solving the next problem.
| toomuchtodo wrote:
| LLMs are the great equalizer of our time.
| EnergyAmy wrote:
| We also survived before the internet and indoor plumbing and
| fire, and yet life is so much better now.
| cpursley wrote:
| Yeah, not all of us have memories that work like that. I've
| studied my tools but often forget the little details. My
| productivity has increased since GPT has come out.
| bigfudge wrote:
| Stuff changes too. There are things that are worth learning
| and being fluent in. Regex, sql. But even then there are
| always edge cases or weirdness that someone has solved
| before. LLMs are just much better for this than wading
| through forum posts.
| morsch wrote:
| So ChatGPT says -- to me, a minute ago, ymmv -- it will
| rollback the first insert. Now what? Do you believe it? Cool. I
| wouldn't. I would confirm its claim, either by Googling or by
| trying it myself.
|
| Also, when I asked it "what if I use PostgreSQL's non-
| transactional triggers", which I _thought_ I just made up, it
| told me it _wouldn 't_ roll back the first insert: _Non-
| transactional triggers are executed as part of the statement
| that triggered them, but they don 't participate in the
| transaction control._ So now I don't know what to think.
| sidibe wrote:
| Not working for me. Am I going to have to read docs or search
| Google like some boomer?
| rossdavidh wrote:
| (speaking as a just-slightly-pre-Boomer) Yes. But probably not
| for long.
|
| However, you might want to get used to it, as it looks like it
| might happen not uncommonly.
| callalex wrote:
| If you are not ok with casually saying racist or sexist things,
| you probably also shouldn't say ageist things either.
| Eumenes wrote:
| I think its okay to make little generational jokes. Its not
| like they said .. "Do I have to google this like some old
| f*ck?" ... Certain generations are slower to pick up
| technology.
| timeon wrote:
| With Google'n'co you at least know when search is wrong.
| bun_at_work wrote:
| Don't people often fall into the "vaccines cause autism" trap
| from Google?
| TechRemarker wrote:
| Same for the past 30+ minutes was surprised not to see it on HN
| but guess it just took a little time for someone to post. Tried
| Bard and reminded me how far behind it is when asking programmer
| questions.
| verdverm wrote:
| You might find the results from Google Cloud's Vertex AI better
| than the general purpose Bard. They have a number of pre-
| trained models for coding tasks. You can chat in the console UI
| or use the API directly. They also offer a number of open
| source models (codey & llama), so you can easily try different
| models
|
| https://cloud.google.com/vertex-ai
|
| https://console.cloud.google.com/vertex-ai/model-garden
| pnathan wrote:
| I am finding Bard almost comparable. Gpt quality is declining I
| think.
|
| Wouldn't surprise me if bard permanently surpasses gpt in the
| next quarter. Particularly if openai is dialing down quality...
| mg wrote:
| I'm running this comparison of free and open AI engines:
|
| https://www.gnod.com/search/ai
|
| Looks like they all currently work.
|
| If there are more, let me know.
| rvz wrote:
| We should except OpenAI's API to go down regularly at least every
| week [0] just like GitHub does all the time. But have you tried
| contacting the CEO of OpenAI this time?
|
| [0] https://news.ycombinator.com/item?id=38371339
| drexlspivey wrote:
| Has anyone managed to replicate the "search the web"
| functionality through the API? I've set up two "functions" one to
| get search results and one to extract the text from a search
| results and feed it back to the AI but I am a bit stuck.
|
| What do you use to extract the text from a webpage and how do you
| handle websites with anti-bot measures?
| altdataseller wrote:
| Shouldn't the API response be the exact response you would get
| if you sent the same input to ChatGPT, assuming it'a the same
| model ?
| doppio19 wrote:
| No, ChatGPT is more than just a UI for the OpenAI API. Web
| requests are a feature built into ChatGPT using the API's
| support for function calls, but the API doesn't make any
| external web requests by itself.
| CSMastermind wrote:
| The API doesn't have access to the web search functionality
| unless something changed.
| BrunoJo wrote:
| You may want to try https://lemonfox.ai/ as a OpenAI API
| alternative. I think relying on open-source models is a great
| alternative.
| Filligree wrote:
| The solution to 'GPT-4 sometimes breaks' isn't to use something
| that never works...
| nuz wrote:
| Finetuned local models can work just as well or better than
| gpt-4 in many use cases
| jjcon wrote:
| Have you never used the open source models? They are getting
| really good - better than 3.5 for sure not as good as 4
| except when domain trained in my opinion
| timthelion wrote:
| I think we need a new type of status page or at least a public
| version number on llms, yesterday for me GPT4 started giving
| nonsense super generic answers, like it was hardly reading what I
| wrote, and today it is back to top notch performance. I think
| they were trying to make the model more efficient or something
| but I just saw a massive decrease in the quality of output. From
| my side though, there is no version number except for "4"...
| buildbot wrote:
| This conspiracy always comes up - don't you think that they
| test the output of the model revisions on probably 1000s of
| downstream tasks at this point? Bad responses are hard to
| reason about, could be prompting, could be a model revision,
| could just be bad luck.
| lazysheepherd wrote:
| Or maybe they are just AB testing and aggressively optimizing
| the response generation?
|
| LLMs are known to be compute/energy hungry to execute. It is
| a developing technology, if not downright experimental.
|
| Therefore, this explanation is very likely. I cannot see the
| reason to call this a conspiracy.
| willsmith72 wrote:
| AB testing on what? AB tests need to produce some results
| which are then compared. How would releasing different
| versions in production help with that?
|
| It would make more sense if that was internal and the
| responses were then graded.
|
| A failed canary release would be more likely, where they
| released this version to a small amount of people not
| realising it was bad
| timthelion wrote:
| There are the up down thumbs and automatic sentiment
| analysis as a test.
| lazysheepherd wrote:
| On top of my mind: responses have feedback buttons below
| them.
|
| You can simply deploy different versions and compare the
| neutral + positive / negative feeback ratio.
|
| It would be sinful if they did not add other metrics like
| how many times the user had to correct and update their
| prompt before ending the chat, etc.
|
| Data, data, data...
| timthelion wrote:
| Calling that a conspiracy is like saying its a conspiracy
| theory that Meta shows different people different Ads. I'd be
| more concerned if OpenAI WASN'T constantly trying to tune
| their models. Its literaly their job to tune the models.
| captainkrtek wrote:
| On a recent interview of Sam Altman (Hard Fork podcast) he
| mentioned that due to the load they have been trying to make
| optimizations, disable certain features, etc. so it's not
| outside the realm of possibility that some tweak caused this.
| __loam wrote:
| I think one of the harder things about developing these
| models is that regressions are hard to figure out or even
| detect.
| airstrike wrote:
| This happens so often it's made it really easy to test an app I'm
| developing for API outages and put it into "maintenance mode"
| accordingly. I don't even need to mock the outage... just wait
| for the weekly occurrence
| extheat wrote:
| As an FYI, Bing Chat at http://bing.com/chat continues to work
| even during OpenAI outages. It's also running GPT-4 -- it can be
| annoying when it reaches out to search, but you can usually
| explicitly prompt it to not do that.
| jug wrote:
| Yes it's indeed an option. Note that Creative mode uses GPT-4,
| not the default Balanced.
| Al-Khwarizmi wrote:
| What does Balanced use then? 3.5?
| bilsbie wrote:
| I'm seeing everyone reporting "laziness" today on X. Like it's
| telling people to do their own coding. What's up with that?
| padjo wrote:
| The AI has joined the teamsters
| alexdoesstuff wrote:
| Does anyone know if Azure's OpenAI Studio is down as well? For
| everyone using ChatGPT APIs in production, this needs to be the
| most straightforward failover mechanism.
|
| Using this to plug our open-source tool
| https://github.com/Marvin-Labs/lbgpt which allows ChatGPT
| consumers to quickly load balance and failover between OpenAi and
| Azure models.
| nonfamous wrote:
| Azure OpenAI is running normally
| https://azure.status.microsoft/en-us/status
| gardenhedge wrote:
| Why don't they disable the free version when they're hitting this
| type of load
___________________________________________________________________
(page generated 2023-11-28 23:01 UTC)