[HN Gopher] OpenAI: Increased errors across API and ChatGPT
       ___________________________________________________________________
        
       OpenAI: Increased errors across API and ChatGPT
        
       Author : zeptonix
       Score  : 66 points
       Date   : 2023-11-28 19:51 UTC (3 hours ago)
        
 (HTM) web link (status.openai.com)
 (TXT) w3m dump (status.openai.com)
        
       | bun_at_work wrote:
       | > OK I have a table in postgresql and I am adding a trigger such
       | that when an insert happens on that table, an insert happens on
       | another table. The second table has a constraint. What happens to
       | the first insert if the second insert violates the constraint?
       | 
       | How can I get help with this now?
       | 
       | Google result 1:
       | https://stackoverflow.com/questions/77148711/create-a-trigge...
       | 
       | Google result 2:
       | https://dba.stackexchange.com/questions/307448/postgresql-tr...
       | 
       | Like 90% of my questions like this are going to ChatGPT these
       | days.
       | 
       | I can figure it out via the docs, but ChatGPT is SO convenient
       | for things like this.
        
         | netcraft wrote:
         | Agreed that chatgpt is great for this kind of thing - a
         | coworker is working on this GPT specifically for postgres
         | https://chat.openai.com/g/g-uXYoYQEFi-sql-sage
         | 
         | But with it being down, my biggest advice would be to try it
         | and see. something like dbfiddle.uk is perfect for these kinds
         | of tests.
        
         | SoftTalker wrote:
         | > What happens to the first insert if the second insert
         | violates the constraint?
         | 
         | Try it and see? Why do you need an AI to help with this?
        
           | speedgoose wrote:
           | Why do you use an internet search engine when you can walk to
           | the library?
        
             | rurp wrote:
             | The question at hand is pretty easy to test manually and
             | the information you get is much more useful. You will get
             | to see the exact behavior for yourself, can easily build on
             | the test case as related questions come up, and you _know_
             | the information you are getting is correct rather than a
             | hallucination.
             | 
             | Copying information from ChatGPT is the newer version of
             | blindly copying answers from StackOverflow. It often works
             | out ok and at times makes sense to do, but it can easily
             | lead to software flaws and doesn't do much to build a
             | better undersanding of the domain which is necessary to
             | solve more difficult challenges that don't fit into a Q&A
             | format well.
        
               | speedgoose wrote:
               | In my experience, I encounter more issues and waste more
               | time when I fiddle on my own and try stuff compared to
               | doing the same, but using chatGPT.
               | 
               | There is a lot of knowledge that I don't want to have
               | expertise with. Sure, I could carefully read the
               | PostgreSQL documentation about triggers and implement it
               | myself, or I could get the job done in a few minutes and
               | procrastinate on HN instead.
        
               | JoshuaDavid wrote:
               | > The question at hand is pretty easy to test manually
               | and the information you get is much more useful.
               | 
               | This approach can be hazardous to the health of the
               | product you're building. For example, if you take this
               | approach to answer the question of "what happens if I
               | have two connections to a MySQL database, start a
               | transaction in one of them and insert a row (but don't
               | commit) and then issue a SELECT which would show the
               | inserted row", then you will see consistent results
               | across all of the experiments you run with that
               | particular database, but you could easily end up with
               | bugs that only show up when the transaction isolation
               | level changes from how you tested it.
               | 
               | Whereas if you search for or ask that question, the
               | answers you get will likely mention that transaction
               | isolation levels are a thing.
               | 
               | You might _also_ be able to get this level of knowledge
               | by reading the manual, though there will still be things
               | that are not included in the manual but do come up
               | regularly in discussions on the wider internet.
        
         | Tommstein wrote:
         | You can also, uh, just try it with a trivial test and see what
         | happens.
        
         | swatcoder wrote:
         | Well, were it possible, I'd say go back in time and study your
         | tools so that you're not spending the journeyman period of your
         | career ricocheting between tutorials and faqs.
         | 
         | Failing that, read the documentation. Failing that, stand up a
         | quick experiment.
         | 
         | Somehow, we survived before ChatGPT and even before saturated
         | question boards. Those strategies are still available to you
         | and well worth learning
        
           | dvfjsdhgfv wrote:
           | I see your point but the world changes so fast. Back in my
           | day you just needed to learn C, understand algorithms and so
           | on and then you could get deeper in an area or two. Today,
           | you need to understand and be able to proficiently use so
           | many technologies that you can feel lost.
           | 
           | And this is what happens when, say, you loose a job you've
           | been doing for 10-15 years. You need to re-learn the world.
           | And a lifetime is not enough to do it the way we used to do
           | it.
        
           | mritchie712 wrote:
           | I'd go straight to the experiment, create the tables on a
           | local postgresdb and try to get it to work.
        
           | linsomniac wrote:
           | The "good old days weren't always good". I'm tired of either
           | limiting myself to the information I have on the top of my
           | head, the LLMs are really helping allow me to be creative and
           | stretch out to do things that are just beyond my bread and
           | butter, or things that I do infrequently.
        
             | bigfudge wrote:
             | Exactly this. I -could- become an expert in the intricacies
             | of every tool I touch, or I could use chat gpt and move on
             | to solving the next problem.
        
               | toomuchtodo wrote:
               | LLMs are the great equalizer of our time.
        
           | EnergyAmy wrote:
           | We also survived before the internet and indoor plumbing and
           | fire, and yet life is so much better now.
        
           | cpursley wrote:
           | Yeah, not all of us have memories that work like that. I've
           | studied my tools but often forget the little details. My
           | productivity has increased since GPT has come out.
        
             | bigfudge wrote:
             | Stuff changes too. There are things that are worth learning
             | and being fluent in. Regex, sql. But even then there are
             | always edge cases or weirdness that someone has solved
             | before. LLMs are just much better for this than wading
             | through forum posts.
        
         | morsch wrote:
         | So ChatGPT says -- to me, a minute ago, ymmv -- it will
         | rollback the first insert. Now what? Do you believe it? Cool. I
         | wouldn't. I would confirm its claim, either by Googling or by
         | trying it myself.
         | 
         | Also, when I asked it "what if I use PostgreSQL's non-
         | transactional triggers", which I _thought_ I just made up, it
         | told me it _wouldn 't_ roll back the first insert: _Non-
         | transactional triggers are executed as part of the statement
         | that triggered them, but they don 't participate in the
         | transaction control._ So now I don't know what to think.
        
       | sidibe wrote:
       | Not working for me. Am I going to have to read docs or search
       | Google like some boomer?
        
         | rossdavidh wrote:
         | (speaking as a just-slightly-pre-Boomer) Yes. But probably not
         | for long.
         | 
         | However, you might want to get used to it, as it looks like it
         | might happen not uncommonly.
        
         | callalex wrote:
         | If you are not ok with casually saying racist or sexist things,
         | you probably also shouldn't say ageist things either.
        
           | Eumenes wrote:
           | I think its okay to make little generational jokes. Its not
           | like they said .. "Do I have to google this like some old
           | f*ck?" ... Certain generations are slower to pick up
           | technology.
        
         | timeon wrote:
         | With Google'n'co you at least know when search is wrong.
        
           | bun_at_work wrote:
           | Don't people often fall into the "vaccines cause autism" trap
           | from Google?
        
       | TechRemarker wrote:
       | Same for the past 30+ minutes was surprised not to see it on HN
       | but guess it just took a little time for someone to post. Tried
       | Bard and reminded me how far behind it is when asking programmer
       | questions.
        
         | verdverm wrote:
         | You might find the results from Google Cloud's Vertex AI better
         | than the general purpose Bard. They have a number of pre-
         | trained models for coding tasks. You can chat in the console UI
         | or use the API directly. They also offer a number of open
         | source models (codey & llama), so you can easily try different
         | models
         | 
         | https://cloud.google.com/vertex-ai
         | 
         | https://console.cloud.google.com/vertex-ai/model-garden
        
         | pnathan wrote:
         | I am finding Bard almost comparable. Gpt quality is declining I
         | think.
         | 
         | Wouldn't surprise me if bard permanently surpasses gpt in the
         | next quarter. Particularly if openai is dialing down quality...
        
       | mg wrote:
       | I'm running this comparison of free and open AI engines:
       | 
       | https://www.gnod.com/search/ai
       | 
       | Looks like they all currently work.
       | 
       | If there are more, let me know.
        
       | rvz wrote:
       | We should except OpenAI's API to go down regularly at least every
       | week [0] just like GitHub does all the time. But have you tried
       | contacting the CEO of OpenAI this time?
       | 
       | [0] https://news.ycombinator.com/item?id=38371339
        
       | drexlspivey wrote:
       | Has anyone managed to replicate the "search the web"
       | functionality through the API? I've set up two "functions" one to
       | get search results and one to extract the text from a search
       | results and feed it back to the AI but I am a bit stuck.
       | 
       | What do you use to extract the text from a webpage and how do you
       | handle websites with anti-bot measures?
        
         | altdataseller wrote:
         | Shouldn't the API response be the exact response you would get
         | if you sent the same input to ChatGPT, assuming it'a the same
         | model ?
        
           | doppio19 wrote:
           | No, ChatGPT is more than just a UI for the OpenAI API. Web
           | requests are a feature built into ChatGPT using the API's
           | support for function calls, but the API doesn't make any
           | external web requests by itself.
        
           | CSMastermind wrote:
           | The API doesn't have access to the web search functionality
           | unless something changed.
        
       | BrunoJo wrote:
       | You may want to try https://lemonfox.ai/ as a OpenAI API
       | alternative. I think relying on open-source models is a great
       | alternative.
        
         | Filligree wrote:
         | The solution to 'GPT-4 sometimes breaks' isn't to use something
         | that never works...
        
           | nuz wrote:
           | Finetuned local models can work just as well or better than
           | gpt-4 in many use cases
        
           | jjcon wrote:
           | Have you never used the open source models? They are getting
           | really good - better than 3.5 for sure not as good as 4
           | except when domain trained in my opinion
        
       | timthelion wrote:
       | I think we need a new type of status page or at least a public
       | version number on llms, yesterday for me GPT4 started giving
       | nonsense super generic answers, like it was hardly reading what I
       | wrote, and today it is back to top notch performance. I think
       | they were trying to make the model more efficient or something
       | but I just saw a massive decrease in the quality of output. From
       | my side though, there is no version number except for "4"...
        
         | buildbot wrote:
         | This conspiracy always comes up - don't you think that they
         | test the output of the model revisions on probably 1000s of
         | downstream tasks at this point? Bad responses are hard to
         | reason about, could be prompting, could be a model revision,
         | could just be bad luck.
        
           | lazysheepherd wrote:
           | Or maybe they are just AB testing and aggressively optimizing
           | the response generation?
           | 
           | LLMs are known to be compute/energy hungry to execute. It is
           | a developing technology, if not downright experimental.
           | 
           | Therefore, this explanation is very likely. I cannot see the
           | reason to call this a conspiracy.
        
             | willsmith72 wrote:
             | AB testing on what? AB tests need to produce some results
             | which are then compared. How would releasing different
             | versions in production help with that?
             | 
             | It would make more sense if that was internal and the
             | responses were then graded.
             | 
             | A failed canary release would be more likely, where they
             | released this version to a small amount of people not
             | realising it was bad
        
               | timthelion wrote:
               | There are the up down thumbs and automatic sentiment
               | analysis as a test.
        
               | lazysheepherd wrote:
               | On top of my mind: responses have feedback buttons below
               | them.
               | 
               | You can simply deploy different versions and compare the
               | neutral + positive / negative feeback ratio.
               | 
               | It would be sinful if they did not add other metrics like
               | how many times the user had to correct and update their
               | prompt before ending the chat, etc.
               | 
               | Data, data, data...
        
           | timthelion wrote:
           | Calling that a conspiracy is like saying its a conspiracy
           | theory that Meta shows different people different Ads. I'd be
           | more concerned if OpenAI WASN'T constantly trying to tune
           | their models. Its literaly their job to tune the models.
        
         | captainkrtek wrote:
         | On a recent interview of Sam Altman (Hard Fork podcast) he
         | mentioned that due to the load they have been trying to make
         | optimizations, disable certain features, etc. so it's not
         | outside the realm of possibility that some tweak caused this.
        
           | __loam wrote:
           | I think one of the harder things about developing these
           | models is that regressions are hard to figure out or even
           | detect.
        
       | airstrike wrote:
       | This happens so often it's made it really easy to test an app I'm
       | developing for API outages and put it into "maintenance mode"
       | accordingly. I don't even need to mock the outage... just wait
       | for the weekly occurrence
        
       | extheat wrote:
       | As an FYI, Bing Chat at http://bing.com/chat continues to work
       | even during OpenAI outages. It's also running GPT-4 -- it can be
       | annoying when it reaches out to search, but you can usually
       | explicitly prompt it to not do that.
        
         | jug wrote:
         | Yes it's indeed an option. Note that Creative mode uses GPT-4,
         | not the default Balanced.
        
           | Al-Khwarizmi wrote:
           | What does Balanced use then? 3.5?
        
       | bilsbie wrote:
       | I'm seeing everyone reporting "laziness" today on X. Like it's
       | telling people to do their own coding. What's up with that?
        
         | padjo wrote:
         | The AI has joined the teamsters
        
       | alexdoesstuff wrote:
       | Does anyone know if Azure's OpenAI Studio is down as well? For
       | everyone using ChatGPT APIs in production, this needs to be the
       | most straightforward failover mechanism.
       | 
       | Using this to plug our open-source tool
       | https://github.com/Marvin-Labs/lbgpt which allows ChatGPT
       | consumers to quickly load balance and failover between OpenAi and
       | Azure models.
        
         | nonfamous wrote:
         | Azure OpenAI is running normally
         | https://azure.status.microsoft/en-us/status
        
       | gardenhedge wrote:
       | Why don't they disable the free version when they're hitting this
       | type of load
        
       ___________________________________________________________________
       (page generated 2023-11-28 23:01 UTC)