[HN Gopher] OpenAI Codex
___________________________________________________________________
OpenAI Codex
Author : e0m
Score : 237 points
Date : 2021-08-10 17:33 UTC (5 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| throwaway128327 wrote:
| I don't understand what is going on, why are people even spending
| time on this? I think this and copilot and etc are solving a non
| problem of "we will remove the boring part of programming" by
| generating a bunch of code, so now it's even more boring to read
| it and check if it actually does what you want.
|
| In the same time zero of the developers I interviewed know how a
| linked list is laid out in memory, or what is the pro/con of
| continuous memory layouts, or even how a cpu works actually.
|
| Maybe those things are not needed anymore, but I see their
| code... I think it will be better if they know them.
| parksy wrote:
| This is just nascent technology leading toward something like
| this:
|
| "Computer, I want to play a game."
|
| "Okay, what will the game be?"
|
| "I want to be a starship captain, give me a cool space ship I
| can explore the galaxy with"
|
| "Okay... like this?"
|
| "Not quite, make the galaxy more realistic, with real stars and
| planets. Also make it 3d. I want to be the captain inside the
| ship."
|
| "How about now?"
|
| "Cool, and there should be space stations I can visit near
| planets, and I can fly my ship to stars with hyperspace. Make
| it so I have to trade for fuel at the space stations, maybe I
| need to mine asteroids or search derelict space ships for
| treasure. I want to play with my friends too, they can have
| their own ships or walk around my ship."
|
| "Done, was there anything else?"
|
| "Yes, add different alien races to some of the star systems,
| and make some of them have alliances. I want to talk to the
| aliens about their history and culture. Sometimes aliens are
| unfriendly and we'll have space battles if talking doesn't
| work. Make it so I can command a fleet and call for
| reinforcements."
|
| "Processing... Done. Anything else?"
|
| "Actually this is boring, can we start over?"
|
| "Game erased. Please provide new prompt."
| vimy wrote:
| Also know as the holodeck from Star Trek.
| throwaway128327 wrote:
| Oh! this will be so cool! do you really think it could lead
| in that direction? To me it seems more like a metaphysical
| cargo cult. I think I am too pessimistic, I should shake it
| off, nothing good comes out of being pessimistic (by
| definition).
|
| Thanks for the inspiration!
| parksy wrote:
| > do you really think it could lead in that direction?
|
| If you asked me 20 years ago, or even 10, I'd have said it
| was total science fiction. I wouldn't have been able to
| imagine how to do it. If you asked me 5 years ago, I'd have
| vaguely said something about AI, half jokingly. At the time
| I thought perhaps the models could be trained so we can do
| test-only development and let AI trained on formal test
| cases generate endless code until all tests pass, but I
| didn't really imagine it would be possible to get a
| computer to take freeform written English (even in a
| tightly controlled manner) and produce functioning code.
|
| Over the past couple of years I have seen increasingly
| fluent demonstrations and tried a few myself, and I have
| fallen off the fence and I think that with the pace that
| machine learning and AI assisted programming keeps
| advancing, this outcome is all but inevitable, as far
| fetched as it seems.
|
| I was messing with the OpenAI sandbox over the weekend and
| it helped me generate several game design concepts from
| prompts similar to my post above that I could see myself
| being interested in building and playing. It's not
| difficult to imagine down the line with a few more
| advancements in this tech that the generated design could
| then instruct the code generator, fetch the assets, and
| stage the environment for a player or user to enter without
| ever touching a line of code.
|
| I'm not close enough to the research itself know which of
| those problems are hard and which are easy, so I don't know
| if we'll see the first totally AI-generated "proto-
| holodeck" tech demo in the next 5 years, or the next 20
| years, but I can't see it being more than 50 years away,
| and something tells me with the pace of things it will be
| much sooner than that, assuming we're all still around at
| the time to enjoy it.
| throwaway128327 wrote:
| I wonder what will it make when you ask it to make a good
| bot AI for a game.
|
| "make a game with a formidable opponent that plays good
| enough to win with 51% probability"
|
| and of course the inevitable "make a better version of
| yourself"
| parksy wrote:
| From what I've seen the technology can fuse together a
| remarkable range of outputs, but all of them are
| essentially fused together from within the training set.
| If there were enough examples of AI opponents, it
| conceivably could do it since most game AIs are some form
| of state machine combined with a degree of statistical
| analysis and pathfinding (for mobile AI actors). It would
| "just" be replicating existing patterns.
|
| As I understand it, it would take a dramatic leap from
| this kind of interpolation to being able to extrapolate
| and "self improve". So far I haven't seen anything that
| convinces me we're close to this, but again I'm not close
| to the wheel on the research side of things.
| woah wrote:
| You're interviewing programmers for a job in operating systems
| programming?
| throwaway128327 wrote:
| Just full stack devs react native + go. Is it too wrong to
| think they are the same? Programming is programming, most
| computers work in a similar way no?
|
| But they also don't know how garbage collection works in
| their language, or how to work with 1 million things in an
| efficient manner. Or why does the app pause for 100 ms
| because someone does sort while parsing dates within the
| sort.
|
| For example, I have seen people that cant imagine what is the
| cost of a leaked database transaction, just back of the
| napkin wise, like you would think well, how many changes
| happened in between, how much we have to unwind when the
| session disconnects, when will it even disconnect because of
| the connection pool.. etc etc. Because the sql server is this
| magic rds thing. As if aws will solve everything with its
| pixie dust.
| reducesuffering wrote:
| Think bigger. Say I'm starting a startup:
|
| 1. "Setup Django, Nginx, and Postgres deployed on a Digital
| Ocean Ubuntu droplet." Done.
|
| 2. "Make a shopping page like $URL." Done.
|
| 3. "Fill it with data from X and connect with Stripe." Done.
|
| 4. ???
|
| 5. Profit
|
| Seems like even a great dev will take 20x the time to do that
| if the model is able to correctly generate this, even with an
| error, customization, or two.
| throwaway128327 wrote:
| but does it really matter, if 20x is 1 week instead of 2
| hours?
|
| are startups really that shallow?
| motoxpro wrote:
| 1/20th of the time? That's kind of a big deal.
| qayxc wrote:
| That depends: https://xkcd.com/1205/
|
| A one-time setup is perfectly OK to take a few days,
| especially if afterwards you have a documented process
| that allows you to modify and improve the result.
| throwaway128327 wrote:
| i think the 1/20th of the time was mentioned was only at
| start, i dont think you will gain a lot after that as the
| spaghetti AI will come to collect.
|
| You have a debt to pay. -- Davy Jones
| dimal wrote:
| If you don't have someone that understands the generated
| code, you'll be kinda screwed. Most of my work isn't writing
| a function to do X. It's reading and understanding all the
| surrounding code and architecture and then knowing that I
| need a function to do X. Writing the actual function isn't
| usually much of a challenge. I get the feeling that this tool
| will just encourage write-only code that ultimately no one
| understands. Will all of the generated code follow a
| consistent style? Will it know to use the framework you built
| or will it just reinvent everything it needs for each problem
| you give it? I already see tons of code that people copy and
| paste without really understanding it, and a lot of the time
| they're just adding complexity by solving non-problems. This
| just automates that process. I can see it being useful in
| certain narrow cases, but the potential for misuse is huge.
| holler wrote:
| at the point where 1/2/3 are possible, what value does the
| startup have when anyone else can ask it to do the same
| thing?
| tux3 wrote:
| Do your competitors have access to this tool that gets you
| started 20x faster? If so, you want the tool.
|
| Your copycat startup may not have incredible value, but
| selling shovels always pays.
| tome wrote:
| Why would you mention "Django", "Nginx", "Postgres", "Digital
| Ocean", "Ubuntu" or "Stripe"? Surely those are implementation
| details that the user wouldn't care about.
| [deleted]
| nradov wrote:
| It seems like they're going in totally the wrong direction. If
| program content is predictable based on patterns (low entropy)
| then that's a sign that our programming languages are too low
| level. If we want to improve developer productivity then the
| solution is the same as it always has been: create higher level
| languages which abstract away all the repetitive patterns.
| mxwsn wrote:
| Tools are relatively low level compared to any single use
| case or field because they should universally support all
| uses cases or fields. The more narrow your field or use case
| is, the fewer resources there are to create a higher level
| language that abstracts away the details that aren't
| important for your area, but are important to other areas. In
| this manner, Codex has enormous potential.
| temp8964 wrote:
| Can this read existing code and fix one missing piece? That will
| be cool.
|
| Say I have a question I can't solve by searching through
| stackoverflow. If the AI can solve a problem like that, it will
| be great.
| priyanmuthu wrote:
| Program Synthesis can do some rudimentary fixes. But I would
| love to explore this problem of program correction using AI.
| maxwells-daemon wrote:
| The "language models don't really understand anything" corner is
| getting smaller and smaller. In the last few months we've seen
| pretty definitive evidence that transformers can recombine
| concepts ([1], [2]) and do simple logical inference using
| contextual information ([3], "make the score font color
| visible"). I see no reason that this technology couldn't smoothly
| scale into human-level intelligence, yet lots of people seem to
| think it'll require a step change or is impossible.
|
| That being said, robust systematic generalization is still a hard
| problem. But "achieve symbol grounding through tons of multimodal
| data" is looking more and more like the answer.
|
| [1] https://openai.com/blog/dall-e/ [2]
| https://distill.pub/2021/multimodal-neurons/ [3]
| https://openai.com/blog/openai-codex/
| fpgaminer wrote:
| > "language models don't really understand anything"
|
| I have a sneaking suspicion that, if blinded, the crowd of
| people saying variations of that quote would also identify the
| vast majority of human speech as regurgitated ideas as well.
|
| > I see no reason that this technology couldn't smoothly scale
| into human-level intelligence
|
| Yup, the OpenAI scaling paper makes this abundantly clear.
| There is currently no end in sight for the size that we can
| scale GPT to. We can literally just throw compute at the
| problem and GPT will get smarter. That's never been seen before
| in ML. Last time I ran the calculations I estimated that,
| everything else being equal, we'd reach GPT-human in 20 years
| (GPT with similar parameter scale as a human brain). That's
| everything else being equal. It is more than likely that in the
| next twenty years innovation will make GPT and the platforms we
| use to train and run models like it more efficient.
|
| And the truly terrifying thing is that, to me, GPT-3 has about
| the intelligence of a bug. Yet it's a bug who's whole existence
| is human language. It doesn't have to dedicate brain power to
| spatial awareness, navigation, its body, handling sensory
| input, etc. GPT-human will be an intelligence with the size of
| a human brain, but who's sole purpose is understanding human
| language. And it's been to every library to read every book
| ever written. In every language. Whatever failings GPT may have
| at that point, it will be more than capable of compensating for
| in sheer parameter count, and leaning on the ability to combine
| ideas across the _entire_ human corpus.
|
| All available through an API.
| maxwells-daemon wrote:
| As an add-on to this: I'd encourage anyone interested in this
| debate to read Rich Sutton's "The Bitter Lesson"
| (http://www.incompleteideas.net/IncIdeas/BitterLesson.html).
|
| At every point in time, the best systems we can build today
| will be ones leveraging lots of domain-specific information.
| But the systems that will continue to be useful in five years
| will always be the ones freely that scale with increased
| parallel compute and data, which grow much faster than domain-
| specific knowledge. Learning systems with the ability to use
| context to develop domain-specific knowledge "on their own" are
| the only way to ride the wave of this computational bounty.
| pchiusano wrote:
| https://rodneybrooks.com/a-better-lesson/ is an interesting
| retort to the Sutton post.
| Voloskaya wrote:
| The definition of "understanding" behaves just like the
| definition of "intelligence": The threshold to qualify gets
| pushed by as much as the technology progresses, so that nothing
| we create is ever intelligent and nothing ever understands.
| karmasimida wrote:
| > The "language models don't really understand anything"
|
| This is still true. By all account, human doesn't need to read
| 159GB of Python code to write Python, or we simply can't.
|
| But it doesn't necessarily indicate language models aren't
| useful.
| hackinthebochs wrote:
| Considering the sum total of data and computation that goes
| in to creating an intelligent human mind, including the
| forces of natural selection in creating our innate structure
| and dispositions, it's not obvious that any conclusions can
| be drawn from the fact that so much data and compute goes
| into training these models.
| nightski wrote:
| Has this transfer of knowledge from one domain to another
| really been demonstrated by these models/learning
| processes? I know transfer learning is a thing (I have a
| couple books on my shelf on it). But it seems far from what
| you are describing.
| talor_a wrote:
| they mention in the demo video that the inspiration for
| codex came from GPT-3 users training it to respond to
| queries with code samples. I saw some pretty impressive
| demos of the original model creating SQL queries from
| plain questions. I'm not sure if that counts as switching
| domains, but it's something?
| visarga wrote:
| DALL-E + CLIP models show a deep understanding of the
| relation between images and text.
| sbierwagen wrote:
| The AlphaZero algorithm swapped between board games
| pretty easily. OpenAI could also have been gesturing at
| this when they named the GPT paper "Language Models are
| Few-Shot Learners".
| maxwells-daemon wrote:
| I would argue humans ingest a lot more than 159GB before they
| can write code. Most of it isn't Python, and humans currently
| transfer knowledge a lot more efficiently than NNs, but I
| suspect that'll change as incorporating more varied data
| sources becomes feasible.
| bufferoverflow wrote:
| It probably can scale, but we're nowhere near the computational
| power we need to even recreate the brain. And don't forget, our
| brain took a billion years to evolve.
|
| A typical brain has 80-90 billion neurons and 125 trillion
| synapses. That's a big freaking network to train.
|
| Hopefully we can figure out how to train parts of it and then
| assemble something very smart.
| jacquesm wrote:
| Takes on average 2.5 decades to train it.
| mattkrause wrote:
| That's just from the most recent checkpoint :-)
|
| If you were to build it "from scratch" you'd also need to
| include the millions of years of (distributed) evolution
| required to get that particular kid to that point.
|
| Tony Zador has some interesting thoughts about that,
| including"A critique of pure learning", here:
| https://www.nature.com/articles/s41467-019-11786-6)
| jdonaldson wrote:
| I think intelligence as defined as "mapping inputs into goal
| states" is pretty well handled by models, and the models may be
| able to pick and choose states that are sufficient for
| achieving the goals.
|
| However, the intelligence that's created by language models is
| very schizophrenic, and the human-level reflective intelligence
| that it displays is at best a bit of Frankenstein's monster (an
| agglomeration of utterances from other people that it uses to
| form sentences that form opinions of itself or its world).
|
| I think that modeling will help us learn more about human
| intelligence, but we're going to have to do a lot better than
| just training models blindly on huge amounts of text.
| visarga wrote:
| Maybe we're also >50% Frankenstein monsters, an agglomeration
| of utterances from other people.
| 6gvONxR4sf7o wrote:
| > The "language models don't really understand anything" corner
| is getting smaller and smaller.
|
| In my mind, understanding a thing means you can justify an
| answer. Like a student showing their work and being able to
| defend it. An answer with a proof understands the answer with
| respect to the proof it provides. E.g. to understand an answer
| with regards to first order logic, it'll have to be able to
| defend a logical deduction of that answer.
|
| These models still can't justify their answers very well, so
| I'd say they're accurate but only understand with respect to a
| fairly dumb proof system (e.g. they can select relevant
| passages or just appeal to overall accuracy statistics).
| They're still far from being able to justify answers in the
| various ways we do, which I'd say means that by definition that
| they still don't understand with regards to the "proof systems"
| that we understand things with regards to.
|
| Maybe the next step will require increasingly interesting
| justification systems.
| beering wrote:
| > In my mind, understanding a thing means you can justify an
| answer.
|
| What if the language model can generate a step-by-step
| explanation in the form of text? [0]
|
| There's no guarantee that the reasoning was used to come up
| with the answer in the first place, and no proof that the
| reasoning isn't just the product of "a really fancy markov
| chain generator", but would you accept it?
|
| We're really walking into Searle's Chinese Room at this
| point.
|
| [0] https://nitter.hu/kleptid/status/1284069270603866113#m
| sbierwagen wrote:
| >In my mind, understanding a thing means you can justify an
| answer.
|
| Sure, but how does that work with superhuman AI? Consider
| some kind of math bot that proves theorems about formal
| systems which are just flat out too large to fit into human
| working memory. Even if it could explain its answers, there
| would just be too many moving parts to keep in your head at
| once.
|
| We already see something this in quant funds. The stock
| trading robot finds a price signal, and trades on it. You can
| look at it, but it's nonsensical: if rainfall in the Amazon
| basin is above this amount, and cobalt price is below this
| amount, then buy municipal bonds in Topeka. The price signal
| is durable and casual. If you could hold the entire global
| economy in your head, you could see the chain of actions that
| produce the effect, but your brain isn't that big.
|
| Or you just take it on faith. Why do bond prices in Topeka go
| up, but not in Wichita? "It just does." Okay, then what was
| the point of the explanation? A machine can't justify
| something you physically don't have enough neurons to
| comprehend.
| gnramires wrote:
| > Even if it could explain its answers, there would just be
| too many moving parts to keep in your head at once.
|
| While this is possible in practice, consider the
| (universal) Turing machine principle: in principle, you can
| simulate any system given enough memory; we may not have it
| our brains, but we have pen and paper or simply digital
| text scratchpad (both of which we use extensively in our
| lives).
| gnramires wrote:
| Also, you should note the memory and capabilities required
| to reach a conclusion might be much greater than to show
| it's true. Showing a needle may be easy, finding it in the
| haystack very hard. In this sense the hope for
| explainability is expanded. But still, I guess the real
| world is really messy "the full explanation" may be too
| large -- like when you explain a human intuition, the "full
| explanation" might have been your entire brain, your entire
| set of experiences up to that point; yet we can give
| partial explanations that should be satisfactory
|
| A have a hypothesis that inevitably, reasoning needs to
| 'funnel' through explicit, logical representations (like we
| do with mathematics, language, etc.) to occur effectively.
| Or at least (quasi-)formalization is an important element
| of reasoning. This formal subset can be communicated.
| 6gvONxR4sf7o wrote:
| It's not about us being able to interpret answer or
| justification, but the reasoner's ability to justify. If a
| superhuman AI can justify its answers in terms of first
| order logic, for example, it could be defined as
| understanding the answers with respect to FOL. Whether we
| as humans are able to check whether this specific bot in
| fact meets that definition is a separate empirical
| question.
|
| If that quant algo you mentioned just says "it'll go up
| tomorrow" that's different than "it'll go up tomorrow" with
| an attached "it's positively correlated with Y, which is up
| today" which is different from a full causal DAG model of
| the world attached, which is again different from those
| same things expressible in english. But again, those are
| definitions, which are separate from our ability to check
| whether they're met.
|
| Luckily, we're not in the realm of bots spitting out
| unfeasible to check proofs, except for a few niche areas
| like theorem proving (e.g. four color theorem). For
| language models like in the article, the best I'm aware of
| is finding relevant passages to an answer and classifying
| entailments.
|
| > A machine can't justify something you physically don't
| have enough neurons to comprehend.
|
| We can't always verify its justification, but it either can
| or can't justify an answer with respect to a given
| justification system.
| cscurmudgeon wrote:
| We build another system we fully understand that can
| process the justification and see if it is correct/makes
| sense.
| joshjdr wrote:
| I found it on Stack Overflow!
| visarga wrote:
| > Maybe the next step will require increasingly interesting
| justification systems.
|
| You can just ask it to comment what it intends to do. It's
| surprising actually.
| maxwells-daemon wrote:
| Look at the "math test" video.
|
| Given the question: "Jane has 9 balloons. 6 are green and the
| rest are blue. How many balloons are blue?" The model
| outputs: "jane_balloons = 9; green_balloons = 6;
| blue_balloons = jane_balloons - green_balloons;
| print(blue_balloons)"
|
| That seems like a good justification of a (very simple) step-
| by-step reasoning process!
| wizzwizz4 wrote:
| Except I could do that with a few regex substitutions,
| which would not be reasoning. The "intelligence" is in the
| templates provided by the training data. (Extracting that
| is _impressive_ , but not _that_ impressive.)
| lstmemery wrote:
| I have to disagree with you here. In the Codex paper[1], they
| have two datasets that Codex got correct about 3% of the time.
| These are interview and code competition questions. From the
| paper:
|
| "Indeed, a strong student who completes an introductory
| computer science course is expected to be able to solve a
| larger fraction of problems than Codex-12B."
|
| This suggests to me that Codex really doesn't understand
| anything about the language beyond syntax. I have no doubt that
| future systems will improve on this benchmark, but they will
| likely take advantage of the AST and could use unit tests in a
| RL-like reward function.
|
| [1] https://arxiv.org/abs/2107.03374
| nmca wrote:
| 12B, though. What about 1.2T?
| lstmemery wrote:
| You need to scale the amount of data to take advantage of
| the increase in parameters. I'm not sure where we would
| find another 100 GitHubs worth of data.
| ruuda wrote:
| > but they will likely take advantage of the AST
|
| In the end, a more general approach with more compute, always
| wins over applying domain knowledge like taking advantage of
| the AST. This is called "the bitter lesson".
| http://www.incompleteideas.net/IncIdeas/BitterLesson.html
| lstmemery wrote:
| I don't think the bitter lesson is applies to ASTs.
|
| From the Bitter Lesson:
|
| "Early methods conceived of vision as searching for edges,
| or generalized cylinders, or in terms of SIFT features. But
| today all this is discarded. Modern deep-learning neural
| networks use only the notions of convolution and certain
| kinds of invariances, and perform much better."
|
| Those models are taking advantage of inductive biases.
| Every model has them, including the massive language
| models. They are not the same as engineered features (such
| as SIFTs) or heuristics.
|
| Using the AST is just another way of looking at the code
| already in your dataset. For the model to understand what
| it is writing, it needs to map the text sequences map to
| ASTs anyways. It can attempt to learn this, but the 12B
| model still makes illegal Python code so it clearly hasn't.
| kevinqi wrote:
| "the bitter lesson" is a very interesting, thank you!
| However, I wonder if AST vs. text analysis is fully
| comparable to the examples given in the post. Applying
| human concepts for chess, go, image processing, etc. failed
| over statistical methods, but I don't think AST vs. text is
| exactly the same argument. IMO, using an AST is simply a
| more accurate representation of a program and doesn't
| necessarily imply an attempt to bring in human
| intuition/concepts.
| abeppu wrote:
| I'm still surprised by the approach. I mean, great that it works
| this well -- but program synthesis is one of those rare domains
| where you can observe exactly what the outcome is after you
| generate something. You can see execution traces, variable
| values, what the JIT produced, etc. And all of this is relatively
| cheap -- often executing a code snippet should be far cheaper
| than an extra pass through a giant DNN right? So it's fascinating
| to me that they train entirely from dealing with code as text.
|
| Imagine learning to develop recipes, not by ever cooking or
| eating or even seeing food, but only reading a giant library of
| cookbooks. Or learning to compose music but never hearing or
| playing anything -- only seeing scores.
| wantsanagent wrote:
| FWIW execution guided code synthesis is a thing. Get a few
| possible outputs and ditch those that don't pass a parser as an
| example. At least in the SQL generation realm this is well
| worth the time it takes to tack onto a large language model.
| [deleted]
| jmportilla wrote:
| Very cool, will be interesting to see if this is ever added in to
| VisualStudio as some sort of "super" auto-complete.
| mensetmanusman wrote:
| If this actually worked, wouldn't that be amazing? If you could
| break down a software idea into a blue print of concepts that
| need to be accomplished, and then dictate what should be done...
|
| I doubt it works, but I wonder how many decades from now we will
| be able to walk through a finite number of simple requests and
| wrap them together as working software. Then people will be able
| to convert their blueprint into action!
| GistNoesis wrote:
| Can I use this to write solidity contracts ?
| mxwsn wrote:
| That has got to be one of the worst possible use cases one
| could imagine. In page 33 of the appendix, the authors note
| that nearly 40% of RSA encryption keys created by Codex are
| clearly insecure.
| GistNoesis wrote:
| Only if tokens have value.
|
| If codex is able to handle a generic api from reading the
| doc, it maybe could use a python library for solidity
| contracts like
| https://web3py.readthedocs.io/en/stable/contracts.html
|
| As a contract user, I'd probably have more trust in a
| contract written by an independent AI from a short natural
| language specification which can't hide intent, than a
| contract with hidden backdoor, or a subtle bug.
|
| Also the AI will probably improve with usage.
|
| You probably can generate multiple version of your contract,
| and maybe a high level bug correction scheme like taking the
| median action between those version can increase bug
| robustness and find those edge cases when action differ.
| woah wrote:
| What does that have to do with anything?
| northfoxz wrote:
| A new way to talk to the computer I guess.
| vincnetas wrote:
| Will really be impressed when one could say: "here is this
| codebase, modify this function so that it would preduce [insert
| desired efect]" and also other functionality of project would not
| crash thumbling down...
|
| Because writing code from scratch now is i think much rearer than
| improoving existing codebases. Aka bugfixing.
| vincnetas wrote:
| Also curious what this ai would produce when provided with
| contradictory requests. Because often there are multiple
| requirements which on theyr own sounds reasonable but when you
| try to fit all requirements in one system, things get nasty.
| polyanos wrote:
| It is only able to translate small instructions into code. I
| think it will take a while to get to a situation where you
| can just give it a list of requirements and it spits a
| working program.
|
| Hell it messed up when they gave it the instruction "make
| every fifth line bold" in their Word api part of the demo,
| where it made the first line of every paragraph (which is
| only 4 lines long in total) bold instead of every fifth line.
| 3wolf wrote:
| I think integrations like the MS Word example they show off at
| the end of the live demo have the potential to be even more
| impactful than just generating code for programmers.
| polyanos wrote:
| That still needs work though, it messed up the "Make every
| fifth line bold" pretty bad. Still, it showed it could adapt to
| a new API pretty well.
| 3wolf wrote:
| Yeah, definitely. I guess my point was that converting
| natural language to source code can be even more valuable for
| people who don't know how to code, but want to perform
| actions more complicated than a simple button press. For
| example, I often find myself doing regex based find-and-
| replace-alls in text files, and even that feels inefficient
| while also being over the head of the vast majority of users.
| I'd imagine there are a lot of people out there spending many
| hours manually editing documents and spreadsheets.
| amalive wrote:
| Would like to say "Fix that something of undefined error" some
| day.
| dmurray wrote:
| They should have released this first instead of GitHub Copilot.
| The focus would then have been much more on "look at the cool
| stuff they can do" rather than "Microsoft is releasing a product
| that plagiarizes GPL code".
|
| Once people had digested that and there had been a few other
| proof-of-concept business ideas around turning Codex into a SaaS
| (because some people will always queue to build their product on
| your API), announce the evil version. Not that I really think
| Copilot is evil, but the IP concerns are legitimate.
| mark_l_watson wrote:
| I watched their 30 minute demo on Twitch this morning, really
| good!
|
| I use their OpenAI beta APIs as a paying customer, I am still
| waiting for access to Codex.
| leesec wrote:
| The Writing On The Wall
| z77dj3kl wrote:
| I thought OpenAI was originally supposed to be some kind of for-
| the-good, non-profit institution studying AI and its safe use in
| particular with an effort to make it more accessible and
| available to all through more open collaboration. This is cool
| research, sure; but what happened to making models available for
| use by others instead of just through some opaque APIs?
|
| Maybe I'm just remembering wrong or conflating OpenAI with some
| other entity? Or maybe I bought too much of the marketing early
| on.
| mark_l_watson wrote:
| They very transparently transitioned to a for profit company.
| It doesn't seem like they are aggressively profit oriented
| though: I am a paying customer of OpenAI beta APIs and the cost
| to use the service is very low. It also solves several classes
| of tough NLP problems. I used to sell my own commercial NLP
| library - glad I gave up on the years ago.
| keewee7 wrote:
| OpenAI was founded in 2015. In 2015 Google was AI and AI was
| Google. There was legitimate concern that one American
| corporation was going to dominate AI. OpenAI was created to
| challenge that dominance and let "AI benefit all of humanity".
|
| In the meantime China and Chinese companies have catched up.
| Turns out the fear that one company and one country dominating
| AI was overblown.
|
| Maybe the OpenAI founders feel that the original goal has been
| fulfilled because AI is no longer dominated by the US and
| Google.
| Buttons840 wrote:
| No, they did some good, they've done a few things to personally
| help me. They created OpenAI Gym which is a great help when
| doing reinforcement learning research and defined the standard
| interface for reinforcement learning libraries for a
| generation. But they not longer maintain OpenAI Gym.
|
| They also created Spinning Up [0], one of the best resources
| I've found for learning reinforcement learning. Their teaching
| resources are detailed but relatively brief and are focused on
| implementing the algorithms, even if some of the "proofs" are
| neglected. But they no longer maintain Spinning Up.
|
| So yes, originally they were for-the-good, but lately I've
| noticed them moving away from that in more ways than one. It
| seems they learned one cool trick with language sequence
| modelling, and they have a lot of compute, and this is all they
| do now.
|
| [0]: https://spinningup.openai.com/en/latest/
| blt wrote:
| That was the marketing message. They became for-profit in 2019
| and took investment from Microsoft. Many people were skeptical
| before that because the main investors were mostly known for
| for-profit ventures.
| webmaven wrote:
| You're remembering correctly. OpenAI transitioned from non-
| profit to for-profit in 2019, took about $1 billion from
| Microsoft (there has been speculation that this was mostly in
| the form of Azure credits), and announced that Microsoft would
| be their preferred partner for commercializing OpenAI
| technologies: https://openai.com/blog/microsoft/
| stingraycharles wrote:
| I remember Sam Altman, when asked "How will you make money?",
| reply they would ask the AI. I thought it was a fairly creative
| answer.
|
| It turns out, however, that the way they plan on earning money
| is much less creative, and more run-of-the-mill SaaS
| monetization. In a way, I like to believe that a real AI would
| also end up with such a mundane strategy, as it's the most
| likely to actually make them profitable and return money to
| investors.
| amrrs wrote:
| I feel that OpenAI Codex could become like Webflow for coding. It
| might sound ironic, but what tools like Webflow in the world of
| Web programming is to give the power of creators to build
| something fast that can long last (without the speciality of a
| decent web programmer).
|
| If the same thing can happen in the world of programming, I guess
| evaluations like LeetCode and Whiteboarding can go away and bring
| in a new of logical thinking evaluation which could ultimately be
| a more realistic method of some programmer rising above the
| chain.
| Vermeulen wrote:
| A warning to devs building on OpenAI APIs: We spent months
| developing a chatbot using GPT3 for our game and released a video
| showcasing it: https://www.youtube.com/watch?v=nnuSQvoroJo&t=264s
|
| Afterwards OpenAI then added GPT3 chatbot guidelines disallowing
| basically anything like this. We were in communication with them
| beforehand, but they decided later that any sort of free form
| chatbot was dangerous.
|
| What they allow changes on a weekly basis, and is different for
| each customer. I don't understand how they expect companies to
| rely on them
| nradov wrote:
| The notion of a toy like a chatbot being "dangerous" is just so
| ludicrous. The OpenAI folks take themselves way too seriously.
| Their technology is cool and scientifically interesting, but in
| the end it's nothing more than a clever parlor trick.
| mszcz wrote:
| I think different kind of dangerous, not the SkyNet stuff.
| The first idea that popped into my mind is below. I know,
| it's dark but...
|
| 8 year old to AI: "my parents won't let me watch TV, what do
| I do?". AI: "stab them, they'll be too busy to forbid you".
|
| Then again the same thing can be said by a non-AI. My
| thinking is that you'd be talking to an _actual average_
| person. I 'm not so sure that that is such a good thing.
| EamonnMR wrote:
| Definitely dangerous from a legal perspective if AI Dungeon
| is any indication.
| elefanten wrote:
| The general public basically races to test the most
| controversial content. As exhibited by several other high-
| profile chatbot launches.
|
| > Tay responded to a question on "Did the Holocaust
| happen?" with "It was made up"
|
| https://en.m.wikipedia.org/wiki/Tay_(bot)
| aeternum wrote:
| It's pretty easy to get GPT-3 to say things that are
| incredibly sexist and racist. I think OpenAI is more
| concerned about the bad press associated with that than AI-
| safety.
| Siira wrote:
| Which is even less ethically defensible.
| andreyk wrote:
| Oh man, I was looking forward to this a ton! Are you thinking
| to keep working on it with the open source GPT J or something
| similar by any chance?
| Vermeulen wrote:
| I am looking at GPTJ, and also hoping OpenAI comes to their
| senses on how dangerous a video game chatbot can be
| MasterScrat wrote:
| > Afterwards OpenAI then added GPT3 chatbot guidelines
| disallowing basically anything like this. We were in
| communication with them beforehand, but they decided later that
| any sort of free form chatbot was dangerous.
|
| Was this announced anywhere? We applied to deploy an
| application in this space, and they refused without providing
| any context, so I'd be really interested if they published
| details about restrictions in this space somewhere.
| Vermeulen wrote:
| https://beta.openai.com/docs/use-case-guidelines/use-case-
| re... "reliably being able to limit the conversational topics
| to strictly X, Y, and Z topics"
| Miraste wrote:
| OpenAI cloaks themselves in false "open" terminology to hide
| how proprietary and incredibly restrictive they've made their
| tech. That's a very cool demo; have you considered trying to
| make it run on GPT-J instead? It's an open source alternative
| you can run yourself or pay an independent api provider without
| supporting OpenAI.
| Vermeulen wrote:
| Haven't been able to find a GPT-J service with good latency -
| though we haven't tried hosting ourselves
| spullara wrote:
| I have gotten it running on AWS in a container if you want
| the Dockerfile/scripts I can send it to you. Email is in my
| profile.
| fpgaminer wrote:
| It sucks that OpenAI has no competition right now. They have
| every right to control their technology however they like. But
| it's a shame that they're being so stifling with that right,
| killing really fun stuff like you demonstrated.
|
| But that monopoly won't last, and I think it's more than likely
| that competition will crop up within the next year. There's
| definitely a lot of secret sauce to getting a 175B parameter
| model trained and working the way OpenAI has. The people
| working there are geniuses. But it can still be reproduced, and
| will. Once competition arrives I'm hoping we'll see these
| shackles disappear and see the price drop as well. Meanwhile
| the open source alternatives will get better. We already have
| open source 6B models. A 60B model shouldn't be far off, and is
| likely to give us 90% of GPT-3.
| option_greek wrote:
| That's a really interesting demo. What makes the responses so
| laggy? Does the model take that long to generate text? You can
| also experiment with things like repeating the user question or
| adding pauses like "hmm let's see" to make it less noticeable
| at least some of the time.
|
| Too bad they asked you to pull it. What's the danger they are
| worried about? Annoying thing from their press releases is how
| seriously they take their GPT3 bot impact on humans. Despite
| all the hype, it's difficult to see the end of humanity by GPT3
| bots any time soon. Honestly they need to rename themselves -
| can't see what's open about openai.
| maxwells-daemon wrote:
| Autoregressive transformers take a while to generate text,
| since you need to run the whole model once for every word in
| the output.
| Vermeulen wrote:
| It's laggy since it needs to do speech to text, gpt3 text
| response, then text to speech. Not sure what adds the most
| latency actually.
|
| They only allow gpt3 chatbots if the chatbot is designed to
| speak only about a specific subject, and literally never says
| anything bad/negative (and we have to keep logs to make sure
| this is the case). Which is insane. Their reasoning to me was
| literally a 'what if' the chatbot "advised on who to vote for
| in the election". As if a chatbot in the context of a video
| game saying who to vote for was somehow dangerous
|
| I understand the need to keep GPT3 private. There is a lot of
| possibility for deception using it. But they are so scared of
| their chatbot saying a bad thing and the PR around that
| they've removed the possibility of doing anything useful with
| it. They need to take context more into account - a clearly
| labeled chatbot in a video game is different than a Twitter
| bot
| dfraser992 wrote:
| But what if it wasn't clearly labeled? I did my MSc thesis
| on fake reviews and discussed the phenomena known as
| "covert marketing" a bit. e.g. a guy you're talking to in a
| bar at some point steers the conversation to the excellent
| beer he is drinking and heavily recommends it to you. Good
| enough actors will be very convincing. "Influencers" are a
| somewhat more ethical alternative that takes advantage of
| humans' lemming-like nature.
|
| I mean, quite a lot of people truly believe Hilary Clinton
| is the mastermind behind a DNC run pedophile ring. Yes, she
| is a problem, but that theory is completely schizophrenic.
| A NPC masquerading as a real person who spouts positive
| talking points about Tucker Carlson's respect for Hungary
| is quite reasonable compared to that and it will suck some
| people in.
|
| So all it takes is some right wing developers for a not-
| entirely-just-a-game like Second Life or Minecraft to
| introduce a bug that allows certain instances of NPC to be
| unlabeled... or a mod to a game that drives a NPC... and an
| equivalent to GPT-3 funded by the Kochs or the Mercers...
|
| Very hypothetical, very hand waving. But it is possible. So
| I can see the PR and legal departments flat out stopping
| this idea.
| minimaxir wrote:
| > But they are so scared of their chatbot saying a bad
| thing and the PR around that they've removed the
| possibility of doing anything useful with it.
|
| It's not unreasonable to have checks-and-balances on AI
| content, and there should be.
|
| However, in my testing of GPT-3's content filter when it
| was released (it could be improved now), it was _very_
| sensitive to the point that it had tons of false positives.
| Given that passing content filter checks is required for
| productionizing a GPT-3 app, it makes using the API too
| risky to use, and part of the reason I 'm researching more
| with train-your-own GPT models.
| nradov wrote:
| Why should there be checks and balances on AI content?
| What most people label as "AI" today is literally just
| fancy statistics. Should there be checks and balances on
| the use of linear regression analysis and other
| statistical techniques? Where do we draw the line?
| minimaxir wrote:
| > Should there be checks and balances on the use of
| linear regression analysis and other statistical
| techniques?
|
| That rhetorical question actually argues against your
| point: even in academic contexts, statistics can be used
| (intentionally or otherwise) to argue
| incorrect/misleading points, which is why reputable
| institutions have peer reviews/boards as a level of
| validation for papers.
|
| The point I was making was more on general content
| moderation in response to user-generated content, which
| is _required_ for every service that does so for legal
| reasons at minimum, as they 're the ones who will get
| blamed if something goes wrong.
| mola wrote:
| Ofcourse statistical techniques need checks and balances,
| hence peer reviewed academic papers, meta analysis, etc.
| statistics is a major tool for science these days.
| science needs checks and balances otherwise it's a pretty
| idle effort. Without checks and balances, you could just
| imagine any theory and believe it's the truth because you
| want to.
| ummonk wrote:
| Eh, I could still see a clearly labeled chatbot on a video
| game causing a major PR scandal if it says something
| offensive. Not really worth the risk.
|
| Pretty bad that they took so long to decide on this,
| though, pulling out the rug from under developers' feet.
| qwertox wrote:
| This stunning. Imagine being able to practice your foreign
| language lessons this way.
| TchoBeer wrote:
| How many languages does GPT3 support at the moment?
| make3 wrote:
| I work in this domain, and you can make these things say
| anything with a little probing, even stuff like "Hitler was
| right to kill all the Jews, I wish he was still alive today."
|
| They likely don't want to have "OpenAI GPT-3" and such stuff
| associated to one another in such demos, would be really bad
| for their appearence.
| refulgentis wrote:
| I'm trying to extract some signal from this link...lots of
| upvotes, no comments, 30 min old, top 3 on HN...I'm worried this
| will be read as negative, but it's not, just learning, and enough
| time has passed I'm itching to jump in and ask:
|
| - Is the significance here exactly what it says on the tin: the
| model behind GitHub's AI code completion will be shared with
| people on an invite basis? Or am I missing something?
|
| - What is the practical import of the quote at the end of this
| comment?
|
| "can now" makes me think its a new feature over Github's
| implementation, which would then indicate the "simple commands"
| could be general UI, or at least IDE UI, navigation.
|
| If "can now" means "it is currently capable of, but will be
| capable of more", then I'd expect it to be the same as the
| current implementation on Github.
|
| Quote: "Codex can now interpret simple commands in natural
| language and execute them on the user's behalf--making it
| possible to build a natural language interface to existing
| applications."
| sbierwagen wrote:
| Take a look at the video demo. It takes natural text in a box
| and generates code. Copilot was super-autocomplete, so the
| interface was writing code in an IDE that it filled out for
| you. Natural language interface will be a little easier for
| non-programmers. (Though, how would you read the code to make
| sure it does what you meant...)
| polyanos wrote:
| >Take a look at the video demo. It takes natural text in a
| box and generates code. Copilot was super-autocomplete, so
| the interface was writing code in an IDE that it filled out
| for you.
|
| No it wasn't, you can literally describe, in natural text,
| what you want in a comment and CoPilot will do its best to
| generate a complete method based on that comment. It seemed
| like it was so auto-compltely because that focussed on the
| "helping the developer" part.
|
| I'm fairly sure CoPilot could have shown something similar if
| they had a demo where you could make something visual easily,
| like HTML + Javascript/Typescript/whatever scripting
| language. They're using exactly the same model (Codex) after
| all.
| am17an wrote:
| I really want to just play with this tech- it's frightening but
| also the future, but I'm still waiting to be accepted on the
| GitHub copilot waitlist. I wonder how long this will take for
| people who don't know someone who knows someone...
| [deleted]
| febrilian wrote:
| Uhh... I'm literally no one but got the access for like a week
| or so. I got 134 repos and 12,060 contributions in the last
| year. Idk if that mattered.
| andyxor wrote:
| that's not the future, these large language models have no
| understanding of language, they repeat the most frequently
| occurring patterns like parrots. They miss this whole thing
| called semantics.
| f0e4c2f7 wrote:
| They just finished a demo on twitch. Pretty crazy!
|
| https://www.twitch.tv/videos/1114111652
|
| Starts at 15:45.
| j0ej0ej0e wrote:
| aaaand they've blocked audio until 18:17ish, timestamp url:
| https://www.twitch.tv/videos/1114111652?t=00h18m17s
| raidicy wrote:
| lmao; copyright muted so you can't even hear them speaking.
| [deleted]
| karmasimida wrote:
| It is simultaneously impressive and underwhelming for me.
|
| I mean yes this is a super impressive demo, but it didn't go
| beyond my expectation. I really want to see whether this model
| can write a correct binary search method without seeing one
| before.
|
| Or even correctly using the binary search, does it understand
| concept like index boundaries?
| stavros wrote:
| > I really want to see whether this model can write a correct
| binary search method without seeing one before.
|
| I don't believe the model was trained on Google interview
| answers, sadly.
| polyanos wrote:
| I found the whole UI/sandbox they created the most
| interesting part. Now don't get me wrong, the tech is
| certainly great and all, but I really didn't had the feeling
| I watched/learned more than I already knew from what was
| shown with Github CoPilot, although I was kinda impressed, if
| it really is as simple as they stated, at how it is able to
| adapt to new apis.
|
| It's a shame they only limited the demo to relatively simple
| instructions.
___________________________________________________________________
(page generated 2021-08-10 23:00 UTC)