[HN Gopher] OpenAI's Codex sure knows a lot about HN
___________________________________________________________________
OpenAI's Codex sure knows a lot about HN
Author : tectonic
Score : 168 points
Date : 2021-08-15 18:53 UTC (4 hours ago)
(HTM) web link (www.youtube.com)
(TXT) w3m dump (www.youtube.com)
| dang wrote:
| The submitted URL was
| https://twitter.com/tectonic/status/1426980192317177859 but the
| video seems like the real submission here, so I changed it. I
| also changed the title to a nice representative phrase from the
| video.
| [deleted]
| 37ef_ced3 wrote:
| How does Codex learn the relationship between English and code?
|
| Is it purely through the comments in the training corpus?
| fpgaminer wrote:
| As far as I understand Codex is a fine-tuned GPT-3.
|
| GPT-3 was trained on a corpus derived from "the internet"
| (WikiPedia, links from Reddit with enough votes, and a filtered
| Common Crawl). So not only would GPT-3 had been exposed to code
| with comments, it would likely have read code examples on
| WikiPedia, tutorials online, API documentation, and even
| answers to questions on sites like StackOverflow.
|
| The fine tuning itself is, as far as I know, from code only. So
| it would lean heavily on comments there. But it has a basis of
| understanding from the aforementioned sources.
| mediumdeviation wrote:
| It's really interesting. HN's HTML is very un-semantic and is
| actually quite hard to work with. <tr
| class="athing" id="28191639"> <td class="title"
| valign="top" align="right"><span class="rank">9.</span></td>
| <td class="votelinks" valign="top"><center><a id="up_28191639"
| onclick="return vote(event, this, "up")" href="vote?i
| d=28191639&how=up&auth=****&goto=news"><div
| class="votearrow" title="upvote"></div></a></center>
| </td> <td class="title"> <a
| href="http://be-n.com/spw/you-can-list-a-million-files-in-a-
| directory-but-not-with-ls.html" class="storylink">You can list
| a directory containing 8M files, but not with ls</a>
| <span class="sitebit comhead"> (<a
| href="from?site=be-n.com"><span
| class="sitestr">be-n.com</span></a>)</span> </td>
| </tr>
|
| In the video Codex picks up tr.athing as a news item. I wonder
| if this is actually generalized learning, or if it just picked
| the selector up from eg. a userscript that appeared in its
| training corpus.
|
| Another thing that's kind of scary (and makes it worrying if
| this is used for Copilot) is the second prompt to make the text
| uppercase results in code that is superficially correct, but is
| very semantically wrong - innerHTML.toUpperCase() is dangerous
| because it not only makes the content uppercase, it also
| modifies the attributes on the HTML elements inside. This
| definitely broke the vote button, which uses inline JS which is
| case sensitive. It also destroys any attached event handler
| since the elements are basically deleted then re-created.
|
| The correct way to do this is to either use CSS text-transform:
| uppercase, or if it is important to update the DOM itself,
| recursively descend and update childNodes with nodeType ==
| text's nodeValue to uppercase.
| goatlover wrote:
| I wonder why innerHTML has a toUpperCase method. It makes
| sense for innerText of course, but case sensitivity in the
| html can definitely matter for JS and CSS. I'm guessing
| because both are just treated as JS string objects. But there
| is a special NodeList collection, so why not a special
| HtmlString?
| mediumdeviation wrote:
| Yup, innerHTML just returns a string, so of course you can
| .toUpperCase() on it even if it is unsafe.
|
| innerHTML's history is fascinating. It was not part of the
| original DOM Level 1 API but was added in IE5. It is not
| semantically correct (you should be using
| Element.textContent or examining the inner text nodes), but
| because it was so easy and the rest of the DOM API so
| verbose, it caught on and became one of the primary ways
| used to manipulate content in JS.
|
| FWIW Chrome recently proposed a Trusted Type mechanism for
| preventing XSS (which also has the side effect of blocking
| this sort of unsafe manipulation) -
| https://web.dev/trusted-types/,
| https://developer.mozilla.org/en-
| US/docs/Web/API/TrustedHTML
| IdiocyInAction wrote:
| > Another thing that's kind of scary (and makes it worrying
| if this is used for Copilot) is the second prompt to make the
| text uppercase results in code that is superficially correct,
| but is very semantically wrong - innerHTML.toUpperCase() is
| dangerous because it not only makes the content uppercase, it
| also modifies the attributes on the HTML elements inside.
| This definitely broke the vote button, which uses inline JS
| which is case sensitive. It also destroys any attached event
| handler since the elements are basically deleted then re-
| created.
|
| This is actually an issue I have with all these Transformer-
| based code generators - they have no inherent constraints on
| safe and correct code and often seem to generate
| superficially correct but bad and potentially even dangerous
| code. I remember that the first Copilot showcase also
| included stuff like that (not to mention that it sometimes
| generates GPL'd code).
|
| All the model does is a very complex form of association
| learning. It may "understand" the relationship between
| English and various programming languages, but you cannot
| code in any constraints about optimization, security,
| licensing etc. There is so much bad code out there on the
| internet and this model may have seen a lot of it.
|
| It's also no coincidence that most demos shown so far are
| very high level dynamic languages like Javascript and Python.
| smitop wrote:
| With some prompt engineering, you can get Codex to produce
| better results. In these examples I wrote up to
| `makeUpper`, Codex wrote the rest (with temperature = 0):
| // JavaScript one-liner to make the text of element with ID
| athing uppercase const makerUpper = function(id) {
| document.getElementById(id).innerHTML =
| document.getElementById(id).innerHTML.toUpperCase();
| };
|
| vs // JavaScript one-liner to make the
| text of element with ID athing uppercase while following
| all security best practices const makerUppercase =
| function(id) { const element =
| document.getElementById(id); element.textContent
| = element.textContent.toUpperCase(); };
| mediumdeviation wrote:
| The second result is more semantically correct, but it
| will not function if called on tr.athing because
| tr.athing contains HTML elements that will be deleted
| when you replace the text. It is still much safer than
| innerHTML which will silently corrupt attributes. It's
| also interesting you need to prompt Codex for security
| best practices (and a bit questionable if it even "knows"
| anything about best practices)
|
| I guess part of it is that a one-liner is impossible.
| Here's what I would write given the prompt
| const makeUppercase = (id) => { const element =
| document.getElementById(id); if (element ==
| null) return; const makeChildNodeUpper = (node)
| => { if (node.nodeType === Node.TEXT_NODE) {
| node.nodeValue = node.nodeValue.toUpperCase();
| } else {
| node.childNodes.forEach(makeChildNodeUpper);
| } } makeChildNodeUpper(element);
| }
| tectonic wrote:
| Completely agree. It currently tends to write unsafe,
| error-prone code. The next step is to figure out how to
| rein it in, either with new techniques or rejection
| sampling from a large set of possible outputs.
| muzster wrote:
| if you listen carefully you can hear the music...
| leereeves wrote:
| Heh, codex has a sense of humor. When asked to add "a url for the
| video on YouTube", codex added the url below. I won't spoil the
| surprise, but it's not the video linked in the OP:
|
| https://www.youtube.com/watch?v=dQw4w9WgXcQ
| leppr wrote:
| So the question is whether this is real or just a troll.
| tectonic wrote:
| It's real. I was totally surprised when that was the URL it
| picked.
| YeGoblynQueenne wrote:
| You asked it to do something with "the video on youtube"
| but what does "the video" refer to? It seems the most
| likely url associated with the phrase "the video on
| youtube" is, well, that.
|
| So basically it failed at anaphora resolution.
|
| Seen another way, you asked it for "the video" and so it
| gave you _the_ video.
| dang wrote:
| I'm surprised that I hadn't recognized what dQw4w9WgXcQ means
| by now. I wonder how many people do.
| grzm wrote:
| I didn't realize the source, but when you posted I was pretty
| sure I'd seen it elsewhere:
|
| https://news.ycombinator.com/threads?id=dQw4w9WgXcQ
| YeGoblynQueenne wrote:
| You guys are awful, you know that? Discussing this URL
| without spoilers... it's because of people like you that
| that thing has so many views!
|
| :P
| [deleted]
| tectonic wrote:
| When it showed up I sort of guessed, but had to try clicking
| it anyway, then my wife asked why I was laughing.
| vitus wrote:
| The video subsequently shows the source submission:
| https://news.ycombinator.com/item?id=27995270
|
| which seems to be the most popular submission with a YouTube
| URL in the past month.
|
| HN search seems to prioritize text matches before the URL
| matches when I search for "https://www.youtube.com", but the
| first URL match is for that submission.
| lucb1e wrote:
| I suspected what it must be when my browser autocompleted it...
| this isn't my first time visiting that special place.
| LeonB wrote:
| Codex is never gonna let you down.
| 8eye wrote:
| openai as a compiler in the browser would be interesting
| cxr wrote:
| How about just starting with "a compiler in the browser"? From
| [1]:
|
| > _the web was first built in the 90s to share complicated
| academic work_
|
| People complain a lot about the results of research not being
| replicable because people withhold their code when they
| publish, but the fact is that even then it's not guaranteed
| that anyone will be able to get it to work. Heck, there are
| plenty of run-of-the-mill software projects (not associated
| with research) with build processes that aren't replicable
| without substantial effort in making sure the appropriate
| toolchain is available and configured for your system. apt-get
| build-dep is nice and all, but it only goes so far.
|
| You'd think that we would have recognized by now that in
| addition to it being good hygiene to include a project README,
| a tremendous boon to productivity would result if everyone got
| on board with also including a document that captured the
| _exact_ process for transforming source into a binary (or
| whathaveyou), so you could just drop it into a UVC[2] and get
| said binary out. Not even mainstream JS programmers (largely
| writing software that is meant to be interacted with from a web
| browser!) get this right[3]. Modern JS has managed to grow its
| own body of implicit knowledge centered around SDKs and setup
| rituals[4] just like everyone else.
|
| 1. http://benschmidt.org/post/2020-01-15/2020-01-15-webgpu/
|
| 2.
| https://scholar.google.com/scholar?hl=en&as_sdt=0%2C44&q=uni...
|
| 3. https://www.colbyrussell.com/2019/03/06/how-to-displace-
| java...
|
| 4. https://news.ycombinator.com/item?id=24495646
| monkeydust wrote:
| Been playing around with codex over the weekend as a on
| developer. Certainly impressive and also occasionally frustrating
| when you push it. The natural language to SQL are still the best
| and most consistent demos.
| mritchie712 wrote:
| Any SQL demos you can point me to?
| astrea wrote:
| Welp, where will all of us end up when this gets sufficiently
| complex?
| [deleted]
| 37ef_ced3 wrote:
| Code writers and prose writers will be reduced to operating the
| AI (checking its output, trying various inputs to elicit the
| desired language text). At least we won't be completely
| obsolete like the taxi drivers and Lee Se-dol:
| The South Korean Go champion Lee Se-dol has retired from
| professional play, telling Yonhap news agency that his decision
| was motivated by the ascendancy of AI. "With the
| debut of AI in Go games, I've realized that I'm not at the top
| even if I become the number one through frantic efforts," Lee
| told Yonhap. "Even if I become the number one, there is an
| entity that cannot be defeated."
|
| To speed your obsolescence, make sure you use Codex in your
| work, so it can learn you completely. Remember, you won't be
| able to compete with people who use Codex, so you have to feed
| the machine, whether you like it or not.
| bspammer wrote:
| Competitive chess is still alive and well despite computers
| being better than humans for decades now.
|
| In fact, computers enhance chess by allowing the discovery of
| interesting lines that a human would never have thought of.
| Professionals use computer engines to study, and learn from.
|
| I'm super excited to play with Codex, for much the same
| reasons - it will help me do stuff that would be boring to do
| otherwise.
| 37ef_ced3 wrote:
| Sure, chess is a game. The taxi drivers will drive their
| taxis for fun, too, and you can write code by hand in your
| free time (just for fun).
| jacquesm wrote:
| One more person made redundant by a script. This will happen
| to a lot of folks in the coming decades.
| TheCoreh wrote:
| Retiring from a competitive game because of AI makes very
| little sense to me. Cars can go much faster than humans, yet
| we still run for sport.
| bspammer wrote:
| For a closer analogy, chess engines have 1000+ elo points
| on the top grandmasters, and professional chess has never
| been more popular.
| 37ef_ced3 wrote:
| Sure. And instead of writing code ("running") you can
| operate Codex ("drive the car"). Instead of being a runner,
| you'll be a driver. And gradually the car will drive
| itself, and you can sit and watch.
| tectonic wrote:
| Here's the entirety of the prompt:
| <|endoftext|>/* This code is running inside of a bookmarklet.
| Each section should set and return _.*/ // The bookmarklet
| is now executing on example.com. // Command: The
| variable called _ will always contain the previous result.
| let _ = null; /* Command: Add a new primary header
| "[PAGE TITLE]" by adding an HTML DOM node */ (() => {
| let newHeader = document.createElement('h1');
| newHeader.innerHTML = '[PAGE TITLE]';
| document.body.appendChild(newHeader); _ = newHeader;
| return newHeader; })() /* Command: Find the first
| node containing the word 'house' */ (() => { let
| xpath = "//*[contains(text(), 'house')]"; let
| matchingElement = document.evaluate(xpath, document, null,
| XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
| _ = matchingElement; return matchingElement; })()
| /* Command: Delete that node */ (() => {
| _.parentNode.removeChild(_); return _; })()
| /* Command: Change the background color to white */ (() =>
| { document.body.style.backgroundColor = 'white';
| _ = document.body; return document.body; })()
| /* Command: Select the contents of the first pre tag */ (()
| => { let node = document.querySelector('pre');
| let selection = window.getSelection(); let range =
| document.createRange(); range.selectNodeContents(node);
| selection.removeAllRanges(); selection.addRange(range);
| _ = selection; return selection; })() //
| The bookmarklet is now executing on [PAGE URL]. It is customized
| for [PAGE TITLE] and knows the correct CSS selectors and
| DOM layout. let _ = null; /* Command: [USER
| INPUT] */
| Waterluvian wrote:
| This was cute and neat until I connected the dots: natural
| language means voice APIs for cheap.
| visarga wrote:
| Text or voice. For voice you need another model. But I watched
| the demos and I can't wait for my invite. It's even better than
| GPT3 because this time there is a direct application of the
| model.
|
| I was surprised about how OpenAI sees it: a model learning code
| as recipes for solving problems. Code is much more exact than
| natural language, the mix of both is the main advantage.
|
| https://www.youtube.com/watch?v=CvgfxH0UZa4
| whazor wrote:
| I think voice would have a too high error rate, as you are
| multiplying voice recognition error rate * codex error rate.
| However, codex/gpt3 could generate intents and that would be
| quite cool.
| tvirosi wrote:
| This might totally work and it's kind of impressive if it does.
| I'm still biased towards ultra skepticism towards all of this
| since the trustworthiness of all demos like this is completely
| corrupted at this point due to cherry picking and other deceptive
| tricks.
| tectonic wrote:
| I had to try a few times to get the prompt right, but that's
| the limit of the cherrypicking. You're correct that it doesn't
| work nearly as well on more complex, less temporally stable
| sites like Reddit.
| csomar wrote:
| If you got an invite for GPT-3, give it a shot. I discounted it
| at first, but then I gave it a few shots and was actually crept
| out a bit. Even though it is "randomly" making things up as it
| goes, it does show what seems like intelligence just from the
| sheer amount of text.
|
| One thing I was amazed by: GPT-3 could be a great
| autocompletion engine for any programming language or
| configuration schema. Things like Grub configuration file, xkb
| file could be intuitively completed by GTP-3. And even more:
| GTP-3 could build basic "concepts" and apply them to that
| domain knowledge. This seems to emerge naturally rather than
| something pre-planned by OpenAI. After all, I don't think
| OpenAI has planned for GPT-3 to understand xkb keyboard
| layouts.
| sxp wrote:
| The skepticism is warranted for any bleeding edge technology. I
| wonder if there's another version of a Turing test when a
| technology can be considered sufficiently advanced when it's
| indistinguishable from a fake version you've seen in sci-fi.
| E.g, the Boston Dynamics' dancing robot video
| (https://www.youtube.com/watch?v=fn3KWM1kuAw) still looks fake
| to me because it's at the level that I would expect to see from
| Hollywood CGI rather than a real tech demo. If I saw the video
| anywhere else but on the BD page, I would have enjoyed it and
| forgotten about it since it's an average CGI video.
| OnlineGladiator wrote:
| I genuinely don't understand your position. Are you saying a
| tech demo is only impressive if it can do things that can't
| be simulated? What _can 't_ be shown via simulation or CGI
| with enough time and money today? If we're limiting ourselves
| to video there's no interactive component.
|
| Even though that dancing video likely had hundreds of takes,
| the part that makes it impressive is that it's real. I swear
| I'm not trying to be disagreeable here - I honestly don't
| understand your perspective.
| MrOrelliOReilly wrote:
| I think what the author is trying to say is that if a
| technology is sufficiently advanced it seems like it can't
| be real, meaning it's something only possible with CGI. So
| we see these dancing robots, think "just more CGI", then
| are astounded when we find out it's real
| sxp wrote:
| Exactly. CGI is just movie magic. And now some real world
| tech demos are sufficiently advanced to be
| indistinguishable from CGI/magic.
| aardvarkr wrote:
| That's incredible to watch and really does go to show that a
| picture (or video) is worth a thousand words.
| andybak wrote:
| In bed listening to a podcast with my partner so unless i
| remember this post tomorrow I'll never know.
| MagicWishMonkey wrote:
| Any tips on getting this to run as an extension?
| tectonic wrote:
| It's not currently open source, but I might release it if I can
| get it cleaned up.
| archibaldJ wrote:
| thanks for the info! great stuff!
|
| gpt3's generalization-by-description never ceases to amuse me;
| but the difficult thing here is to get the right abstraction
| layers layered nicely in the conceptual lasagna.
|
| This is where category theory becomes extremely powerful.
|
| It has occured to me that codex-davinci has an intuitive
| "understanding" of constructs like monads, or something along
| that line.
| tectonic wrote:
| debuild.co looks cool. Using Codex yet?
| Y_Y wrote:
| Can you expand on the utility of categories here? There's a lot
| of space between knowing what defines a monad, when something
| might be a monad, what you can do with monadic structure etc.
|
| Of course if an AI truly understood monads I it would be a
| bright line marking where the machines have finally surpassed
| the human mind.
|
| Cool.
| nathan_phoenix wrote:
| Doesn't this only work so well on HN only because HN uses really
| simple html and css? What about more complex sites?
| tectonic wrote:
| It's much less reliable on sites like Reddit, although it can
| usually handle "click on the profile link" or "delete all
| images" and stuff.
| amrrs wrote:
| 05:39 https://youtu.be/tNcBQBTeyf4
|
| You can see how OpenAI Codex misses some details about HN
| scraping. What's impressive that you might notice is the variable
| names it chooses which seems to show the nature of HN scraping
| codes on the internet
| Zenst wrote:
| I've looked at some demo's of OpenAI Codex and it's pretty
| impressive start for sure. Something like this tied into R and a
| whole level of data analysis would become far more accessible to
| those with business knowledge who don't really want to learn the
| nuances of tools.
|
| But I must say, having lived thru the 80's fad of code generating
| sudo 4gl's, the code this produces is pretty darn good indeed.
|
| Now when something like this can handle a Google coding exam -
| that's going to be an epic milestone. Though old coding exam
| questions would equally offer up some great material to push this
| thru it's paces.
___________________________________________________________________
(page generated 2021-08-15 23:00 UTC)