[HN Gopher] Infinite AI Array
___________________________________________________________________
Infinite AI Array
Author : adrianh
Score : 257 points
Date : 2023-01-02 22:36 UTC (1 days ago)
(HTM) web link (ianbicking.org)
(TXT) w3m dump (ianbicking.org)
| jamal-kumar wrote:
| I tried GPT-3 and it generated code examples that just ~made up~
| some library that didn't exist at all. It had code that in theory
| would have ran, if that library ever existed at all, but it
| didn't, and it consistently came up with code examples using this
| imaginary library. I thought that wasn't very impressive.
|
| What I'm finding it's extremely great for is writing drivel. I'm
| talking ecommerce product descriptions, product names, copy...
| it's really awesome for making one of my side gigs less of a
| chore.
| ianbicking wrote:
| I asked it to write some GPT-related code and it used a gpt3
| library (which does not exist) instead of the openai library. I
| was amused by the self-blindness.
|
| (Forgot to actually look it up... seems like it does exist but
| shouldn't, as it's an empty placeholder:
| https://pypi.org/project/gpt3/)
| mmastrac wrote:
| If the results were stable, reproducable and somehow memoizable,
| this would be actually insanely useful. Perhaps it could modify a
| cache file somewhere with the generated Python code, and that
| could be committed.
| recuter wrote:
| In what manner do you reckon would it be insanely useful?
| etaioinshrdlu wrote:
| The backend OpenAI APIs are not deterministic even with
| temperature 0, and they might upgrade the model weights/tuning
| behind your back too? (Not sure about the upgrade, they might
| just put out a new model id param...)
| nasir wrote:
| normally you can choose the model version so that's not a
| concern.
| ilaksh wrote:
| They seem pretty consistent with temperature 0. Maybe not
| 100% but very close.
| justsid wrote:
| "Pretty consistent" and "deterministic" are not at all the
| same
| AgentME wrote:
| It's close but not fully deterministic. I remember seeing a
| theory that if the system is considering multiple possible
| completions that have the same rounded score percentage,
| then its choice between them is nondeterministic.
| teebs wrote:
| It's likely because GPU calculations are non-
| deterministic and small differences in floating point
| numbers could lead to different outcomes (either in the
| way you described or somewhere deeper in the model)
| keuriGPT wrote:
| I'm no python expert but it looks like this does happen in
| memory: https://github.com/ianb/infinite-ai-
| array/blob/main/iaia/mag...
|
| I can't imagine it's too hard to serialize the `existing`
| python dict so subsequent runs are deterministic
| wging wrote:
| It is also caching requests on-disk. See for instance
| https://github.com/ianb/infinite-ai-
| array/blob/main/iaia/gpt...
|
| The problem* is that if a prompt changes slightly then it
| won't be a cache hit.
|
| * okay, _one_ problem.
| samwillis wrote:
| Looks like Dill [0] would be a good option for serialising the
| generated code. The built in Python Pickle doesn't support
| pickling functions properly, but Dill extends it to enable this
| sort of functionality.
|
| 0: https://pypi.org/project/dill/
| wging wrote:
| Per the README (and the source code), requests to GPT are
| cached. I know that doesn't solve the stability problem, since
| the cache is keyed on the exact prompt among other parameters,
| but it's something at least.
|
| The source code shows that it's using pickling to store OpenAI
| responses on the local filesystem. See here:
|
| https://github.com/ianb/infinite-ai-array/blob/main/iaia/gpt...
|
| and here:
|
| https://github.com/ianb/infinite-ai-array/blob/main/iaia/gpt...
| ianbicking wrote:
| It really should be saving them to JSON instead of Pickle,
| but I gave up trying to figure out how to properly rehydrate
| the openai Completion objects.
|
| If it was JSON then it wouldn't be crazy to add it to your
| repository. (I guess Pickle is stable enough in practice, but
| it would offend my personal sensibilities to check them in.)
| karmasimida wrote:
| Neat idea. But GPT itself has context length limit, so it is not
| infinite? At some point the list will go rogue and becomes
| irrelevant.
| ianbicking wrote:
| There's a maximum context of 10 items so it won't ever
| technically stop (though your GPT bill will keep going up). For
| something clearly ordered this might be enough, e.g.,
| primes[100:110] will give a context of primes[90:100] which
| might give you the correct next primes. (Everything in this
| library is a "might".)
|
| For something like the book example I expect once you get to
| item 12 it might repeat item 0 since that won't be included in
| the examples.
| visarga wrote:
| If you have a prompt with demonstrations (in-context-learning)
| you can randomly sample the demonstrations from a larger
| dataset. This will make the model generate more variations as
| you are introducing some randomness in the prompt.
| MrZander wrote:
| Clever. I could see this being legitimately useful for generating
| test data.
| ccozan wrote:
| I find it simply ... human to use a tool once you understand it.
| The ingenuity is so characteristic. Python is fusing with an NL
| dialect. Amazing.
|
| It feels like the current AI developments ( OpenAI, SD, etc) are
| just like wheels. We are now putting them on a cart and invent
| transportation.
|
| And look, we are planning to go to Mars.
| shadowgovt wrote:
| Python, in particular, is a great language to play this game
| with since an imported module is just another class instance
| and class instances can have their dereferencing hooked via the
| __get__ method.
| paxys wrote:
| Does anyone remember Google Sets? It was a short lived project
| where you'd input some entries from a list and Google would
| automatically expand it with related items. Seemed magical at the
| time (mid-late 00s I think?).
| dan-robertson wrote:
| I think it partly made it into Google sheets. The 'auto fill'
| function when you drag the bottom-right corner of a range will
| work on lots of things you wouldn't expect. But maybe it
| doesn't work anymore.
| Gigachad wrote:
| Yeah I 100% remember this around when Sheets first came out.
| hnuser123456 wrote:
| Google Squared, and yes, killed around 2007-8, and yes, never
| seen something that compares ever since. It was Sheets except
| you could make any row and column headers/labels you want and
| it would try to fill in the result for every cell.
|
| Edit: just RTFA'd, yeah this is better. What happens if you
| call magic.ai_take_over_world_debug_until_no_exceptions()?
| iamflimflam1 wrote:
| I was just messing around with ChatGPT for a similar use case.
| Amazing what comes out of you ask for: Give me
| a list of imaginary products that might be found in a magic shop
| formatted as json array including an id name, description, sku,
| price. Use syntax highlighting for the output and pretty print
| it.
| xianshou wrote:
| This is a great illustration of how ChatGPT fundamentally changes
| "true/false" into a continuum of "truthiness" as measured by
| plausibility. The infinite AI array is clearly marked as such,
| but how long will it be before generative extensibility is the
| (undeclared) norm?
|
| We're all about to get some real-world GAN training.
| kleene_op wrote:
| I wasn't aware there was a mechanism to get the actual name
| attached to an object when it is instantiated in Python! That
| alone made my day.
| w-m wrote:
| The magic method resolution seems much more intriguing to me than
| the infinite array. Are there any programming languages (or
| perhaps a better word would be paradigms), that take this concept
| further?
|
| I would imagine that I just state comments in my code file. Then,
| at runtime, all code is produced by the language model, and then
| executed.
|
| There's the issue of the model producing different generative
| results with each execution. But maybe that could be taken care
| off by adding doctests within my comments. Which could, of
| course, be mostly generated by another language model...
| shadowgovt wrote:
| Hardware manufacturers actually have to deal with this issue
| these days.
|
| An awful lot of chip fabrication is done via stochastic /
| heuristic / monte carlo methods for any large chip; rather than
| exhaustively laying out the whole chip by hand, hardware devs
| describe the chip in a constraints language and feed it to a
| fabrication program (often with some additional parameters like
| "optimize speed" or "minimize power consumption / heat
| generation"). Then the program outputs the chip schematic to be
| fabricated.
|
| Unless you save the random seed at every step, it's entirely
| possible to end up with the problem that you have _a_
| schematic, but if you lose it you 'll never get the program to
| fabricate exactly that schematic again (because hundreds or
| thousands of precise schematics solve the problem well enough
| to terminate the monte carlo search).
| matsemann wrote:
| Used to be quite common in java jpa spring, but I haven't seen
| it that much in the wild lately. Basically you just write a
| method name as an interface, and runtime it will implement it.
| https://docs.spring.io/spring-data/jpa/docs/current/referenc...
|
| To make fun of this behavior, I made a similar thing for
| javascript when proxies came into the language
| https://github.com/Matsemann/Declaraoids
| shadowgovt wrote:
| This will be absolute gold for generating test data.
| alar44 wrote:
| Can anyone clarify what this does? I can't understand what the
| purpose of this is.
| Workaccount2 wrote:
| I believe it uses ChatGPT to generate "infinite" lists of
| things in a chosen topic.
|
| So rather than programming in every airline (and array of
| airline names) for a program that returns airline names, it
| queries ChatGPT for airline names.
|
| If I'm wrong I am sure someone here will be quick to correct me
| and provide the right answer.
| molenzwiebel wrote:
| This is similar to copilot-import [0] which in turn was based on
| stack-overflow-import [1]. I'd be interested to see whether
| ChatGPT/GPT-3 or Codex/Copilot is better at generating function
| bodies.
|
| [0]: https://github.com/MythicManiac/copilot-import [1]:
| https://github.com/drathier/stack-overflow-import
| ballenf wrote:
| Super useful for mock data generation and testing.
| pedrovhb wrote:
| This is an example of something I've seen referred to as "code
| hallucination". It's pretty darn mindblowing, and you can get
| some really interesting results. Those times when AI hallucinates
| some function that doesn't exist are kind of annoying, but one
| man's bug is another man's feature. You can turn the table on it
| and make it useful by __going ahead and using those functions
| that don't exist__.
|
| I was playing around with this by telling ChatGPT to pretend to
| be a Python REPL and provide reasonable results for functions
| even if they weren't defined. A few of my favorite results:
| >>> sentence_transform("I went to the bank yesterday.",
| tense="future") "I will go to the bank tomorrow."
| >>> wittiest_comeback(to_quip="Hey George, the ocean called.
| They're running out of shrimp.", funny=True) "Well, I
| hope it's not too crabby about it." >>>
| sort_by_temperature(["sun", "ice cube", "flamin-hot cheetos",
| "tea", "coffee", "winter day", "summer day"], reverse=True)
| ["flamin-hot cheetos", "sun", "tea", "coffee", "summer day",
| "winter day", "ice cube"]
|
| It took some experimenting to get it to consistently respond as
| expected. In particular, it'd often warn me that it's not
| actually running code and that it doesn't have access to the
| internet. Explicitly telling it to respond despite those things
| helped. Here's the latest version of the prompt I've had success
| with:
|
| ---
|
| Your task is to simulate an interpreter for the Python
| programming language. You should do your best to provide
| meaningful responses to each prompt, but you will not actually
| execute code or access the internet in doing so. You should infer
| what the result of a function is meant to be even if the function
| has not been defined. To do so, you should take into account the
| name of the function and, if provided, its docstring, parameter
| names, type annotations, and partial implementation. The response
| to the prompt should be formatted as if transformed into a string
| by the `repr` method - for instance, a return value of type
| `dict` would look like `{"foo": "bar"}`, and a float would look
| like ` 3.14`. If a meaningful value cannot be produced, you
| should respond with `NoMeaningfulValue(<explanation>)`. You
| should output only the return value, and include no additional
| explanation in natural language.
|
| ---
|
| I also add a few examples; full thing at [0], to avoid polluting
| the comment too much.
|
| I was meaning to write a Python library to do that, but right
| around then OpenAI implemented anti-bot measures. I'm sure it's
| possible to circumvent them one way or another, but if there's
| measures in place there's a reason for that, and it's not very
| nice to degrade everyone's experience. I've had less impressive
| results with codex-2 so far. Still, harnessing hallucination is a
| pretty cool idea.
|
| [0]
| https://gist.github.com/pedrovhb/2ac9b93f446f91a2be234622309...
| ianbicking wrote:
| I like the idea of a prompt based sort! E.g.,
| books.sort(key="publish date"). I'm not sure if that's best
| done with a dict-like approach (i.e., actually calculate the
| key) or let it get really fuzzy and ask it to sort directly
| based on an attribute. Then you might be able to do
| books.sort(key="overall coolness factor") which is an attribute
| that doesn't necessarily map to any concrete value but might be
| guessed on a pairwise basis. (This might be stretching GPT a
| bit far.)
| AgentME wrote:
| Using just your examples and no explanatory prompt with text-
| davinci-003 or code-davinci-002 worked pretty well for me in
| some quick tests.
| stared wrote:
| It kind of reminds me this one https://xkcd.com/221/:
| getRandomNumber() { return 4; // chosen by fair dice
| roll // guaranteed to be random };
| visarga wrote:
| Have you ever imagined it will come a day when an AI can
| explain this joke?
|
| > This code snippet appears to be a joke because it is claiming
| to be a function that returns a random number, but the function
| always returns the number 4. The line "chosen by fair dice roll
| / guaranteed to be random" is a reference to the common phrase
| "random as a dice roll," which means something is truly random
| and unpredictable. However, the fact that the function always
| returns 4 suggests that it is not actually random at all.
| jamesdwilson wrote:
| ChatGPT said:
|
| > The joke in this code is that the function is called
| "getRandomNumber", but it always returns the number 4, which
| is not random at all. The comment "chosen by fair dice roll"
| and "guaranteed to be random" are added for humorous effect,
| because they suggest that the number 4 was chosen through a
| random process, but in reality it is hardcoded into the
| function. The joke is meant to be a play on the idea of
| "randomness", implying that the function is not actually
| generating a random number as it claims to do.
| wzdd wrote:
| It's interesting to contrast this explanation and the
| parent's one with the explanation from https://www.explainx
| kcd.com/wiki/index.php/221:_Random_Numbe... , which is
| essentially that the function may well return a random
| number but, contrary to expectation for functions with
| names like this, won't return different results if called
| more than once.
___________________________________________________________________
(page generated 2023-01-03 23:00 UTC)