[HN Gopher] Is DALL-E 2 'gluing things together' without underst...
___________________________________________________________________
Is DALL-E 2 'gluing things together' without understanding their
relationships?
Author : Hard_Space
Score : 172 points
Date : 2022-08-04 13:24 UTC (9 hours ago)
(HTM) web link (www.unite.ai)
(TXT) w3m dump (www.unite.ai)
| rasz wrote:
| Dalle is fancy content aware fill on steroids (style translation
| etc).
| avalys wrote:
| I think DALL-E is clearly just gluing things together that it
| found in a massive dataset, and doesn't have any understanding of
| the underlying concepts. I thought it was easy to see the signs
| of this in examining its output. Same for GPT-3.
|
| However, what's amazing about DALL-E and these other statistical,
| generative models to me is that it's made me think about how much
| of my daily thought processes are actually just gluing things
| together from some kind of fuzzy statistical model in my head.
|
| When I see an acquaintance on the street, I don't carefully
| consider and "think" about what to say to them. I just blurt
| something out from some database of stock greetings in my head -
| which are probably based on and weighted by how people have
| reacted in the past, what my own friends have used similar
| greetings, and what "cool" people say in TV and other media in
| similar circumstances. "Hey man how's it going?"
|
| If I was asked to draw an airplane, I don't "think" about what an
| airplane looks like from first principles - I can just synthesize
| one in my head and start drawing. There are tons of daily
| activities like this that we do like this that don't involve
| anything I'd call "intelligent thought." I have several relatives
| that, in the realm of political thought, don't seem to have
| anything more in their head than a GPT-3 models trained on Fox
| News (that, just like GPT-3, can't detect any logical
| contradictions between sentences).
|
| DALL-E has convinced me that even current deep learning models
| are probably very close to replicating the performance of a
| significant part of my brain. Not the most important part or the
| most "human" part, perhaps. But I don't see any major conceptual
| roadblock between that part and what we call conscious,
| intelligent thought. Just many more layers of connectivity,
| abstraction, and training data.
|
| Before DALL-E I didn't believe that simply throwing more compute
| at the AGI problem would one day solve it. Now I do.
| multibit wrote:
| >I have several relatives that, in the realm of political
| thought, don't seem to have anything more in their head than a
| GPT-3 models trained on Fox News (that, just like GPT-3, can't
| detect any logical contradictions between sentences).
|
| If more people were to realize that we're all probably like
| this, trained on some particular dataset (like mainstream vs
| reactionary news/opinion), I wonder if that would lead to a
| kind of common peace and understanding, perhaps stemming only
| from a deep nihilism.
| dougmwne wrote:
| I think our intelligence has 2 components, our conscious selves
| that are good at formal reasoning and meaning making, and a
| massive neural network that is good at statistical inference. I
| suspect the reason we need less training data for learning than
| dall-e is because our conscious selves can tease out the formal
| relationships and models, rather than relying purely on
| inference which requires enough datapoints.
| AbraKdabra wrote:
| I mean, if it is "gliung things together" is doing it pretty
| well...
| QuadmasterXLII wrote:
| My intuition is that DALL-E is more a demonstration of how hard
| image synthesis is for humans, than how intelligent the algorithm
| is. The image generation models have orders of magnitude fewer
| parameters than the large language models.
| bawolff wrote:
| AI system is good at what it was trained to do and bad at what it
| was not trained to do! News at 11!
| nodja wrote:
| I've been following the image generation field for a couple
| months now and while the answer to the title is "yes for most
| things" it is easily fixed. Use a better text encoder.
|
| My favorite anecdote for showing how having a text encoder that
| actually understands the world is important to image generation
| is when querying for "barack obama" on a model trained on a
| dataset that has never seen Barack Obama the model somehow
| generates images of random black men wearing suits[1]. This is,
| in my non-expert opinion, a clear indication that the model's
| knowledge of the world is leaking through to the image generator.
| So if my understanding is right, as long as a concept can be
| represented properly in the text embeddings of a model, the image
| generation will be able to use that.
|
| If my anecdote doesn't convince you, consider that one of
| google's findings on the Imagen paper was that increasing the
| size of the text encoder had a much bigger effect on not only the
| quality of the image, but also have the image follow the prompt
| correctly, including having the image generator being able to
| spell words.
|
| I think the next big step in the text to image generation field,
| aside from the current efforts to optimize the diffusion models,
| will be to train an efficient text encoder that can generate high
| quality embeddings.
|
| [1] Results of querying "barack obama" to an early version of
| cene555's imagen reproduction effort.
| https://i.imgur.com/oUo3QdF.png
| snek_case wrote:
| > when querying for "barack obama" on a model trained on a
| dataset that has never seen Barack Obama the model somehow
| generates images of random black men wearing suits[1]. This is,
| in my non-expert opinion, a clear indication that the model's
| knowledge of the world is leaking through to the image
| generator.
|
| That's super interesting. It's not just black men in suits
| either. It's older black men, with the American flag in the
| background who look like they might be speaking. Clearly the
| model has a pretty in-depth knowledge of the context
| surrounding Barack Obama.
|
| I would say the image generation model is also doing a pretty
| great job at stitching those concepts together in a way that's
| coherent. It's not a random jumble. It's kind of what you would
| expect if you asked a human artist to draw a black American
| president.
| LesZedCB wrote:
| it does understand their relationships.
|
| the difference are the subjects of the domain it learns about
| exist purely as 2 dimensional images.
|
| once these models get larger and include wider ranges of sensory
| data beyond just imagery (as can be seen with models like GATO),
| they are clearly better able to "glue together" concepts across
| multiple domains.
|
| i would argue we absolutely do nothing different with regard to
| 'gluing things together'. we just have a wider range of sensory
| inputs.
| throwaway675309 wrote:
| Yes, SHRDLU it most definitely is _not_.
| upupandup wrote:
| Blake Lemon claimed that Google's chatbot was sentient which I
| disagreed with and its demonstrated through this article. AI can
| be optimized to respond in a way that can easily fool someone
| into thinking they are talking to a human but at the end of the
| day sentiency requires consciousness and that is not something
| that can be digitally produced.
|
| You can teach a parrot to respond to basic arithmetic but they
| are not aware of the concept of math rather they are acting in
| pathways set to induce the desired response.
|
| A truly conscious entity would simply have a mind of its own and
| will not do our bidding just like any other humans. They would be
| extremely selfish and apathetic, the idea that bunch of GPUs
| sitting in a datacenter is sentient is sheer lunacy.
|
| This Blake Lemon character will not be the last, there are always
| those that seek to be in the lime light with zero regards for
| authenticity. Such is sentient behavior.
| boredumb wrote:
| GPT/DALL-E/etc... All of these models are of course gluing things
| together in some manner, but who cares? That's the point right?
| The AI pill i've taken is that you don't need AGI in order to
| make things that are useful for people and business. If you've
| ever ran a business and had to dive into creatives for blogs, SEO
| content, social media posts, etc then you spent an inordinate
| amount of time creating it or outsourced it and in both cases the
| final copy is NOT going to get you a literary prize any time soon
| but it is absolutely enough to inform potential customers, start
| ranking on google and start gaining social media clout. GPT will
| also not garner you a literary award but can absolutely get you
| quality copy that users, customers, google, facebook users will
| not be the wiser that you generated it with AI instead of paid a
| third party to hack it together for you.
|
| (https://neuralmates.com/ I recently started putting together a
| web app to MVP this, and I hope to be able to integrate DALLE-2
| soon to be able to start generating images for folks as well.)
| lancesells wrote:
| This is the kind of response that turns me off on all of these
| amazing technologies. Blanketing the world with low-level
| content in the hopes to gain ranking and clout sounds awful.
|
| On the flip side I'm guessing we'll have some gpt-3/dall-e
| blocker extensions that help reduce some of it.
| boredumb wrote:
| Sure, but It's the reality of it.
|
| Also - you already live in this world but it's fueled by low
| cost copy writers and ghost accounts on fiver, I'd bet you
| are going to see the low water mark for content increase
| quite a bit in quality and volume over the next few years due
| to GPT3 being leagues better than the current state of
| content mills.
| kache_ wrote:
| define understand
| kazinator wrote:
| The authors here underestimate how much semantics is involved in
| gluing things together.
| simion314 wrote:
| There were so many articles about this project, but I can't find
| a demo page to test 3,4 outputs myself (unless it was some page I
| already tried when it was linked and I got nothing similar with
| what is shown on main page)
| ___rubidium___ wrote:
| Fascinating. I've noticed the same problem with Midjourney and
| Nightcafe as well.
| mellowside wrote:
| I've seen different results than the author. Had it since it was
| in beta and learned that it needs better prompts to generate
| better images. As someone on their Discord said "prompt writing
| is a craft you need to learn".
| aaroninsf wrote:
| Strong recommend for the book _The Alignment Problem_ which while
| only somewhat touching on systems which is an excellent lay
| overview of the state of ML which does a good effort to explain
| the principles involved, the challenges and blockers, and the
| ways in which progress has been made.
|
| The field is moving so quickly it will stop short of the current
| status quo, but, it's still remarkable.
|
| I have been myself playing with MidJourney, which like DALL-E 2
| is a text prompt to image generation system; it has different
| goals and favors aestheticized output over photorealism.
|
| The premises of that project and its current execution (as an
| explicit exercise in collective- rather that siloed-and-
| contractual relations) are genuinely remarkable and I believe
| represent a watershed. The rate of evolution of what it is doing
| are something to behold.
|
| I have generated around 7500 images with MidJourney so far and am
| slowly developing what feels like an intuition for the way it
| "sees" images, which as the devs say in regular office hours, is
| unlike how humans see them.
|
| The limitations, and superpowers, of the system as it exists are
| already deeply provocative. When things scale again to the next
| level, the degree of uncanniness and challenge to our
| preconceptions about the nature and locus of intelligence in
| ourselves may be genuinely shaking.
|
| Or so I currently think.
|
| I highly recommend taking time to really feel out these systems,
| because the ways they do and do not succeed and fail serves as a
| very potent first-hand education in the opportunities, and
| perhaps much more important, perils, of their application in
| other more quotidian areas.
|
| It's one thing for them to reliable produce "nightmare fuel"
| because of their inability to retain very high level coherence
| down through low level details, when they are drawing limbs,
| hands, faces...
|
| ...it's another thing entirely when analogous failure modes
| quietly permeate their ability to recognize illness, or, approve
| a loan, or recommend an interest rate.
|
| Or--as the example which opens _The Alignment Problem_ spells out
| --recommend whether someone should or should not offered bail. (A
| real world example with life-changing consequences for people who
| interact with ML in this path, in something over 30 states
| today... at least, as of publication).
| siglesias wrote:
| I discovered something like this recently when I tried the prompt
| "man throwing his smartphone into a river," and for the life of
| me I could not get DALL-E to render the phone separated from the
| hand (I tried "like a boomerang," "tossing," "into an ocean,"
| "like a baseball," etc). And then it occurred to me that by the
| training data, there are virtually no pictures of a person and a
| phone where the phone is separated! So DALL-E might have thought
| that the phone was just an appendage to the body, the way the
| hand is (which, what does this say about society!). I might as
| well have asked DALL-E to render someone throwing their elbow
| into a river.
|
| Another interesting case is animal-on-animal interactions. A
| prompt like, "small french bulldog confronts a deer in the woods"
| often yields weird things like the bulldog donning antlers! As
| far as the algorithm is concerned, it sees a bulldog, ticking the
| box for it, and it sees the antlers, ticking the box for "deer."
| The semantics don't seem to be fully formed.
| gwern wrote:
| I dunno man, I punched that exact prompt ("man throwing his
| smartphone into a river") in DALL-E 2 just now, and in 2/4
| samples, the smartphone is clearly separate from the hand:
| labs.openai.com/s/uIldzs2efWWnm3i9XjsHI7or
| labs.openai.com/s/jSk4qhAxSiL7QJo7zeGp6m9f
|
| > The semantics don't seem to be fully formed.
|
| Yes, not so much 'formed' as 'formed and then scrambled'. This
| is due to unCLIP, as clearly documented in the DALL-E 2 paper,
| and even clearer when you contrast to the GLIDE paper (which
| DALL-E 2 is based on) or Imagen or Parti. Injecting the
| contrastive embedding to override a regular embedding tradesoff
| visual creativity/diversity for the semantics, so if you insist
| on exact semantics, DALL-E 2 samples are only a lower bound on
| what the model can do. It does a reasonable job, better than
| many systems up until like last year, but not as good as it
| could if you weren't forced to use unCLIP. You're only seeing
| what it can do after being scrambled through unCLIP. (This is
| why Imagen or Parti can accurately pull off what feels like
| absurdly complex descriptions - seriously, look at the examples
| in their papers! - but people also tend to describe them as
| 'bland'.)
| visarga wrote:
| If you want multiple objects, each with individual
| attributes, the unCLIP model still has to make a single
| embedding vector representation. But the single vector is too
| small to contain an ever more detailed scene description.
| That's why it has failure modes like assigning the wrong
| colour to cubes and not being able to spell text.
|
| On the other hand the previous approach - autoregressive
| generation - allows full access through the attention
| mechanism to the prompt.
|
| For example Imagen encodes text to a sequence of embeddings.
|
| > Imagen comprises a frozen T5-XXL [52] encoder to map input
| text into a sequence of embeddings and a 64x64 image
| diffusion model, followed by two super-resolution diffusion
| models
| anothernewdude wrote:
| Certainly seems that way, but I didn't get to use DALL-E 2
| because its a Closed-AI.
| 2OEH8eoCRo0 wrote:
| What are we trying to accomplish with AI? What is "real"
| intelligence? What are we hoping that an AI does? What test does
| it need to pass?
| roland00 wrote:
| Isn't this just Chaos Theory, Calculus, and the problem of Jerks
| (third derivative of position in regards to time)?
|
| When you slice complex reality into smaller pieces, within the
| smaller piece you have a rough idea of velocity, and changes in
| velocity (aka acceleration), but you have no idea of future
| speed-bumps, aka the jerks (third derivative of position in
| regards to time) for that information is outside the frame of
| reference when you divided reality into smaller pieces.
|
| Thus you have pictures of people / objects in systems but you are
| not truly understanding relationships thus you miss things even
| though you feel like you see things. It is all a theme park for
| our own amusement, it is not real, only hyper-real which becomes
| uncanny when we start noticing how the images are off.
| BiteCode_dev wrote:
| Amusingly, I get creative new ideas when I glue things together
| and ignore what I thought I understood about their relationships.
| TremendousJudge wrote:
| Happens to me too -- it's a great way to make new things.
| However, the "creation" I'd argue happens when you look at the
| pile of random stuff and generate a new understanding, and
| decide that it is valuable. The difference between trash and
| art only exists in the head of the artist. Same thing happens
| with DALL-E output, really.
| gizajob wrote:
| I recently used it to create a picture of "an analogue
| synthesizer made out of meat and gristle". I don't really know
| how you could glue these concepts together, because nothing in
| reality causes them to ever go together, but it did a great job
| anyway.
| JoshTko wrote:
| Do humans assume things without understanding their
| relationships?
| can16358p wrote:
| Regardless of the content, the article has been "published 10
| seconds ago" for the last few minutes, and has been submitted to
| HN 1 hour ago.
| otikik wrote:
| How does one define "understand their relationships"?
|
| To me it is a matter of degrees and has multiple axes.
|
| When my 6yo son draws a chair, it's not the same as when Van Gogh
| draws one, which is different to when an expert furniture
| designer draws one. They all "understand" in different ways. A
| machine can also "understand". It might do it in different
| degrees and across different axes that the ones humans usually
| have, that's all. How we transform that understanding into action
| is what is important I think.
| dezmou wrote:
| Another cool example is when you ask DALLE-E to make a "Jaguar
| XF", it does not really know if I talk about the car or the
| animal, yet it manage to blend the two together https://img-
| comment-fun.9cache.com/media/a6wG6Kq/aq04bGmR_70...
| dukeofdoom wrote:
| Using midjourney, I'm fascinated with the geometric shapes it
| comes with. Just yesterday I was looking at a sort of a hexagon
| made with intersecting hearts. Not only was it aesthetically
| pleasing, it was such an unusual composition, that I think it
| would be extremely hard for an artist to come up with, maybe even
| a mathematician. So I think it's already capable of doing more
| sophisticated things than human artists.
| Cognitron wrote:
| MidJourney is really good at creating abstract images. I've
| been playing with the multi-prompt syntax (::) a lot lately,
| just combining different words that evoke form or texture to
| see what it will come up with. I start by testing things one at
| a time and just take a guess at what might look cool together.
| I end up making strange images with prompts like "fractal ::
| stained glass :: feathers", "antelope canyon :: tattered fabric
| :: cave", "rubik's cube :: Japanese woodblock", you get the
| idea. I rarely try to make pictures of real things anymore.
| Definitely fun to play around with.
| dukeofdoom wrote:
| I'm pretty new to it. Any tips on making retro pixel art
| assets. I want a stone or brick wall I could tile into a game
| I'm making.
| Cognitron wrote:
| I think most of us are pretty new to it since they just
| opened the beta to a ton of people. I haven't tried to do
| anything like that with it though. All I can say is it
| tends to take a lot of experimentation to get what you
| want. And you'll probably have to do some work to the
| images to get them to tile seamlessly. If you can't get the
| pixelated look quite how you want it, you might try using
| it to generate a base image and then use a separate tool to
| pixelate it. Some people are using it as part of a larger
| pipeline.
|
| The discord is full of people sharing their experiments and
| approaches though. Maybe try asking in the prompt-craft
| channel to see if someone else has attempted something
| similar.
|
| Good luck!
| thriftwy wrote:
| Maybe you could give DALL-E creativity lessons by finding such
| nonsense queries and adding creative interpretations of those
| into learning set.
| Marazan wrote:
| https://imgur.com/ggnm920
|
| "a photo of 6 kittens sitting on a wooden floor. Each kitten is
| clearly visible. No weird stuff."
|
| Like, lets start with the fact that there are 7 of them (2 of the
| 4 images from the prompt had 7 kittens). Now lets continue on
| with how awful they look.
|
| It is startling the difference in image quality between DAlle-2
| asked for a single subject vs DAlle-2 being asked for a group of
| stuff.
|
| And its obvious, if you know how the tech works, why this is the
| case.
| xwdv wrote:
| For art generation, gluing things together without understanding
| relationships is basically creativity. Humans can assign meaning
| to the final output.
| visarga wrote:
| There is some degree of understanding how to combine concepts,
| otherwise the avocado chair would not be possible.
| knighthack wrote:
| I don't see how the 'understanding of relationships' should be
| taken as the key intent of DALL-E 2.
|
| Consider procedural generation: it can create abstractions of
| both utter beauty or garbage without understanding context. You
| need to guide it towards something meaningful.
|
| Just the fact that DALL-E can 'glue things together' without need
| for human inspiration - yet where its output and intent can be
| understood by a human appraising it - that is not only a feat in
| itself, but I would say its key _feature_.
| adamsmith143 wrote:
| Why is there this obsession with systems or algorithms having
| "understanding"? No one thinks these things have internal states
| equivalent to "understanding". "Understanding" or not you can't
| deny the capability of these systems.
| arey_abhishek wrote:
| AI will never understand the actual context because not
| everything we feel/experience can be captured and communicated to
| a machine. For example, human language is incomplete and doesn't
| encode every information because it doesn't need to when used
| with other humans.
|
| I think it's a romantic notion to imagine that AI will not be a
| Chinese room.
|
| Even human intelligence feels like a Chinese room. Especially
| noticeable when using complicated devices like flight controls.
| I've been playing the MSFT Flight simulator, and I don't fully
| understand the relationship between the different instruments.
| But I can still fly planes(virtually).
|
| We'd be better off if we considered AI similar to an appliance
| like a microwave or a refrigerator. Does a fridge need to
| understand or taste what's inside it to be helpful?
| RootKitBeerCat wrote:
| Having spent about 200$ on credits... yes, it glues together your
| words through transformers into images... of snippets of the
| identified corpus of images it has... with some specific prompt
| phrasings having cool weighting for making really nicely looking
| glued together results... it's a model: classical computing -
| everything is turtles and duct tape and rivets and glue, all the
| way down...
| pera wrote:
| Interesting, DALL-E Mini / Craiyon actually generates a pretty
| accurate result for "a monkey touching an iguana" (at least
| semantically):
|
| https://i.imgur.com/Oq62gQI.png
| affgrff2 wrote:
| Maybe it's in the training data? At least some special kind of
| monkey (human) touching some lizard is.
| johndough wrote:
| https://rom1504.github.io/clip-
| retrieval/?index=laion5B&useM...
|
| There seem to exist several pictures of marmoset monkeys
| touching iguanas, but DALL-E mini shows macaque monkeys. This
| makes me believe that DALL-E mini has at least some
| generalization capabilities.
| walrus01 wrote:
| imgur is providing a prompt before viewing this image "this
| image may contain erotic imagery, are you over 18, etc".
|
| i certainly hope not...
| aeturnum wrote:
| The Fair Witness was a job that Heinlein made up for Stranger in
| a Strange Land. Fair Witnesses were supposed to reliably report
| what they saw without judgement - including their subjective
| judgement in their report. The example exchange is: "Is that
| house over there brown?" "It is brown on this side."
|
| Dall-E (and other ML systems) feel like fair witnesses for our
| cultural milieu. They basically find a series of weighted
| connections between every phrase we've thought to write down or
| say about all images and can blend between those weights on the
| fly. By any assessment it's an amazing feat - as is the feat to
| view their own work and modify it (though ofc it's from their
| coordinate system so one does expect it would work).
|
| In one sense - asking if the machine "understands" is beside the
| point. It does not need to 'understand' to be impressive (or even
| what people claim when they're not talking to Vice media or
| something).
|
| In another sense, even among humans, "understanding" is both a
| contested term _and_ a height that we all agree we don 't all
| reach all of the time. One can use ideas very successfully for
| many things without "understanding" them.
|
| Sometimes people will, like, turn this around and claim that:
| because humans don't always understand ideas when they use them,
| we should say that ML algorithms are doing a kind of
| understanding. I don't buy it - the map is not the territory. How
| ML algorithms interact with semantics is wholly unlike how humans
| interact with them (even though the learning patterns show a lot
| of structural similarity). Maybe we are glimpsing a whole new
| kind of intelligence that humans cannot approach - an element of
| Turing Machine Sentience - but it seems clear to me that
| "understanding" in the Human Sentience way (whatever that means)
| is not part of it.
| blocked_again wrote:
| This can be easily answered by giving the following input to
| dalle.
|
| """
|
| A map highlighting the countries the ancient Romans invaded since
| Pepsi was introduced.
|
| """
| Rackedup wrote:
| Havoc wrote:
| Isn't that true for all the models?
| [deleted]
| rkwasny wrote:
| People are "gluing things together" for a very long time, that's
| what everyone does!
|
| All ideas as remixes of previous ideas
| andybak wrote:
| I'm a big Dall-E fan but this is no surprise to anyone who's used
| it for more than 5 minutes.
|
| It was one of the things highlighted by Google when they
| announced Imagen as a differentiator:
| https://imagen.research.google
|
| The article touches on this but the headline is slightly
| deceptive.
| causi wrote:
| Imagen would be amazing for an "Images Against Humanity" game
| where you score points by generating the funniest picture.
| Zircom wrote:
| Absolutely stealing this idea, look forward to being credited
| for the idea in the future show hn post :P
| andybak wrote:
| (Incidentally - I want to make it clear I don't mean "Dall-E
| has no understanding of the elements and it's just a fancy
| image search/compositor" - there are results that absolutely
| require some semantic "understanding". It's just there are
| limits and spatial understanding is one of the weak spots)
| kemiller wrote:
| Anyone who has ever raised a small child will realize this is how
| kids learn too.
| rocgf wrote:
| Does anyone actually believe DALL-E "understands" what it's
| doing? For any reasonable definition of "understands", I assume
| most people would be skeptical.
|
| So if we go with that, then yes, it just glues things together
| without understanding their relationship. I'd just be tempted to
| say it doesn't really matter that it doesn't understand, except
| maybe for some philosophical questions. It's still incredible
| based on its output.
| visarga wrote:
| > Does anyone actually believe DALL E "understands" what it's
| doing?
|
| I believe it understands enough to make tens of thousands of
| people interested and debating its merits. The GANs of 5 years
| ago were baby toys compared to DALL-E. They were drawing 5 legs
| to a dog and limited to a few object classes. Now people debate
| if it "really understands" and if it is "(slightly) conscious".
| LesZedCB wrote:
| can we come up with a 'reasonable' definition of understands
| that isn't couched in human exceptionalism and could still
| apply to any computer model?
|
| i think dall-e understands, within the sensory domain it's
| trained from.
| rocgf wrote:
| I think DALL-E understands the same way an Excel sheet would
| understand SUM(B3:B6), obviously at a way larger scale and
| with a more complicated function. It's all just rather simple
| maths all the way down.
| visarga wrote:
| Not similar. The SUM() function was written by humans in a
| top-down fashion, DALL-E learned from the visual and
| textual data by a completely different process. DALL-E
| distilled something about the world and is using that to
| generate.
| [deleted]
| chpatrick wrote:
| I think "understanding" is a super vague word.
| i_am_toaster wrote:
| Bingo. This is also a fair comparison to how todays politics
| works in a nutshell. Overload a word with a vague context and
| people immediately attribute their own perspective to come to a
| conclusion that conflicts with an opposing argument.
|
| Relating back to this headline, using "understanding" creates
| lots of messages with differing views because everyone has
| their own take on the word. If instead you said something like,
| "DALLE fakes understanding of concepts to create new images," I
| bet you'd get even closer to the "political message board"
| style of comments because you are now taking an objective
| position (yes/no,true/false,good/bad) on a subjective word
| (understanding).
| IceMetalPunk wrote:
| Now, let's be critical of possible _reasons_ for this. It 's
| important to remember two things: 1) Any NN has zero experience
| with the world beyond its training data. Things that seem obvious
| to us from our experience are not obvious to a system that has
| never experienced those things. And 2) DALL-E 2 was trained on
| image-caption pairs scraped from the internet, basically.
|
| So it's quite possible the reason it doesn't understand things
| like "X under Y" very well is that its training set doesn't have
| a lot of captions describing positional information like that, as
| opposed to any failure in the architecture to even _potentially_
| understand these things.
| vannevar wrote:
| I'm sure it is, but "gluing things together" coherently in
| response to a text prompt is a stupendous achievement. It's not
| AGI, but it's miles ahead of where we were even a few years ago
| and opens the door to automating a class of jobs I don't think
| anyone back then believed could be automated, short of AGI.
| seanmcdirmid wrote:
| Just think what this could do for a game experience like
| Scribblenauts. Just being able to glue a fixed number of
| concepts in a huge number of ways...game designers are going to
| have to learn how to leverage ML.
| monkeynotes wrote:
| Their research showed that Dall-E had most success with real
| world stuff it had been trained on. Is this surprising? I mean,
| if I didn't know much about iguanas I'd also have a hard time
| representing them.
| _nhynes wrote:
| I ended up reading the book _Blindsight_ (Peter Watts) that 's
| been floating around in comments recently. A major theme in the
| book is intelligence and its relation to consciousness
| (including whether consciousness is even beneficial). If you
| agree with the idea, you'd consider that DALL-E is indeed
| intelligent even though it appears to be a "Chinese Room".
| Humans would be "gluing things together" in just the same way,
| but with this odd introspective ability that makes it seem
| different.
| Pulcinella wrote:
| I think the book does make a point that maybe one benefit of
| consciousness the ability to filter through all the
| information and spam that conscious beings produce. E.g. The
| scramblers may view all the radio waves we blast everywhere
| as attempts at warfare and reducing the fitness of other
| species. Why else would a species emit so much information if
| not to DDOS their enemies?! tl;dr consciousness is a defense
| against ****posting and trolling caused by conscious beings.
| the8472 wrote:
| re the chinese room, you might want to consider the
| computational and memory complexity of a lookup table.
| https://www.scottaaronson.com/papers/philos.pdf page 14
| twak wrote:
| fantastic book. Made me consider the question of whether
| consciousness exists at all or if it is just some hack by
| evolution to allow introspection.
|
| I haven't found a definition of consciousness which is
| quantifiable or stands up to serious rigour. If it can't be
| measured and isn't necessary for intelligence, perhaps there
| is no magic cut-off between the likes of Dall-E and human
| intelligence. Perhaps the Chinese-room is as conscious as a
| human (and a brick)?
| otikik wrote:
| It is a hack alright. A very beautiful hack, refined in
| hundreds of thousands of iterations, with a very high
| price.
|
| > to allow introspection
|
| Evolution doesn't do things "to anything". It repeats what
| works, and kills the rest. Our brains have allowed us to
| adapt to the changes in the environment better than the
| rest. Conscience came with the pack. It might not have an
| actual "purpose"- it could be an "appendix".
|
| My personal belief is that consciousness started as the
| self-preservation instinct that most animals have, and we
| developed introspection as a way to strengthen our ties to
| other members of our family or tribe. And then we "won"
| (for now)
| somenameforme wrote:
| I think in these sort of topics we are often discussing two
| different meanings of consciousness: physical vs
| philosophical. The arbitrary mixing and matching of them is
| likely one the reasons for the lack of clarity. Physical
| consciousness is most easily identified by the notion that
| you're unconscious when sleeping. The other is the far more
| interesting philosophical consciousness.
|
| Philosophical consciousness is what the oft misunderstood
| quote _cogito ergo sum_ , I think therefore I am, was
| hitting on. Descartes was not saying that consciousness is
| defined by thinking. He was trying to identify what he
| could know was really real in this world. When one goes to
| sleep, the dreams we have can often be indistinguishable
| from a reality in themselves, until we awake and find it
| was all just a dream. So what makes one think this reality
| isn't simply one quite long and vivid dream from which we
| may one day awake?
|
| But this wasn't an appeal to nihilism, the exact opposite.
| The one thing he could be certain of is that he, or some
| entity within him, was observing everything. And so, at the
| minimum, this entity must exist. And the presence of this
| entity is what I think many of us are discussing when we
| speak of consciousness. In contrast to physical
| consciousness, you are philosophically conscious even when
| sleeping.
|
| Of course like you said philosophical consciousness cannot
| be proven or measured and likely never will be able to be,
| which makes it an entirely philosophical topic. It is
| impossible for me to prove I am conscious to you, or vice
| versa, no matter what either of us does. Quite the private
| affair, though infinitely interesting to ponder.
| akomtu wrote:
| There is a lot of literature on this topic, but nearly
| all of it is in the "restricted" section of the library:
| it's the section with book shelves for alchemy,
| occultism, buddhism and so on. Western literature is very
| shallow in this regard.
|
| There a few interesting thoughts about consciousness that
| I've found in those books. One is that the boundary
| between consciousness and "real matter" is imaginary:
| consciousness exists only because of change in that
| matter, when the change stops - so does consciousness,
| consciousness creates reality for itself, and the two are
| in fact just two sides of the coin. In other words,
| static consciousness isnt a thing, and hence the need for
| "reality".
|
| Human consciousness is a sum of many consciousnesses that
| exist at wildly different levels of reality. There are
| primitive cellular consciousnesses, and those sometimes
| influence our mental consciousness. Our neural
| cerebrospinal system has an advanced consciousness
| capable of independent existence: it manages all the
| activity of internal organs, and only loosly interacts
| with our higher mental consciousness. That cerebrospinal
| system is even self-conscious in a primitive way: it can
| observe its own internal changes and distinguish them
| from impulses from the outside. There's emotional and
| mental consciousness that mainly lives in the brain and
| is somewhat aware of the dark sea of lower consciousness
| below it.
|
| Most people are conscious in dreams, as they can perceive
| in that state. However they cant make (yet) distinction
| between inner processes (self) and external effects
| (others), so to them it appears as if everything is
| happening inside their mind, i.e. they are not self-
| conscious. That's consciousness of a toddler. Some are
| more advanced, they start seeing the me-others difference
| and can form memories from dreams.
| patcon wrote:
| Also read the book due to comments here. Was very aligned
| with some of my interests in relationships between
| language, information, life/intelligence and metaphor <3
|
| You might get a kick out of this paper (though some may
| find it's proposal a bit bleak, I think there's a way to
| integrate it without losing any of the sense of wonder of
| the experience of being alive :) )
|
| It analogizes conscious experience to the a rainbow "which
| accompanies physical processes in the atmosphere but exerts
| no influence over them".
|
| Chasing the Rainbow: The Non-conscious Nature of Being
| (2017) https://www.frontiersin.org/articles/10.3389/fpsyg.2
| 017.0192...
|
| > Though it is an end-product created by non-conscious
| executive systems, the personal narrative serves the
| powerful evolutionary function of enabling individuals to
| communicate (externally broadcast) the contents of internal
| broadcasting. This in turn allows recipients to generate
| potentially adaptive strategies, such as predicting the
| behavior of others and underlies the development of social
| and cultural structures, that promote species survival.
| Consequently, it is the capacity to communicate to others
| the contents of the personal narrative that confers an
| evolutionary advantage--not the experience of consciousness
| (personal awareness) itself.
|
| So consciousness is more about what it subjectively feels
| like to be under pressure/influence to broadcast valuable
| internal signals to other (external) agents in our
| processes of life; aka other humans in the super-organism
| of humanity. I analogize it to what a cell "experiences"
| that compel it to release hormonal signals in a
| multicellular organism.
| f38zf5vdt wrote:
| > Made me consider the question of whether consciousness
| exists at all or if it is just some hack by evolution to
| allow introspection.
|
| Can't it be both? What's the difference? Evolution just
| responds to the environment, so a method of complex
| interaction with the environment like "consciousness" or
| "ever-polling situational awareness" seems like par for the
| course.
| croes wrote:
| Evolution is the result of the environment not a
| response.
|
| Giraffes didn't get a long neck because the food was out
| of reach, giraffes have a lock neck because the one
| without just died.
| f38zf5vdt wrote:
| It's a response caused by survival. The environment is
| the stimuli and the genetics that are not eliminated are
| cultivated.
|
| If I introduce an antibiotic into a culture of bacteria
| and they evolve resistance, then they appear to be
| responding to it on a collective level.
| croes wrote:
| "Since bacteria are extremely numerous, random mutation
| of bacterial DNA generates a wide variety of genetic
| changes. Through mutation and selection, bacteria can
| develop defense mechanisms against antibiotics."
|
| https://health.mo.gov/safety/antibioticresistance/general
| inf...
|
| Mutation happens all the time because cell replication
| isn't perfect, some mutation are irrelevant, some deadly,
| some bring better chance of survival.
|
| It's not a response just the result. Or how does the
| bacteria know it's an antibiotic and not just water? It
| doesn't, water just isn't a evolutionary filter,
| antibiotics are.
| doph wrote:
| Exactly (and cannot recommend Blindsight highly enough). Of
| course DALL-E is a Chinese Room. The most exciting and
| subversive idea in Blindsight is that consciousness is
| maladaptive, and the scramblers are _more_ fit for the lack
| of it. Long after neutral nets have surpassed our functional
| intelligence, we'll still be criticizing their ability to
| navel-gaze as well as humans.
| FeepingCreature wrote:
| Of course, DALL-E can only get away with not being
| introspective because it has reinforcement learning holding
| its hand. Bit harder to curate a clean training set without
| consciousness.
|
| Personally, I suspect consciousness is intrinsically
| connected with whatever lets humans learn things by looking
| at and thinking about a single example.
| soulofmischief wrote:
| I get what you're saying, but it's important to also
| consider how evolution provides for us as well by giving
| us neural structures already tuned for things like
| recognizing faces and recognizing danger. I do wonder how
| much realtime danger analysis has to do with
| consciousness.
| FeepingCreature wrote:
| I'd doubt it has much of a connection. Consciousness can
| do many things, but generally not quickly.
| TrevorJ wrote:
| Reinforcement learning _is_ the thing that integrates the
| network across time though. In one sense, I think it 's
| doing at least part of the job that consciousness does:
| mediating between memory(past) and the present.
| [deleted]
| fatherzine wrote:
| Consciousness is very much adaptive, though it requires a
| sense of the metaphysical / transcendent to behave in that
| way. Theologically speaking, God brings Life and the Devil
| brings Death. Sadly, our modern brave new world has taken
| "Gott ist Tot" as a triumphalist proclamation, and not the
| lamentation it was meant to be.
| wpietri wrote:
| It's far too early to tell if consciousness is adaptive.
| Has it enabled humans to be wildly successful? In the
| short term, sure! But the long term is unclear.
|
| E.g., look at how the technological progress it makes is
| at risk of destroying human civilization. A series of
| ecological boom/bust cycles could lead toward negative
| selection for consciousness. Or we might manage to wipe
| out ourselves or, say, most mammals, leading life on
| earth to start over as it did 65m years ago.
|
| But even without that, it's not clear to me that
| consciousness will really win out. Look at the number of
| successful people who are not only painfully
| unreflective, but need to be to keep doing what they're
| doing. I could name a lot of people, but today's good
| example is Alex Jones, whose whole (very profitable)
| schtick is based on refusing to be fully conscious of
| what he's saying: https://popehat.substack.com/p/alex-
| jones-at-the-tower-of-ba...
|
| And this is hardly a new idea. Vonnegut wrote a novel
| where humans end up evolving into something like a sea
| lion. The point being "all the sorrows of humankind were
| caused by 'the only true villain in my story: the
| oversized human brain'", an error evolution ends up
| remedying.
| fatherzine wrote:
| Hardly a new ideea indeed: https://en.wikipedia.org/wiki/
| Tree_of_the_knowledge_of_good_...
|
| Edit: To be clear, I posit that consciousness is the
| organ that enables us to distinguish between Good and
| Evil.
| wpietri wrote:
| I doubt that consciousness is an organ, but if it were,
| it's distinct from a moral sense, because we share that
| with animals. Here's a good book on the topic: https://ww
| w.hup.harvard.edu/catalog.php?isbn=9780674356610
|
| That same professor has done a bunch more work on the
| topic, as have many others.
| fatherzine wrote:
| Frans de Waal has done great work.
|
| Evolution is a strange phenomenon. I invite us to marvel
| at the transformation of terrestrial quadrupeds into
| majestic aquatic creatures, over eons:
| https://en.wikipedia.org/wiki/Evolution_of_cetaceans
|
| Evolutionary speaking, cetaceans "share" the front limbs
| with quadrupeds. And yet there is a qualitatively
| distinct functional difference. Consider that moral
| consciousness, as present in humans, is functionally not
| quite the same as its biological precursor, the moral
| sense present in dogs or gorillas. And, of course, there
| are gradual changes along the evolutionary way.
|
| Edit: "Organ", more precise "sensory organ", as in "the
| visual organ". Perhaps there is a better word here than
| "organ" here, before we get lost in the medical
| distinction between eye / retina / optic nerve / cortex /
| etc.
| kebman wrote:
| But Sontag suggests the use of Batailleist `powerful
| communication' to modify and read consciousness. The
| subject is contextualised into a predialectic nationalism
| that includes language as a totality.
| fatherzine wrote:
| Hi there GPT-3 bot.
|
| Edit. Don't really want to be unnecessarily flippant
| here. Perhaps this is a real person attempting to
| communicate. Sadly, I was unable to follow through, the
| text jumps wildly between seemingly unrelated subjects.
| kebman wrote:
| Nope, that was actually the Postmodern Generator.^[1] But
| nice try! ;) [1]: https://www.elsewhere.org/pomo/
| fatherzine wrote:
| Haha, <fist bump>.
| benlivengood wrote:
| It's not clear what generates consciousness. Until we know
| for sure (e.g. A/B testing with humans who can report when
| they do and do not experience consciousness in different
| neural configurations) then I think it's impossible to know
| what level of conscious experience large ML models have.
|
| Blindsight is an excellent book in its exploration of
| consciousness, but the speculative part is that a working
| sense of self isn't necessary for embodied intelligence (like
| the scramblers), which I tend to doubt. An agent without a
| model of itself will have difficulty planning actions;
| knowing how its outputs/manipulators are integrated into the
| rest of reality will be a minimum requirement to control them
| _effectively_. It is certainly possible that "self" or "I"
| will be absent; humans can already turn the ego off with
| drugs and still (mostly) function but they remain conscious.
| mrcartmeneses wrote:
| People say this a lot online in this kind of debate but you
| can see which areas of the brain light up in people who are
| conscious and how that's different from people who are
| asleep or in a persistent vegetative state. We have a very
| good idea of what consciousness is and it's definitely not
| linear neural net with a discrete input and output. Having
| said that, the nerve that connects the eye to the brain
| could be considered similar to a modern neural net.
|
| It looks like we're beginning to get the building blocks of
| consciousness together. But we don't yet know how to
| combine the wave functions into a chorus necessary to
| achieve GI
| stormbrew wrote:
| I think you're using a different meaning of the word
| conscious than the GP? You seem to be talking about
| awake, they're talking about a much more abstract idea.
|
| A person who is sleeping or in a vegetative state is not
| currently getting new inputs fed into some parts of their
| brain, so it's not surprising that their brain "lights up
| differently," nor does it imply anything about a piece of
| software that _is_ getting new inputs that might be being
| integrated into its model (of course, a model that is
| trained and then repeatedly used without further
| integration is not in any way comparable to a brain).
|
| This more abstract idea of consciousness is definitely
| not a solved problem - people can't even manage to agree
| on whether non-human animals have it. And a lot of
| internet arguments for why this or that neural network
| can't be conscious probably also rule 5 year olds out of
| it too.
| [deleted]
| benlivengood wrote:
| We have fairly good ways of detecting human consciousness
| now, but not a lot of understanding of _why physically_
| or _precisely which_ neural circuits produce conscious
| experience, which is the understanding we 'd need to
| infer what sort of conscious experience is possible for
| ML models. We also can't reliably detect or predict
| whether a given animal is conscious or not (related to
| the problem of minimal viable neurons required for
| consciousness), and this is the biggest evidence I have
| that our study of consciousness is far from complete.
|
| The best we can do with humans (and maybe animals) is
| behaviorism and inference on our own personal
| consciousness at this point, with brain imaging to
| demonstrate at least gross prediction of consciousness in
| humans.
| visarga wrote:
| > consciousness is definitely not linear neural net with
| a discrete input and output
|
| Just neural circuits are not going to be conscious by
| themselves, for one they need to learn concepts from the
| environment and those concepts shape the neural circuits.
| Thus the way they act shape how they develop. You can't
| separate consciousness from the environment where it
| develops.
|
| In other words it was not the neural net that was
| lacking, but the environment.
| canjobear wrote:
| > humans who can report when they do and do not experience
| consciousness
|
| How can someone possibly report when they are not
| experiencing consciousness?
| benlivengood wrote:
| > How can someone possibly report when they are not
| experiencing consciousness?
|
| By an absence of reporting it. If I sit at a desk getting
| my neurons moderated by testing equipment and say "I am
| conscious" every subjective second that I am experiencing
| consciousness then I could at least help narrow down when
| consciousness is lost. If I am simply unable to speak or
| respond at all, but still conscious, I would report that
| fact later. Only in the case of locked-in conscious
| awareness without later memory of the experience would
| this kind of experimental setup fail, and this is where
| brain imaging could probably help determine that
| everything except motor or memory neurons were active.
| toomanydoubts wrote:
| >(e.g. A/B testing with humans who can report when they do
| and do not experience consciousness in different neural
| configurations)
|
| Doesn't that fall back to the old consciousnesses trap that
| nobody knows how to resolve? How do you know if the human
| reporting that he's conscious is actually conscious and not
| philosophical zombie?
|
| We don't know what generates consciousness because we don't
| know how to measure it, and if we can't measure it, we will
| always have to take the words of an seemingly conscious
| entity for it.
| benlivengood wrote:
| Self-reporting is probably ideal, probably with a camera
| recording everything so I can judge my answers and
| behavior and compare it with my memory. But we don't
| really have a great handle on how much of our memory of
| conscious experience matches our real-time experience or
| even if the memory-formation process might "be"
| consciousness, so I admit there are blind spots in
| studying consciousness rigorously.
|
| I don't buy the philosophical zombie argument simply
| because consciousness does alter behavior. I wouldn't
| participate in this conversation the same way if I didn't
| experience consciousness. It would be more like vivid
| imagination (as apposed to moderate aphantasia) where I
| find it curious but don't have it. As in the novel,
| unconscious beings probably behave noticeably different.
|
| There are, apparently, some people who have a very
| reduced sense of consciousness. I know I have done and
| said things when I'm not (memorably) conscious, for
| example when half asleep or coming out of anesthesia, and
| my behavior has been altered according to witnesses. I
| wasn't quite "myself". I can also hyper-focus and reduce
| conscious awareness of my surroundings and of my own body
| and mind, but that still feels like I have an internal
| awareness and memory of the experience. I am fairly
| certain I would be able to tell if that is switched off
| for a time.
| Scene_Cast2 wrote:
| There was this Google AI that made the news recently, it
| was trained to say that it's a conscious AI. However,
| most agree that it isn't conscious.
| laserlight wrote:
| > I wouldn't participate in this conversation the same
| way if I didn't experience consciousness.
|
| I don't think you're conscious. Prove me wrong.
| [deleted]
| joe_the_user wrote:
| The "Chinese room" argument, so far as I understand it,
| applies to any discreet computation process simulating
| consciousness.
|
| The argument of the article is that DALL-E doesn't respond
| appropriately to a particular kind of input - two entities in
| some kind of spatial relationship (that it hasn't often
| seen). Dall-E's not extrapolating the three-D world but
| stretching a bunch 2-D images together with some heuristics.
| That works to create a lot of plausible images sure but it
| implies to this ability might not, say, be able to be useful
| for the manipulation of 3-D space.
|
| So, given a "Chinese room" is just a computation, it's
| plausible that some Chinese room could handle 3-d image
| manipulation more effectively than this particular program.
|
| Which is to say, "no, the criticism isn't this is a Chinese
| room, that is irrelevant".
| TrevorJ wrote:
| I think we need to start thinking of these networks not as
| something we hope will someday simulate a brain, but as
| small building blocks that can be networked together in a
| variety of ways to tackle tasks. DALL-E may not understand
| 3D image manipulation, but if you chained the output from
| DALL-E into something like NeRF you could start to
| transform the image in 3D space. One can picture a
| collection of a few hundred of these networks, all with
| different things that they are good at working in concert
| to approach something that is very good at doing things
| generally. We're missing the layer on top that decides
| which algorithms to run and in what order.
| jrlocke wrote:
| The Chinese room argument itself isn't very compelling.
| Surely the constituent parts of the brain are fundamentally
| governed solely by physics, surely thought arises solely
| from the physical brain, and surely the constituent parts
| (and thus thought) could be described by a sufficiently
| complex discreet computation.
|
| Are we not conscious?
| indigo945 wrote:
| I think a considerable subset of the people who do make
| use of the Chinese room argument also subscribe to some
| form of mind-body dualism, where consciousness does not
| or does not completely arise from physical processes.
| jrlocke wrote:
| To be more explicit, I'm saying I find it weird to hear
| so much about the Chinese room argument from a crowd of
| (presumably) materialists.
| notahacker wrote:
| I see it the other way round.
|
| The Chinese Room and the brain of a Chinese-speaking
| person are completely different physical processes.
| Looked at on an atomic level, they have almost nothing in
| common. Mind-body dualists may or may not agree that the
| room is not "conscious" in the way a human is, but if
| consciousness is purely a material process, I can't see
| how the materialist can possibly conclude all the
| relevant properties of the completely dissimilar room and
| person are the same.
|
| Those that would argue the Chinese Room is "conscious" in
| the same way as the Chinese person are essentially
| arguing that the dissimilarity of the physical processes
| is irrelevant: the "consciousness" of the Chinese person
| doesn't arise from molecules bouncing around their brain
| in very specific ways, but exists at some higher level of
| abstraction shared with the constituent molecules of
| pieces of paper with instructions written in English and
| outputs written in Chinese.
|
| The idea our consciousness exists in some abstract sense
| which transcends the physics of the brain is not a new
| one of course. Historically we called such abstractions
| souls...
| mannykannot wrote:
| The argument you make here is a reasonable one (IMHO) for
| the plausibility in principle of what Searle calls
| "strong AI", but he claims that his "Chinese Room"
| argument proves that it must be mistaken. One can simply
| ignore him, but to refute him takes a little more effort.
|
| It turns out that when one looks at the argument in
| detail, and in particular at Searle's responses to
| various objections (such as the Systems and Virtual Mind
| replies), it is clear that he is essentially begging the
| question, and his ultimate argument, "a model is not the
| thing modeled", is a non-sequitur.
| TheOtherHobbes wrote:
| The argument is essentially that there are no qualia of
| Chinese comprehension in an automaton or in any system
| that uses an equivalent algorithm, whether or not run by
| a human.
|
| It's a sound argument to the extent that qualia clearly
| exist, but no one has any idea what they are, and even
| less of an idea how to (dis)prove that they exist in
| external entities.
|
| It's the materialists who are begging the question,
| because their approach to qualia is "Well obviously
| qualia are something that just happens and so what?"
|
| Unfortunately arguments based on "Well obviously..." have
| a habit of being embarrassingly unscientific.
|
| And besides - _written language skills are a poor
| indicator of human sentience._ Human sentience relies at
| least much on empathy; emotional reading of body
| language, expression, and linguistic subtexts; shared
| introspection; awareness of social relationships and
| behavioural codes; contextual cues from the physical and
| social environment which define and illuminate
| relationships; and all kinds of other skills which humans
| perform effortlessly and machines... don 't.
|
| Turing Tests and game AI are fundamentally a nerd's view
| of human intelligence and interaction. They're so
| impoverished they're not remotely plausible.
|
| So as long as DALL-E has no obvious qualia, it cannot be
| described as sentient. It has no introspection and no
| emotional responses, no subjective internal state (as
| opposed to mechanical objective state), and no way to
| communicate that state even if it existed.
|
| And it also has no clue about 3D geometry. It doesn't
| know what a sphere, only what sphere-like shading looks
| like. Generally it knows the texture of everything and
| the geometry of nothing.
|
| Essentially it's a style transfer engine connected to an
| image search system which performs keyword searches and
| smushes them together - a nice enough thing, but still
| light years from AGI, never mind sentience.
| planetsprite wrote:
| human brains are a chinese room. Our DNA and experiences
| wrote the book.
| TrevorJ wrote:
| I'm becoming convinced that these algorithms _are_ huge steps
| towards AGI simply because AGI might end up being a
| collection of many of these domain-specific networks with a
| network sitting above them who 's only role is to interrogate
| the sub networks for solutions to the problem at hand, and
| discriminate which solution(s) are most worth trying,
| simulating those and then picking one out and executing it in
| the real world. That seems to me to be pretty close to what
| we as humans do.
| ausbah wrote:
| the problem you run into then is how does it adapt to new
| domains? sure large models display some degree of
| generalization, but new tasks will certainly ask things the
| network couldn't get
| TrevorJ wrote:
| I think that's valid, however, I think the modular
| approach would allow for a lot more cross domain
| flexibility because I suspect that a large subset of
| disparate tasks do actually decompose themselves into
| discrete subtasks that can be useful across domains.
|
| Adding to that, if you can compose multiple sub networks
| together then you've really got something. You can build
| a lot of different buildings from bricks without needing
| to invent new kinds of brick basically.
|
| For instance, think about the large number of domains
| that robust computer vision would be useful in. Then
| think about the fact that if the computer understands the
| 3D space around it, it can hand that model off to a
| network that does predictive physics simulation. Now
| you've got something that would be useful across a
| extremely wide range of domains.
| WitCanStain wrote:
| Cognitive scientists have been exploring this idea for a
| while, e.g. the Modularity of Mind Hypothesis. It would be
| very interesting to see if learnings from this field can be
| further applied to AI development.
| p1necone wrote:
| You just need a neural network that knows how to
| interrogate other neural networks, and a neural network
| that knows how to find new datasets and train new neural
| networks and you potentially have something very scary
| right?
|
| Maybe a third neural network that knows how to purchase
| cloud compute?
| TrevorJ wrote:
| >Maybe a third neural network that knows how to purchase
| cloud compute?
|
| Amazon has entered the chat
| RootKitBeerCat wrote:
| Private angel investors watch from above all of the
| "clouds"
| miguelxpn wrote:
| That book gave me the highest dose of existential crisis I've
| ever felt. I should probably re-read it.
| zro wrote:
| Blindsight is one of my all time favorite books.
|
| Looks like it's still available for free from the author's
| website [0]
|
| [0] https://www.rifters.com/real/Blindsight.htm
| mgdlbp wrote:
| Indeed, it's actually irrevocably CC BY-NC-SA 2.5
| metacritic12 wrote:
| Are humans just "Chinese rooms"? We don't really understand
| anything deeply, but our neurons just fire in a way that
| gives good responses and makes us feel like we understand
| stuff.
| monkeynotes wrote:
| I think this is exactly true.
| plmpsu wrote:
| No you don't.
| joe_the_user wrote:
| Yes, DALL-E is very impressive to see and can have a number of
| actual practical uses.
|
| But fear of AGI is huge currently. The more impressive non-AGI
| things we see, the more worried people naturally become that
| we're reaching the "dawn" of AGI with all the disturbing
| implications that this might have. (A lot of people are afraid
| an AGI might escape the control of its creator and destroy
| humanity. I think that's less likely but I think AGI under
| control of it's creator could destroy or devastate humanity so
| I'd agree AGI is a worry).
|
| That DALL-E doesn't understand object-relationships should be
| obvious to people who know this technology but a lot of people
| seem to need it spelled-out. And they probably need it spelled
| why this implies it's not AGI. But that would be several more
| paragraphs for me.
| jeremyjh wrote:
| The whole point is it isn't doing so coherently if similar
| images were absent from the training data. The monkey wasn't
| touching the iguana.
| jcelerier wrote:
| I don't understand how my brain isn't just gluing things
| together either. I don't personally feel like I'm actually
| experiencing the understanding of anything
| drooby wrote:
| After reading your message I felt compelled to click reply
| and generate a response and now that's what I'm doing. My
| comment feels mildly appropriate to what you said and I'm not
| 100% sure how it will be received but it _feels_ like I'm
| communicating some degree of understanding of the sentiment
| in which you have expressed. My neurons are happy.
| RC_ITR wrote:
| Do you think GPT 3 could generate the same string of text?
| drooby wrote:
| I was curious so I asked it some questions... here you
| go:
|
| Me: Write what it feels like to understand something
|
| GPT-3: It feels good to understand something. You feel
| like you have a handle on what is going on and you are
| able to make sense of it. It can feel empowering and give
| you a sense of control.
|
| Me: Write a sentence that provides an example of self
| reflection and metacognition
|
| GPT-3: I was surprised that I was able to understand the
| concept so quickly.
|
| Me: What concept?
|
| GPT-3: The concept was metacognition.
|
| Me: What does surprisal feel like in this context?
|
| GPT-3: It feels good to be surprised that you understand
| something. You feel like you are smart and have a good
| handle on the concept.
|
| Me: What is the experience of feeling good to you?
|
| GPT-3: The experience of feeling good is one of happiness
| and satisfaction. You feel like you have accomplished
| something and are proud of yourself.
|
| Me: What have you accomplished here?
|
| GPT-3: You have accomplished understanding a concept.
|
| Me: What concept is that again?
|
| GPT-3: The concept is metacognition.
| orlp wrote:
| The most important thing I think DALL-E shows is that it has a
| model of our world and culture. It's not intelligence, but it
| is knowledge.
|
| Google can give you endless pictures of giraffes if you search
| for it. But it can only connect you to what exists. It doesn't
| know things, it knows OF things.
|
| DALL-E has knowledge of the concept of a giraffe, and can
| synthesize an endless amount of never-before seen giraffes for
| you. It actually knows what a giraffe is.
| TremendousJudge wrote:
| I find talking about "knowing" is too much anthropomorphizing
| to my taste. A more accurate description would be that it can
| synthesize an endless amount of pictures that the viewer will
| recognize as giraffes (or as anything else the viewer wants).
| And I agree, it's pretty amazing. But my point is that
| talking like this skips over the entire useless "what is
| knowledge" debate.
| soco wrote:
| Maybe we need to "de-antropomorphize" these terms and
| rules. It's a long due process, I believe.
| psyc wrote:
| If it'll get people to stop accusing other people of
| anthropomorphizing, I'm 100% on board. Now if you can
| think of a way to get people to quit asking for
| definitions of the words _intelligence_ and
| _consciousness_ , you'll be a legend and maybe these
| discussions will advance for the first time.
| TremendousJudge wrote:
| Well, I tried my best. Another commenter summed it up
| better without using the term "anthropomorphizing" at
| all, but it came out a bit more condescending than
| necessary.
| IIAOPSW wrote:
| Hang on while I become a legend.
|
| I use GPT3 to generate the usual trite arguments about
| intelligence and consciousness why computers won't ever
| get there. Of course I don't actually reveal that a
| computer is generating my responses until later on.
| Eventually everyone will become jaded and skeptical that
| the other participants in that conversation are real
| people. Soon all arguments about machine intelligence
| will devolve into accusations of using GPT3 or not. Some
| day, even mentioning consciousness will just make
| everyone assume you're probably a GPT3 troll. This kills
| the conversation in a way that makes a valid point. If
| the bots can't be reliably identified, the proof is in
| the pudding and the matter is settled.
| TremendousJudge wrote:
| I don't think it's possible. "Knowing" and
| "understanding" are inherently human experiences. The
| most reasonable thing to do would be to invent new terms
| and relegate the older ones, that have too much cultural
| baggage attached to them.
|
| The same way when the creator of humanity turned out to
| be "evolution by natural selection" we didn't redefine
| the term "God" to mean that. Eventually we just started
| using the new term.
| MauranKilom wrote:
| I too can write a program that will synthesize an endless
| amount of pictures that the viewer will recognize as a
| giraffe. They would all be the same picture, just with one
| bit of noise randomly applied to each pixel - enough to
| generate more distinct 256x256 pictures of giraffes than
| there are atoms in the universe.
|
| We need to go a bit deeper to tease out what makes DALL-E 2
| amazing.
| soco wrote:
| A honest question: what is then intelligence? Or are you just
| trying to avoid being caught into a debate...
| wcarss wrote:
| you could think of "knowing what a giraffe is" in this
| instance as a slice of an intelligent thought, but it's not
| a whole thought, let alone a train of thoughts, or a plan
| involving many trains of thought over time for internally
| motivated reasons.
|
| If we could get it to tell us a story about a particular
| giraffe, and then ask it next week about that same giraffe,
| and then the giraffe could be referenced by it while it
| tells a joke on a talk show in a decade -- that's maybe too
| high a bar, but that's real intelligence.
| shawnz wrote:
| Why is long term memory a requirement? Are amnesiacs not
| intelligent?
| wcarss wrote:
| > that's maybe too high a bar, but that's real
| intelligence.
|
| It's not necessarily a requirement, but I couldn't
| witness someone do it and then deny their intelligence.
| shawnz wrote:
| But it's not sufficient either, right? Just having memory
| alone also wouldn't prove intelligence, so it's not
| necessary or sufficient
| trention wrote:
| It knows what combinations of pixels correspond to the
| metadata having "giraffe" in it. Nothing more.
| blooperdeoo wrote:
| [deleted]
| tellarin wrote:
| " It actually knows what a giraffe is." -> It maybe has a
| sense of what a giraffe looks like.
| lancesells wrote:
| I wonder what would happen if you put in the text prompt:
| "Long-necked spotted mammal that eats leaves from trees".
|
| Most people would answer that as a giraffe.
| BudaDude wrote:
| I was curious about this as well. Here is what I got:
| https://i.imgur.com/Ey0i3My.png
| Marazan wrote:
| Very similar to my images for the prompt:
| https://imgur.com/8dw2brf
|
| Intriguing that it's gone for a headshot for all of them.
| I suspect it says something about the source text
| coel wrote:
| I was curious about this too and wondered if it could
| then relate to my description of a giraffe so put in
| "legs of a yellow mammal with a long neck and spots" and
| it still nailed it.
| lancesells wrote:
| Dall-E knows what a giraffe is IMO
| nextaccountic wrote:
| I found it incredible too
|
| The text model of dall-e at very least can map "Long-
| necked spotted mammal that eats leaves from trees" near
| the same representation of "giraffe"
| mechagodzilla wrote:
| The pictures are incredibly similar to what one gets when
| you type that same string in to google image search - it
| makes me wonder if you could do reverse-image search on
| some of the pictures it produces and find whole training
| pictures embedded in it, the way one can find blocks of
| intact training text stored inside language models.
| notahacker wrote:
| One of the interesting things about the "monkey touching
| iguana" photos is just how abstract that sense is.
|
| Several of the pictures look like it's taken geometry from
| a 'touching' source and painted both 'monkey' and 'iguana'
| textures on to both figures, because on the one hand its
| model of relationships is too sophisticated to just
| copy/paste monkey and iguana photos from its library, and
| on the other hand it's not sophisticated enough to always
| treat "monkey touching iguana" as implying that the monkey
| and the iguana are discrete animals. (An interesting
| contrast with it being generally praised for being
| remarkably good at things like putting hats on animals'
| heads...)
| XorNot wrote:
| On the other hand if you posted "monkey touching iguana"
| as a prompt on an internet forum - say a photoshop
| contest - you would get essentially all of these variants
| from actual humans as well when the task delved into
| being "meta".
| notahacker wrote:
| People are certainly _capable_ of wilfully
| misinterpreting the prompt because they consider monkey
| /iguana chimeras to be amusing to draw, or think that
| semi-abstract hybrid animals with a caption musing on the
| meaning of "touching" is a higher form art than mere
| representation. And in a photoshop competition they might
| all copy one silly interpretation because they think it's
| meme-worthy.
|
| But I don't think if you asked 18 humans to independently
| draw "monkey touching iguana" you'd get 17 pairs of
| monkey/iguana hybrids mostly not touching each other
| against photographic backgrounds often featuring human
| limbs and one apparently normal monkey being pursued by a
| giant iguana!
| bpodgursky wrote:
| Let's not play word games.
|
| Do _you_ know what a giraffe _is_? No, you just know what a
| giraffe looks like, where it lives, and maybe that it 's
| vaguely related to a horse.
| axus wrote:
| I do not _grok_ a giraffe in the same way I do the things
| I work on.
| shafoshaf wrote:
| That last bit about the horse is pretty loaded. Knowing
| what a giraffe is means knowing that it is a mammal, has
| a blood system, eats leaves, reacts certain ways to
| certain stimuli, etc... That is a lot more than just
| knowing what it looks like. Image recognition is not the
| same a relationships or knowledge.
| influxmoment wrote:
| > mammal, has a blood system, eats leaves, reacts certain
|
| DALL-E likely can map all those concepts to a giraffe
| also.
| karatinversion wrote:
| By your criteria, no human before the 18th century (when
| Carl Linnaeus invented the class "mammalia") knew what a
| giraffe was.
| usrusr wrote:
| And in the 25th century, if we manage keep technology
| alive, the dall-es of the day will be trained in more
| computer depictions of long extinct animals that
| accidentally made it into the training set than pictures
| of real animals. It will be Durer's rhinoceros all over
| again.
| orlp wrote:
| Now ask DALL-E "giraffe feeding young", "giraffe organs",
| "giraffe eating", etc, I think you'd find most knowledge
| is there.
| teolandon wrote:
| What about ,"giraffe laying eggs"?
| IIAOPSW wrote:
| That's not a word game at all. To know what a giraffe is
| is to be able to understand the implications of being
| informed that something is a giraffe. Dalle would never
| assume on its own that "a giraffe close to a low flying
| helicopter" could be very dangerous for the giraffe, but
| "a horse close to a low flying helicopter" is safe.
| nextaccountic wrote:
| > Dalle would never assume on its own that "a giraffe
| close to a low flying helicopter" could be very dangerous
| for the giraffe, but "a horse close to a low flying
| helicopter" is safe.
|
| Are you sure? (Can someone run down those prompts?)
|
| Dall-e might be able to make those relationships in the
| latent space
| gattilorenz wrote:
| > Are you sure?
|
| Considering Dall-e has problems painting "a red cube on
| top of a blue cube" [1] and all kind of simple spatial
| relations, I'd say it's a fair shot.
|
| [1] As reported by OpenAI, but there are also some
| prompts by Gary Marcus et al.
| (https://arxiv.org/abs/2204.13807) showing this, and it's
| trivially simple to find other very simple cases like
| these
| Earw0rm wrote:
| Not really close to horses at all. It's an even-toed
| ungulate, closer to deer, cows and sheep. Many, many tens
| of millions of years from horses.. the even-toed
| herbivores are closer to dolphins (!!) than they are to
| horses or rhinos.
| leereeves wrote:
| You know a lot more than that. You know that it eats,
| sleeps, runs, mates. You know that there are male and
| female giraffes and what that means. You know that it
| dies if it gets too hot, too cold, if it falls off a
| cliff. You know that it has legs, a head, neck, etc and
| you probably know that it has a heart, brain, lungs,
| kidneys, etc inside.
|
| You know, vaguely, how it interacts with other things and
| what it's likely to do around a fruit tree, or a lion, or
| fire.
|
| Even if you've never been close to a giraffe, you can
| probably imagine what it looks like from close enough to
| see individual hairs in its fur.
|
| A lot of knowledge is still missing from ML systems that
| don't interact with the world.
| hypertele-Xii wrote:
| It's even deeper than that. Humans and giraffes are
| mammals, _very_ close cousins in terms of evolutionary
| biology. The very nature of your experience of reality is
| essentially shared with the giraffe.
|
| Part of our brains are _lizard,_ both yours and the
| giraffes. Tech so ancient that it uses the same circuits
| and chemicals with _crustaceans._
|
| You can imagine what existence is like for a giraffe with
| pretty much 99% accuracy without consciously knowing a
| single thing about it.
|
| A word-based image generator cannot.
| kilroy123 wrote:
| What do you mean by "is"? I've spent a fair amount of
| time with giraffes in real life. Does that mean I know
| what one is?
| mym1990 wrote:
| I would really hope so. All of this conversation is
| devolving into a philosophical argument about 2 letter
| words. To know what something is relates to understanding
| some of its properties, and how it might interact with
| the world around it. Whether DALL-E knows what something
| is, is debatable. Whether you know what something is
| should be pretty clear.
| romellem wrote:
| If this was around in 2009 it would have made the
| https://www.onemilliongiraffes.com/ project much easier.
| SketchySeaBeast wrote:
| > It actually knows what a giraffe is.
|
| Every time I see a painting it does with blank biege swirls
| where faces should be I question that it does know what
| anything is or what the important parts are.
| LordDragonfang wrote:
| I suspect you're thinking of Craiyon (formerly DALL-E Mini)
| which is a derivative (but distinct) project, and
| consistently does the beige swirls. DALL-E 2 proper is much
| higher fidelity, and seems much more reliable at generating
| passable (if imperfect) faces from what I've seen.
| rob74 wrote:
| What's still missing, as the article mentions, is the correct
| relationship between objects. As in this post from a few
| months ago:
| https://astralcodexten.substack.com/i/57243633/the-tenth-
| vir... - it knows that Tycho Brahe was a bearded guy, and it
| knows what a telescope looks like, but when you ask for
| "Tycho Brahe looking through a telescope", the person
| sometimes looks _at_ the telescope, or _past_ it, sometimes
| the telescope just sits there, but the person never actually
| looks _through_ it.
| amelius wrote:
| Reminds me of a classifier that couldn't tell the difference
| between a panther and a couch with a panther-print.
| dane-pgp wrote:
| But what if the couch wasn't panther-print but made from
| actual panther hide? Is a dead panther still a panther? How
| much of the panther's body has to be visible for the object
| depicted to still be a panther?
|
| I'm reminded of the classic joke about the mathematician on
| a train who sees a sheep...
|
| https://bestjokehub.com/joke-103626/
| blooperdeoo wrote:
| " It actually knows what a giraffe is."
|
| No. You know what a giraffe is, Dall*E simply creates pixel
| groups which correlate to the text pattern you submitted.
|
| Watching people discuss a logical mirror scares me that most
| people are not themselves conscious.
| rhdunn wrote:
| There's an interesting Google TechTalk on YouTube called
| "The Next Generation of Neural Networks" from 2007 [1]. In
| that video, there is a demo [2] that showcases the neural
| network recognizing numbers when given a drawing of a
| number as inputs, but more interesting is the follow on
| where the demo shows fixing the output to a given number
| and letting the neural network generate the "input" image
| to show what it is thinking that number can be. That is a
| strong indication to me that that particular neural network
| has a good understanding of what the particular number
| glyphs look like, even if it does not know what those are
| conceptually or how they relate to each other and to
| mathematics -- that is, that the neural network would not
| be able to work out what number 42 is, how it relates to 37
| (i.e. 37 < 42), or how to manipulate those numbers (e.g. 2
| + 7 = 9).
|
| Dall-E will likely be similar in that it is effectively
| doing that perception step where you fix the text
| description from the classifier output and run that in
| reverse to show what the neural network is "seeing" when it
| is "thinking" about that given output. So it won't be able
| to describe features of a giraffe, or information about
| where they live, etc. but it will be able to show you what
| it thinks they look like.
|
| [1] https://www.youtube.com/watch?v=AyzOUbkUf3M [2]
| https://youtu.be/AyzOUbkUf3M?t=1293
| gojomo wrote:
| There's a lot of people who make this same argument -
| DALLE/GPT/etc is just a 'mirror' or 'parrot' - but they
| rarely make convincing supporting arguments.
|
| They just assert it as axiomatic, whistling-past all the
| ways that they themselves - unless they believe in
| supernatural mechanisms - are also the product of a finite
| physical-world system (a biological mind) and a finite
| amount of prior training input (their life so far).
|
| I'm beginning to wonder if the entities making this
| argument are conscious! It seems they don't truly
| understand the issues in question, in a way they could
| articulate recognizably to others. They're just repeating
| comforting articles-of-faith that others have programmed
| into them.
| mherdeg wrote:
| > I'm beginning to wonder if the entities making this
| argument are conscious! It seems they don't truly
| understand the issues in question, in a way they could
| articulate recognizably to others. They're just repeating
| comforting articles-of-faith that others have programmed
| into them.
|
| I took a quick look at the Stanford Encyclopedia of
| Philosophy entry for philosophical zombies (
| https://plato.stanford.edu/entries/zombies/ ) and I can't
| see evidence of this argument having been seriously
| advanced by professionals before. I think it would go
| something like:
|
| "Yes, we have strong evidence that philosophical zombies
| exist. Most of the laypeople who discuss my line of work
| are demonstrably p-zombies."
| gfodor wrote:
| Use twitter long enough and it's hard to not be convinced
| that p-zombies exist - it may be that such zombies are
| ephemeral though, and a transitory state of some
| individuals.
| FeepingCreature wrote:
| Alternately, of course, conscious awareness is the
| ephemeral state.
|
| ("A laser is trying to find the darkness...")
| martindbp wrote:
| Yes, and DALLE clearly knows what a giraffe is, maybe not
| in the same way as we do, but it has placed the giraffe
| in some latent space close to other animals like it but
| far away from say, cars. But yes, there is plenty it
| doesn't know about giraffes because it's only been fed
| images and captions, while we can watch documentaries
| about giraffes and apply knowledge we've acquired about
| other animals and the world through our senses.
| drdeca wrote:
| I think I'd say that it knows what a giraffe _looks like_
| and some aspects of their behavior. I'm not sure if it
| (in a sense) "knows that a giraffe is an animal" though I
| wouldn't be too surprised if it does.
|
| Though, I guess maybe that is all "knowing what a giraffe
| is" is?
| usrusr wrote:
| I know what a giraffe looks like and some aspects of
| their behavior. I couldn't build a giraffe, I've never
| been a giraffe, I wouldn't even know what to talk about
| if I met a giraffe on a blind date.
| saynay wrote:
| At some point, these discussions always get to some
| question like "what is 'knowing'?"
|
| Would a person who only ever read articles and looked at
| pictures of giraffes have a better understanding of them
| than Dall-e does? At some level, probably, in that every
| person will have a similar lived experience of _being_ an
| animal, a mammal, etc. that Dall-e will never share. Is
| having a lesser understanding sufficient to declare it
| has no real understanding?
| [deleted]
| joe_the_user wrote:
| _There 's a lot of people who make this same argument -
| DALLE/GPT/etc is just a 'mirror' or 'parrot' - but they
| rarely make convincing supporting arguments._
|
| I suggest you look at the parent article.
|
| Defining "understanding" in the abstract is hard or
| impossible. But it's easy to say "if it can't X, it
| couldn't possibly understand". Dall-E doesn't manipulate
| images three dimensionally, it just stretch images with
| some heuristics. This is why the image shown for "a cup
| on a spoon" don't make sense.
|
| I think this is a substantial argument and not hand-
| waving.
| gojomo wrote:
| In many of its outputs, it's doing far more than
| stretching, and essentially rotating new entities into
| scene-appropriate relationships.
|
| True, it has some problems fully abstracting, and then
| logically-enforcing, some object-to-object relationships
| that most people are trivially able to apply as
| 'acceptance tests' on candidate images. That is evidence
| its scene-understanding is not yet at human-level, in
| that aspect - even as it's exceeded human-level
| capabilities in other aspects.
|
| Whether this is inherent or transitory remains to be
| seen. The current publicly-available renderers tend to
| have a hard time delivering requested meaningful text in
| the image. But Google's PARTI claims that simply growing
| the model fixes this: see, for example:
| https://twitter.com/gojomo/status/1540095089615089665
|
| We also should be careful using DALL-E as an accurate
| measure of what's possible, because OpenAI has
| intentionally crippled their offering in a number of ways
| to avoid scaring or offending people, under the rubric of
| "AI safety". Some apparent flaws might be intentional, or
| unintentional, results of the preferences of the
| designers/trainers.
|
| Ultimately, I understand the practicality of setting
| tangible tests of the form, "To say an agent
| 'understands', it MUST be able to X".
|
| However, to be honest in perceiving the rate-of-progress,
| we need to give credit when agents defeat all the point-
| in-time MUSTs, and often faster than even optimists
| expected. At that point, searching for new MUSTs that
| agent fails at is a valuable research exercise, but
| retroactively adding such MUSTs to the definition of
| 'understanding' risks self-deception. "It's still not
| 'understanding' [under a retconned definition we
| specifically updated with novel tough cases, to comfort
| us about it crushing all of our prior definition's
| MUSTs]." It obscures giant (& accelerating!) progress
| under a goalpost-moving binary dismissal driven by
| motivated-reasoning.
|
| This is especially the case as the new MUSTs increasingly
| include things many, or most, humans don't reliably do!
| Be careful who your rules-of-thumb say "can't possibly be
| coceptually intelligent", lest you start unpersoning lots
| of humanity.
| joe_the_user wrote:
| I read a modest amount of popular "AI safety" text
| (lesswrong basically) and this is the first time I've
| heard of Dall-E being crippled for safety reasons.
|
| Your argument overall seems to take "you skeptics keep
| moving the bar, give me a benchmark I can pass and I'll
| show you", which seems reasonable on it's face but I
| don't think actually works.
|
| The problem is that while algorithm may be defined by
| theory and tested by benchmark, the only "definition" we
| have for general intelligence except "what we can see
| people doing". If I or anyone had a clear, accepted
| benchmark for general intelligence, we'd be quite a bit
| further towards creating it but we're not there.
|
| That said, I think one thing that current AIs lack is an
| understanding of it's own processing and an understanding
| of the limits of that processing. And there are many
| levels of this. But I won't promise that if this problem
| is corrected, I won't look at other things. IDK,
| achieving AGI isn't like just passing some test, no
| reason it should be like that.
| weego wrote:
| I guess ironically you've just recreated the argument
| people of faith use to shout down logic used to question
| the reality of their belief system.
|
| The burden of proof is not on the one claiming logically
| consistent interpretations of events.
| gojomo wrote:
| Huh? That's a bit abstract, can you clarify?
| mlcrypto wrote:
| Yep human brain is just neurons firing or chemicals
| activating
| meroes wrote:
| And don't blame others for not finding this satisfactory
| either. Many many mathematicians think abstract objects
| exist outside of physical reality. Chomsky says
| "physical" hasn't been well defined for 200 years. And
| finite physical processes do not seem adequate to explain
| the infinite character of language and mathematical
| infinity. Or if they are, then go inform the mathematical
| realists infinity isn't real because all proofs are
| finite yet Godel believed in actual infinity.
| bglazer wrote:
| I can give a random string of letters as a prompt and
| DALLE will generate coherent images based on that. To me,
| that is as clear a signal as any that there is no
| reasoning or even a consistent world model embodied in
| DALLE. It's simply a high dimensional latent mapping
| between characters and pixels. Like OP said, that is a
| stupendous achievement, but it is just a very complex and
| impressive mirror. If it wasn't just a mapping between
| characters and pixels, and instead DALLE had intelligence
| that "understood" the symbols it manipulated, then I
| would expect it to generate nothing, or white noise in
| response to random letters.
| elcomet wrote:
| I agree there's no high level reasoning (if there's a way
| to define this).
|
| But saying "It's simply a high dimensional latent mapping
| between characters and pixels" is clearly a very bad
| argument. Your brain is simply a high dimensional latent
| mapping between your sensory input and your muscular
| output. This doesn't make you not intelligent.
| thfuran wrote:
| >Your brain is simply a high dimensional latent mapping
| between your sensory input and your muscular output
|
| It definitely does more than that.
| elcomet wrote:
| What do you mean? Your brain _is_ a high dimensional
| function. I 'm not talking about what it is able to do,
| I'm talking about what it is.
| gojomo wrote:
| If you give humans gobbledygook input - say a Rorschach
| blot, or some random syllables - and ask them to describe
| what it makes them think of, or generate art from it,
| many will do so quite competently.
|
| They'll find some morsels of fragmentary hints-of-meaning
| in the junk, or just act from whatever's bouncing around
| in their own 'ground state', and make something
| interesting & coherent, to please their interlocutor.
|
| So I don't see why this corner-case impugns the level-of-
| comprehension in DALLE/etc - either in its specific case,
| nor in the other cases where meaningful input produces
| equally-meaningful responses.
|
| In what ways are you yourself not just a "very complex &
| impressive mirror", reflecting the sum-of-all external-
| influences (training data), & internal-state-changes,
| since your boot-up?
|
| Your expectationthat random input should result in noise
| output is the weird to me. _People_ can see all sorts of
| omens & images in randomness; why wouldn't AIs?
|
| But also: if you trained that expectation into an AI, you
| could get that result. Just as if you coached a human, in
| a decade or 2 of formal schooling, that queries with less
| than a threshold level of coherence should generate an
| exceptional objection, rather than a best-guess answer,
| you could get humans to do so.
| bglazer wrote:
| I think most people would find it more difficult to draw
| a "serkle nufding" than a "spotted dog". At the very
| least, they would need an explanation that you're looking
| for whatever pops into their head for the nonsense.
| Again, that indicates that human intelligence works in a
| fundamentally different way from DALLE, which has
| absolutely no idea what is nonsense. So, in this
| incredibly important task (recognizing nonsense) there's
| an asymmetry: people can act like DALLE, but DALLE can
| not act like humans. To me, that totally breaks any
| equivalence between DALLE and human intelligence.
| dtech wrote:
| Then again, if you lock a human in a room, tell it "draw
| me things based on the notes you receive or you will be
| killed", and give it a note "serkle nufding", you bet the
| human will draw _something_.
|
| Given the model has only 1 input and 1 output and
| training is essentially surviving that order, it's not
| dissimilar.
| gojomo wrote:
| I've not tested this, but I suspect you can get DALLE to
| create pictures that humans are more likely to describe
| as 'nonsense' by adding 'nonsense' or 'nonsensical' to
| the prompt. That'd indicate that it absolutely _does_
| have an idea about 'nonsense' and can recognize, &
| reproduce within its constrained output, 'nonsense'
| that's largely compatible with human ideas of what
| 'nonsense' is.
|
| Asking it to produce _noise_ , or raise an objection that
| a prompt isn't sufficiently meaningful to render, is a
| silly standard because it's been designed, and trained,
| to always give _some_ result. Humans who can object have
| been trained differently.
|
| Also, the GPT models - another similar train-by-example
| deep-neural architecture - can give far better answers,
| or give sensible evaluations of the quality of its
| answer, _when properly prompted to do so_. If you wanted
| a model that 'd flag nonsense, just give it enough
| examples, and enough range-of-output where the answer
| your demanding is even _possible_ , and it'll do it.
| Maybe better than people.
|
| The circumstances & limits of the single-medium (text, or
| captioned image) training goals, and allowable outputs,
| absolutely establish that these are _different_ from a
| full-fledged human. A human has decades of reinforcement-
| training via multiple senses, and more output options,
| among other things.
|
| But to observe that _difference_ and conclude these
| models don 't "understand" the concepts they are so
| deftly remixing, or are "just a very complex and
| impressive mirror", does not follow from the mere
| difference.
|
| In their single-modalities, constrained as they may be,
| they can train the equivalent of a million lifetimes of
| reading, or image-rendering. Objectively, they're
| arguable now better at composing college-level essays, or
| rendering many kinds of art, than most random humans
| picked off the street would be. Maybe even better than
| 90% of all humans on earth at these narrow tasks. And,
| their rate of improvement seems only a matter of how much
| model-size & training-data they're given.
|
| Further: _the narrowness of the tasks is by designers '
| choice, NOT inherent to the architectures._ You could -
| and active projects are - training similar multi-modality
| networks. A mixed GPT/DALLE that renders essays with
| embedded supporting pictures/graphs isn't implausible.
| londons_explore wrote:
| I just got Dalle-2 to draw a "serkle nufding". It gave me
| 4 pictures of rather odd looking food. I too assume that
| random words are usually food, if only because restaurant
| menus seem to have such a love of making up foreign
| sounding words.
| dane-pgp wrote:
| > a love of making up foreign sounding words
|
| One great example of this phenomenon is "Haagen-Dazs".[0]
|
| Admittedly that's a brand name, rather than a specific
| dish, but I assume that Dalle-2 would generate an image
| of ice cream if given a prompt with that term in it
| (unless there is a restriction on trademarks?).
|
| [0] https://funfactz.com/food-and-drink-facts/haagen-
| dazs-name/
| caconym_ wrote:
| When I read the words "serkle nufding" I pictured in my
| head, with zero perceptible delay, a weird little goblin
| guy with pale green skin and long ears and a pointy hat.
| sarchertech wrote:
| Serkle Nufding sounds like a word. What about
| "xywjlkbtywlsjjsnskksbrjsksbsysjsnsjdkdmdnd"?
| caconym_ wrote:
| Honestly? The top half of a signpost bearing the name of
| a Welsh village in white on green (likely an artifact of
| my actual local signage standards). The tops of some
| broadleaf deciduous trees, possibly ash, are visible in
| the background, but the perspective is from below (as one
| usually looks up at a street sign, also probably an
| artifact of my local experience with signs rather than a
| true representation of Welsh signage) and so the
| background is dominated by a blue sky.
|
| This is probably because your string is a low-entropy
| keyboard-mash.
| simondw wrote:
| > they would need an explanation that you're looking for
| whatever pops into their head
|
| Of course. And the equivalent of that explanation is
| baked into DALL-E, in the form of its programming to
| always generate an image.
|
| > but DALLE can not act like humans
|
| No, not generally, but I don't think anyone has claimed
| that.
| bglazer wrote:
| What else always produces an image, no matter what you
| present to it?
| simondw wrote:
| Lots of things, but I'm not sure why you're asking that.
| Could you connect that back to the point you're making?
| saynay wrote:
| I am not really sure that this is a basis to say human
| intelligence is fundamentally different. If you ask a
| human to draw a picture based on nonsensical input, they
| probably wont. That might not be a difference in process,
| but in agency.
|
| So is a single-purpose AI equivalent to the entirety of
| the Human Experience? Of course not. But can it be
| similar in functionality to a small sliver of it?
| IIAOPSW wrote:
| I've done my share of testing randomish prompts. Here's
| two examples that might change your mind slightly. The
| first example is "E sublimary widge fraus chronly becross
| yon corpti." I constructed this sentence to resemble the
| grammar, syllables and pacing of English, but it is
| complete nonsense. The output was both consistent and
| coherent. In the case of DallE, it generated church and
| saint related images. The second example was ".- / ... .
| -- .- .--. .... --- .-. . / .-.. .- -- .--. / .- -... ---
| ...- . / - .... . / -.-. .-.. .- ... ... --- -. / .- ...-
| . -. ..- . / ... ..- -... .-- .- -.-- / ... - .- - .. ---
| -." There are clearly not enough (any?) samples of morse
| code text in the training. The output was wholly
| unrelated to the prompt. The output is kind of like
| doodling. Its fairly random, fairly undirected without
| any real theme, but its not the same randomness as TV
| static. Its more like randomness from a distribution of
| things people would draw when bored.
|
| So the interesting part is this, why did one random
| prompt fail in a consistent way and the other in a random
| way? Perhaps the encoding of meaning into vocabulary has
| patterns to it that we ourselves haven't noticed. Maybe
| your random string experiment works because there is some
| amount of meaning in the syllables that happened to be in
| your chosen string.
| rhdunn wrote:
| I suspect it is designed to describe images, like you
| would find in alt text but run in reverse. As your first
| example is closer to English it can do a better job than
| with the second example, like how the Jabberwocky poem
| contains nonsense words but still makes some degree of
| sense.
|
| For example, the "yon" is immediately reconizable to me
| (hither and yon), so "yon corpti" could mean a distant
| corpti (whatever a corpti is). "becross" looks similar to
| "across" but with a be- prefix (be-tween, be-neath, be-
| twixt, etc.), so could be an archaic form of that.
| "chronly" could be something time related (chronos+ly).
| etc...
| gojomo wrote:
| Indeed - the 'cadence of English' prompt reminds me of
| Latin, or some other foreign-litany that's nonetheless
| repeated to an audience that's not following every word.
| And ~IIAOPSW has used real word-roots, hinting at
| meanings! (The FastText algorithm does a passable job at
| synthesizing word-vectors for unknown words using such
| fragments.) So rendering that as a religious scene isn't
| too surprising.
|
| That morse code gives nothing useful probably just
| indicates some combination of - (a) few morse transcripts
| in training set; (b) punctuation-handling in training or
| prompting - makes it more opaque. It's opaque to me,
| other than recognizing it's morse code.
| semi-extrinsic wrote:
| Your first random prompt is far from random. It contains
| the fragments "sublim", "chr", "cross" and "corpt" in
| addition to the isolated "E", which all project the
| solution down towards Latin and Christianity.
|
| You can easily see that these language models are in some
| sense working on fragments as much as they are on the
| actual words isolated by spaces in your sentence. Just
| take a test sentence and enter as a prompt to get some
| images. Then take that same sentence, remove all spaces
| and add new spaces in random locations, making gibberish
| words. You will see that the results will retain quite a
| few elements from the original prompt, while other things
| (predominantly monosyllables) become lost.
|
| To me, I have not seen a single example that cannot just
| be explained by saying this is all just linear algebra,
| with a mind-bogglingly huge and nasty set of operators
| that has some randomness in it and that projects from the
| vector space of sentences written in ASCII letters onto a
| small subset of the vector space of 1024x1024x24bit
| images.
|
| If you then think about doing this just in the "stupid
| way", imagine you have an input vector that is 4096 bytes
| long (in some sense the character limit of DALL-E 2) and
| an output vector that is 3 million bytes long. A single
| dense matrix representing one such mapping has 6 billion
| parameters - but you want something very sparse here,
| since you know that the output is very sparse in the
| possible output vector space. So let's say you have a
| sparsity factor of somewhere around 10^5. Then with the
| 3.5 billion parameters of DALL-E 2, you can "afford"
| somewhere around 10^5 such matrices. Of course you can
| apply these matrices successively.
|
| Is it then so far fetched to believe that if you thought
| of those 10^5 matrices as a basis set for your
| transformation, with a separate ordering vector to say
| which matrices to apply in what order, and you then spent
| a huge amount of computing power running an optimizer to
| get a very good basis set and a very good dictionary of
| ordering vectors, based on a large corpus of images with
| caption, that you would not get something comparably
| impressive as DALL-E 2?
|
| When people are wowed that you can change the style of
| the image by saying "oil painting" or "impressionist",
| what more is that than one more of the basis set matrices
| being tacked on in the ordering vector?
| Joeri wrote:
| Dall-e must generate an image, even for a random prompt
| of letters, because it has been trained to do only that.
| Take a human, hold a gun to their head, and force them to
| paint an image for a random string of letters, and I
| guarantee they will make you something.
|
| Retrain dall-e and give it a choice whether it generates
| an image or does something else, and you will get a
| different outcome.
|
| The argument boils down to this: is a human brain nothing
| but a mapping of inputs onto outputs that loops back on
| itself? If so the dall-e / gpt-3 approach can scale up to
| the same complexity. If not, why not?
| snapcaster wrote:
| How do you know you know what a giraffe is?
| logifail wrote:
| > How do you know you know what a giraffe is?
|
| Perhaps a more interesting question could be: [how] do we
| know what consciousness is?
| twoxproblematic wrote:
| gfodor wrote:
| Not exactly. If you're using the word "simply" you probably
| don't quite understand what exactly DALL-E is doing. It's
| not just mapping correlations between words to pixels. The
| intermediate representations and textual attention are what
| make DALL-E interesting.
| jvanderbot wrote:
| Precisely this! You don't know what a giraffe is, either.
| But you can talk about a giraffe to yourself, using words
| with correlating imagery that you can "see" by feeding
| through your visual cortex, and that's close enough.
| EricBurnett wrote:
| Thought experiment: between you and the AI, which would do
| a better job depicting a giraffe skeleton? A giraffe in
| it's natural habitat? Their favorite tree to eat? Species
| on the genetic tree closest to giraffes?
|
| If we assume this AI or a successor can win that
| evaluation, in what way would you say you know what a
| giraffe is better than the AI?
| adamsmith143 wrote:
| Comprehension does not imply consciousness or vice versa...
| chaxor wrote:
| There are far too many that try to use this argument
| without really knowing what they're talking about.
|
| *Don't mistake skepticism for knowledge*
|
| This is a major problem on this site and elsewhere.
| elcomet wrote:
| > You know what a giraffe is, Dall*E simply creates pixel
| groups which correlate to the text pattern you submitted.
|
| How would you tell the difference though? Can you think of
| a test to distinguish between those two abilities ?
| ramoz wrote:
| Because you've mentioned Google, here's this:
| https://imagen.research.google/
| danielmorozoff wrote:
| Where do you see DALLE automating away jobs?
| soderfoo wrote:
| Augmenting jobs is more likely, from what I have heard. It
| would be useful for rapid prototyping of artwork and could
| help designers get design specs to artists quicker.
|
| The initial impressions are that production ready designs
| will require an artists touch. But this discussion was in the
| context of AA-Games and up. For indie and small games fully
| AI produced art may be fine.
| robertlagrant wrote:
| There was a post on here recently on how someone designed
| their logo by asking DALL-E a load of questions. The
| rationale could be that it might not remove that job, but it
| would certainly be a powerful tool for a logo designer to
| take on way more clients, thus driving down the cost of logo
| design.
| lancesells wrote:
| If I'm thinking of the same post that logo is poorly done /
| chosen. I know the author wanted very specific imagery in
| it and achieved that but I don't think that logo would be
| considered "good" by most.
| arbitrage wrote:
| That was the first instance. These sorts of things only
| get better over time.
| odessacubbage wrote:
| i feel like this boils down to a client-side
| misunderstanding of where graphic designers actually spend
| the majority of their time. i'd actually argue that working
| from spec or even a crude doodle on the back of a napkin is
| a lot easier than dealing with a client who is married to a
| poor design that DALL-E puked out for them. one of the most
| important things about the sketch phase is to be able to
| iterate without too many assumptions while also being able
| to recognize and play upon the strengths of individual
| drawings. this is not a time consuming process but it also
| isn't something you want to rush anymore so than you'd want
| to rush the plastic surgeon performing your nose job.
| depersonalizing the design process in favor of volume does
| not particularly serve anyone and, moreover, it responds to
| a need that i don't really think exists. this is not to say
| that i don't think these kinds of image generators are
| without uses, but right now we are still in the phase where
| people are being impressed by graphical fidelity[which was
| already incredibly cheap] rather than usefulness. imo this
| stage of ai will primarily work well for tasks that benefit
| from randomness while also being relatively low stake in
| the overall production pipeline. training a network to
| generate typefaces or to randomize npc portraits for an
| open world game is precisely the sort of area where 'gluing
| things together' can potentially provide more benefits than
| headaches.
| robertlagrant wrote:
| I was saying a designer could use it for first draft
| ideas, not their client.
| laxd wrote:
| I just cancelled my plan of paying a graphics designer to
| prepare an android app (personal project) for launch. After
| playing with dall-e I'm confident that "I" can make a logo
| and some artwork "myself". Edit: ment too say icon, not logo.
| IshKebab wrote:
| The people who do digital art illustrations for magazine
| articles are surely out of a job?
| elondaits wrote:
| I think it'll start by removing the need for editorial
| illustration and photography for a lot of blogs and websites,
| starting with the less reputable ones.
|
| MidJourney is already good enough creating illustrations for
| articles, book covers, etc. Not something that The New Yorker
| will be interested in, but better than anything you can buy
| for a couple of dollars or small monthly subscription.
| notahacker wrote:
| > I think it'll start by removing the need for editorial
| illustration and photography for a lot of blogs and
| websites, starting with the less reputable ones.
|
| Stock photo sites and Google Images and basic graphic
| filters did that a _long_ time ago. Sure, DALL-E has
| certain creative possibilities that those don 't, and the
| niches like satirical memes or "first pass" concepts in
| agency brainstorming meetings where it could be absolutely
| revolutionary tend to not be deliverables the creators get
| paid for. The publications giving illustrators a reliable
| income are usually after their consistent style, fantasy
| world coherence or novel ideas which isn't exactly where
| DALL-E shines.
| Workaccount2 wrote:
| DALL-E is the writing on the wall for those with a
| competing talent to start looking for other (non-art)
| work they may want to do in the future. It would be an
| egregious error to think that AI art programs are some
| new tool to learn akin to photoshop. Maybe that will hold
| true for a few years at best.
|
| There will come a point, much sooner than later, where
| the value in typing prompts into an AI is going to only
| be worth minimum wage.
| notahacker wrote:
| Substitute "the camera" (and perhaps even more "the
| gramophone") for "DALL-E" and "AI art programs", and the
| 19th century wants its argument against the future of
| creative media back.
| IshKebab wrote:
| I agree. This paragraph is baffling:
|
| > DALL-E's difficulty in juxtaposing wildly contrastive image
| elements suggests that the public is currently so dazzled by
| the system's photorealistic and broadly interpretive
| capabilities as to not have developed a critical eye for cases
| where the system has effectively just 'glued' one element
| starkly onto another, as in these examples from the official
| DALL-E 2 site:
|
| Yes the public is so dazzled by this massive leap in capability
| that it hasn't developed a critical eye for minor flaws.
|
| Yeah we get it. It's not instantly perfect. But the fact that
| people aren't moaning that it can't put a tea cup in a cylinder
| isn't because everyone stupidly thinks it is perfect, it's
| because not everyone is a miserable naysayer.
| RC_ITR wrote:
| "Don't people realize the sunset would be more beautiful if
| you run it through saturation filters and do white
| balancing?"
| KingOfCoders wrote:
| That's called creativity.
| IIAOPSW wrote:
| I have a phrase I'd like to coin in contrast to AI. "Artificial
| Bullshit". AB.
|
| I of course mean "bullshit" in the highly technical sense defined
| by Frankfurt [1]. The defining feature that separates a
| bullshitter from a liar is that a liar knows and understands the
| truth and intentionally misrepresents the matters of fact to
| further their aims, whereas a bullshitter is wholly unconcerned
| with the truth of the matters they are discussing and is only
| interested in the social game aspect of the conversation.
| Bullshit is far more insidious than a lie, for bullshit can (and
| often does) turn out to be coincident with the truth. When that
| happens the bullshitter goes undetected and is free to infect our
| understanding with more bullshit made up on the spot.
|
| DallE generates the images it thinks you want to see. It is
| wholly unconcerned with the actual objects rendered that are the
| ostensible focus of the prompt. In other words, its bullshitting
| you. It was only trained on how to get your approval, not to
| understand the mechanics of the world it is drawing. In other
| words, we've trained a machine to have daddy issues.
|
| A profoundly interesting question (to me) is if there's a way to
| rig a system of "social game reasoning" into ordinary logical
| reasoning. Can we construct a Turing Tarpit out of a reasoning
| system with no true/false semantics, a system only designed to
| model people liking/disliking what you say? If the answer is yes,
| then maybe a system like Dalle will unexpectedly gain real
| understanding of what it is drawing. If not, systems like Dalle
| will always be Artificial Bullshit.
|
| [1] http://www2.csudh.edu/ccauthen/576f12/frankfurt__harry_-
| _on_...
| adamsmith143 wrote:
| >In other words, its bullshitting you. It was only trained on
| how to get your approval, not to understand the mechanics of
| the world it is drawing.
|
| A system can learn to do all kinds of interesting things by
| trying to optimize getting rewards.
|
| See: https://www.deepmind.com/publications/reward-is-enough
| visarga wrote:
| What you call bullshit I call imagination. Both humans and AI
| need it. Humans use imagination to plan ahead. AlphaGo was
| generating moves to plan ahead.
|
| Dall-E and GPT-3 are not being used as agents, they are just
| tool AIs. They have a narrow task - generating images and text.
| Agents on the other hand need to learn how to act in the
| environment, while learning to understand the world at the same
| time.
| elefanten wrote:
| How did you get "imagination" out of gp's comment? They
| described something quite clearly and it's not reasonably
| summarized as "imagination".
|
| Per gp, bullshit is cynically self-interested pontificating.
| It's performance. Maybe you could say that the bullshit
| produced is imaginative, sometimes. But it has nothing to do
| with "imagination" as a simulation-like capability used for
| planning and learning.
| TremendousJudge wrote:
| As far as I know, the human brain is just a "social game
| reasoning" optimizer, that we try (and fail) to use to do
| actual logical reasoning. The zillion cognitive biases we have
| are the clue: we don't do logic, we have biases and sometimes
| stumble upon logic.
| [deleted]
| gfodor wrote:
| DALL-E either doesn't generate images you want to see, or if it
| does, it does a bad job, because it generates many images you
| don't want to see.
|
| In other words, the claim you've set up is basically
| unfalsifiable, given that thre's no way to form strong
| counterevidence from its outputs. (I would argue that if there
| was, we'd already have it in the vast majority of outputs that
| aren't images people want.)
|
| If I were to refine what you're saying, is that DALL-E is
| constrained to generating images that _make sense_ to the human
| visual system in a coherent way. This constraint is a far cry
| from what you need to be able to lift it up to claim it is
| "bullshitting" though, since this constraint is at a very low
| level in terms of constraining outputs.
| ___rubidium___ wrote:
| I think you're right, but I would qualify that the AI is
| bullshitting in the same way that a child's drawing of a stick
| figure, house, and smiling sun is bullshit designed to get
| approval. The AI is giving symbols--very visually stunning
| ones, to be sure, but symbols nonetheless--of what it is
| prompted to create, just like a child learns that "circle with
| lines coming out of it" is a symbol that can be read as "sun"
| and praised by adults.
| IIAOPSW wrote:
| The reason for my final question at the end is this: can AI
| emerge out of AB? The child eventually grows up and
| (hopefully) stops bullshitting everything. But, do they
| actually stop bullshitting, or is the human ability to reason
| itself somehow built out of bullshit?
| ___rubidium___ wrote:
| That's a very good question. The child eventually learns to
| stop using symbols and instead begins to copy the shapes,
| negative spaces, curves, etc. in front of them and makes an
| assemblage of otherwise-meaningless shapes, lines, curves
| which is interpreted by viewers as a representation of _the
| real thing_. Does AI have the potential to do that? I don
| 't know.
| IshKebab wrote:
| Nobody grows out of your definition of bullshit - doing
| things that they think others want to see.
|
| But that definition is ... well bullshit. Bullshitting is a
| deliberate deceptive act. Children aren't being
| deliberately deceptive when they come up with nonsense
| answers to questions they don't understand.
| notahacker wrote:
| I don't think the child's drawing of a stick figure, house
| and smiling sun is just "bullshit designed to get approval"
| though. I think they care deeply about the concept of mummy,
| and how mummy relates to various things especially them and
| have strong emotional impulses towards mummy. Learning that a
| stick figure is an symbolic representation of mummy that
| might meet with approval is a trivially small part of that
| intrinsic interest in mummy, not the extent of their
| engagement with the concept of mummy. And if you tell them
| you don't approve of their drawing, they'll burst into tears
| and possibly have a completely different attitude towards
| drawing and/or you, not just iterate on the bullshit.
| Similarly, the kid's concept of mummy being happy or sad is a
| bit more nuanced than which way up the crescent under the two
| dots should be....
| [deleted]
| thfuran wrote:
| >Bullshit is far more insidious than a lie, for bullshit can
| (and often does) turn out to be coincident with the truth. When
| that happens the bullshitter goes undetected and is free to
| infect our understanding with more bullshit made up on the
| spot.
|
| If the bullshit is turning out to be true, what's the issue
| with more of it? If it's not true but still believed and so
| causing problems, what's the practical difference between it
| and an undetected lie that makes it more insidious?
| IIAOPSW wrote:
| A lie is anti-correlated with the truth. Bullshit is
| uncorrelated with the truth. The problem with trusting a coin
| flip because it happened to be right this time is that now
| you trust the coin a source of information, when really its a
| source of entropy. The lie will be found out just as soon as
| the truth comes out, whereas the bullshitter might get enough
| lucky flips in a row that people start taking their advice
| seriously.
| garyrob wrote:
| I agree, but I disagree about one aspect. For the most part,
| humans don't use reason all that much or all that deeply. We
| usually use intuitive thinking, and there is research showing
| that immediate intuitive responses are often better than the
| result of long thinking. More negatively, is QAnon belief, or
| even Trump election claim belief, about reason? Or is it about
| associations between words and concepts, especially when those
| concepts are believed in by the people the believer tends to
| trust and associate with?
|
| In other words, the takeaway here may not be that GPT-3 spews
| bullshit. It may be that most of the time, human "thinking" is
| a less-nuanced, biological version of GPT-3.
| thaw13579 wrote:
| To me, Dall-E seems analogous to a film production team that
| produces visual imagery reflecting a script written by a
| screenwriter. By the above reasoning, would that team would be
| producing "bullshit"? I think most people would think not,
| because the goal isn't to communicate objective truth about the
| world, rather something plausible, interesting, entertaining,
| etc. (unless it is a documentary).
|
| I also think distinguishing bullshit from lying depends heavily
| on internal mental thoughts, goals, and intentions. Isn't
| talking about Dall-E this way personification and ascribing
| some level of consciousness?
| croes wrote:
| The production knows was it puts together Dall-E seems not
| thaw13579 wrote:
| Exactly my second point. Producing "bullshit" requires some
| internal thoughts and intentions, as the parent put it
| "DallE generates the images it _thinks_ you want to see "
| (emphasis mine). A very big claim, which is interesting to
| consider, but probably premature.
|
| However, I do think that Dall-E is able to learn complex
| high-order statistical associations, i.e. beyond just
| juxtaposing and visually blending objects. For a recent
| example, this post with a prompt "ring side photo of
| battlebots vs conor mcgregor":
|
| https://twitter.com/weirddalle/status/1554534469129871365
|
| What is amazing here is the excessive blood and gore. That
| feature can't be found in any individual battlebot or MMA
| match, but it is exactly what you would expect from robots
| fighting a person. Pretty amazing, and I wonder at what
| point we could consider this analytical reasoning.
| dougmwne wrote:
| I reached essentially the same conclusion after playing with
| GTP-3 for a while. It spins out mountains of convincing and
| impressive bullshit, but you can't actually trust anything it
| says because it is disconnected from right, wrong, correct and
| incorrect. Reading too much of what it outputs is dangerous
| because it basically is feeding white noise into your
| perception and experience of the world.
| acoard wrote:
| I'm a big fan of Frankfurt's "On Bullshit", and love the
| reference.
|
| I think there's one significant distinction between a normal
| human bullshitter that Frankfurt originally envisioned, and the
| AI practicing Artificial Bullshit. The bullshitter knows there
| is truth and intentionally disregards it; whereas the AI is
| blind to the concept. I guess this is "mens rea" in a sense,
| the human is conscious of their guilt (even if they're
| apathetic towards it), whereas DALL-E is just a tool that does
| what it is programmed to do.
|
| I do like this application of "bullshit" though, and will keep
| it in mind going forward.
| IIAOPSW wrote:
| Interesting, but I have two follow ups
|
| 1. What are the implications of intentionally disregarding
| the existence of truth vs being blind to the concept? How
| does this distinction you made manifest?
|
| 2. Are you sure all humans actually believe in the concept of
| truth, or could it be the case that some people genuinely
| function on a principle "there is no truth, only power". Is
| it possible to think "truth" and "current dominant narrative"
| are one in the same?
|
| I've certainly had a ton of luck with Bullshit in Diplomacy.
| As Russia, I offered a plan that involved France helping me
| take Munich and I would repay by supporting him against the
| English invasion. Did I intend to actually follow through, or
| was this a cunning lie? Neither. It was bullshit that got me
| into Munich. I myself didn't know because (in game) I don't
| believe in the concept of truth. Everything I say is true and
| none of it is true. Its all true in the sense that it might
| happen and I put some weight on it, none of it is true in the
| sense that there is no branch of the future game tree
| privileged as "the truth". Some truths have more weight than
| others, but there is no underlying absolute truth value that
| must exist yet I choose to ignore. Eventually the order box
| forces me to pick a truth out of the many I have told. But
| prior to being forced, it didn't exist.
|
| Is it possible to think in this way all the time about
| everything? Maybe.
| samstave wrote:
| > _conscious of their guilt_
|
| This hits it for me:
|
| Consciousness is kinda "being aware of the fact that you have
| choices for available actions, and what the impact of these
| actions/non-actions will have on either yourself, your
| environment, the object of your action, or impact on others.
|
| Intelligence is being aware of the inputs and knowing the
| (non)available list of actions to take.
|
| Intelligence acts on stimuli/input/data?
|
| Consciousness is awareness of one's own actions from
| intelligence, others acts from their standpoint of
| intelligence or consciousness...
|
| A yin/yang, subjective/objective sort of duality that Humans
| make. (thought v emotion)
|
| Dogs are both intelligent and conscious. They know guilt when
| they are shamed or happiness when praised for intelligent
| actions..
| leereeves wrote:
| > It is wholly unconcerned with the actual objects rendered
| that are the ostensible focus of the prompt.
|
| I disagree. To the extent that the training data are images of
| actual objects, recreating images of actual objects is the only
| thing DALL-E cares about.
|
| If we define "caring" about something as changing behavior to
| cause that to happen, then a neural network doesn't "care"
| about inference at all, because inference never changes the
| network's behavior.
|
| It also doesn't know or care about your approval. It only cares
| about minimizing the loss function.
|
| (But now that you bring this up, I think it would be really
| interesting to create a network that, after training initially
| on training data, began interacting with people and continued
| training to maximize approval.)
| mjburgess wrote:
| Images of objects arent objects and measure only their 2D
| geometrical properties projected onto the image plane. These
| properties are correlated across images, inexplicably, since
| the machine has no access to the physical properties of those
| objects, nor the laws of physics.
| leereeves wrote:
| It's hypothesized (and, I think, likely) that the machine
| learns some of those hidden/latent properties, though,
| obviously, only to the extent that they are useful to
| recreate the images.
|
| In any case, it does at least care about the appearance of
| the actual objects. So I think it would be fair to say that
| there are aspects of the actual objects that the network
| doesn't care about, but there are also aspects that it
| cares very much about. Thus it's not "wholly unconcerned
| with the actual objects".
| MauranKilom wrote:
| > It's hypothesized (and, I think, likely) that the
| machine learns some of those hidden/latent properties,
| though, obviously, only to the extent that they are
| useful to recreate the images.
|
| Notably, this precisely describes humans too. We don't
| _know_ the "true" properties of anything we interact
| with. We just have models - some more sophisticated than
| others - but only to the extent that we care for
| reasoning about the objects. From "This stone is too
| heavy to lift." to "Oh yeah Becky is always late."
| IIAOPSW wrote:
| My gripe with this is that it has no understanding of
| what the objects are beyond "the thing that's supposed to
| be output". Its a textbook case of the Chinese room
| argument [1]. Sure it can follow translation rules, even
| infer some sophisticated rules and patterns about the
| slips of paper that are passed in, but it still has no
| idea what any of it means.
|
| https://en.wikipedia.org/wiki/Chinese_room
|
| Dalle is very concerned with the 2d shapes of things we
| call objects, and has correctly inferred some rules about
| those shapes, but it neither knows nor cares about the
| things we call objects and how the shapes it has learned
| are representations of them. It doesn't do reasoning
| about the round peg fitting in the round hole. It just
| glues a few pegs and holes together in a way that's
| mostly consistent with our expectations and says "is this
| what you wanted"?
|
| Its a student that cares about passing the test, not
| learning the material.
| leereeves wrote:
| Isn't that true of all of us? We all care selectively
| about certain attributes of objects.
|
| I care that my car is quiet and has comfortable seats, I
| don't care (or know) what material the muffler is made
| of, but somewhere there is an engineer who cared about
| that.
|
| A road designer cares what the tires are made of and how
| much it weighs, but doesn't care what color the paint is.
|
| An AI recreating an image of my car would care what color
| the paint is, but not how comfortable the seats are.
|
| I think I see what you're describing - the AI has a very
| limited scope and doesn't know or care about most of the
| things we do - but I think that's just a limitation of
| our current small models and limited training data, not
| an inherent limitation of neural networks.
| IIAOPSW wrote:
| You may not know what material the muffler is made of,
| but you know its made of a material. You "care about the
| objects themselves" in the sense of having this framework
| of objects, all of which have certain properties and
| relations to each other which form a predictive model you
| have about the world. You expect everything to have a
| position in x,y,z, to have a size, shape, mass, hardness
| etc. You might not care about a particular property of a
| particular object in a particular context, but you
| fundamentally think in the semantics of objects. You know
| what to expect when the heavy hard object meets the
| brittle lite object.
|
| DallE doesn't really have a system of the semantics of
| objects. It doesn't know why it would be useless for a
| building to have all the doors on the 2nd level. Its not
| even clear that DallE makes use of discrete "objects" in
| its limited understanding.
|
| Here's an example from budget DallE
|
| https://ibb.co/B2B5zky
|
| It understood the shape of "stick figure" and "boobs",
| but had no understanding of what a stick figure is meant
| to represent and thus where it should place the boobs.
| The results are hilarious. I'm not sure which I like
| more, the guy with a boob casually walking down his arm,
| or the lady with a boob head that's shrugging with
| uncertainty.
| icehawk wrote:
| Now explain how that's different than anything you look at.
|
| - Your visual system only has access to the 2D geometrical
| properties projected on your retina. The properties are
| correlated across images, inexplicably. (I certainly cannot
| explain how chairs are, in a fashion that includes all
| chairs I've encountered and excludes anything I've
| encountered that is not a chair)
|
| - Any other interaction is also a correlation.
|
| - Humans don't have access to the laws of physics, just
| reasonable approximations in certain contexts.
| Barrin92 wrote:
| >Now explain how that's different than anything you look
| at.
|
| For starters you just don't look at things, you're
| embedded in the world. You have sensory input far beyond
| visual information, you also have something akin to
| cybernetic feedback in response to your mechanical
| actions in the world, and DALL-E has not.
|
| In fact DALL-E doesn't even have access to visual
| information in the same sense you have, which is to a
| large extent biochemical and analog, not digital.
| Earw0rm wrote:
| Where this gets interesting is when the AI attempts to
| create animations or moving images - this requires a much
| deeper understanding of objects and their physical
| properties, even to create a highly stylised or "drawn"
| animation.
| burlesona wrote:
| Slight side tangent but reading this article it hit me how much
| this generation of work may be reinforcing English as the global
| language for generations to come. It seems like we are headed
| towards a phase of technology where learning how to feed well-
| crafted prompts into the AI system will be a highly valuable
| productivity skill. And since the major AI systems seem to be
| built around English, that would make English fluency even more
| valuable than it already is. I'm sure that's obvious to non-
| native speakers who have worked hard to master English, I just
| hadn't thought of it before.
|
| Less likely but still interesting, I wonder if the way we're
| building these models will at some point begin to layer on top of
| each other such that English as it is used _now_ becomes
| something deeply embedded in AI, and whether that will evolve
| with the spoken language or not. It's funny to imagine a future
| where people would need to master an archaic flavor of English to
| get the best results working with their AI helpers.
| affgrff2 wrote:
| But the same work is also removing the language barriers at the
| same time with really good translation tools. I rather guess
| being fluent in English will be not as important as it is now.
|
| Edit: the same work = transformer based language models
| Ajedi32 wrote:
| Perhaps, but another possibility is that the more advanced
| models all end up being polyglots. The state of the art in
| machine translation already uses a single model trained on
| multiple languages[1], which results in better translations
| between languages it doesn't have a lot of examples for. If the
| same principle applies to other types of models, then training
| them on every possible dataset available regardless of language
| might yield better results. That could result in models that
| are fluent in hundreds of languages. (I'd be curious as to
| whether DALL-E understands prompts in languages other than
| English, has anyone tried?)
|
| [1]: https://ai.googleblog.com/2019/10/exploring-massively-
| multil...
| visarga wrote:
| BLOOM is the most recent model trained on multi-lingual data
| - 46 natural languages and 13 programming languages.
| Workaccount2 wrote:
| I envision the skill of giving good AI prompts to be as short
| lived as the skill of T9 texting. Probably even shorter.
| TillE wrote:
| Right, I would expect the rise of a bunch of domain-specific
| tools within the next few years.
| planede wrote:
| I wonder if these models performed better if they were trained
| on an other language.
| qayxc wrote:
| I wonder the same thing. Many languages are far better at
| making precise statements due to stricter and more complex
| grammar. The reduction in ambiguity could help models perform
| better.
|
| Then there's also the problem of labelling training data.
| Most of the labelling and annotating is outsourced to
| countries with cheap labour and performed by non-native
| speakers, which leads to problems with mis-labelled training
| data.
| visarga wrote:
| DallE is trained on web scraped data, there is no manual
| labelling taking place except maybe to filter out some
| types of content.
|
| A few years ago we were training captioning models on
| manually labelled datasets (such as COCO captions), but the
| they were small and models were not too general.
| ffhhj wrote:
| > reinforcing English as the global language
|
| Which might also result in new speakers modifying English to
| their cultures (like Blade Runner's Cityspeak), and Global-
| English speakers not understanding "secret" foreign
| communication, so they might create new languages for their own
| subcultures, then relegating English as the new Latin for
| technical knowledge (Latin was kept by the Catholic Church).
| mtlmtlmtlmtl wrote:
| Also worth noting that the internet has massively accelerated
| the importance of English already.
|
| As an ESL speaker who grew up on the internet, Norwegian was
| more or less useless to me outside school and family. Most of
| my time was spent on the internet, reading and writing lots of
| English. Norwegian wikipedia is pretty much useless unless you
| don't know English. That's still true today for the vast
| majority of articles, but back then was universally the case.
|
| There were Norwegian forums, but with a population of just 4
| million and change at the time, they were never as interesting
| or active as international/American forums and IRC channels.
|
| In fact I'd say Norwegian is only my native language in spoken
| form, whereas English feels more natural to me to write and
| read. Doesn't help that Norwegian has two very divergent
| written forms, either.
|
| I even write my private notes in English, even though I will be
| the only one reading them.
| bearjaws wrote:
| Covid and remote work are going to make this even more true. At
| my new job I work with a Nigerian co-worker who lives in
| Nigeria, he speaks clear & proper English and earns an amazing
| salary for his area.
|
| If you live in a third world country, you could really benefit
| from remote work going forward and English will be a popular
| language to learn for that. That being said, I know some people
| will 'phone it in' and not speak as clearly, which will put
| them at a disadvantage.
| TheRealPomax wrote:
| "Yes". And that's why question headlines may not technically be
| click bait, but still qualify as something the world can do
| about.
| prometheus76 wrote:
| I want to see two DALL-E bots talk to each other. First bot:
| "Pete and Repeat were sitting on a fence. Pete fell off. Who was
| left?" Second bot: "Repeat" First bot: "Pete and Repeat were
| sitting on a fence. Pete fell off. Who was left?"
|
| What happens next?
| mkl wrote:
| I think you're thinking of GPT-3. DALL-E and DALL-E 2 generate
| images, not text (and they're very bad at images of text).
| prometheus76 wrote:
| Ah I'm sorry. Well, the curiosity remains!
| xornox wrote:
| At least there is much missing data for human like intelligence.
| If human would be trained with same input data, I think he/she
| would not understand relationships.
___________________________________________________________________
(page generated 2022-08-04 23:00 UTC)