[HN Gopher] What are embeddings?
       ___________________________________________________________________
        
       What are embeddings?
        
       Author : Anon84
       Score  : 132 points
       Date   : 2023-06-25 16:27 UTC (6 hours ago)
        
 (HTM) web link (vickiboykis.com)
 (TXT) w3m dump (vickiboykis.com)
        
       | cubefox wrote:
       | An array of floats (an n-dimensional vector) which represents
       | some piece of data like a text or an image. Different embeddings
       | can be more or less close to each other, and this closeness
       | indicates similarity.
        
       | KRAKRISMOTT wrote:
       | OP you need to make it clear that it's a book. The website is
       | confusing.
        
         | gtirloni wrote:
         | There is a big green button that says "Get PDF" when I visit
         | it.
        
       | charcircuit wrote:
       | A mapping whose codomain is an N dimensional space.
        
         | moralestapia wrote:
         | Don't know why you're downvoted since you're absolutely
         | correct.
         | 
         | They're just locality-sensitive hash functions.
        
           | corobo wrote:
           | I'd imagine it's because it's the title of the thing being
           | linked to, not an actual question to be answered.
           | 
           | Farting out a quick one line answer is a boring comment
        
           | layer8 wrote:
           | Those seem like two very different definitions.
        
             | pizza wrote:
             | lsh: similarity(x, y) < thresh => E[P(d(lsh(x), lsh(y)) <
             | 1)] > 1-eps for some eps
             | 
             | For a model that learns a good representation, and some
             | suitable distance function to measure distances between
             | embeddings (e.g. Euclidean)                   model:
             | similarity(x, y) < thresh => E[P(d(emb(x), emb(y)) < r] >
             | 1-eps
             | 
             | for some cutoff distance r in the vector space
        
       | neonate wrote:
       | Paper:
       | https://github.com/veekaybee/what_are_embeddings/blob/main/e...
        
       | [deleted]
        
       | sp332 wrote:
       | I'd like to find a way to start with an embedding and have the
       | computer generate some text that corresponds, at least
       | approximately. There are tools that do that for images, right?
       | Like Stable Diffusion, you can put an image in, get an embedding,
       | then do gradient descent in latent space to find a new embedding,
       | then generate a new image from that.
        
         | kreeben wrote:
         | The most basic of embedding (that I can think of) is one where
         | the number of dimensions corresponds to the number of unique
         | characters in your lexicon. If there is a number in one of the
         | components of your embedding that is greater than "0", then you
         | know what characters they are. This embedding does not encode
         | the order of the characters, though. They are just a "bag of
         | characters". If you were to then also encode the order of the
         | characters in, say, yet another embedding, you could use those
         | two embeddings to recreate the original word.
         | 
         | Combine the two embeddings into a new vector space and BAM,
         | you've invented "embedding2word".
        
           | jorlow wrote:
           | Gpt (and many others) just add these embeddings together in
           | the model, so you could do that and have one vector that
           | encodes both things together
        
         | freeone3000 wrote:
         | You can get this fairly trivially with word and sentence
         | embeddings just by running the inverse (huggingface models have
         | this as a builtin). For llama, the same is possible, but the
         | matrix transpose is your responsibility :)
        
           | sp332 wrote:
           | Ok, I got annoyed at oobabooga (around when it first came
           | out) and have been messing with llama.cpp since then. I can't
           | tell if this feature request is the same thing we're talking
           | about here?
           | https://github.com/ggerganov/llama.cpp/issues/1552 I guess if
           | I want more features, I should move on to something that has
           | more features lol.
        
             | TeMPOraL wrote:
             | Looking at that issue, I wonder how is it that everyone
             | seemed to not understand the poster's question.
             | 
             | The feature itself is something I wanted to play with too,
             | as it's kind of an obvious thing to want. I mean, these
             | models execute a pipeline:
             | 
             | [text] -> [tokens] -> <[embeddings] -> [inference] ->
             | [embeddings]> -> [tokens] -> [text]
             | 
             | Where the part in < ... > may or may not be implemented as
             | a single step (i.e. all three parts interleaved).
             | 
             | Now, apparently all the magic (not the "how transformer
             | works", but the "how the hell are they this good" / "GPT-4
             | is uncanny valley" kind of magic) of transformer models
             | sits in the latent space and is invoked by the < ... > bit.
             | We also know for sure that you can make the pipeline look
             | like this:
             | 
             | [text] -> [tokens] -> <[embeddings] -> [inference]> ->
             | [embeddings]
             | 
             | So with the two things in mind, it's kind of obvious you'd
             | also want a pipe that looks like:
             | 
             | [embeddings] -> [inference] -> [embeddings] (and optionally
             | -> [tokens] -> [text])
             | 
             | for the sole purpose of messing around and exploring the
             | latent space itself.
             | 
             | I'm very much not up to date with the whole space, so I
             | might be missing something, but I'd thought that poking
             | around the latent space would be getting _a lot_ more
             | attention than it seems to be getting.
        
         | danieldk wrote:
         | This is basically how RNN encoder/decoder architectures worked.
         | The encoder encoded the input as a single vector and the
         | decoder would decode this into text (eg. for machine
         | translation) [1]. However, fixed-length vectors generally
         | required too much 'compression' to represent variable-length
         | text, so people started adding attention mechanisms so that the
         | decoder could also attend to the input text. And the seminal
         | Transformer paper by Vaswani and others showed that you only
         | need an attention mechanism and you could ditch the RNN (hence
         | the title 'Attention is all you need') and here we are.
         | 
         | So, this has been possible already for quite a long time.
         | 
         | [1] https://arxiv.org/pdf/1409.3215.pdf
        
           | sdenton4 wrote:
           | To be sure, the seq2seq style already allowed multi length
           | embeddings before transformers were a thing. RNN vs
           | Transformer is entirely an implementation choice; you can
           | build a seq2seq model with any combination of transformer,
           | RNN, and conventional layers.
           | 
           | Each of these layer types have different computational costs
           | for training and inference, and encode different inductive
           | biases, which may be more or less appropriate to a given
           | problem.
        
       ___________________________________________________________________
       (page generated 2023-06-25 23:00 UTC)