[HN Gopher] Show HN: Postgres as a VectorDB GUI
       ___________________________________________________________________
        
       Show HN: Postgres as a VectorDB GUI
        
       Author : z-gort
       Score  : 148 points
       Date   : 2024-12-19 02:28 UTC (20 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | z-gort wrote:
       | lmk if anyone has any thoughts...if I could go back I may have
       | not gone through Electron
       | 
       | Doing dimensionality reduction locally posed a few challenges in
       | terms of application size--the idea was that by analyzing just a
       | few thousand randomly sampled points you can get an idea of your
       | data through a local GUI where you interact with your data and
       | see some correlated metadata.
       | 
       | Not sure if there's too much need for an individual GUI to go
       | along with Postgres as a VectorDB, maybe people just do analysis
       | separate from a normal "GUI"? But maybe not.
       | 
       | What you think?
        
         | maxchehab wrote:
         | Just some fast feedback, I can't copy & paste in the connection
         | url input form. On a mac.
         | 
         | Once loaded, I get the error "Table must contain a UUID column
         | for vector visualization."
         | 
         | I'm assuming it's trying to find an ID column for grouping? Can
         | we manually specify this? My ID columns are varchars.
        
           | garybake wrote:
           | Same here. I'm using langchain which creates a varchar id
           | column. It also has different collections on the same table.
        
       | thangngoc89 wrote:
       | As a non-native English speaker and not very familiar with vector
       | database, the title seems very ambiguous to me. I understand it
       | as Postgres as a GUI for some VectorDB. Upon closer inspection, I
       | realized that "Postgres as a VectorDB" is a full name. Maybe
       | shorten that thing to something else. Just my 2 cents.
        
         | colechristensen wrote:
         | It's just plain bad grammar, the title should be
         | 
         | "Show HN: Reservoirs Lab, a Postgres VectorDB GUI"
        
           | monsieurbanana wrote:
           | I think the confusing term is "VectorDB" which sounds like a
           | name of an existing product. "A vector db GUI powered by
           | Postgres"?
        
       | wenc wrote:
       | This is good, but could also be good to mention that you're using
       | umap for dimensionality reduction with cosine metric.
       | 
       | https://github.com/Z-Gort/Reservoirs-Lab/blob/main/src/elect...
       | 
       | Dimensionality reduction from n >> 2 dimensions to 2 dimensions
       | can be very fickle, so the hyperparameters matter. Your
       | visualization can change significantly significantly depending on
       | choice of metric.
       | 
       | https://umap-learn.readthedocs.io/en/latest/parameters.html
       | 
       | You may want to consider projecting to more than 2 dimensions
       | too. You may ask, how does one visualize more than two
       | dimensions? Through a scatterplot matrix of 2 axes at a time.
       | 
       | https://seaborn.pydata.org/examples/scatterplot_matrix.html
       | 
       | These are used for PCA-type multivariate analyses to visualize
       | latent variables in higher dimensions than 2, but 2 dimensions at
       | a time. Some clustering behavior that cannot be seen in 2 axes
       | might be seen in higher dimensions. We used to do this our lab to
       | find anomalies in high dimensions.
        
         | isoprophlex wrote:
         | About fickleness... indeed i've found this a kinda problematic
         | thing when running large-d text embeddings through umap -- it
         | always comes out spherical, blob-shaped, without any obvious
         | segregation in the low-d projected space.
         | 
         | IMO it's very difficult to make a "fire and forget" embedding
         | interpreter. Maybe I never found the right parameters to umap
         | but the results of running it (or any dimension reduction algo)
         | always left me a bit underwhelmed.
        
           | antman wrote:
           | Have you tried PaCMAP? It should be better and faster
        
             | wenc wrote:
             | Thanks for the pointer to PacMap.
             | 
             | I just tried it. My verdict?
             | 
             | PacMap >= UMAP >> t-SNE.
             | 
             | UMAP captures the basic pattern but PacMap makes it
             | crisper.
        
       | ddtaylor wrote:
       | Does this use pgVector?
        
         | z-gort wrote:
         | It lets you visualize any column with type "EMBEDDING", and I
         | think the only way to get that is through
         | pgvector/pgvectorscale.
        
       | gregncheese wrote:
       | I have yet to find a better tool than the old Tensorflow
       | projector: https://projector.tensorflow.org/
       | 
       | Granted, it requires to prepare your data into TSV files first.
        
         | wenc wrote:
         | That is indeed an excellent tool. Allows one to dynamically
         | adjust and recompute umap and t-sne.
        
       | redwood wrote:
       | Have folks seen https://atlas.nomic.ai/ <-- absolutely beautiful
       | vector visualization
        
         | dcreater wrote:
         | Proprietary hosted solution to gain as I uncover insights in my
         | data? Hard pass
        
       | samanthasu wrote:
       | That is excellent visualization!
        
       | dmezzetti wrote:
       | Very interesting, thanks for sharing!
        
       | paddy_m wrote:
       | README suggestions:
       | 
       | Put the animated gif at the top
       | 
       | Add subtitles to the gif explaining what you're doing.
        
         | dcreater wrote:
         | If I had a nickel for GUI/viz tools that bury the image/video
         | or straight up don't have it in the readme.. lends credence to
         | the popular opinion that engineers don't know how to
         | communicate
        
       ___________________________________________________________________
       (page generated 2024-12-19 23:01 UTC)