[HN Gopher] Show HN: Map of YC Startups
       ___________________________________________________________________
        
       Show HN: Map of YC Startups
        
       Hey Everybody! Hope you had a merry christmas  Today I had a bit of
       fun with Claude.  Started by scraping YC's startups list, then ran
       them through OpenAI's embedding service, then UMAP'd the embedding
       to reduce the dimension to just two coordinates and then just
       forced Claude to write React that would compile to visualize that.
       I had fun and I think it's interesting, so take a look!  Also note
       that you won't be able to zoom on mobile (found about this Plotly
       limitation way too late). If there's interest I can fix this issue
       by changing plotting libs tomorrow :)  Merry christmas
        
       Author : yoouareperfect
       Score  : 86 points
       Date   : 2024-12-25 22:37 UTC (1 days ago)
        
 (HTM) web link (yc-map.vercel.app)
 (TXT) w3m dump (yc-map.vercel.app)
        
       | rrr_oh_man wrote:
       | Cool concept! What are the X and Y axes?
       | 
       | Oh, and your website has an unchanged Wordpress favicon...
        
         | tptacek wrote:
         | They're semi-arbitrary, dimensionally reduced from OpenAI
         | embedding vectors.
        
       | uncomplexity_ wrote:
       | hella nice mate very interesting
       | 
       | what's the x and y axes?
        
         | jerrygenser wrote:
         | they don't have meaning by themselves. they are two dimensions
         | that umap projected the original embeddings down to in order to
         | show a combination of local neighborhood similarity or closenes
        
           | gavmor wrote:
           | Well, they _do_ have meaning by themselves, but it 's more
           | work to figure that out. All regular, predictable
           | relationships "have" meaning because all meaning is
           | _prescribed_. And since we 've captured many such
           | prescriptions in LLMs, they can do a decent job approximating
           | those.
        
       | tmshapland wrote:
       | Really neat! We were Tule, in the industrials part of the map in
       | grey.
       | 
       | There's something wonky when I zoom in on Chrome on my laptop. It
       | abruptly shifts to another part of the map.
        
       | Liftyee wrote:
       | Cool project, but missed opportunity to name the arbitrary
       | dimensions Y and C...
        
         | Bilal_io wrote:
         | OP made the change
        
         | yoouareperfect wrote:
         | haha awesome, shipped!
        
           | ProofHouse wrote:
           | I figure why not plot them with an X and Y (Y,C) of some sort
        
         | lovestory wrote:
         | My dumb ass was trying to figure out what each dimension meant
        
           | tptacek wrote:
           | That doesn't make you dumb; there is no intuitive meaning for
           | the axes chosen; you can think of them, roughly, as
           | statistically chosen to maximize clustering.
        
             | bravura wrote:
             | Statistically chosen to maximize *some particular loss
             | measure, which in this case might be the t-SNE or UMAP
             | criterion, and is computed only globally and not for
             | different filters.
        
               | tptacek wrote:
               | Right (I mean, I'm saying "right" but really I should
               | just say "I'm taking your word for it"), but even more
               | fundamentally this is dimensionality reduction _from an
               | OpenAI embedding vector_ , which seems almost like the
               | asymptotic limit of inscrutability.
        
           | alex-knyaz wrote:
           | same
        
       | jb1991 wrote:
       | Filters are unreadable on mobile.
        
         | yoouareperfect wrote:
         | should be fixed now, thanks!
        
       | rl_for_energy wrote:
       | It'd be nice to just see the name of the company on click instead
       | of going to the website (I'm on mobile). Trying to find our
       | company
        
       | welder wrote:
       | Company status isn't up to date... I know there's more than 1
       | public company that went through YC.
        
         | yoouareperfect wrote:
         | Check the filters, not all batches are selected as default.
         | Only the latest ones. If you select all of them, then there are
         | many public companies
        
       | welder wrote:
       | Animated Gif of each category:
       | 
       | https://imgur.com/a/ycombinator-startups-map-iNX8k6M
        
       | kure256 wrote:
       | Love that, what are Axes Y and C?
        
         | DrawTR wrote:
         | Apparently inspired by a comment on this very post! (Above
         | yours, right now.)
         | 
         | > Cool project, but missed opportunity to name the arbitrary
         | dimensions Y and C...
        
       | gniting wrote:
       | Nice! What's the tech stack?
        
         | yoouareperfect wrote:
         | For scraping and all the processing, typescript. Embeddings:
         | openai
         | 
         | For visualizing react (nextjs) + plotly (though the lack of
         | mobile zoom makes me question if I should chsnge it)
        
       | zild3d wrote:
       | fun, though I also got stuck on what the Y and C axes represent
       | initially. IMO just hide the axes altogether, since the goal is
       | just some visual clustering/similarity
        
         | skeeter2020 wrote:
         | Maybe I'm slow, but clustering on what dimension? The lack of
         | axes and labeling makes it pretty confusing to me, but I'm a
         | dinosaur.
         | 
         | Visuals that are not self-explanatory make me feel dumb.
        
           | gavmor wrote:
           | We don't know what to label those features/dimensions,
           | because they're a reduction form higher dimensions that we
           | also didn't bother to interrogate.
           | 
           | It's possible to figure them out. I wish OP would.
        
             | yoouareperfect wrote:
             | OP here, Is there a way to figure that out?
        
               | gavmor wrote:
               | (Not OP) I can think of a convoluted and expensive pair-
               | wise comparison method, but I hope there's also a way to
               | figure this out during the application of principal
               | component analysis in a way I don't understand.
               | 
               | Edit: I'm thinking it can't be done without
               | experimentation on the embedding model.
               | 
               | Edit2: Ah, even that might not yield results, because as
               | the basis is derived interstitially through computation,
               | there's no guarantee the features of the final coordinate
               | system will have any accessible relationship to those of
               | the initial basis.
        
       | woodylondon wrote:
       | Really nice to see - also, It would be great when filtering if
       | there was a tabular view at the bottom as well.
        
       | ksec wrote:
       | I didn't know YC does Government, Healthcare, Industrials, Real
       | Estate and Construction. All these are great sectors and never
       | made the headline.
        
       | paxys wrote:
       | There's no need to include an X & Y axis, labels and gridlines if
       | they all have no meaning. A simple cluster diagram is enough.
        
         | ascorbic wrote:
         | I agree it would be less confusing if they weren't there. I'm
         | sure I'm not alone in spending some time trying to work out
         | what the axes were.
        
       | k-i-r-t-h-i wrote:
       | This is awesome! Are you able to also add F24?
        
       | mring33621 wrote:
       | i'd like a filter by target market (US, EU, APAC...)
        
         | yoouareperfect wrote:
         | Coming for v2
        
       | crush_robo_1536 wrote:
       | Love this! It'd be interesting if some builds this but adds more
       | dimensions (similar to Company status) to it that you can query
       | or group by. For example, if I look at S21 and W21 batches, then
       | it'd be nice to know things like -
       | 
       | 1. How many of these companies made it to series A, series B, etc
       | 
       | 2. How many of these companies have > x employees (where x can be
       | 5, 10, 20, etc)
       | 
       | 3. How many of these companies had a founder that moved on to
       | something else
       | 
       | This does require a lot more intelligent data scraping or manual
       | data collection though.
        
       ___________________________________________________________________
       (page generated 2024-12-26 23:01 UTC)