[HN Gopher] Launch HN: CodeViz (YC S24) - Visual maps of your co...
       ___________________________________________________________________
        
       Launch HN: CodeViz (YC S24) - Visual maps of your codebase in VS
       Code
        
       Hey HN -- we're Liam and Will from CodeViz (https://codeviz.ai).
       We're building a VS Code extension that generates interactive
       diagrams of codebases, from system architecture down to function
       call graphs. Here's a demo where we analyze OpenHands, uv, and
       webviz: https://www.youtube.com/watch?v=fgfDXUtWzRk.  The extension
       is public if you want to try it on your own repos:
       https://marketplace.visualstudio.com/items?itemName=CodeViz....
       Will and I started CodeViz because we wanted more intuitive
       representations of software. During our time at Tesla, we
       encountered a common problem: software engineers spend very little
       time actually typing code. Most development time was spent
       navigating convoluted files and building a mental map for each
       task. At the same time, whiteboard sessions were proof that code
       could be expressed intuitively.  We started with autogenerated
       technical documentation. Of course, long markdown docs are not a
       good solution for long files of code. We realized we needed
       diagrams that (a) help grasp large quantities of code and (b) can
       be filtered according to the developer's task. So, we built a
       graph-based VS Code extension. It generates diagrams directly
       within VS Code, illustrating connections between functions and
       providing overviews of system architecture. These visualizations
       update as code changes.  CodeViz appears as a side panel in VS Code
       with two views:  (1) Call graph: as you click on functions, we show
       a chain of upstream and downstream references. You can navigate
       your codebase using the call stack and see, in one view, everywhere
       your functions are called. We generate this call graph using the
       language servers developers have already installed in VS Code  (2)
       Architecture diagram: we create a C4 diagram of your system, so you
       can see a top-level view of your codebase and click into the
       component layer. We were surprised to find that a small fraction of
       code can generate a very accurate representation of the system. We
       detect these important files, then use LLMs to build nested
       architecture diagrams at the container and component level
       Developers are mainly using our extension to navigate spaghetti
       code, onboard new devs, and interpret open source repos. We're
       still figuring out our pricing. Currently, we offer basic features
       for free, with a paid tier for more resource-intensive tools like
       detailed architecture diagrams. Open to suggestions on this
       approach.  CodeViz is in active development and our main focus over
       the next couple of days is to make the call graph much easier to
       view and navigate. We're continuously working to make it better, so
       your honest feedback, suggestions, and wishes would be very
       helpful. Looking forward to hearing any and all thoughts, whether
       about the current extension, general problem space, or something
       else!
        
       Author : LiamPrevelige
       Score  : 92 points
       Date   : 2024-08-29 17:50 UTC (5 hours ago)
        
       | pthangeda wrote:
       | Congratulations on the launch - this looks great and I've been
       | waiting for years for something like this! As a researcher who
       | mostly uses Python, and explores/navigates a large number of
       | repos for a short time, often written by other researchers not
       | necessarily trained in software best practices, I was always
       | frustrated (and surprised) that there was no VS Code extension or
       | tool that gave me a quick overview/visualization to get a high
       | level gist of different modules and code/data flow!
       | 
       | I tried this with a bunch of small open-source repos and it works
       | great! I imagine using LLM might be a hard no for some
       | people/enterprises - any plans to use stand-alone licenses with
       | small local models? It seems like for what LLM is doing here (if
       | I understand it right, help label the modules in natural language
       | and perhaps help organize them into this hierarchy/modules) you
       | don't necessarily need a SoTA model, right?
       | 
       | Also, this could be coming from LLMs, but I see that the
       | visualizations are more biased towards terminology used in web-
       | dev? (for example, one of my robot related repo was organized
       | into front-end, back-end, etc. with I guess is kinda right but
       | not exactly lol). It would be nice to see an interactive
       | visualization where I can iterate on the initial viz with
       | information I know, e.g., I drag and drop a module or rename it
       | and then you probably do another pass with this feedback and LLMs
       | and update my overall visualization with more domain specific
       | labels and partitions?
       | 
       | Edit: Exploring CodeViz on a few more repos, and it seems like
       | you have a set of hardcoded labels for the highest hierarchy in
       | the architecture diagram? (so far, I've only seen Users,
       | Databases, Backend, Frontend, and Shared Components). I am
       | guessing this is something passed on in the prompts? It'll be
       | nice to allow user to define their own set of labels/partitions
       | at one or more levels and then try to create an architecture
       | visualization that fits into these labels/constraints (although I
       | am guessing at some point you have to be wary of hallucinations?)
        
         | LiamPrevelige wrote:
         | We'd love to use local models and have played around with them
         | a bit. Exactly right about the labeling - we didn't stick with
         | local models because 3.5 sonnet is exceptionally good at
         | finding niche architecture labels and merging similar modules
         | (since code analysis is chunked). Copilot tools are becoming
         | very popular, so companies are getting less strict around LLMs
         | and code, but ultimately we do think everyone is better off if
         | CodeViz is self-contained.
         | 
         | There is some hard coded bias for web dev. Diagram modification
         | is definitely high on our todo, and we've been finding ways to
         | reduce pre-defined structure in our prompts to LLMs so they
         | work with broader tech stacks. When we sell licenses to teams,
         | we do some manual checks for accuracy and detail, which helps
         | us improve the public extension.
         | 
         | What's the name of the robotics repo? And any preference for
         | modifying the diagram directly vs instructing changes by text?
        
         | WillMcCall wrote:
         | Re: Edit
         | 
         | The top level categorizations are indeed fixed, however the
         | nodes themselves can be arbitrary. We've found this helps with
         | grouping and organization while still allowing for the
         | flexibility required to accommodate different systems. I'm
         | curious, are there any categories missing here that could be
         | added?
         | 
         | Currently, we categorize by: Frontend (UI/UX elements), Backend
         | (API/Business/Data Access), DB(persisting storage), External
         | Services (Backends maintained outside codebase), Shared
         | Components (internally maintained libraries and helpers)
        
       | Veuxdo wrote:
       | Going by the video, I see the diagrams have lines, but
       | unfortunately no labels. Do lines always mean "dependency"? E.g.
       | file dependency, dependency on a database or service, etc.
        
         | WillMcCall wrote:
         | The lines represent a connection or interaction between
         | different parts of the codebase. Most often these are
         | dependencies like you mentioned, but a "dependency" could be a
         | parent-child relationship, API call, imported function, and
         | there are certain exceptions such as user interactions. Right
         | now immediate edge labels are displayed on node hover, but I
         | agree that the criteria for inclusion/exclusion of lines should
         | be precisely defined. Any thoughts on what you'd like to see?
        
       | alixanderwang wrote:
       | Hi and congrats on the launch! I run a company that also does
       | software diagrams and we've often been posed the question of
       | generating automated diagrams. We've never done so, primarily
       | because I've never found auto-generated diagrams helpful yet.
       | Code dependency graphs have existed for a generation now and I've
       | just never seen one referenced by anyone. I wonder if things have
       | changed now with LLMs.
       | 
       | The examples in your youtube video look good. I'm curious how
       | they're generated. "We were surprised to find that a small
       | fraction of code can generate a very accurate representation of
       | the system." is a surprising statement to me. It's not been my
       | experience that the code can reveal an accurate representation of
       | human-understandable architecture beyond the call graph. The
       | backend system generated from OpenHands (in your video) also
       | looks pretty different from their own architecture diagram in
       | their README: https://github.com/All-Hands-
       | AI/OpenHands/tree/main/openhand... . How do you reconcile what an
       | LLM says an architecture looks like with what maintainers
       | prescribe? Is there a way to give feedback to it? (similar to
       | pthangeda's comment on customization)
       | 
       | I wish there was a way to point this at a repo to test its
       | efficacy. Though I understand that that'd be prohibitively
       | expensive to do for free on the landing page.
       | 
       | I'm also curious how you guys distinguish yourself from
       | https://docs.codesee.io/docs/review-maps-for-visual-studio-c... .
       | They tried this for a few years but shut down recently
       | (https://www.linkedin.com/posts/shaneak_update-codesee-has-be...)
        
         | WillMcCall wrote:
         | Great points/questions. I suspect that information relevant to
         | codebase architectures follows the 80/20 principle. For
         | example, a router and index file in a React App will usually
         | give you around 80% of what's needed to infer high level
         | container info.
         | 
         | In terms of generating architecture diagrams, we follow the c4
         | model, with top level nodes defined as separately deployable
         | units of software, and lower level component nodes being a set
         | of functions wrapped behind a common interface. As the product
         | develops, we'd like to include a way for feedback/fine tuning,
         | but ideally the definition of an architecture diagram would be
         | rigorous enough that there is no ambiguity, this is what we're
         | aiming for. If you'd like to try it out on a specific repo, you
         | can always use our extension for further analysis.
         | 
         | You're right to notice the similarity with CodeSee. Ultimately
         | we're looking to focus on improving the developer experience
         | without needing to leave the IDE. The idea is that CodeViz can
         | replace or augment search and directory tree by providing a
         | more intuitive interface for navigation!
        
           | alixanderwang wrote:
           | _> If you 'd like to try it out on a specific repo, you can
           | always use our extension for further analysis._
           | 
           | VSCode extension marketplace doesn't have the best security
           | rails or reputation for security, and with this being closed
           | source, just personally, installing and running it on my
           | machine isn't something I'm comfortable doing.
           | 
           |  _> The idea is that CodeViz can replace or augment search
           | and directory tree by providing a more intuitive interface
           | for navigation!_
           | 
           | That to me is a different goal than the one in your post
           | (maybe it's just phrasing or I didn't understand the OP
           | correctly), and is something I'd be excited to have!
           | 
           |  _> ideally the definition of an architecture diagram would
           | be rigorous enough that there is no ambiguity_
           | 
           | Rigor is a big "if" in software ;). See: UML's attempt. C4 is
           | some very loose guidelines. IIRC, a big part of its
           | attraction is the lack of rigor/formal standards.
           | 
           | Anyway, best of luck! Feel free to reach out if you'd like to
           | chat diagrams
        
             | WillMcCall wrote:
             | Sounds good, thanks for the feedback! Would open-sourcing
             | CodeViz change your willingness to give it a try?
             | 
             | Definitely agree that rigor is a tricky term here, do you
             | think the open-ended nature of C4 diagrams is a feature,
             | not a bug? We've found in practice that top level diagram
             | generation is both an art and a science, maybe it ought to
             | stay that way.
             | 
             | In any case, shooting you a PM to set up some time to chat,
             | and thanks again for the input!
        
               | alixanderwang wrote:
               | _> Would open-sourcing CodeViz change your willingness to
               | give it a try?_
               | 
               | Yup. Or a web app that I can point to an open-source
               | repo.
               | 
               | Regarding C4, I'm no expert. I know the guys at IcePanel,
               | an earlier YC-cohort company
               | (https://news.ycombinator.com/item?id=34338995) that
               | specializes in C4 diagrams, and they're very friendly, so
               | if you haven't yet, they'll be much more equipped to
               | chime in on C4 stuff.
        
       | lysace wrote:
       | Does my codebase leave my machine when using your extension with
       | it?
        
         | LiamPrevelige wrote:
         | Yes, some code is sent to Anthropic. We're hoping to: find
         | initial users that are comfortable with software copilots
         | (github copilot, cursor, etc) -> iterate on their feedback to
         | make diagram generation require less sophisticated LLMs -> move
         | everything locally
        
           | lysace wrote:
           | Sent via your servers, I assume, since you are providing the
           | Anthropic api key.
           | 
           | So we must trust both Anthropic's and your infrastructure
           | with our code.
           | 
           | I agree that a local LLM is the way to go.
        
             | LiamPrevelige wrote:
             | That's correct, our server just routes calls directly to
             | Anthropic. Some users requested an option to input their
             | own API key and talk to Anthropic directly. I'll add this
             | by the end of the week, maybe today if time.
             | 
             | Local LLM is still the end goal
        
               | lysace wrote:
               | Ok. I think you should be more transparent on this topic.
               | Needing to drag this out of you piece by piece does not
               | inspire trust.
        
       | fudged71 wrote:
       | "On mobile? Add a calendar reminder to install CodeViz"
       | 
       | This is genius
        
         | LiamPrevelige wrote:
         | Thanks! Inspired by my bad memory and half of our website
         | visitors being on mobile
        
         | weakwire wrote:
         | Perfect. Now I have a calendar event to "Install Covid".
         | Thanks!
        
           | LiamPrevelige wrote:
           | The name association is a genuine concern of ours :)
        
             | wazdra wrote:
             | Let's hope it spreads as well :p
        
         | Crowberry wrote:
         | I was struck by that as well! Unfortunately it seems hardwired
         | to Google calendar which i do not use. I'll remember though, I
         | think..
        
           | LiamPrevelige wrote:
           | Just added an option to choose between Google and Apple
           | calendar
        
         | Hadriel wrote:
         | where are you quoting that from?
        
       | tcanabrava wrote:
       | there is a codevis application in the open for almost two years.
       | it generates large scale software visualization for c, c++ and
       | fortran (but it accepts other languages as plugins). i am the
       | main developer of this tool, and im a bit shocked that another
       | tool with similar intent and same name exists. woyod you be ok
       | for me to demo what codevis (the kde one) do to your team? you
       | can reach me at tcanabrava at kde.org the source of codevis is at
       | invent.kde.org/sdk/codevis
        
         | WillMcCall wrote:
         | Yes of course, Let's set up some time to chat! Looks like a
         | great tool!
        
           | tcanabrava wrote:
           | awesome
        
       | samstave wrote:
       | I like your thing.
       | 
       | I have been forcing my bot to give me Mermaid diagrams, swim
       | diagrams, markup tables of schemas, code, logic etc...
       | 
       | I like where you guys are going, but what I think would be really
       | fun - would be a Node Based diagram logic, where the boxes that
       | you show in the diagram are Code-geometry-Nodes - and could be
       | connected with code blocks as such.
       | 
       | Watch @HarryBlends videos on Geometry Nodes in Blender for
       | Inspiration:
       | 
       | https://www.youtube.com/@harryblends
       | 
       | https://www.youtube.com/watch?v=a-4oCHe-hDE
       | 
       | These are the best graphic/node based visuals for describing
       | structured relationships in maths I've ever seen.
       | 
       | To give you some CyberPunk FutureVision of what your outpus could
       | be like -- if it turned all the code nodes into atomic code legos
       | and rather than drawing the diagram from the code - I can use the
       | diagram to create the code.
       | 
       | --
       | 
       | WRT _"...some code goes to anthropic... "_ while answering
       | another, seems like you guys would do well to know these guys:
       | 
       | https://news.ycombinator.com/item?id=41381498
       | 
       | https://www.usevelvet.com/
       | 
       | As well as these guys:
       | 
       | https://news.ycombinator.com/item?id=41322281
       | 
       | https://github.com/instantdb/instant
        
         | WillMcCall wrote:
         | Thanks for the feedback and resources!
         | 
         | In designing CodeViz we were inspired by the Maya hypershade,
         | which closely resembles the diagram-based blender tool that you
         | shared.
         | 
         | https://help.autodesk.com/view/MAYAUL/2024/ENU/?guid=GUID-22...
         | 
         | These examples show how taking a diagram-based approach to
         | software development can abstract away complexity with minimal
         | loss of control over the end result. I love your image of
         | "atomic code legos," and these legos can still always be edited
         | the level of code when needed.
         | 
         | And yes, if CodeViz can generate architecture diagrams from
         | code, the inverse can and will be possible: generating code
         | from architecture diagrams.
        
           | samstave wrote:
           | Exactly!
           | 
           | I've been wanting to have a GPT directly inside Blender to
           | Talk Geometry Nodes - because I want to tie geometry nodes to
           | external data to external data which runs as python inside
           | blender that draws the object geometry that suitabley
           | shows/diagrams out the nodes of my game I am slowly piecing
           | together 'The Oligarchs' which is an updated Illuminati style
           | game - but with updates using AI to creat nodes directly from
           | Oligarch IRL files, such as their SEC Filings, Panama Papers,
           | and all the tools on HN are suited to creating. I went to
           | school for Softimage & Alias|WAVEFRONT (which became MAYA)
           | Animation in 1995 :-)
           | 
           | So I like your DNA.
           | 
           | I want to unpack the relationships of the Oligarch,
           | programmatically, with hexagonal nodes, similar to this[0]-
           | but driven by Node-based-python-blocks-GraphQL-hierachy. And
           | I am slowly learning how to get GPTBots to spit out the
           | appropriate Elements for me to get there.
           | 
           | [0] - https://www.youtube.com/watch?v=vSr6yUBs8tY
           | 
           | (ive posted a bunch of disjointed information on this on HN -
           | morespecifically about how to compartmentalize GPT responses
           | and code and how to drive them to write code using
           | StyleGuide, and gather data using structures rules for how
           | the outputs need to be presented..)
        
       | ThinkBeat wrote:
       | Diagrams of code can be done with UML 1. A fairly decent 2
       | standard with a set of different diagrams that can be created to
       | visualize code.
       | 
       | In those terms CodeViz will provide a form of simplified class
       | diagrams and a function call mapping?
       | 
       | Will there be sequence diagrams in the future?
       | 
       | 1 https://www.uml.org/ 2 It has fallen out of fashion, since many
       | people found it too heavy, and a lot of people have hever heard
       | of it.
        
       ___________________________________________________________________
       (page generated 2024-08-29 23:00 UTC)