[HN Gopher] NeuralSVG: An Implicit Representation for Text-to-Ve...
___________________________________________________________________
NeuralSVG: An Implicit Representation for Text-to-Vector Generation
Author : lnyan
Score : 262 points
Date : 2025-01-08 18:16 UTC (4 hours ago)
(HTM) web link (sagipolaczek.github.io)
(TXT) w3m dump (sagipolaczek.github.io)
| pizza wrote:
| Prompting Claude to make SVGs then dropping them into Inkscape
| and getting the last ~20% of it to match the picture in my head
| has been a phenomenal user experience for me. This, too, piques
| my curiosity..!
| xvfLJfx9 wrote:
| Claide doesn't work at all for me when generating SVGs
| iambateman wrote:
| It depends for me on if there is an existing SVG that exists
| in its training set.
|
| "Make an SVG of a clock icon" is likely to work. "Make an SVG
| of a playground swingset with the sun setting" is not.
| varunneal wrote:
| Your prompt verbatim turned out quite well, single-shot.
|
| https://claude.site/artifacts/0f696bf8-399d-42c3-93c0-29649
| 3...
| jgalt212 wrote:
| I poked around with NeoSVG a few months back. I was not happy
| with the results, the computation time, or the cost. That being
| said, I do hope they've made big progress lately because SVGs
| work real nice when you have an LLM and human working in tandem
| (as per the comment above).
|
| https://neosvg.com/generations
| kelseyfrog wrote:
| Why does the fourth example show a hamburger but is labeled as a
| dragon?
| jsheard wrote:
| American cultural bias in the training data led it to infer
| that dragons would be turned into burgers if they were real.
| airstrike wrote:
| Most likely just a clerical error, since the dragon is two
| examples to the left with the same caption.
| dekhn wrote:
| because hamburgers aren't made from chopped ham.
| murtio wrote:
| This is really cool! I have been using Claude to animate SVG, and
| it has been great.
| NiloCK wrote:
| I'd be interested to see examples and hear about process here,
| if you're willing to share.
| zellyn wrote:
| The sketch generation is wild... and apparently comes for free.
| airstrike wrote:
| This opens up lots of opportunities for document authoring tools.
| Really cool stuff, can't wait to try out the code once it's
| available.
| lbj wrote:
| "Code coming soon" - I hope someone reposts this when there's
| more to dig into
| janalsncm wrote:
| I am a huge fan of this type of incremental generative approach.
| Language isn't precise enough to describe a final product, so
| generating intermediate steps is very powerful.
|
| I'd also like to see this in music generation. Tools like Suno
| are cool but I would much rather have something that generates
| MIDIs and instrument configurations instead.
|
| Maybe this is a good lesson for generative tools. It's possible
| to generate _something_ that's a good starting point. But what
| people actually want is long tail, so including the capability of
| precision modification is the difference between a canned demo
| and a powerful tool.
|
| > Code coming soon
|
| The examples are quite nice but I have no idea how reproducible
| they are.
| kadushka wrote:
| _I'd also like to see this in music generation. Tools like Suno
| are cool but I would much rather have something that generates
| MIDIs and instrument configurations instead._
|
| Sounds like you're looking for something like
| https://www.aiva.ai
| janalsncm wrote:
| Honestly that site feels like they have a database of midis
| tagged by genre and pick them out randomly. It's totally
| different from their demo song.
|
| I guess I'm hoping for something better. It's also closed
| source, the web ui doesn't have editing functionality, and
| the output is pretty disjointed. Maybe if I messed around
| with it enough the result would be decent.
| bufferoverflow wrote:
| MIDI isn't enough. I want MIDI + filters, plus separate voice
| and custom sounds tracks.
| gexaha wrote:
| microtonal midi would be super awesome
| jonathaneunice wrote:
| Nice! Looking forward to similar textual generation of diagrams.
| (The Pic/Pikchr for the LLM age.)
| da_rob wrote:
| It's not PIC and not really suitable for complex diagrams, yet,
| but you can use Vizzlo's Chart Vizzard to create a subset of
| the supported chart types (let's say a Gantt) and then continue
| editing it using the chart editor: https://vizzlo.com/ai
| fosterbuster wrote:
| Its a wasted opportunity not using SVG to show the examples.
| TeMPOraL wrote:
| Available in ComfyUI when? :).
|
| Seriously though, this is amazing, I'm glad to see this tackled
| directly.
|
| Also, I just learned from this thread that Claude is apparently
| usable for generating SVGs (unlike e.g. GPT-4 when I tested for
| it some months ago), so I'll play with that while waiting for
| NeuralSVG to become available.
| vipshek wrote:
| This is excellent!
|
| I think the utility of generating vectors is far, far greater
| than all the raster generation that's been a big focus thus far
| (DALL-E, Midjourney, etc). Those efforts have been incredibly
| impressive, of course, but raster outputs are _so_ much more
| difficult to work with. You 're forced to "upscale" or "inpaint"
| the rasters using subsequent generative AI calls to actually
| iterate towards something useful.
|
| By contrast, generated vectors are inherently scalable and easy
| to edit. These outputs in particular seem to be low-complexity,
| with each shape composed of as few points as possible. This is a
| boon for "human-in-the-loop" editing experiences.
|
| When it comes to generative visuals, creating simplified
| representations is much harder (and, IMO, more valuable) than
| creating highly intricate, messy representations.
| tasuki wrote:
| Ah, we should be friends!
|
| I'm not sure what else to add, except that these are exactly
| the thoughts I think, and it used to feel lonely ;)
| Lerc wrote:
| There is also the possibility for using these images as
| guidance for rasterization models. Generate easily
| manipulatable and composible images as a first stage then add
| detail once the image composition is satisfactory.
| SillyUsername wrote:
| My little project for the highly intricate, messy
| representation ;) https://github.com/KodeMunkie/shapesnap (it
| stands on the backs of giants, original was not mine). It's
| also available on npm.
| gwern wrote:
| Have you looked at https://www.recraft.ai/ recently? The image
| quality of their vector outputs seems to have gotten quite
| good, although you obviously still wouldn't want to try to
| generate densely textured or photographic-like images like
| Midjourney excels at. (For https://gwern.net/dropcap last year
| or before, we had to settle for Midjourney and create a
| somewhat convoluted workflow through Recraft; but if I were
| making dropcaps now, I think the latest Recraft model would
| probably suffice.)
| 1970-01-01 wrote:
| Aside: I've been having a very hard time prompting ChatGPT to
| spit out ASCII art. It really seems to not be able to do it.
| Here is an ASCII art representation of a hopping rabbit:
| ``` (\(\ ( -.-) o_(")(")
| ``` This is a simple representation of a rabbit
| with its ears up and in a hopping stance. Let me know if you'd
| like me to adjust it!
| Kiro wrote:
| Pretty good if you ask me. What would a proper hopping rabbit
| ASCII art look like?
| jansan wrote:
| Not sure, but that is a sitting rabbit.
| jsheard wrote:
| It seems to have just pulled an ASCII rabbit from the training
| data verbatim
|
| https://old.reddit.com/r/identifythisfont/comments/ytd25m/wh...
| goeiedaggoeie wrote:
| This is very nice.
|
| I has to convert a bitmask to svg and was wishing to skip the
| intermediatary step so looked around for papers about
| segmentation models outputting svg and found this one
| https://arxiv.org/abs/2311.05276
| scosman wrote:
| I've been impressed with even applying sonnet to SVGs for
| animations. This looks like it could be a lot more powerful.
|
| Fun example:
| https://gist.github.com/scosman/701275e737331aaab6a2acf74a52...
| toisanji wrote:
| This is a group applying vector generation to animations:
| https://www.youtube.com/@studyturtlehq The graphic fidelity has
| been slowly improving over time.
| gcr wrote:
| can you say more? all of these videos have less than 5 views
| and i can't find any explanation of their process
___________________________________________________________________
(page generated 2025-01-08 23:00 UTC)