https://arxiv.org/abs/2505.23740

Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member
institutions, and all contributors. Donate
 
arxiv logo > cs > arXiv:2505.23740
[                    ]

Help | Advanced Search

[All fields        ]
Search
arXiv logo
Cornell University Logo
[                    ] GO
quick links

  * Login
  * Help Pages
  * About

Computer Science > Computer Vision and Pattern Recognition

arXiv:2505.23740 (cs)
[Submitted on 29 May 2025]

Title:LayerPeeler: Autoregressive Peeling for Layer-wise Image
Vectorization

Authors:Ronghuan Wu, Wanchao Su, Jing Liao
View a PDF of the paper titled LayerPeeler: Autoregressive Peeling
for Layer-wise Image Vectorization, by Ronghuan Wu and 2 other
authors
View PDF HTML (experimental)

    Abstract:Image vectorization is a powerful technique that
    converts raster images into vector graphics, enabling enhanced
    flexibility and interactivity. However, popular image
    vectorization tools struggle with occluded regions, producing
    incomplete or fragmented shapes that hinder editability. While
    recent advancements have explored rule-based and data-driven
    layer-wise image vectorization, these methods face limitations in
    vectorization quality and flexibility. In this paper, we
    introduce LayerPeeler, a novel layer-wise image vectorization
    approach that addresses these challenges through a progressive
    simplification paradigm. The key to LayerPeeler's success lies in
    its autoregressive peeling strategy: by identifying and removing
    the topmost non-occluded layers while recovering underlying
    content, we generate vector graphics with complete paths and
    coherent layer structures. Our method leverages vision-language
    models to construct a layer graph that captures occlusion
    relationships among elements, enabling precise detection and
    description for non-occluded layers. These descriptive captions
    are used as editing instructions for a finetuned image diffusion
    model to remove the identified layers. To ensure accurate
    removal, we employ localized attention control that precisely
    guides the model to target regions while faithfully preserving
    the surrounding content. To support this, we contribute a
    large-scale dataset specifically designed for layer peeling
    tasks. Extensive quantitative and qualitative experiments
    demonstrate that LayerPeeler significantly outperforms existing
    techniques, producing vectorization results with superior path
    semantics, geometric regularity, and visual fidelity.

Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics
          (cs.GR)
Cite as:  arXiv:2505.23740 [cs.CV]
          (or arXiv:2505.23740v1 [cs.CV] for this version)
          https://doi.org/10.48550/arXiv.2505.23740
          Focus to learn more
          arXiv-issued DOI via DataCite

Submission history

From: Ronghuan Wu [view email]
[v1] Thu, 29 May 2025 17:58:03 UTC (2,376 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled LayerPeeler: Autoregressive
    Peeling for Layer-wise Image Vectorization, by Ronghuan Wu and 2
    other authors
  * View PDF
  * HTML (experimental)
  * TeX Source
  * Other Formats

view license
Current browse context:
cs.CV
< prev   |   next >
new | recent | 2025-05
Change to browse by:
cs
cs.GR

References & Citations

  * NASA ADS
  * Google Scholar
  * Semantic Scholar

a export BibTeX citation Loading...

BibTeX formatted citation

x
[loading...          ]
Data provided by:

Bookmark

BibSonomy logo Reddit logo
(*) Bibliographic Tools

Bibliographic and Citation Tools

[ ] Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
[ ] Connected Papers Toggle
Connected Papers (What is Connected Papers?)
[ ] Litmaps Toggle
Litmaps (What is Litmaps?)
[ ] scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
( ) Code, Data, Media

Code, Data and Media Associated with this Article

[ ] alphaXiv Toggle
alphaXiv (What is alphaXiv?)
[ ] Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
[ ] DagsHub Toggle
DagsHub (What is DagsHub?)
[ ] GotitPub Toggle
Gotit.pub (What is GotitPub?)
[ ] Huggingface Toggle
Hugging Face (What is Huggingface?)
[ ] Links to Code Toggle
Papers with Code (What is Papers with Code?)
[ ] ScienceCast Toggle
ScienceCast (What is ScienceCast?)
( ) Demos

Demos

[ ] Replicate Toggle
Replicate (What is Replicate?)
[ ] Spaces Toggle
Hugging Face Spaces (What is Spaces?)
[ ] Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
( ) Related Papers

Recommenders and Search Tools

[ ] Link to Influence Flower
Influence Flower (What are Influence Flowers?)
[ ] Core recommender toggle
CORE Recommender (What is CORE?)

  * Author
  * Venue
  * Institution
  * Topic

( ) About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and
share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have
embraced and accepted our values of openness, community, excellence,
and user data privacy. arXiv is committed to these values and only
works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community?
Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is
MathJax?)

  * About
  * Help

  * Click here to contact arXiv Contact
  * Click here to subscribe Subscribe

  * Copyright
  * Privacy Policy

  * Web Accessibility Assistance
  * arXiv Operational Status
    Get status notifications via email or slack