[HN Gopher] Unprojecting text with ellipses (2016)
       ___________________________________________________________________
        
       Unprojecting text with ellipses (2016)
        
       Author : nmstoker
       Score  : 138 points
       Date   : 2024-05-19 21:08 UTC (1 days ago)
        
 (HTM) web link (mzucker.github.io)
 (TXT) w3m dump (mzucker.github.io)
        
       | lupire wrote:
       | Why is this better than simply finding the bounding quadrilateral
       | of the text, and rectangularizing that?
        
         | dahart wrote:
         | Good question. How does one simply find the bounding quad of
         | rotated perspective text? Will that handle perspective
         | distortion?
         | 
         | I guess the author partly answers your question early on with
         | discussion of the Merino-Gracia paper, which fits a quad to
         | individual lines of text, and a comment about how that relies
         | on first being able to detect lines of text.
         | 
         | Matt also doesn't claim this method is better. He says "I'm
         | sure its neither as accurate or as useful as the Merino-Gracia
         | approach." I assume the example text "Needlessly Complex" is a
         | bit of self-deprecating humor, acknowledging he may not be
         | taking the easiest path there is. But the method here seems
         | interesting and useful to me for its approach; it doesn't have
         | to identify word or page boundaries, or lines of text, as a
         | prerequisite. The assumptions are simple and the optimization
         | is simple, it's a nice study in different ways to think about
         | the problem.
        
           | yorwba wrote:
           | The method does have to identify lines of text to find the
           | rotation angle, but doing so after perspective correction
           | using the "all letters should be about the same size"
           | assumption means that a Hough transform is enough for that
           | step, since the lines should already be roughly parallel.
           | 
           | (Having to identify page boundaries is handwaved away with
           | "I'm going to make a huge simplifying assumption that that
           | the image we're processing basically contains only the text
           | that we want to rectify")
        
           | lupire wrote:
           | Finding linear boundaries of a wile block of text is much
           | easier than finding letter boundaries. It's a 1980s textbook
           | matter of finding lines where the brightness gradient is
           | extremely large.
        
             | dahart wrote:
             | Which algorithm are you referring to? I have a copy of Jain
             | et al and I can't find what you're describing. Do you have
             | a link to something? The Hough transform is used in this
             | article, if that's what you're thinking of, but that will
             | not work to find the bounding box of text, the lines have
             | to be solid, contiguous, and linear for that to work. Note
             | the method in the article doesn't depend on the text having
             | a solid surround color, or even have the text arranged in a
             | roughly rectangular shape. And it also doesn't depend on
             | the text being linear. These differences are valuable, not
             | having to make the same assumptions you're making, and it
             | means this method (whether or not it's "better") may work
             | in a wider variety of situations, or may make a very good
             | complement to existing methods.
        
         | kookamamie wrote:
         | As the blog title has it, it's needlessly complex.
        
       | ch33zer wrote:
       | Never have unprojected text. I learned the hard way it's just not
       | worth it.
        
         | alex_duf wrote:
         | If you never have done it, how can you have learned the hard
         | way that it's not worth it?
        
           | thsksbd wrote:
           | Since we're nitpicking, OP said:
           | 
           | "Never have unprojected text."
           | 
           | Not:
           | 
           | "I never have [...]"
           | 
           | The absence of an explicit subject means that another correct
           | interpretation of the sentence is that the OP is giving you
           | some good advice.
        
             | lupire wrote:
             | Good advice about what? Unprotected text is worse than the
             | alternatives, projected text, or not text at all?
             | 
             | That's an outrageous claim that needs some sort of
             | justification.
        
               | thsksbd wrote:
               | A valid interpretation of the OP's sentence includes the
               | advice never to have unprojected text.
               | 
               | Im not evaluating the worth of said advice, just the
               | grammar, to play nitpick tennis with alex_duf who
               | graciously conceded the point.
        
             | alex_duf wrote:
             | I see! I had failed to parse the sentence!
        
       | ClassyJacket wrote:
       | What would I have to learn to understand all the maths in this
       | post?
        
         | BobbyTables2 wrote:
         | It's only basic pre-algebra and matrix multiplication. Plus,
         | the typical Mathematicians' love of variable naming and use of
         | the tilde.
         | 
         | Matrix equations are really just shorthand for several related
         | equations. The notation can be a bit unsettling if you aren't
         | used to it.
        
         | DeathArrow wrote:
         | Linear Algebra. A point in space can be though as a vector.
         | Rotation and scaling are done by multiplying a vector with a
         | matrix.
        
         | tlarkworthy wrote:
         | As all the classic computer algorithms are here
         | https://homepages.inf.ed.ac.uk/rbf/HIPR2/index.htm
         | 
         | E.g. https://homepages.inf.ed.ac.uk/rbf/HIPR2/hough.htm
        
       | DeathArrow wrote:
       | I wonder how well does it work for images. There is going to be
       | some data loss, but how much?
        
         | Someone wrote:
         | Not at all for most photos, I think. What would you replace the
         | assumption _"on average, all letters should be about the same
         | size"_ with?
        
       | JadeNB wrote:
       | I thought at first that it was about
       | https://en.wikipedia.org/wiki/Ellipsis , which makes sense in a
       | textual context, not about https://en.wikipedia.org/wiki/Ellipse
       | , so it took me a minute to understand the relevance of the
       | article.
        
       ___________________________________________________________________
       (page generated 2024-05-20 23:01 UTC)