[HN Gopher] RegExr: Learn, Build and Test Regex
       ___________________________________________________________________
        
       RegExr: Learn, Build and Test Regex
        
       Author : evo_9
       Score  : 111 points
       Date   : 2022-01-27 17:09 UTC (5 hours ago)
        
 (HTM) web link (regexr.com)
 (TXT) w3m dump (regexr.com)
        
       | a99c43f2d565504 wrote:
       | I'd like a tool similar to this but for sed or awk usage. Like to
       | see interactively what would be the output for given input and
       | command. Particularly for the toolchain distributed with Ubuntu.
       | I believe demand for such tool would be greater than just me. Let
       | me know if you know one!
        
         | nerdponx wrote:
         | I was _just_ talking about this yesterday. I too would really
         | appreciate a tool like this!
        
       | jklinger410 wrote:
       | Validating regex with some visualization is cool and all...but I
       | actually value my time. I want a WYSIWYG regex builder, not a
       | tool that helps me learn it.
        
         | bloblaw wrote:
         | Oh, you should checkout RegexMagic. That's exactly what it
         | does: https://www.regexmagic.com/
         | 
         | Written by author of RegexBuddy and Oreilly's Regex Cookbook:
         | https://learning.oreilly.com/library/view/regular-expression...
        
           | jklinger410 wrote:
           | Ah this is exactly what I am looking for! I'll have to see if
           | the free version will run in Wine.
        
       | brendanfalk wrote:
       | RegExr is my go to tool for testing regex. It's always been able
       | to solve my needs and I've never had any need to change. In
       | saying that, I've always wondered if regexr is missing out on
       | something that other regex build/test tools have?
       | 
       | What other regex tools do people use and why?
        
         | Glant wrote:
         | I typically use Regex101. It's been good enough for my needs
         | and I'm just used to it at this point. Looking at RegExr, it
         | seems like the only big difference is that Regex101 supports
         | substitution, though I think I've only used it once.
         | 
         | https://regex101.com/
        
           | walls wrote:
           | It also lets you switch between regex implementations in
           | different languages.
        
           | ask_b123 wrote:
           | What do you mean by substitution? Replacing a match with a
           | string/pattern?
           | 
           | If so, RegExr does have that tool (at the middle/bottom,
           | tools bar), and it probably is the functionality I most often
           | use.
        
             | Glant wrote:
             | Ah, I see it now. Usually when I do replace/substitution
             | it's in a larger project so it's not a feature I typically
             | use on the site. Most of my Regex101 use is just figuring
             | out why a regex I wrote isn't executing like I expected.
        
         | Zababa wrote:
         | I use regex101.com, that looks a lot like RegExr. I remember
         | using RegExr at some point, and I think I default to regex101
         | just because the name is a bit easier to remember. I'll try to
         | remember to switch to RegExr since it's open source.
        
       | nerdponx wrote:
       | I (and most people I talk to about this stuff) tend to use
       | Regex101 (https://regex101.com) for this purpose. It will be
       | interesting to spend some time with RegExr and compare the two
       | tools.
       | 
       | I am also aware of RegexBuddy (https://www.regular-
       | expressions.info/regexbuddy.html), whose author also publishes
       | very good regex learning content on their site. It looks great,
       | but it's a closed-source Windows-only application, which means
       | it's something I'll never be able to benefit from.
        
         | bloblaw wrote:
         | RegexBuddy works perfectly on wine:
         | https://www.regexbuddy.com/wine.html
         | 
         | Why does it matter if it's closed source?
         | 
         | I am a paying customer of RegexBuddy. Best $39 I've ever spent
         | on software.
         | 
         | RegexBuddy's `debug` feature has no equal in open-source or
         | commercial software: https://www.regexbuddy.com/debug.html
         | 
         | It has many more features than regex101 _AND_ it keeps all data
         | local. Maybe regex101 keeps things locally too, but I have to
         | run it in a browser and I 'm not about to put sensitive test
         | data there.
         | 
         | That being said, regex101 is very well done, but I paid $39 for
         | RegexBuddy 14 years ago (and price is still the same) and last
         | year paid $19 for the optional upgrade from v3 to v4...but I
         | really only did that to support the developer. v3 still met all
         | my needs.
        
         | fiatjaf wrote:
         | https://www.regular-expressions.info/ is the best place to
         | learn though, not because of the tools, but because the text is
         | so good, so clear, you can learn without helper tools, just by
         | reading, and become a regex master in one day.
        
       | viggity wrote:
       | <this is a repost, but I like to spread the good word when the
       | opportunity avails itself>
       | 
       | I think one reason why most people have a hard time reading regex
       | is because they don't use any indentation or linebreaks.
       | Honestly, if a buddy came to you and asked you to help him debug
       | a javascript method and all 15 statements were on the same line,
       | would you offer to help him, or tell him to fix his shit first so
       | you can read it? What if it was all on one line and his variable
       | names were all "v1", "v2", etc. Would you help him then? fuck no.
       | And yet, this is standard operating procedure with regex, except
       | you don't even get "v1", "v2" because nothing is labeled at all.
       | v1/v2/... would be an improvement!
       | 
       | This is how most people write a simple date regex:
       | 
       | \d{1,2}/\d{1,2}/(\d{4}|\d{2})
       | 
       | And mind you, this is a very simple scenario. Here is how you
       | would write it if you treated it like actual code:
       | 
       | (?<month>\d{1,2})
       | 
       | /
       | 
       | (?<day>\d{1,2})
       | 
       | /
       | 
       | (?<year>\d{4}|\d{2})
       | 
       | First off, you can know what my intent is when I'm capturing each
       | group. Maybe this code gets used by a european where the month
       | and day switch places. They can figure out how to fix it in like
       | two seconds. Secondly, the forward slashes are not lost in a sea
       | of characters anymore because we use whitespace like a civilized
       | developer, not a regex savage.
       | 
       | If you want to keep things simple with regular expressions:
       | 
       | * Be liberal with what your pattern matches and use a normal
       | programming language for your complicated conditional logic to
       | filter out crap you don't want
       | 
       | * Don't be afraid to break up the search with multiple regular
       | expressions
       | 
       | * Ignore pattern whitespace and use it to visually break up your
       | pattern. Nobody would agree to debug javascript that has been
       | minimized, yet people do this all the time with regex
       | 
       | * For the love of all that is holy, USE NAMED GROUPS. It is a
       | fantastic way to document your intent.
        
       | libraryatnight wrote:
       | Last time a regex conversation came up on HN someone turned me on
       | to https://regexcrossword.com - which is good fun if you're
       | someone who enjoys regex :)
        
       | avgcorrection wrote:
       | (I'm too much of an idealist for my own good.)
       | 
       | I'm sure that these resources are great. And it's not their fault
       | that the regex family of languages evolved in the way that they
       | did. But the simpler regex languages (without backreferences and
       | other stuff... that I might not even know about) seem simple at
       | first glance. In a perfect world I want to just spend and hour
       | internalizing them forever. But in practice it seems that doubt
       | always grips me, mostly because of the meta-syntax problem: did I
       | unintentionally use some metacharacter in this part of the string
       | which I meant to be "fixed"? So then I feel I have to "validate"
       | it with some external tool. And suddenly it feels like this
       | seemingly terse and agile language is just making me second-guess
       | myself.
        
         | Cyberdog wrote:
         | > did I unintentionally use some metacharacter in this part of
         | the string which I meant to be "fixed"?
         | 
         | When in doubt, just throw a backslash in front of it, which
         | always means "the next character is to be interpreted
         | literally," even in cases where it's not necessary.
         | 
         | (Well, not _always;_ the backslash will invoke a special
         | character when thrown before some letters; eg,  "\t" means the
         | tab character. But normal letters never need to be escaped;
         | just punctuation.)
        
           | burntsushi wrote:
           | Fun fact: Rust's regex crate won't let you do this. If you
           | try to escape a character that isn't a meta character, you
           | get an error. So in cases like this, it will erase your
           | doubt.
           | 
           | (There is ongoing discussion about relaxing this rule for
           | some characters, since it is so common in some cases. For
           | example, escaping / so common that folks try to do it with
           | the regex crate and are surprised when it returns an error. /
           | is rarely a regex meta character, rather, it tends to
           | _denote_ the start and stop of regexes, e.g., in Javascript
           | or sed.)
        
       | blahyawnblah wrote:
       | This one is my go-to. But it would be a lot better if it didn't
       | display an alert when you try to leave the site. I've started to
       | try and find alternatives.
        
         | dmitriid wrote:
         | > I've started to try and find alternatives.
         | 
         | https://regex101.com/
        
         | cdolan wrote:
         | I don't get a warning to leave the site unless I've input data
         | into the form. I find that to be a valuable feature and not a
         | nag.
        
         | kroltan wrote:
         | In Firefox you can set dom.disable_beforeunload to true in your
         | about:config to disable this behaviour (globally)
        
       | s1mon wrote:
       | Rubular: a Ruby regular expression editor
       | (https://rubular.com/r/mP6IRzteSm) is another option which is
       | pretty minimal but useful.
        
       | yashg wrote:
       | I've used regexr for years. It has helped me build some really
       | complex expressions.
        
       | Mockapapella wrote:
       | Regex seems like a good use case for GPT3. Most people that I've
       | seen use regex use it so rarely that they end up having to
       | relearn the syntax each time they use it.
        
       | janpot wrote:
       | I sometimes use https://regexper.com/ to debug hard to understand
       | regexes.
        
         | bloblaw wrote:
         | If you paste a regex into RegexBuddy, it will explain each
         | portion of the regex. Clicking on parts of the regex will
         | highlight its meaning.
         | 
         | https://www.regexbuddy.com/analyze.html
         | 
         | Or it has the best regex debugger I've ever seen in my 20 year
         | career: https://www.regexbuddy.com/debug.html
        
       | blable2 wrote:
       | Just thinking out loud... can't I just ask the google "Hey
       | Google, in PERL, show me a regex to find the first occurrence of
       | a semicolon to the next period."?
        
       | angryGhost wrote:
       | This site is a rite of passage for any programmer, ever.
        
       | somehnguy wrote:
       | I switch between this site and debuggex.com depending on what I'm
       | doing at the time. I find they both have their strengths for
       | specific tasks.
        
         | scottc wrote:
         | What's the diff? I love regexr and credit that site to my
         | finally understanding regex.
         | 
         | In fact, I was just on it this morning!
        
           | somehnguy wrote:
           | I think regexr does a better job of helping you to understand
           | exactly what is happening in the regex. I typically reach for
           | debuggex when I already have a decent understanding of how to
           | accomplish what I want and just want a simple way to edit &
           | test, I think the interface is less busy for that case.
        
       | l30n4da5 wrote:
       | I've used RegExr for a few years now. No real reason other than
       | it was the first tool for writing regex that I found.
        
       | newusertoday wrote:
       | Is there a list of regex patterns for common usecases like
       | imei/geo cordinates etc. somewhere . My google searches are
       | leading me either to regex tutorial sites or regex libraries.
       | There are handful of results for emails/url etc. but not getting
       | exhaustive list.
        
         | nerdponx wrote:
         | Do geographical coordinates have a standard layout? I'd expect
         | that you have to look at your particular data for cases like
         | this.
         | 
         | Something like IMEI should be pretty easy, if Wikipedia [0] is
         | to be trusted (e.g. in Python):                   # Matches
         | IMEI and IMEISV         imei_pattern =
         | re.compile(r"\d{2}-\d{6}-\d{6}-\d\d?")
         | 
         | You _could_ write a big monster pattern that sets up capture
         | groups for all the different TAC and Check Digit variants, but
         | why bother? Just slice off what you need from the result after
         | matching.
         | 
         | 0:
         | https://en.wikipedia.org/wiki/International_Mobile_Equipment...
        
       | lvl100 wrote:
       | Are there any regex builders that work with natural language
       | inputs?
        
         | maciejgryka wrote:
         | If I understood what you mean, then yes, I built one
         | https://regex.help/ (powered by
         | https://github.com/pemistahl/grex doing the heavy lifting).
        
       ___________________________________________________________________
       (page generated 2022-01-27 23:01 UTC)