Follow RFC 3986 regarding scheme more closely in extract_urls(). - plumb - Open certain URL patterns with an ad-hoc opener (plumber)
 (HTM) hg clone https://bitbucket.org/iamleot/plumb
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) README
       ---
 (DIR) changeset 6014611a02b449c4d2f5f0e285d26fd2ba525b66
 (DIR) parent 5a0b90c4bd7d2f0f60197a9c7bf06781122d1137
 (HTM) Author: Leonardo Taccari <iamleot@gmail.com>
       Date:   Wed, 28 Mar 2018 17:03:24 
       
       Follow RFC 3986 regarding scheme more closely in extract_urls().
       
       According RFC 3986 the scheme part of an URI can be:
       
           scheme      = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
       
       Adjust URI related REs accordingly.
       
       Diffstat:
        dplumb |  12 ++++++------
        1 files changed, 6 insertions(+), 6 deletions(-)
       ---
       diff -r 5a0b90c4bd7d -r 6014611a02b4 dplumb
       --- a/dplumb    Tue Mar 27 22:14:23 2018 +0200
       +++ b/dplumb    Wed Mar 28 17:03:24 2018 +0200
       @@ -76,13 +76,13 @@
        '
        /:\/\// {
               # Extract URLs inside possible delimiters
       -       if (match($0, /\<[[:alnum:]]+:\/\/[^>]+\>/) ||
       -           match($0, /\([[:alnum:]]+:\/\/[^)]+\)/) ||
       -           match($0, /\[[[:alnum:]]+:\/\/[^]]+\]/) ||
       -           match($0, /"[[:alnum:]]+:\/\/[^]]+"/) ||
       -           match($0, /'"'"'[[:alnum:]]+:\/\/[^]]+'"'"'/)) {
       +       if (match($0, /\<[[:alpha:]][[:alnum:]+.-]*:\/\/[^>]+\>/) ||
       +           match($0, /\([[:alpha:]][[:alnum:]+.-]*:\/\/[^)]+\)/) ||
       +           match($0, /\[[[:alpha:]][[:alnum:]+.-]*:\/\/[^]]+\]/) ||
       +           match($0, /"[[:alpha:]][[:alnum:]+.-]*:\/\/[^]]+"/) ||
       +           match($0, /'"'"'[[:alpha:]][[:alnum:]+.-]*:\/\/[^]]+'"'"'/)) {
                       print substr($0, RSTART + 1, RLENGTH - 2)
       -       } else if (match($0, /[[:alnum:]]+:\/\/.+/)) {
       +       } else if (match($0, /[[:alpha:]][[:alnum:]+.-]*:\/\/.+/)) {
                       print substr($0, RSTART, RLENGTH)
               }
        }