hngopher.com

       [HN Gopher] Wcurl Is Here
       ___________________________________________________________________
        
       Wcurl Is Here
        
       Author : thunderbong
       Score  : 55 points
       Date   : 2024-08-23 19:38 UTC (3 hours ago)
        
 (HTM) web link (daniel.haxx.se)
 (TXT) w3m dump (daniel.haxx.se)
        
       | Vecr wrote:
       | I tried it out, it works. Personally I'd write something like
       | this in Rust not shell. It's not hard at all, you basically can
       | just ignore the borrow checker because all the program is doing
       | is spawning processes and a bit of text processing.
       | 
       | The main issue with writing it in shell instead is incorrect
       | error handling. I see a case that's a bit sketchy there. For now
       | maybe don't expose wcurl to user/attacker controlled input.
        
         | dgan wrote:
         | Why would you bother with such a heavy language, just to wait
         | on a single network event?
        
           | Vecr wrote:
           | Because of the error checking problem shell has. I'll put it
           | on my list to check the script, so I won't only be
           | complaining.
        
       | shmerl wrote:
       | Switched to wcurl from my own alias. It's neat. curl doesn't
       | pollute $HOME unlike wget.
        
         | tangus wrote:
         | How does wget pollute $HOME?
        
           | Izkata wrote:
           | ~/.wget-hsts
           | 
           | You can disable creating this file by adding a config option
           | to ~/.wgetrc
           | 
           | Looks like it's some sort of security thing involving https
           | vs http though.
        
       | firesteelrain wrote:
       | This is a great start!
       | 
       | Some suggestions:
       | 
       | - Allow insecure URLs by providing -k option automatically for
       | https URLs
       | 
       | - Auto follow HTTP redirects by providing -L option by default
       | 
       | - Rather than use the curl-options setting, just accept ---user
       | and ---pass as options then pass this to curl
       | 
       | I know people will feel strongly about some of these however it
       | would simplify curl usage for a majority of the download only use
       | cases
       | 
       | Other curl options can be used for the more "advanced" use cases
       | like Headers
        
         | viraptor wrote:
         | > Allow insecure URLs by providing -k option automatically for
         | https URLs
         | 
         | Why would you ever do that? This should be a very conscious
         | choice if you decide to ignore the main system keeping the
         | internet traffic trusted.
        
           | firesteelrain wrote:
           | I use it all the time in house because we have in house, air
           | gapped systems
           | 
           | I know some people feel strongly about this one
           | 
           | But the only time it could lead to a problem is if you pass
           | user/pass and you have a MITM situation.
           | 
           | So maybe only allow it if not passing user and pass
           | 
           | If it's just a download and we know we aren't on a TOR node
           | situation then privacy isn't that great of a concern
           | 
           | My two cents! Open to changing my mind
        
             | saurik wrote:
             | A MITM situation is relevant even without a credential and
             | isn't at all about privacy: an attacker can swap out a
             | different file for the one you wanted to download.
        
               | firesteelrain wrote:
               | You are right. Hash (if provided) would still need to be
               | verified upon download
        
             | viraptor wrote:
             | If you have an closed system, then you have two options:
             | use plain http if you really trust the environment, or use
             | your own CA and have a trusted https. Having an untrusted
             | https and disabling it is a double waste of time.
        
               | firesteelrain wrote:
               | We have our own CA but they don't originate with any
               | known root. They are self signed certs
        
               | viraptor wrote:
               | That's ok, that's how you normally do it. But then the
               | second step is adding that CA to the trusted store on all
               | relevant clients, so that it can actually get verified.
               | (Otherwise why bother with the CA, just self-sign
               | everything individually)
        
               | firesteelrain wrote:
               | It's our lack of a DevOps / Platform Dept. Our
               | traditional IT groups won't do it sadly
               | 
               | I mean invest in Smallstep SSH - nope
        
             | ziml77 wrote:
             | Add the signing authorities to your systems certificate
             | store if it's that big of an annoyance. Or make your own
             | custom alias that includes -k. But this absolutely cannot
             | be default. HTTPS ensures that you are connected to the
             | server you think you are and that no one is messing with
             | your data in transit.
        
               | firesteelrain wrote:
               | I totally understand this isn't popular. But even if it
               | doesn't originate from a certificate chain, it is still
               | encrypted between you and the website. Having the
               | certificate chain lets you know the certificate is part
               | of a chain of trust and prevents MITM
        
             | perchlorate wrote:
             | Just don't do that. Some of us (hello) live in countries
             | that perform or tried to perform HTTPS MITM on a massive
             | scale, and only had to roll back because so much well
             | behaving shit broke.
             | 
             | If software suddenly started accepting invalid
             | certificates, they would have no incentive of rolling it
             | back. HTTPS would make zero sense then.
        
         | zamadatix wrote:
         | > - Auto follow HTTP redirects by providing -L option by
         | default
         | 
         | The script already passes "--location", the non-shortened
         | version of the argument.
         | 
         | For the other things maybe the both safer and more scalable
         | approach is the script should see if it's being run
         | interactively and default to interactively prompting the user
         | for what action to take on detectable things (like missing
         | basic auth or invalid certs for secure domains) or logging an
         | example version of the same command to run as part of the
         | stderr output otherwise. Apart from avoiding debate by taking a
         | stance on what the default should be this will still be allowed
         | by e.g. the Debian repo maintainers.
        
           | firesteelrain wrote:
           | Good point on --location. I normally use -L. So I missed that
           | 
           | It does default to https as the proto-default if no scheme is
           | provided - in your example it could default to interactively
           | asking the user however this may fail in automated scenarios
           | and/or hang.
           | 
           | Anyways it's a good thought exercise. It's hard to satisfy
           | all use cases equally
        
       | Alifatisk wrote:
       | > remembering what curl options to use when they just want to
       | download the contents of a URL is hard
       | 
       | Which curl options do they mean? "curl -O URL"?
        
         | brainzap wrote:
         | curl -LO url
        
           | ilyagr wrote:
           | `curl -LOf url` is the simplest one I'm aware of that doesn't
           | act too weirdly :)
           | 
           | Without `-f`, it'll happily save the 404 error page if you do
           | `curl -LO https://example.com/index.htmll`
           | 
           | After playing with `wcurl`, I may or may not remember that
           | one.
           | 
           | Also, wcurl puts a lot of effort into making `wcurl url1
           | url2` work equally predictably.
           | 
           | Finally, wcurl doesn't yet support `wcurl
           | https://example.com` (without a filename at the end of the
           | URL), but it might eventually.
        
         | resoluteteeth wrote:
         | The readme at https://github.com/curl/wcurl explains some other
         | stuff it does by default which makes sense if the goal is to
         | download files:
         | 
         | > By default, wcurl will:
         | 
         | > Encode whitespaces in URLs;
         | 
         | > Download multiple URLs in parallel if the installed curl's
         | version is >= 7.66.0;
         | 
         | > Follow redirects;
         | 
         | > Automatically choose a filename as output;
         | 
         | > Avoid overwriting files if the installed curl's version is >=
         | 7.83.0 (--no-clobber);
         | 
         | > Perform retries;
         | 
         | > Set the downloaded file timestamp to the value provided by
         | the server, if available;
         | 
         | > Disable curl's URL globbing parser so {} and [] characters in
         | URLs are not treated specially.
        
           | ziml77 wrote:
           | Is curl able to retry from the middle of a file? Beyond the
           | sane defaults that's where wget has been nice for me; if the
           | connection drops it can pick right back up where it left off.
        
             | mcpherrinm wrote:
             | Yes, curl can resume partial downloads. You need to pass
             | `-C -` to do so, which is a pretty good example of "the
             | command line flags aren't easy to remember"
        
               | groby_b wrote:
               | It's also yet another example of "could you please just
               | do the right thing by default".
               | 
               | curl wants to be a swiss army knife, and so you'll need
               | to configure everything just right so you don't
               | accidentally get the bottle opener instead of the
               | downloader. wget just downloads, and does it well.
        
       | ChrisArchitect wrote:
       | [dupe] misleading
       | 
       | Wcurl is here.....2 months ago.
       | 
       | Discussion: https://news.ycombinator.com/item?id=40869458
        
       | moreati wrote:
       | Was fully expecting wcurl to be an alternative name for curl on
       | Windows, because Microsoft made "curl" an alias for their own
       | http downloader command in Powershell.
        
         | ndegruchy wrote:
         | `curl.exe` is actually the real deal curl, now. Though it's got
         | less built options than a standard Linux distro version. I
         | don't know if they still alias it in PowerShell, but you can
         | still use curl.exe to get it.
        
         | Izkata wrote:
         | I was thinking "web curl", like, an alternative to fetch() or
         | XMLHttpRequest that exposed even more stuff.
        
       | aaplok wrote:
       | > This is one often repeated reason why some users reach for wget
       | instead of curl on the command line.
       | 
       | Is there a reason _not_ to do that? I 've always used wget and
       | curl interchangeably, because both meet my moderate needs. Is
       | there a reason I should avoid wget?
        
         | m463 wrote:
         | sometimes when using containers, curl was already installed and
         | I use that. Other times it is wget. I might want to skip adding
         | "apt-get install wget" etc
        
         | lolpanda wrote:
         | the only reason i use wget over curl is that when i'm
         | downloading a file, i don't have to specify the output file
         | name with wget lol
        
           | threePointFive wrote:
           | -O (for name based on end of the URL) or -J (for name based
           | on the returned HTTP header)
        
       ___________________________________________________________________
       (page generated 2024-08-23 23:00 UTC)