[HN Gopher] SeleniumBase: Python APIs for web automation and byp...
       ___________________________________________________________________
        
       SeleniumBase: Python APIs for web automation and bypassing bot-
       detection
        
       Author : seleniumbase
       Score  : 45 points
       Date   : 2024-12-16 17:34 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | michael_j_x wrote:
       | I've been working with scrapers quite a lot. I started with
       | python requests, then to scrapy, then selenium, then selenium via
       | undetected_chromedriver, and once that started being detected
       | during a chrome update about a year ago, I've switched over to
       | seleniumbase. It got by undetected, but to get it working with
       | pre-downloaded drivers, I had to look into the code. I have
       | never, and I mean never, in all my python years, seen such a
       | horrible mess of code. We are talking 1000lines long methods,
       | with 20-30 different flags and branches Just horrible. I have
       | since switched to Playwright, which seems to be also undetected,
       | and offers a much saner interface.
        
         | seleniumbase wrote:
         | SeleniumBase modifies the webdriver so that it doesn't get
         | detected when used alongside the CDP stealth mode and methods.
         | It'll download chromedriver for you. Not sure what you mean by
         | the multiple branches, as there's just the primary one. What
         | 1000-line methods are you referring to? By "flags", do you mean
         | the different command-line options available? As for
         | Playwright, they aren't undetected: See
         | https://github.com/microsoft/playwright/issues/23884#issueco...
         | - "Playwright is an end-to-end testing framework, where we
         | expect you test on your own environments. Bypassing any form of
         | bot protection is not something we can act on. Thanks for your
         | understanding." On the contrary, SeleniumBase is OK with
         | bypassing bot detection:
         | https://github.com/seleniumbase/SeleniumBase/blob/master/exa...
        
         | mdaniel wrote:
         | rather than point-by-point rebuttal as the sibling requests, I
         | think this sums up the coding style pretty well:
         | https://github.com/seleniumbase/SeleniumBase/blob/v4.33.11/s...
        
           | seleniumbase wrote:
           | That method came from code that I accepted in a PR from
           | December 31, 2019:
           | https://github.com/seleniumbase/SeleniumBase/pull/459 Not a
           | true representation of most of the code today.
        
         | edm0nd wrote:
         | Not sure if you have explored rolling captcha solving services
         | into your code. Its easy as fuck and you can do it in a few
         | lines of code. Check out DeathByCaptcha or AntiCaptcha. It's
         | like $2.99 per 1,000 successfully solved captchas.
         | 
         | I guess my point is, you dont have to be undetected nor write
         | 1000 lines of code to scrape or do whatever you are needing to
         | do always. Saved me a ton of headaches and time when captchas
         | are involved.
        
           | mintzworld wrote:
           | SeleniumBase is free, open-source, can bypass CAPTCHAs with a
           | few lines of code, and it works from the free tier of GitHub
           | Actions.
        
       ___________________________________________________________________
       (page generated 2024-12-17 23:00 UTC)