[HN Gopher] SeleniumBase: Python APIs for web automation and byp...
___________________________________________________________________
SeleniumBase: Python APIs for web automation and bypassing bot-
detection
Author : seleniumbase
Score : 45 points
Date : 2024-12-16 17:34 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| michael_j_x wrote:
| I've been working with scrapers quite a lot. I started with
| python requests, then to scrapy, then selenium, then selenium via
| undetected_chromedriver, and once that started being detected
| during a chrome update about a year ago, I've switched over to
| seleniumbase. It got by undetected, but to get it working with
| pre-downloaded drivers, I had to look into the code. I have
| never, and I mean never, in all my python years, seen such a
| horrible mess of code. We are talking 1000lines long methods,
| with 20-30 different flags and branches Just horrible. I have
| since switched to Playwright, which seems to be also undetected,
| and offers a much saner interface.
| seleniumbase wrote:
| SeleniumBase modifies the webdriver so that it doesn't get
| detected when used alongside the CDP stealth mode and methods.
| It'll download chromedriver for you. Not sure what you mean by
| the multiple branches, as there's just the primary one. What
| 1000-line methods are you referring to? By "flags", do you mean
| the different command-line options available? As for
| Playwright, they aren't undetected: See
| https://github.com/microsoft/playwright/issues/23884#issueco...
| - "Playwright is an end-to-end testing framework, where we
| expect you test on your own environments. Bypassing any form of
| bot protection is not something we can act on. Thanks for your
| understanding." On the contrary, SeleniumBase is OK with
| bypassing bot detection:
| https://github.com/seleniumbase/SeleniumBase/blob/master/exa...
| mdaniel wrote:
| rather than point-by-point rebuttal as the sibling requests, I
| think this sums up the coding style pretty well:
| https://github.com/seleniumbase/SeleniumBase/blob/v4.33.11/s...
| seleniumbase wrote:
| That method came from code that I accepted in a PR from
| December 31, 2019:
| https://github.com/seleniumbase/SeleniumBase/pull/459 Not a
| true representation of most of the code today.
| edm0nd wrote:
| Not sure if you have explored rolling captcha solving services
| into your code. Its easy as fuck and you can do it in a few
| lines of code. Check out DeathByCaptcha or AntiCaptcha. It's
| like $2.99 per 1,000 successfully solved captchas.
|
| I guess my point is, you dont have to be undetected nor write
| 1000 lines of code to scrape or do whatever you are needing to
| do always. Saved me a ton of headaches and time when captchas
| are involved.
| mintzworld wrote:
| SeleniumBase is free, open-source, can bypass CAPTCHAs with a
| few lines of code, and it works from the free tier of GitHub
| Actions.
___________________________________________________________________
(page generated 2024-12-17 23:00 UTC)