[HN Gopher] Curl-Impersonate
___________________________________________________________________
Curl-Impersonate
Author : jakeogh
Score : 340 points
Date : 2024-12-30 09:18 UTC (13 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| londons_explore wrote:
| > The resulting curl looks, from a network perspective, identical
| to a real browser.
|
| How close is it? If I ran wireshark, would the bytes be exactly
| the same in the exact same packets?
| dchest wrote:
| What else could "identical" mean?
| londons_explore wrote:
| It could be that the TCP streams are the same, but
| packetiation is different.
|
| It could mean that the packets are the same, but timing is
| off by a few milliseconds.
|
| It could mean a single HTTP request exactly matches, but when
| doing two requests the real browser uses a connection pool
| but curl doesn't. Or uses HTTP/3's fast-open abilities, etc.
|
| etc.
| zlagen wrote:
| It replicates the browser at the HTTP/SSL level, not TCP.
| From what I know this is good enough to bypass cloudflare's
| bot detection.
| Retr0id wrote:
| Two TLS streams are never byte-identical, due to randomness
| inherent to the protocol.
|
| Identical here means having the same fingerprint - i.e. you
| could not write a function to reliably distinguish traffic
| from one or the other implementation (and if you can then
| that's a bug).
| jsnell wrote:
| The packets from Chrome wouldn't be exactly the same as packets
| sent by Chrome at a different time either. "The exact same
| packets" is not a viable benchmark, since both the client and
| the server randomize the payloads in various ways. (E.g. key
| exchange, GREASE).
| peetistaken wrote:
| You can check your fingerprint on https://tls.peet.ws
| peetistaken wrote:
| https://github.com/bogdanfinn/tls-client is the go-to package for
| the go world, it does the same thing
| zlagen wrote:
| In case anyone is interested, I created something similar but for
| python(using chromium's network stack)
| https://github.com/lagenar/python-cronet I'm looking for help to
| create the build for windows.
| hk__2 wrote:
| Any reason you didn't use
| https://github.com/lexiforest/curl_cffi?
| zlagen wrote:
| I wanted to try a diffent approach which is to use chromium's
| network stack directly instead of patching curl to
| impersonate it. In this case you're using the real thing so
| it's a bit easier to maintain when there are changes in the
| fingerprint.
| Klonoar wrote:
| Similar projects exist for C#
| (https://github.com/sleeyax/CronetSharp), Go
| (https://github.com/sleeyax/cronet-go) and Rust
| (https://github.com/sleeyax/cronet-rs).
|
| These _can_ work well in some cases but it 's always a
| tradeoff.
| thrdbndndn wrote:
| Any plan to offer a sync API?
| Retr0id wrote:
| I recently used ja3proxy, which uses utls for the impersonation.
| It exposes an HTTP proxy that you can use with any regular HTTP
| client (unmodified curl, python, etc.) and wraps it in a TLS
| client fingerprint of your choice. Although I don't think it does
| anything special for http/2, which curl-impersonate does
| advertise support for.
|
| https://github.com/LyleMi/ja3proxy
|
| https://github.com/refraction-networking/utls
| TekMol wrote:
| What is the use case? If you have to read data from one specific
| website which uses handshake info to avoid being read by
| software?
|
| When I have to do HTTP requests these days, I default to a
| headless browser right away, because that seems to be the best
| bet. Even then, some website are not readable because they use
| captchas and whatnot.
| mschuster91 wrote:
| > What is the use case? If you have to read data from one
| specific website which uses handshake info to avoid being read
| by software?
|
| Evade captchas. curl user agent / heuristics are blocked by
| many sites these days - I'd guess many popular CDNs have pre-
| defined "block bots" stuff that blocks everything automated
| that is not a well-known search engine indexer.
| adastral wrote:
| > I default to a headless browser
|
| Headless browsers consume orders of magnitude more resources,
| and execute far more requests (e.g. fetching images) than a
| common webscraping job would require. Having run webscraping at
| scale myself, the cost of operating headless browsers made us
| only use them as a last resort.
| TekMol wrote:
| So you maintain a table of domains and how to access them?
|
| How do you build that table and keep it up to date? Manually?
| at0mic22 wrote:
| Blocking all image/video/CSS requests is the rule of thumb
| when working with headless browsers via CDP
| sangnoir wrote:
| Speaking as a person who has played on both offense and
| defense: this is a heuristic that's not used frequently
| enough by defenders. Clients that load a single HTML/JSON
| endpoint without loading css or image resources associated
| with the endpoints are likely bots (or user agents with a
| fully loaded cache, but defenders control what gets cached
| by legit clients and how). Bot data thriftiness is a huge
| signal.
| at0mic22 wrote:
| As a high load system engineer you'd want to offload
| asset serving to CDN which makes detection slightly more
| complicated. The easy way is to attach an image onload
| handler with client js, but that would give a high yield
| of false positives. I personally have never seen such
| approach and doubt its useful for many concerns.
| sangnoir wrote:
| Unless organization policy forces you to, you do not have
| to put _all_ resources behind a CDN. As a matter of fact,
| getting this heuristic to work requires a non-optimal
| caching strategy of one or more real or decoy resources -
| CDN or not. "Easy" is not an option for the bot/anti-bot
| arms race, all the low hanging fruit is now gone when
| fighting a determined adversary on either end.
|
| > I personally have never seen such approach and doubt
| its useful for many concerns.
|
| It's an arms race and defenders are not keen on sharing
| their secret sauce, though I can't be the only one who
| thought of this rather basic bot characteristic, multiple
| abuse trams probably realized this decades ago. It works
| pretty well against the low-resource scrapers with fakes
| UA strings and all the right TLS handshakes. It won't
| work against the headless browsers that costs scrapers
| more in resources and bandwidth, and there are specific
| countermeasures for headless browsers [1], and counter-
| countermeasures. It's a cat and mouse game.
|
| 1. e.g. Mouse movement, as made famous as ine signal
| evaluated by Google's reCAPTCHA v2, monitor resolution &
| window size and position, and Canvas rendering, all if
| which have been gradually degraded by browser anti-
| fingerprinting efforts. The bot war is fought on the long
| tail.
| zzo38computer wrote:
| Even legitimate users might want to disable CSS and
| pictures and whatever, and I often do when I just want to
| read the document.
|
| Blind users also might have no use for the pictures, and
| another possibility is if the document is longer than the
| screen so the picture is out of view then the user might
| program the client software to use lazy loading, etc.
| jollyllama wrote:
| >The Client Hello message that most HTTP clients and libraries
| produce differs drastically from that of a real browser.
|
| Why is this?
| throwaway99210 wrote:
| Based on what I've seen, most command-line clients and basic
| HTTP libraries typically ship with leaner, more static
| configurations (e.g., no GREASE extensions in the Client Hello,
| limited protocols in the ALPN extension header, smaller number
| of Signature Algorithms). Mirroring real browser TLS
| fingerprints is also more difficult due to the randomization of
| the Client Hello parameters (e.g., current versions of Chrome)
| Retr0id wrote:
| The protocols are flexible and most browsers bring their own
| HTTP+TLS clients
| zlagen wrote:
| They use different SSL libraries/configuration. Chrome uses
| BoringSSL and other libraries may use OpenSSL or some other
| library. Besides that the SSL library may be configured with
| different cipher suites and extensions. The solution these
| impersonators provide is to use the same SSL library and
| configuration as a real browser.
| cle wrote:
| The same author also makes a Python binding of this which exposes
| a requests-like API in Python, very helpful for making HTTP reqs
| without the overhead of running an entire browser stack:
| https://github.com/lexiforest/curl_cffi
|
| I can't help but feel like these are the dying breaths of the
| open Internet though. All the megacorps (Google, Microsoft,
| Apple, CloudFlare, et al) are doing their damndest to make sure
| everyone is only using software approved by them, and to ensure
| that they can identify you. From multiple angles too (security,
| bots, DDoS, etc.), and it's not just limited to browsers either.
|
| End goal seems to be: prove your identity to the megacorps so
| they can track everything you do and also ensure you are only
| doing things they approve of. I think the security arguments are
| just convenient rationalizations in service of this goal.
| throwaway99210 wrote:
| > I can't help but feel like these are the dying breaths of the
| open Internet though
|
| I agree with the over zealous tracking by the megacorps but
| this is also due to bad actors, I work for a financial company
| and the amount of API abuse, ATO, DDoS, nefarious bot traffic,
| etc. we see on a daily basis is absolutely insane
| berkes wrote:
| But how much of this "bad actor" interaction is countered
| with tracking? And how many of these attempts are even close
| to successfull with even the simplest out of the box security
| practices set up?
|
| And when it does get more dangerous, is over zealous tracking
| the best counter for this?
|
| I've dealt with a lot of these threats as well, and a lot are
| countered with rather common tools, from simple fail2ban
| rules to application firewalls and private subnets and
| whatnot. E.g. a large fai2ban rule to just ban anything that
| attempts to HTTP GET /admin.php or /phpmyadmin etc, even just
| once, gets rid of almost all nefarious bot traffic.
|
| So, I think the amount of attacks indeed can be insane. But
| the amount that need over zealous tracking is to be
| countered, is, AFAICS, rather small.
| throwaway99210 wrote:
| > E.g. a large fai2ban rule to just ban anything that
| attempts to HTTP GET /admin.php or /phpmyadmin etc, even
| just once, gets rid of almost all nefarious bot traffic.
|
| unfortunately fail2ban wouldn't even make a dent in the
| attack traffic hitting the endpoints in my day-to-day work,
| these are attackers utilizing residential proxy
| infrastructure that are increasingly capable of solving
| JS/client-puzzle challenges.. the arms race is always
| escalating
| JohnMakin wrote:
| we see the same thing, also with a financial company, the
| most successful strategies we've seen is making stuff
| like this extremely expensive for whoever it is if we see
| it, and they stop or slow down to a point it becomes not
| worth it and they move on. sometimes that's really all
| you can do without harming legit traffic.
| josephcsible wrote:
| Such a rule is a great way to let malicious users lock
| out a bunch of your legitimate customers. Imagine if
| someone makes a forum post and includes this in it:
| [img]https://example.com/phpmyadmin/whatever.png[/img]
| tialaramex wrote:
| A big problem is that where we have a good solution you'll
| lose if you insist on that solution but other people get
| away with doing something that's crap but customers like
| better. We often have to _mandate_ a poor solution that
| will be tolerated because if we mandate the better solution
| it will be rejected, and if we don 't mandate anything the
| outcomes are far worse.
|
| Today for example I changed energy company+. I made a
| telephone call, from a number the company has never seen
| before. I told them my name (truthfully but I could have
| lied) and address (likewise). I agreed to about five
| minutes of parameters, conditions, etc. and I made one
| actual meaningful choice (a specific tariff, they offer
| two). I then provided 12 digits identifying a bank account
| (they will eventually check this account exists and ask it
| to pay them money, which by default will just work) and I'm
| done.
|
| Notice that _anybody_ could call from a burner and that
| would work too. They could move Aunt Sarah 's energy to
| some random outfit, assign payments to Jim's bank account,
| and cause maybe an hour of stress and confusion for both
| Sarah and Jim when months or years later they realise the
| problem.
|
| We know how to do this properly, but it would be high
| friction and that's not in the interests of either the
| "energy companies" or the politicians who created this
| needlessly complicated "Free Market" for energy. We could
| abolish that Free Market, but again that's not in their
| interests. So, we're stuck with this waste of our time and
| money, indefinitely.
|
| There have been _simpler_ versions of this system, which
| had even worse outcomes. They 're clumsier to use, they
| cause more people to get scammed AND they result in higher
| cost to consumers, so that's not great. And there are
| _better_ systems we can 't deploy because in practice too
| few consumers will use them, so you'd have 0% failure but
| lower total engagement and that's what matters.
|
| + They don't actually supply either gas or electricity,
| that's a last mile problem solved by a regulated monopoly,
| nor do they make electricity or drill for gas - but they do
| bill me for the gas and electricity I use - they're an
| artefact of Capitalism.
| Szpadel wrote:
| I can tell you about my experience with blocking traffic
| from scalpers bots that were very active during pandemic.
|
| All requests produced by those bots were valid ones,
| nothing that could be flagged by tools like fail2ban etc
| (my assumption is that it would be the same for financial
| systems).
|
| Any blocking or rate limiting by IP is useless, we saw
| about 2-3 requests per minute per IP, and those actors had
| access to ridiculous number of large CIDRs, blocking any IP
| caused it instantly replace it with another.
|
| blocking by AS number was also mixed bag, as this list
| growed up really quickly, most of that were registered to
| suspicious looking Gmail addresses. (I feel that such
| activity might own significant percentage of total ipv4
| space)
|
| This was basically cat and mouse game of finding some
| specific characteristic in requests that matches all that
| traffic and filtering it, but the other side would adapt
| next day or on Sunday.
|
| aggregated amount of traffic was in range of 2-20k r/s to
| basically heaviest endpoint in the shop, with was the main
| reason we needed to block that traffic (it generated 20-40x
| load of organic traffic)
|
| cloudflare was also not really successful with default
| configuration, we had to basically challenge everyone by
| default with whitelist of most common regions from where we
| expected customers.
|
| So best solution is to track everyone and calculate long
| term reputation.
| stareatgoats wrote:
| Blocking scalper bot traffic by any means, be it by
| source or certified identification seems a lost cause,
| i.e. not possible because it can always be circumvented.
| Why did you not have that filter at point of sale
| instead? I'm sure there are reasons, but to have a
| battery of captchas and a limit on purchases per credit
| card seems on the surface much more sturdy. And it
| doesn't require that everyone browsing the internet
| announce their full name and residential address in order
| to satisfy the requirements of a social score ...
| Szpadel wrote:
| The product they tried to buy what not in stock anyways,
| but their strategy was to constantly try anyways, so in
| case it would become in stock they would be the first to
| get it. It was all for guest checkout, so no address yet
| to validate nor credit card. Because they used API
| endpoints used by the frontend we could not use any
| captcha at this place because of technical requirements.
|
| As stated before the main reason we needed to block it
| was volume of the traffic, you migh imagine identical
| scenario for dealing with DDoS attack.
| bornfreddy wrote:
| > Because they used API endpoints used by the frontend we
| could not use any captcha at this place because of
| technical requirements.
|
| That doesn't compute... Captcha is almost always used in
| such setups.
|
| It also looks like you could just offer an API endpoint
| which would return if the article is in stock or not, or
| even provide a webhook. Why fight them? Just make the
| resource usage lighter.
|
| I'm curious now though what the articles were, if you are
| at liberty to share?
| Szpadel wrote:
| We had captcha, but it was at later stage of the checkout
| process. This API endpoint needed to work from cached
| pages, so it could not contain any dynamic state in
| request.
|
| Some bots checked product page where we had info if
| product is in stock (although they tried heavenly to
| bypass any caches by putting garbage in URL). This kind
| of bots also scaled instantly to thousands checkout
| requests when product become available with gave no time
| for auto scaling to react (this was another challenge
| here)
|
| This was easy to mitigate so it didn't generate almost
| any load on the system.
|
| I believe we had email notification available, but it
| could be too high latency way for them.
|
| I'm not sure how much I can share about articles here,
| but I can say that those were fairly expensive (and
| limited series) wardrobe products.
| shaky-carrousel wrote:
| Hm, is probably too late, but you could have implemented
| in your API calls some kind of proof of work. Something
| that's not too onerous for a casual user but it is hard
| for someone trying multiple requests.
| miki123211 wrote:
| > Why fight them? Just make the resource usage lighter.
|
| Because you presumably want real, returning customers,
| and that means those customers need to get a chance at
| buying those products, instead of them being scooped up
| by a scalper the millisecond they appear on the website.
| geysersam wrote:
| Sounds like a dream having customers scooping up your
| products the millisecond the appear on the website. They
| should increase their prices.
| sesm wrote:
| I remember people doing this with PS5 when they were in
| short supply after release.
| cute_boi wrote:
| why not charge people? This is the only solution I can
| think of.
| shwouchk wrote:
| Require a verified account to buy high demand items.
| codingminds wrote:
| I've learned that Akamai has a service that deals with
| this specific problem, maybe this might interest you as
| well: https://www.akamai.com/products/content-protector
| mattpallissard wrote:
| That's not the same type of bot net. Fail 2 ban simply is
| not going to work when you have a popular unauthenticated
| endpoint. You have hundreds of thousands of rps spread
| across thousands of legitimate networks that. The requests
| are always modified to look legitimate in a never ending
| game of whack-a-mole.
|
| You wind up having to use things like tls fingerprinting
| with other heuristics to identify what to traffic to
| reject. These all take engineering hours and require
| infrastructure. It is SO MUCH SIMPLER to require auth and
| reject everything else outright.
|
| I know that the BigCo's want to track us and you originally
| mentioned tracking not auth. But my point is yeah, they
| have malicious reasons for locking things down, but there
| are legitimate reasons too.
| sangnoir wrote:
| > You wind up having to use things like tls
| fingerprinting
|
| ...and we've circled back to the post's subject - a
| version of curl that impersonates browsers TLS handshake
| behavior to bypass such fingerprinting.
| fijiaarone wrote:
| Easy solution to rate limit. Require initial request to
| get 1 time token with a 1 second delay And then require
| valid requests to include the token. The token returned
| has a salt with something like timestamp and ip. That way
| they can only bombard the token generator.
|
| get /token
|
| Returns token with timestamp in salted hash
|
| get /resource?token=abc123xyz
|
| Check for valid token and drop or deny.
| jsnell wrote:
| The question is a bit of a non sequitur, since this is not
| tracking. The TLS fingerprint is not a useful tracking
| vector, by itself nor as part of some composite
| fingerprint.
| fijiaarone wrote:
| The point is that you have to use an approved client (eg
| browser, os) with an approved cert authority that goes
| through approved gatekeepers (eg Cloudflare, Akamai)
| miki123211 wrote:
| This depends on what you're fighting.
|
| If you're fighting adversaries that go for scale, AKA
| trying to hack as many targets as possible, mostly low-
| sophistication, using techniques requiring 0 human work and
| seeing what sticks, yes, blocking those simple techniques
| works.
|
| Those attackers don't ever expect to hack Facebook or your
| bank, that's just not the business they're in. They're fine
| with posting unsavory ads on your local church's website,
| blackmailing a school principal with the explicit pictures
| he stores on the school server, or encrypting all the data
| on that server and demanding a ransom.
|
| If your company does something that is specifically
| valuable to someone, and there are people _whose literal
| job it is to attack your company 's specific systems_, no,
| those simple techniques won't be enough.
|
| If you're protecting a Church with 150 members, the simple
| techniques are probably fine, if you're working for a major
| bank or a retailer that sells gaming consoles or concert
| tickets, they're laughably inadequate.
| cle wrote:
| Yep totally agree these are problems. I don't have a good
| alternative proposal either, I'm just disappointed with what
| we're converging on.
| code51 wrote:
| Much of this "bad actor" activity is actually customer needs
| left hanging - for either the customer to automate herself or
| other companies to fill the gap to create value that's not
| envisioned by the original company.
|
| I'm guessing investors actually like a healthy dose of open
| access and a healthy dose of defence. We see them (YC, as an
| example) betting on multiple teams addressing the same
| problem. The difference is their execution, the angle they
| attack.
|
| If, say, the financial company you work for is capable in
| both product and technical aspect, I assume it leaves no gap.
| It's the main place to access the service and all the side
| benefits.
| miki123211 wrote:
| > Much of this "bad actor" activity is actually customer
| needs left hanging - for either the customer to automate
| herself or other companies to fill the gap to create value
|
| Sometimes the customer you have isn't the customer you
| want.
|
| As a bank, you don't want the customers that will try to
| log in to 1000 accounts, and then immediately transfer any
| money they find to the Seychelles. As a ticketing platform,
| you don't want the customers that buy tickets and then
| immediately sell them on for 4x the price. As a messaging
| app, you don't want the customers who have 2000 bot
| accounts and use AI to send hundreds of thousands of spam
| messages a day. As a social network, you don't want the
| customers who want to use your platform to spread pro-
| russian misinformation.
|
| In a sense, those are "customer needs left changing", but
| neither you nor otherr customers want those needs to be
| automatible.
| schnable wrote:
| A lot of the motivation comes from government regulations too.
| Right now this is mostly in banking, but social media and porn
| regs are coming too.
| lelandfe wrote:
| PornHub and all of its affiliate sites now block all
| residents of Alabama, Arkansas, Idaho, Indiana, Kansas,
| Kentucky, Mississippi, Montana, Nebraska, North Carolina,
| Texas, Utah, and Virginia (and Florida on Jan 1):
| https://www.pcmag.com/news/pornhub-blocked-florida-
| alabama-t...
|
| Child safety, as always, was the sugar that made the medicine
| go down in freedom-loving USA. I imagine these states'
| approaches will try to move to the federal level after
| Section 230 dies an ignominious death.
|
| Keep an eye out for _Free Speech Coalition v. Paxton_ to hit
| SCOTUS in January: https://www.oyez.org/cases/2024/23-1122
| octocop wrote:
| "I have nothing to hide" will eventually spread to everyone.
| Very unfortunate.
| cle wrote:
| I'm in a similar boat but it's more like "I have nothing I
| can hide".
|
| These days I just tell friends & family to assume that
| nothing they do is private.
| Habgdnv wrote:
| The answer is simple: I have something to hide. I have many
| things to hide actually. Nothing of these things is illegal
| currently but I still have many things to hide. And if I have
| something to hide - I can be worried about many things.
| deadbabe wrote:
| Even if the internet was wide open it's of little use these
| days.
|
| AI will replace any search you would want to do to find
| information, the only reason to scour the internet now is for
| social purposes: finding comments and forums or content from
| other users, and you don't really need to be untracked to do
| all that.
|
| A megacorp's main motivation for tracking your identity is to
| sell you shit or sell your data to other people who want to
| sell you things. But if you're using AI the amount of ads and
| SEO spam that you have to sift through will dramatically
| reduce, rendering most of those efforts pointless.
|
| And most people aren't using the internet like in the old days:
| stumbling across quaint cozy boutique websites made by
| hobbyists about some favorite topic. People just jump on social
| platforms and consume content until satisfied.
|
| There is no money to be made anymore in mass web scraping at
| scale with impersonated clients, it's all been consumed.
| oefrha wrote:
| What are some example sites where this is both necessary and
| sufficient? In my experience sites with serious anti-bot
| protection basically always have JavaScript-based browser
| detection, and some are capable of defeating puppeteer-extra-
| plugin-stealth even in headful mode. I doubt sites without
| serious anti-bot detection will do TLS fingerprinting. I guess it
| is useful for the narrower use case of getting a short-lived
| token/cookie with a headless browser on a heavily defended site,
| then performing requests using said tokens with this lightweight
| client for a while?
| jonatron wrote:
| There are sites that will block curl and python-requests
| completely, but will allow curl-impersonate. IIRC, Amazon is an
| example that has some bot protection but it isn't "serious".
| ekimekim wrote:
| In most cases this is just based on user agent. It's
| widespread enough that I just habitually tell requests not to
| set a User Agent at all (these aren't blocked, but if the UA
| contains "python" it is).
| Retr0id wrote:
| A lot of WAFs make it a simple thing to set up. Since it
| doesn't require any application-level changes, it's an easy
| "first move" in the anti-bot arms race.
|
| At the time I wrote this up, r1-api.rabbit.tech required TLS
| client fingerprints to match an expected value, and not much
| else:
| https://gist.github.com/DavidBuchanan314/aafce6ba7fc49b19206...
|
| (I haven't paid attention to what they've done since so it
| might no longer be the case)
| oefrha wrote:
| Makes sense, thanks.
| Avamander wrote:
| CloudFlare offers it. Even if it's not used for blocking it
| might be used for analytics or threat calculations, so you
| might get hit later.
| thrdbndndn wrote:
| Lots of sites, actually.
|
| > I doubt sites without serious anti-bot detection will do TLS
| fingerprinting
|
| They don't set it up themselves. CloudFlare offer such thing by
| default (?).
| oefrha wrote:
| Pretty sure it's not default, and Cloudflare browser check
| and/or captcha is a way bigger problem than TLS
| fingerprinting, at least was the case the last time I scraped
| a site behind Cloudflare.
| remram wrote:
| Those JavaScript scripts often get data from some API, and it's
| that API that will usually be behind some fingerprinting wall.
| ape4 wrote:
| I like this project!
|
| Is there a way to request impersonization of the current version
| of Chrome (or whatever)?
| aninteger wrote:
| I think we should list the sites where this fingerprinting is
| done. I have a suspicion that Microsoft does it for conditional
| access policies but I am not sure of other services.
| Galanwe wrote:
| We cannot really list them, as 90% of the time, it's not the
| websites themselves, it's their WAF. And there is a trend
| toward most company websites to be behind a WAF nowadays to
| avoid 1) annoying regulations (US companies putting geoloc on
| their websites to avoid EU cookie regulations) and 2) DDoS.
|
| It's now pretty common to have cloudflare, AWS, etc WAFs as
| main endpoints, and these do anti bots (TLS fingerprinting,
| header fingerprinting, Javascript checks, capt has, etc).
| pixelesque wrote:
| Cloudflare (which seems to be fronting half the web these days
| based off the number of cf-ray cookies that I see being sent
| back) does this with bot protection on, and Akamai has
| something similar I think.
| Sytten wrote:
| Thankfully only a small fraction of website does JA3/JA4
| fingerprinting. Some do more advanced stuff like correlating
| headers to the fingerprint. We have been able to get away without
| doing much in Caido for a long time but I am working on an OSS
| rust based equivalent. Neat trick, you can use the fingerprint of
| our competitor (Burp Suite) since it is whitelisted for the
| security folks to do their job. Only time you will not hear me
| complain about checkbox security.
| jandrese wrote:
| The build scripts on this repo seem a bit cursed. It uses
| autotools but has you build them in a subdirectory. The default
| built target is a help text instead of just building the project.
| When you do use the listed build target it doesn't have the
| dependencies set up correctly so you have to run it like 6 times
| to get to the point where it is building the application.
|
| Ultimately I was not able to get it to build because the
| BoringSSL disto it downloaded failed to build even though I made
| sure all of the dependencies the INSTALL.md listed are installed.
| This might be because the machine I was trying to build it on is
| an older Ubuntu 20 release.
|
| Edit: Tried it on Ubuntu 22, but BoringSSL again failed to build.
| The make script did work better however, only requiring a single
| invocation of make chrome-build before blowing up.
|
| Looks like a classic case of "don't ship -Werror because compiler
| warnings are unpredictable".
|
| Died on:
|
| /extensions.cc:3416:16: error: 'ext_index' may be used
| uninitialized in this function [-Werror=maybe-uninitialized]
|
| The good news is that removing -Werror from the CMakeLists.txt in
| BoringSSL got around that issue. Bad news is that the dependency
| list is incomplete. You will also need libc++-XX-dev and
| libc++abi-XX-dev where the XX is the major version number of GCC
| on your machine. Once you fix that it will successfully build,
| but the install process is slightly incomplete. It doesn't run
| ldconfig for you, you have to do it yourself.
|
| On a final note, despite the name BoringSSL is huge library that
| takes a surprisingly long time to build. I thought it would be
| like LibreSSL where they trim it down to the core to keep the
| attack surface samll, but apparently Google went in the opposite
| direction.
| at0mic22 wrote:
| Played this game and switched to prebuilt libraries. Think
| builder docker images have also been broken for a while.
| 38 wrote:
| that's exactly why I stopped using C/C++. building is many
| times a nightmare, and the language teams seems to have no
| interest in improving the situation
| kerblang wrote:
| Interesting in light of another much-discussed story about AI
| scraper farms swamping/DDOSing sites
| https://news.ycombinator.com/item?id=42549624
___________________________________________________________________
(page generated 2024-12-30 23:00 UTC)