[HN Gopher] Reverse Engineering TikTok's VM Obfuscation
___________________________________________________________________
Reverse Engineering TikTok's VM Obfuscation
Author : hazebooth
Score : 569 points
Date : 2022-12-23 19:36 UTC (1 days ago)
(HTM) web link (nullpt.rs)
(TXT) w3m dump (nullpt.rs)
| mhasbini wrote:
| Deobfuscated script without the vm part:
| https://gist.github.com/mhasbini/f9269d230ed8eb6dfdbb1bd1be9...
| Aperocky wrote:
| Isn't the same concept also used in Youtube? I believe a python
| mock of the equivalent VM exist in youtube-dl.
| mdaniel wrote:
| I recall that discussion recently, and thus just happen to have
| it handy:
|
| a very, very specialized "regex" based JS evaluator that
| presumably did just enough to make the YT one run:
| https://github.com/ytdl-org/youtube-dl/blob/2021.12.17/youtu...
|
| and its callsite: https://github.com/ytdl-org/youtube-
| dl/blob/2021.12.17/youtu...
|
| So the short version is that I would not classify that as a VM,
| and I don't even believe it's obfuscated. Perhaps there are
| other extractors that do what you're describing, I didn't go
| looking
| linux2647 wrote:
| IIRC not exactly. YouTube provides some arbitrary JavaScript
| that must be evaluated as a form of a challenge. It changes
| with every page request, but it's just a set of math
| operations. It's easier to evaluate the JS than to statically
| analyze it
| apienx wrote:
| Solid case! Thanks for taking the time to write it up.
|
| Those who care and have to use TikTok can probably add their own
| virtualization layer (and tolerate the hit in cost/performance).
| chinathrow wrote:
| No one _has_ to use social media.
| QuantumGood wrote:
| Wouldn't an example be a job that requires it? Are you
| attempting a meta comment, and really mean something like
| "anyone can quit a job that requires social media usage"?
| jesuspiece wrote:
| They're trying to be unique and cool by denouncing the use
| of social media
| antiviral wrote:
| This is excellent work.
|
| It also shows how Tiktok _may_ be in violation of several US /EU
| privacy laws. I really wonder now who this data is shared with.
| Perhaps someone should bring this article to the FTC's attention
| for further review.
| codedokode wrote:
| It is interesting, that while technologies like canvas, WebGL or
| WebRTC were intented for other purposes, their main usage became
| fingerprinting. For example, WebGL provides valuable information
| about GPU model and its drivers.
|
| This shows how browser developers race to provide new features
| ignoring privacy impact.
|
| I don't understand why features that allow fingerprinting
| (reading back canvas pixels or GPU buffers) are not hidden behind
| a permission.
| PetahNZ wrote:
| Come on, it's not their main usage... An intentional side
| effect maybe, but their main usage is clear.
| 0xy wrote:
| If something is used 99% of the time for tracking and 1% of
| the time for genuine useful reasons, it's safe to say it's a
| tracking mechanism.
|
| Intent is irrelevant, the APIs are fundamentally insecure.
| Google directly benefits from this financially.
| ivoras wrote:
| Of course it's not that simple.
|
| In most parts of the world, if a person is in a public space,
| anyone can take a photo of that person, including shop owners.
| This photo could be considered as a type of "fingerprint" for
| that person. The only important difference is that in some
| countries, you are not allowed make money off of such photos.
|
| The Internet is a lot like a big public space, and possibly
| worse - while you are using certain services (web pages or
| apps), it might be argued that you are actually "on premises"
| for that service provider.
|
| The best we can do now is more and more education about what
| can go wrong with such data collection.
| ajsnigrutin wrote:
| Yes, but taking photos is expensive, fingerprinting online is
| cheap. Also, there's a difference between taking a photo of
| the eiffel tower and taking a photo of a bunch of other
| tourists there (legal), or intentionally targeting and
| photographing an individual and creating a database of those
| photos (illegal in most countries).
| fxtentacle wrote:
| It's because the developer of the browser needs fingerprinting
| for their ads.
|
| I don't think Chrome accidentally exposed data that Google
| wanted.
| IshKebab wrote:
| Please don't spread obviously untrue conspiracy theories.
|
| The main reason is that it's _really hard_ to avoid
| fingerprinting (while providing rich features like WebGL and
| WebRTC anyway).
|
| A secondary reason is that web browsers started off from a
| position of leaking fingerprint data all over the place so
| there's not much incentive to care about it for new features.
|
| You might be interested in this effort to reduce
| fingerprinting: https://developer.chrome.com/en/docs/privacy-
| sandbox/privacy...
|
| (The real conspiracy is that Google added logins to Chrome
| specifically so that they _don 't_ have to rely on
| fingerprinting. They have a huge incentive to stop
| fingerprinting because it leaves them as the only entity that
| can track users.)
| danielheath wrote:
| I thought the developer of the browser is the only ad
| provider that _doesn't_ need it (since they have other,
| better ways to get that intel which their competitors do
| not).
| asdfghjkjhg wrote:
| they (google) did try.
|
| that's the profile icon you see on your google-chrome UI.
|
| but only fools use that feature.
| somekyle2 wrote:
| what makes someone who uses that feature a "fool"? Some
| users don't particularly mind being tracked.
| Cockbrand wrote:
| Also, it's very convenient in a work context if your
| employer uses G Suite/Workspace. I don't have anything to
| hide work-wise, and I do everything else in incognito
| windows.
| supriyo-biswas wrote:
| The fly in the ointment with this theory is why Apple (or
| even Mozilla) would expose the same kind of information.
| Apple has only recently started experimenting with ads, and
| their ads are limited to the apps that they control.
|
| The more benign explanation would be to allow developers to
| work around device-specific or browser-specific bugs.
|
| (I'm aware Apple changes the GPU Model to "Apple GPU",
| however they do expose a ton of other properties that make it
| possible to fingerprint a device.)
| jakear wrote:
| Apple devices are in fact fairly difficult to fingerprint.
| In my experiments [1] all instances of the same hardware
| model (on iOS, iPadOS, and macOS) give the same
| fingerprint, so the best a tracker can get is "uses iPhone
| 14". Better than nothing, but not terribly unique.
|
| [1] fingrprintr.pages.dev
| [deleted]
| [deleted]
| RobotToaster wrote:
| Isn't Mozilla's main source of income from Google?
| threatofrain wrote:
| Continuing the push the browser to be a general app
| platform is the only way it can survive against native
| experience, which is already eating into the enthusiasm for
| the web. It seems like the trend for consumer companies is
| to _maybe_ launch first on the web for velocity but
| eventually migrate to native experiences.
|
| I wonder to what degree we can enable hardware performance
| without leaking user data.
| camyule wrote:
| Firefox do have a mechanism to limit the amount of data
| being leaked for fingerprinting, but it's disabled by
| default: https://support.mozilla.org/en-US/kb/firefox-
| protection-agai...
| philliphaydon wrote:
| Wow I just realised I've had this enabled for... since I
| first remember the feature announced, and the internet
| hasn't broken.
| cmeacham98 wrote:
| They're not that big of a deal, but my two biggest
| annoyances with RFP:
|
| 1. prefers-color-scheme is completely broken, _even in
| the dev tools_. Mozilla refuses to fix this in any way,
| it is allegedly "by design" that you have to disable all
| RFP protection if you're a web dev and need to test the
| dark color scheme of your website.
|
| 2. Similarly, RFP always vends your timezone as UTC with
| no way to change.
| arein3 wrote:
| They could add switches for individual features to mask
| on a hidden/advanced menu
| cmeacham98 wrote:
| Mozilla refuses to add _any_ toggle to disable RFP's
| control over features it touches, including even an
| about:config entry.
|
| See example bugzilla:
| https://bugzilla.mozilla.org/show_bug.cgi?id=1535189
|
| My "fix" for this involves using a janky old version of
| an addon that attempts to muck with the CSS/JS to
| reproduce the effect.
| nightpool wrote:
| that's a great way to get even more fingerprinting
| potential, each additional switch is another bit of
| identification on top of the actual fingerprint itself.
| madeofpalk wrote:
| > This shows how browser developers race to provide new
| features ignoring privacy impact.
|
| I think it showed how many years ago browser vendors were naive
| with understanding how this tech could be misused.
|
| These days I think browser vendors are very much aware of it
| and will frequently block features or proposals that they feel
| compromise on privacy and/or could be used as a tracking
| vector, especially Firefox and Safari. Sort this list
| https://mozilla.github.io/standards-positions/ by _Mozilla
| Position_ to see the reason they reject /refuse to implement
| standards and proposals.
| jsnell wrote:
| It is absurd to claim that the main use of WebRTC is
| fingerprinting. Especially during the pandemic the world pretty
| much ran on WebRTC. Real-time media is clearly a pretty core
| functionality for the web to be a serious application platform,
| it wasn't just some kind of a trojan horse for tracking.
|
| Now, it is true that a lot of older web APIs do expose too much
| fingerprinting surface. But the design sensibilities having
| changed a lot over time, it's just not the case that you can
| make statements about what browser developers do now based on
| what designs from a decade or two ago look like. These days
| privacy is a top issue when it comes to any new browser APIs.
|
| But let's take your question at face value: why aren't
| thesespecific things behind a permission dialog? Because the
| permissions would be totally unactionable to a normal user.
| "This page wants to send you notifications" or "this page wants
| to use the microphone" is understandable. "This page wants to
| read pixels from a canvas" isn't. If you go the permission
| route, the options are to either a) teach users that they need
| to click through nonsensical permission dialogs, with all the
| obvious downsides; b) make the notifications so scare or the
| permissions so inaccessible that the features might as well not
| exist. And the latter would be bad! Because the legit use cases
| for e.g. reading from a canvas _do_ exist; they 're just pretty
| rare.
|
| The Privacy Sandbox approach to this is to track and limit how
| much entropy a site is extracting via these kinds of side
| channels. So if you legit need to read canvas pixels, you'll
| have to give up on other features that could leak
| fingerprinting data. (I personally don't really believe in that
| approach will work, but it is at least principled. What I'd
| like to see instead is limiting the use of these APIs to
| situations where the site has a stable identifier for the user
| anyway. But that requires getting away from implementing auth
| with cookies as opaque blobs of data with unknown semantics,
| and moving to some kind of proper session support where the
| browsers understands the semantics of signed-in session, and
| it's made clear to users when they're signing in somewhere and
| where they're signed in right now. And then you can make a lot
| better tradeoffs with limiting the fingerprinting surface in
| the non-signed in cases.)
| psychphysic wrote:
| Do you mean more websites use webRTC for legitimate purposes
| than for fingerprinting? Or more instances of it being
| activated is legitimate or more traffic is legitimate (probs
| true given bandwidth needed for audio video).
|
| But I suspect by the other two metrics it's correct to say
| most uses are to fingerprint.
| trifurcate wrote:
| > "This page wants to send you notifications" or "this page
| wants to use the microphone" is understandable. "This page
| wants to read pixels from a canvas" isn't.
|
| Yes, it is. Tor Browser already does this: https://www.bleeps
| tatic.com/content/posts/2017/10/30/CanvasF...
|
| That specific wording may be a touch too verbose for the
| average end user, but it's not impossible nor is it strange.
| Just include a note about how this is 99% likely a
| fingerprinting measure; option b) isn't so bad in this case.
| Of course, due to the nature of how fingerprinting works, the
| absolute breadth of features that would be gated behind
| something like this would be offputting.
|
| I am also wary of what you suggested with gating this kind of
| fingerprinting to when the website has positively identified
| the user anyway; in a way, this seems to me even more
| valuable than fingerprint data without an associated "strong"
| identity.
| ballenf wrote:
| Giving users the permissions would simply be a training
| exercise in "I have to say 'yes' or TikTok breaks". Like
| how Android worked a few years ago with the other
| permissions.
| [deleted]
| trifurcate wrote:
| Android largely works now with these permission prompts,
| though. TikTok asks you for a million permissions too,
| and many average end users decline. Many people also opt
| out of tracking on Facebook et al. when iOS prompts them
| about it.
| monkpit wrote:
| > and many average end users decline
|
| [citation needed]
| scarface74 wrote:
| Really? How much more of a citation do you need than
| Facebook admitted during their quarterly financials the
| effect that iOS users opting out had?
|
| https://www.cnbc.com/2022/02/02/facebook-says-apple-ios-
| priv...
| saagarjha wrote:
| If you don't present the tracking prompt exactly how
| Apple wants you to they boot you from the store. The same
| is not true for a website.
| 0xy wrote:
| Of course it's main use is fingerprinting. Do you think
| WebRTC is instantiated for genuine reasons the majority of
| the time? That's real absurdity.
|
| WebRTC is instantiated most often by ad networks and anti-
| fraud services.
|
| Same thing with Chrome's fundamentally insecure AudioContext
| tracking scheme (yes, it's a tracking scheme), which is used
| by trackers 99% of the time. It provides audio latency
| information which is highly unique (why?).
|
| Given Chrome's stated mission of secure APIs and their
| actions of implementing leaky APIs with zeal, I have reason
| enough to question their motives.
|
| After all, AudioContext is abused heavily on Google's ad
| networks. Google knows this.
| Datagenerator wrote:
| One alternative Librewolf needs some more promotion, has
| safer security by default
| arein3 wrote:
| Wow, that's really shitty from googles part.
| ghayes wrote:
| Take a look at Firefox's Fingerprinting Prevention feature.
| This includes a permission for canvas, as well as:
|
| - Your timezone is reported to be UTC
|
| - Not all fonts installed on your computer are available to
| webpages
|
| - The browser window prefers to be set to a specific size
|
| - Your browser reports a specific, common version number and
| operating system
|
| - Your keyboard layout and language is disguised
|
| - Your webcam and microphone capabilities are disguised
|
| - The Media Statistics Web API reports misleading information
|
| - Any Site-Specific Zoom settings are not applied
|
| - The WebSpeech, Gamepad, Sensors, and Performance Web APIs are
| disabled
|
| https://support.mozilla.org/en-US/kb/firefox-protection-agai...
| TobyTheDog123 wrote:
| TikTok changes this algorithm about once every three months. I've
| reverse-engineered it about two times, and have since given up
| and decided to run a headless browser to do it for me. I'd love
| to see some tool developed to automate solving this so I can sign
| requests in a more limited context (ala Cloudflare Workers / C@E)
| nullpt_rs wrote:
| Author of the post here, if you have an older version of the
| script you're able to post or send over I'd love to take a look
| at it and see what changes they make and potentially automate
| the extraction.
| TobyTheDog123 wrote:
| Hey I'd love to:
|
| 1.0.0.200: https://hastebin.com/tudivadufa.apache Unknown
| version: https://hastebin.com/jasuxineti.js
|
| Some of these might have some console.logs (or curse words),
| but as a whole should be representative
| moneywoes wrote:
| Are you able to scrape with a headless browser?
| TobyTheDog123 wrote:
| Yeah, I can get basic user information pretty reliably just
| from the initial page load.
|
| I had a secondary use case of allowing users to sign-in in
| order to import the (verified/creator) users they follow, but
| quickly realized Apple wouldn't allow that data to be used
| (after the whole OG app ordeal), so I never had a real reason
| to follow up and crack it again.
| draw_down wrote:
| > void 0 (a fancy obfuscated way of saying undefined)
|
| Kind of. But it was possible at one point, maybe still is, to
| rebind `undefined` to some other value, causing trouble. `void`
| is an operator, a language keyword; it's guaranteed to give you
| the true undefined value. (In other words, the value whose type
| is `undefined`.)
|
| If you're coding against an environment as adversarial as these
| people clearly believe they are, you'd go with `void` as well.
| kerneloops wrote:
| Another reason to use `void 0` is that "void 0" takes only 6
| characters while "undefined" takes 9, saving some bandwidth. It
| is common practice for JavaScript minifiers to use this
| substitution.
| marginalia_nu wrote:
| Given it will be gzip-compressed in transport, does this
| really save a meaningful amount of bandwidth?
| draw_down wrote:
| It's really more that there is no reason not to do it. Void
| is marginally safer as well as shorter, so any
| minifier/transpile step etc will make this substitution.
| born-jre wrote:
| Something hit me when reading this, you know how zknark is touted
| as tech which in future allow to create app that can work on user
| private data while preserving user's privacy, could it be used as
| (opposite) an obfuscation technique to, u encrypt users data
| inside and zk oracle in user side and send to server. You could
| reverse engineer what are the inputs to the oracle, but not
| further what exactly it sends to the server?
| renonce wrote:
| zkSNARK allows you to make a proof for a statement that some
| boolean expression is satisfiable, without leaking any
| information about how the expression can be satisfied. That
| helps _prove_ something but not work on any data. The technique
| you described sounds more like homomorphic encryption, which
| currently is lots of magnitudes slower than native hardware and
| lacks practical use.
| born-jre wrote:
| What about sth like this https://github.com/zkonduit/ezkl ?
| thih9 wrote:
| I've seen some of these techniques elsewhere; e.g. javascript-
| obfuscator supports replacing variable names with hex values [1]
| or transforming call structure into something more complex [2].
| Bytecode generation is new to me; is there an existing JS
| obfuscation tool, preferably open source, that supports it?
|
| [1]: https://github.com/javascript-obfuscator/javascript-
| obfuscat...
|
| [2]: https://github.com/javascript-obfuscator/javascript-
| obfuscat...
| hoosieree wrote:
| It's only for C, but Tigress[1] supports a _ton_ of obfuscation
| types. Virtualization and JIT are very effective, especially
| when used together with control flow transforms like Split and
| Flatten.
|
| Renaming variables or encoding them is fairly trivial to
| reverse.
|
| [1] https://tigress.wtf/transformations.html
| xchkr1337 wrote:
| Compiling JS to bytecode is not that uncommon, there's a few
| anti-bot services that rely on it for obfuscation (like
| recaptcha or f5 shapesecurity) but so far I haven't seen any
| open source projects for obfuscating this way
| czx4f4bd wrote:
| Based on my previous research into this, the magic keywords to
| find this kind of thing on Google are "virtualization
| obfuscation" or "VM obfuscation".
|
| rusty-jsyc is the main open source implementation I've found,
| though it hasn't been touched in a few years:
| https://jwillbold.com/posts/obfuscation/2019-06-16-the-secre...
| (GitHub: https://github.com/jwillbold/rusty-jsyc)
|
| I think there are other implementations, but they're
| proprietary so I didn't look into them very much. There are
| lots of posts out there about reversing virtualization
| obfuscation, but not many about implementing it. Seems like
| most people who put the effort into implementing it tend to
| prefer selling it commercially (which I suppose makes sense).
| 0x008 wrote:
| If I recall correctly: electron can compile JavaScript to
| "ByteNode" which is some form of byte code intended to be run
| in the V8 engine.
| frozencell wrote:
| The hunt begins.
| noduerme wrote:
| This is really awesome work.
|
| I spent a lot of time in the early 2000s coming up with nasty
| obfuscation techniques to protect certain IP that inherently
| needed to be run client-side in casino games. Up to and including
| inserting bytecode that was custom crafted to intentionally crash
| off-the-shelf decompilers that had to run the code to disassemble
| it (and forcing them to phone home in the process where
| possible!)
|
| My view on obfuscation is that since it's never a valid security
| practice, it's only admissible for hiding machinery from the
| general public. For instance, if you have IP you want to protect
| from average script kiddies. Any serious IP can be replicated by
| someone with deep pockets anyway. Most other uses of code
| obfuscation are nefarious, and obfuscated code should always be
| assumed to be malicious until proven otherwise. I'm not a
| reputable large company, but no reputable large company should be
| going to these lengths to hide their process from the user,
| because doing so serves no valid security purpose.
| bobleeswagger wrote:
| > since it's never a valid security practice
|
| Why not? It's just another tool in the security game.
|
| I _want_ to be with you on thinking that all obfuscation is
| malicious, I know that individuals have every right to
| obfuscation and privacy as a matter of the 1st and 4th
| amendments in the US, but I 'm not sure I can always say that
| obfuscation by a corporation is evil, without a more compelling
| argument. I'm as anti-establishment as they come, too.
| ViViDboarder wrote:
| I think l the reason is that it means that they don't trust
| or don't want their users to know what they are doing on your
| machine. To me, that is already a malicious premise. Even if
| they aren't trying to exfiltrate my data or anything.
| bobleeswagger wrote:
| I guess the acceptable form of obfuscation would mean only
| IP is protected by it, not everything. I wonder what it
| would take to enforce this as the norm, certainly doesn't
| sound easy.
| mtnygard wrote:
| I read the GP a bit differently... I didn't read it as saying
| obfuscation is evil, just that it is ineffective. More like
| "obfuscation can't prevent reversing, therefore it's not a
| valid security practice since all it does is slow down the
| casual observer but does not stop the determined adversary."
| The statement that most use of obfuscation is nefarious is a
| corollary... since obfuscation doesn't protect IP it is
| mostly used to hide malicious activity.
| dbrueck wrote:
| Agreed - obfuscation is useful for keeping honest people
| honest. If someone is sufficiently motivated, they will
| circumvent it, but for the vast majority of people it's just
| not worth the effort so they'll move to something else.
|
| For example, in our application we have some optionally
| downloadable content that includes some code for an interpreted
| language. That code lives on disk in an obfuscated form because
| we are not yet ready to make the API public (it's on our
| "someday" roadmap), we don't want to clean up the code for
| public viewing, and above all because there are different
| licensing requirements around each content pack.
|
| We looked at various "real" security options and they all have
| holes, and they all add a ton of complexity. We then also
| looked at the likely intersection between "people who would pay
| for this" and "people who could crack this", and there's not
| much there. In the end, obfuscation is cheap (especially in
| terms of implementation and maintenance) and steers our real
| customers away violating the license, and we don't waste
| resources on dishonest people.
|
| If I'm being charitable, the obfuscation in the article has an
| out of whack cost/benefit ratio. If I'm being cynical, the
| obfuscation they are doing strays well into the realm of
| nefarious. :)
| thrashh wrote:
| People knock on obfuscation but everything in life is based
| on trust. Locks being breakable, the fruit stand in front of
| a shop being unprotected, fences being scalable. Everything
| is a cost/benefit
| jstanley wrote:
| Wait, why is a casino protecting it's so-called "intellectual
| property" legitimate and above-board, but TikTok doing the same
| is not?
| margalabargala wrote:
| I don't think OP was defending their own earlier work or
| otherwise exempting it from their assertion that all
| obfuscated code should be considered malicious.
| rnd0 wrote:
| That's how I read it too. I had the feeling that the
| experience convinced the OP that it's not valid except in
| some circumstances.
| jstanley wrote:
| Having reread it, I think you might be right.
|
| > it's only admissible for hiding machinery from the
| general public.
|
| I had originally read this to imply that somehow it's OK
| for a casino to hide its machinery from the general public,
| but it's not OK for TikTok to hide its machinery from the
| general public, but maybe "machinery" here is intended much
| more narrowly, and OP thinks it applies neither to casinos
| nor TikTok.
| compsciphd wrote:
| I read it as the only "legitimate" point is to hide it
| from the general public. As people with more resources
| will be able to figure it out. If you view that as
| legitimate is up to each person to decide. Does the value
| of trying to hide it from the general public have real
| value or not. In general the answer might be no.
| neodymiumphish wrote:
| I think the distinction in what's obfuscated is important.
| Casino apps are trying to hide their code that detects
| cheating, number generation, etc, while TikTok is trying to
| hide its data collection. Obfuscation itself isn't
| necessarily bad.
| im3w1l wrote:
| > Number generation
|
| Number generation is extremely important and it's also
| regulated. You don't put such a thing in the client
| obfuscated or not.
| kevin_thibedeau wrote:
| Because they're doing it on hardware that they control.
| maria2 wrote:
| White box crypto is kind of like obfuscation, but tries to make
| it impossible to extract the information.
| krackers wrote:
| There's also indistinguishability obfuscation which I recall
| recently had a breakthrough in terms of practical
| construction
| awestroke wrote:
| No, encryption is very different from obfuscation, even if
| the former is often used in the latter
| xurukefi wrote:
| You missed the point. maria2 is talking about whitebox
| crypto. The "whitebox" part means that the decryption
| process happens on your machine incuding the secrets, which
| are present in some obfuscated scrambled form in memory.
| Getting the secret key is a matter of debugging and
| understanding the obfuscation scheme. A prime example of
| this is DRM like Widevine (L3) in the chrome browser.
| bitexploder wrote:
| I am really failing to understand the distinction here.
| Encryption with say, AES has very different properties
| and use cases compared to an obfuscation scheme. You can
| use encryption as a part of an obfuscation scheme, but
| obfuscation is a shell game, all the way down. Crypto is
| not, mathematically. They are categorically different
| things, right?
| grog454 wrote:
| The math is irrelevant when the key is known to all
| parties.
| bitexploder wrote:
| Sure. That is DRM and obfuscation in a nutshell. How
| annoying can I make this for you to reverse engineer?
| toast0 wrote:
| Obfuscation with encryption can be done with good
| ciphers, like AES, but the key is still shipped with the
| code, so it's still just cat and mouse.
|
| It's a little different if the key is hardware specific,
| so each binary only runs on one system and it's hard to
| extract the keys, but that's not a typical setup. Usually
| it's this code needs to run on the general public's
| computers or phones, and that's too general a target to
| rely on hardware crypto.
| matmann2001 wrote:
| The key specifier word was "whitebox". They aren't
| speaking generally about cryptography.
| [deleted]
| Alifatisk wrote:
| I never knew that Tiktok was shipped with its own virtual
| machine!
|
| But that explains the obvious subdomain vm.tiktok.com
| llacb47 wrote:
| Don't think that's what vm means there. The m is likely
| "maliva", which is tiktok's overseas (US/europe) CDN.
| wiml wrote:
| Given that the beginning of the "weird string" has a magic number
| and a version field, I wonder if the point of this is not so much
| obfuscation as transpilation? The magic number corresponds to
| ASCII "HNOJ" "@?RC", or perhaps "JONH" "CR?@", which doesn't turn
| anything up on Google but it seems odd to include that redundant
| header if your main goal is minification or obfuscation.
| derefr wrote:
| That HTTP request is kind of hideous. All those extra parameters
| that have nothing to do with what the response will end up being,
| and which change often. Seems like a great way to toss out all
| your API-response edge-cache-ability.
| kevincox wrote:
| With HTTPS you need to own the edge cache yourself and most
| will have options to ignore the headers and URL parameters that
| you want. That way they can log the tracking data and serve the
| cached data as if they were never there.
| Exuma wrote:
| This article is 2 hours old and his Twitter is already changed?
| deepzn wrote:
| Looks like they are more active on Mastodon.
| mdaniel wrote:
| Someone reported that he just had a typo in the twitter handle,
| IIRC an extra "r" at the end; FWIW, navigating up one level
| also has a link to the twitter handle and works just fine:
| https://twitter.com/nullpt_rs
| KirillPanov wrote:
| Awesome, really awesome work. However:
|
| > If that is something you are interested in, keep an eye out for
| the second part of this series :)
|
| Your site is missing an RSS/Atom feed, so I can't do that. ::sad
| face::
| CallMeMarc wrote:
| We're sharing the same fate apparently! Just added a PR to
| their repository to add some feeds, hope it gets merged soon.
|
| https://github.com/nullpt-rs/blog/pull/1
| Kukumber wrote:
| Nice use of low altitude satellites to track individuals and
| sniff telecoms all over the world
|
| This decompiled object class also spy on the grid network, that's
| quite interesting and very clever
|
| I never knew we could also lobby governments to push for some
| office and cloud software full of spyware, even France had to ban
| them! [1]
|
| This TikTok app is very dangerous!
|
| Of course /s
|
| [1] - https://news.ycombinator.com/item?id=33686599
| lazyeye wrote:
| Yes it is.
|
| What is the reason why China blocks all foreign social media
| apps within its own borders?
| derefr wrote:
| FYI, most CAPTCHA and anti-DDoS services (e.g. Cloudflare) do
| something very similar, sending the user an obfuscated program
| implemented on top of an obfuscated JS VM, that they effectively
| have to execute as-is, in a real browser, to get back the correct
| results the gateway is looking for. This is done to prevent
| simple scraping scripts (the ScraPy type) from being able to be
| used to scrape the site. If you want to do scraping, you have to
| spend the extra overhead of doing it by driving a real browser to
| do it. (And not even a headless one; they have tricks to detect
| that, too.)
| amelius wrote:
| Can someone explain what VM they are talking about, and where
| that VM is running on, and what is running in it?
| dbrueck wrote:
| It's a custom VM running inside their app, though calling it a
| VM might be a bit of a stretch because it doesn't appear to be
| a general purpose computing mechanism but more of higher level
| command processor.
|
| It sounds like the forthcoming part 2 article will go into more
| depth.
___________________________________________________________________
(page generated 2022-12-24 23:01 UTC)