[HN Gopher] CDC File Transfer
___________________________________________________________________
CDC File Transfer
Author : GalaxySnail
Score : 363 points
Date : 2025-10-01 02:38 UTC (20 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| rekttrader wrote:
| Nice to see Stadia had some long term benefit. It's a shame they
| don't make a self hosted version but if you did that it's just
| piracy in today's drm world.
| jMyles wrote:
| > it's just piracy in today's drm world
|
| ...which is more important / needed than ever. I encourage
| every who asks to get my music from bit torrent instead of
| spotify.
| MyOutfitIsVague wrote:
| Why not something like Bandcamp, or other DRM-free purchase
| options?
|
| I'm not above piracy if there's no DRM free option (or if the
| music is very old or the artist is long dead), but I still
| believe in supporting artists who actively support freedom.
| jMyles wrote:
| Yep, I put everything on bandcamp.
| https://justinholmes.bandcamp.com/
|
| Even better though, is a P2P service that is censorship
| resistant.
|
| But yeah I like Bandcamp plenty.
|
| > artists who actively support freedom.
|
| The bluegrass world is quickly becoming this.
|
| https://pickipedia.xyz/wiki/DRM-free
| MaxikCZ wrote:
| So you create and seed your torrents with your music, and
| present them prominently on your site?
| jMyles wrote:
| I was doing that for a while, and running a seedbox.
| However, on occasions when the seedbox was the only seeder,
| clients were unable to begin the download, for reasons I've
| never figured out. If I also seeded from my desktop, then
| fan downloads were being fed by both the desktop and the
| seedbox. But without the desktop, the seedbox did nothing.
|
| I need to revisit this in the next few weeks as I release
| my second record (which, if I may boast, has an incredible
| ensemble of most of my favorite bluegrass musicians on it;
| it was a really fun few days at the studio).
|
| Currently I do pin all new content to IPFS and put the
| hashes in the content description, as with this video of
| Drowsy Maggie with David Grier:
| https://www.youtube.com/watch?v=yTI1HoFYbE0
|
| Another note: our study of Drowsy Maggie was largely made
| possible by finding old-and-nearly-forgotten versions in
| the Great78 project, which of course the industry attempted
| to sue out of existence on an IP basis. This is another
| example of how IP is a conceptual threat to traditional
| music - we need to be able to hear the tradition in order
| to honor it.
| oofbey wrote:
| What do you mean piracy in the a DRM world. Like being able to
| share your own PC games through the cloud?
| killingtime74 wrote:
| You can share the games you authored all you like. If you
| bought a license to play them that's another story.
| kanemcgrath wrote:
| for self-hosted game streaming you can use moonlight +
| sunshine, they work really well in my experience.
| BrokenCogs wrote:
| Exactly my experience too. I easily get 60fps at 1080p over
| wireless LAN with moonlight + sunshine. Parsec is also
| another option
| sheepscreek wrote:
| Probably wouldn't have been feasible - I heard developers had
| to compile their games with Stadia support. Maybe it was an
| entirely different platform, with its own alternative to
| DirectX, or maybe had some kind of lightweight emulation (such
| as Proton) but I remember vaguely the few games I played had
| custom stadia key bindings (with stadia symbols). They would
| display like that within the game. So definitely some
| customization did happen.
|
| This is unlike the model that PlayStation, Xbox and even Nvidia
| are following - I don't know about Amazon Luna.
| jakebasile wrote:
| As I understand it, GeForce Now actually does require changes
| to the game to run in the standard and until recently only
| option of "Ready To Play". This is the supposed reason that
| new updates to games sometimes take time to get released on
| the service, since either the developers themselves or Nvidia
| needs to modify it to work correctly on the service. I have
| no idea if this is true, but it makes sense to me.
|
| They recently added "Install to Play" where you can install
| games from Steam that aren't modified for the service. They
| charge for storage for this though.
|
| Sadly, there's still tons of games unavaiable because
| publishers need to opt in and many don't.
| TiredOfLife wrote:
| GeForce Now doesn't require any changes.
| MindSpunk wrote:
| Stadia games were just run on Linux with Vulkan + some extra
| Stadia APIs for their custom swapchain and other bits and
| pieces. Stadia games were basically just Linux builds.
| numpad0 wrote:
| They did have a dev console based on a Lenovo workstation, as
| well as off-menu AMD V340L 2x8GB GPUs, both later leaked into
| Internet auctions. So some hardware and software
| customizations had definitely happened.
| laidoffamazon wrote:
| Stadia was sadly engineered in such a way that this is
| impossible.
|
| Speaking of which, who thought up the idea to use custom
| hardware for this that would _already be obsolete_ a year
| later? Who considered using Linux native instead of a compat
| layer? Why did the original Stadia website not even have a
| search bar??
| nolok wrote:
| For self hosted remote streaming of game look at Moonlight /
| Sunshine (Apollo)
|
| Stadia required special version of games, so it wouldn't be
| that useful
| asmor wrote:
| It's a shame that virtual / headless displays are such a mess
| on both Linux and Windows. I use a 32:9 ultrawide and stream
| to 16:9/16:10 devices, and even with hours of messing around
| with an HDMI dummy and kscreen-doctor[1] it was still an
| unreliable mess. Sometimes it wouldn't work when the machine
| was locked, and sometimes Sunshine wouldn't restore the
| resolution on the physical monitor (and there's no session
| timeout either).
|
| Artemis is a bit better, but it still requires per-device
| setup of displays since it somehow doesn't disable the
| physical output next to the virtual one. Those drivers also
| add latency to the capture (the author of looking glass
| really dislikes them because they undo all the hard work of
| near-zero latency).
|
| [1]: https://github.com/acuteaura/universe/blob/main/systems/
| _mod...
| nolok wrote:
| Use Apollo (a fork of Sunshine) :
| https://github.com/ClassicOldSong/Apollo
|
| > Built-in Virtual Display with HDR support that matches
| the resolution/framerate config of your client
| automatically
|
| It includes a virtual screen driver, and it handles all the
| crap (it can disable your physical screen when streaming
| and re enable after, it can generate the virtual screen by
| client to match the client's needs, or do it by game, or
| ...)
|
| I stream from my main pc to both my laptop and my
| steamdeck, and each get the screen that matches them
| without having to do anything more than connect to it with
| moonlight.
| asmor wrote:
| Artemis/Apollo are mentioned in the post above - yeah
| they work better than the out of box experience, but you
| still have to configure your physical screen to be off
| for every virtual display. It unfortunately only runs on
| Windows and my machine usually doesn't. I also only have
| one dGPU and a Raphael iGPU (which are sensitive to
| memory overclocks) and I like the Linux gaming experience
| for the most part, so while I did have a working gaming
| VM, it wasn't for me (or I'd want another GPU).
| heavyset_go wrote:
| On Linux with an AMD i/dGPU, you can set the
| `virtual_display` module parameter for `amdgpu`[1] and do
| what you want without the need for an HDMI dummy or weird
| software. It's also hardware accelerated.
|
| > _virtual_display (charp)_
|
| > _Set to enable virtual display feature. This feature
| provides a virtual display hardware on headless boards or
| in virtualized environments. It will be set like
| xxxx:xx:xx.x,x;xxxx:xx:xx.x,x. It's the pci address of the
| device, plus the number of crtcs to expose. E.g.,
| 0000:26:00.0,4 would enable 4 virtual crtcs on the pci
| device at 26:00.0. The default is NULL._
|
| [1]https://www.kernel.org/doc/html/latest/gpu/amdgpu/module
| -par...
| asmor wrote:
| Unfortunately this seems to disable physical outputs.
|
| https://bugzilla.kernel.org/show_bug.cgi?id=203339
| heavyset_go wrote:
| I figure if you're using an HDMI dummy you're running
| headless anyway
|
| edit: didn't realize you're the OP lol
| mrguyorama wrote:
| I don't understand, "self hosted stadia" is just one of the
| myriad of services and tools that do literally that.
|
| Steam has game streaming built in and works very well. Both
| Nvidia and AMD built this into their GPU drivers at one point
| or another (I think the AMD one was shut down?)
|
| Those are just the solutions I accidentally have installed
| despite not using that functionality. You can even stream games
| _from_ the steam deck!
|
| Sony even has a system to let you stream your PS4 to your
| computer anywhere and play it. I think Microsoft built
| something similar for Xbox.
| theamk wrote:
| This CDC is "Content Defined Chunking" - fast incremental file
| transfer.
|
| Use case is to copy file over slow net, but the previous version
| is already there, so one can save time by only sending changed
| parts of the file.
|
| Not to be confused with USB CDC ("communications device class"),
| an USB device protocol used to present serial ports and network
| cards. It can also be used to transfer files, the old PC-to-PC
| cables used it by implementing two network cards connected to
| each other.
| oofbey wrote:
| The clever trick is how it recognizes insertions. The standard
| trick of computing hashes on fixed sized blocks works
| efficiently for substitutions but is totally defeated by an
| insertion or deletion.
|
| Instead with CDC the block boundaries are define by the
| content, so an insertion doesn't change the block boundary, so
| it can tell the subsequent blocks are unchanged. I haven't read
| the CDC paper but I'm guessing they just use some probabilistic
| hash function to define certain strings as block boundaries.
| teraflop wrote:
| Probably worth noting that ordinary rsync can also handle
| insertions/deletions because it uses a rolling hash. Rsync's
| method is bandwidth-efficient, but not especially CPU-
| efficient.
| adzm wrote:
| > I haven't read the CDC paper but I'm guessing they just use
| some probabilistic hash function to define certain strings as
| block boundaries.
|
| You choose a number of bits (say, 12) and then evenly
| distribute these in a 48-bit mask; if the hash at any point
| has all these bits on, that defines a boundary.
| NooneAtAll3 wrote:
| not to be confused with Center of Disease Control
| 1ncorrect wrote:
| ...or cDc[0]
|
| [0] https://en.wikipedia.org/wiki/Cult_of_the_Dead_Cow
| bbkane wrote:
| Or https://en.wikipedia.org/wiki/Change_data_capture
| monocasa wrote:
| Or https://en.wikipedia.org/wiki/Control_Data_Corporation
| petsfed wrote:
| Especially in the context of recent (that is, last 10 years)
| removal of data from Center of Disease Control sources due to
| changing political winds.
| claytongulick wrote:
| I ran into some of those issues with the chunk size and hash
| misses when writing bitsync [1], but at the time I didn't want to
| get too clever with it because I was focused on rsync algorithm
| compatibility.
|
| This is a cool idea!
|
| [1] https://github.com/claytongulick/bit-sync
| modeless wrote:
| Does Steam do something like this for game updates?
| Scaevolus wrote:
| Steam unfortunately doesn't use a rolling hash like this
| (fastcdc, buzhash, etc.), but rather slices files into 1MB
| chunks, hashes them, and updates at that granularity.
|
| https://partner.steamgames.com/doc/sdk/uploading#AppStructur...
| supportengineer wrote:
| Cygwin? Does anyone still use that?
| cheema33 wrote:
| Cygwin has its benefits over WSL. e.g. It does not run in a VM
| for example and therefore does not suffer from the resulting
| performance penalty.
| mikae1 wrote:
| _> cdc_rsync is a tool to sync files from a Windows machine to a
| Linux device, similar to the standard Linux rsync._
|
| Does this work Linux to Linux too?
| kxrm wrote:
| No: https://github.com/google/cdc-file-transfer?tab=readme-ov-
| fi...
| maxlin wrote:
| Having dabbled in trying to make a quick delta patch system like
| Steam's, which required me to understand delta patching methods
| and made small patches to big files in a 10gb+ installation in a
| few seconds, this is sure is quite interesting!
|
| I wonder if Steam ever decides to supercharge their content
| handling with some user-space filesystem stuff. With fast
| connections, there isn't really a reason they couldn't launch
| games in seconds, streaming data on-demand with smart pre-caching
| steering based on automatically trained access pattern data. And
| especially with finely tuned delta patching like this, online
| game pauses for patching could be almost entirely eliminated.
| Stop & go instead of a pit stop.
| fsfod wrote:
| Someone already created that[1] using custom kernel driver and
| there own CDN, but they seem to of abandoned it[2], maybe
| because they would of attracted Valve's wrath trying to
| monetized it.
|
| [1]
| https://web.archive.org/web/20250517130138/https://venusoft....
|
| [2] https://venusoft.net/#home
| maxlin wrote:
| That's actually quite interesting. Not entirely what I had in
| mind but close! My version would have only the first boot be
| a bit slow, but the aspect of dynamically replacing local
| content there is cool.
|
| This would be extra cool for LAN parties with good network
| hardware
| Zekio wrote:
| steam game installs are bottlenecked by cpu speed these days
| due to the heavy compression, so doubt it be much faster
| maxlin wrote:
| Well, the amount of compression isn't set in stone, obviously
| a system like this would run with a less compressed dataset
| to balance game boot time, time taken away from running the
| game by compression, and scale on available bandwidth.
|
| With low bandwidth just downloading the whole thing while
| having enough compression to 80% saturate the local system
| would be optimal instead, sure.
| ur-whale wrote:
| Great initiative, especially the new sync algorithm, but giant
| hurdles to adoption:
|
| - only works on a weird combo of (src platform / dst platform).
| Why???? How hard is it to write platform-independent code to
| read/write bytes and send them over the wire in 2025?
|
| - uses bazel, an enormous, Java-based abomination, to build.
|
| Fingers crossed that these can be fixed, or this project is dead
| in the water.
| hobs wrote:
| First thing might be considered a bug by googles, but everyone
| I have talked to LOVED their bazel or at least thought of it as
| superior to any other tool to do the same stuff.
|
| Literally tonight my buddy was talking about how months long
| plan to introduce bazel into his companies infra.
| jve wrote:
| Hey the repo is archived and as I read the tool was meant to
| solve one specific scenario. Not everything has to please the
| public.
|
| The great thing is googlers could make such a tool and publish
| it in the first place. So you can improve it to use it in your
| scenario. Or become maintainer of such a tool.
| maccard wrote:
| > only works on a weird combo of (src platform / dst platform).
| Why????
|
| Stadia ran on linux, and 99.9999999% of game development is
| done on windows (and cross compiled for linux).
|
| > Fingers crossed that these can be fixed, or this project is
| dead in the water.
|
| The project was archived 9 months ago, and hasn't had a commit
| in 2 years. It's already dead.
| EdSchouten wrote:
| I've also been doing lots of experimenting with Content Defined
| Chunking since last year (for https://bonanza.build/). One of the
| things I discovered is that the most commonly used algorithm
| FastCDC (also used by this project) can be improved significantly
| by looking ahead. An implementation of that can be found here:
|
| https://github.com/buildbarn/go-cdc
| Scaevolus wrote:
| This lookahead is very similar to the "lazy matching" used in
| Lempel-Ziv compressors!
| https://fastcompression.blogspot.com/2010/12/parsing-level-1...
|
| Did you compare it to Buzhash? I assume gearhash is faster
| given the simpler per iteration structure. (also, rand/v2's
| seeded generators might be better for gear init than mt19937)
| EdSchouten wrote:
| Yeah, GEAR hashing is simple enough that I haven't considered
| using anything else.
|
| Regarding the RNG used to seed the GEAR table: I don't think
| it actually makes that much of a difference. You only use it
| once to generate 2 KB of data (256 64-bit constants). My
| suspicion is that using some nothing-up-my-sleeve numbers
| (e.g., the first 2048 binary digits of p) would work as well.
| pbhjpbhj wrote:
| The random number generation could match the first 2048
| digits of pi, so if it works with _any_ random number...
|
| If it doesn't work with any random number, then some work
| better than others then intuitively you can find a (or a
| set of) best seed(s).
| Scaevolus wrote:
| Right, just one fewer module dependency using the stdlib
| RNG.
| rokkamokka wrote:
| What would you estimate the performance implications of using
| go-cdc instead of fastcdc in their cdc_rsync are?
| EdSchouten wrote:
| In my case I observed a ~2% reduction in data storage when
| attempting to store and deduplicate various versions of the
| Linux kernel source tree (see link above). But that also
| includes the space needed to store the original version.
|
| If we take that out of the equation and only measure the size
| of the additional chunks being transferred, it's a reduction
| of about 3.4%. So it's not an order of magnitude difference,
| but not bad for a relatively small change.
| quotemstr wrote:
| I wonder whether there's a role for AI here.
|
| (Please don't hurt me.)
|
| AI turns out to be useful for data compression
| (https://statusneo.com/creating-lossless-compression-
| algorith...) and RF modulation optimization
| (https://www.arxiv.org/abs/2509.04805).
|
| Maybe it'd be useful to train a small model (probably of the
| SSM variety) to find optimal chunking boundaries.
| EdSchouten wrote:
| Yeah, that's true. Having some kind of chunking algorithm
| that's content/file format aware could make it work even
| better. For example, it makes a lot of sense to chunk source
| files at function/scope boundaries.
|
| In my case I need to ensure that all producers of data use
| exactly the same algorithm, as I need to look up build cache
| results based on Merkle tree hashes. That's why I'm
| intentionally focusing on having algorithms that are not only
| easy to implement, but also easy to implement _consistently_.
| I think that MaxCDC implementation that I shared strikes a
| good balance in that regard.
| xyzzy_plugh wrote:
| > https://bonanza.build
|
| I just wanted to let you know, this is really cool. Makes me
| wish I still used Bazel.
| laidoffamazon wrote:
| As I've gotten further in my career I've started to wonder - how
| many engineering quarters did it take to build this for their
| customers? How did they manage to get this on their own roadmap?
| This seems like a lot of code surface area for a fairly minimal
| optimization that would be redundant with a different development
| substrate (like running Windows on Stadia like how Amazon Luna
| worked...)
| jayd16 wrote:
| It's easy to get work on this problem. Any effort that shortens
| game deploy time will be highly visible. It's something every
| game needs, and every member of the team deals with.
| laidoffamazon wrote:
| Im sympathetic to this idea but it seems like this is a
| situation that most game developers don't have because they
| just develop locally. Sometimes they do need to push to a
| console which this could help with if Microsoft or Sony built
| this into their dev kit tooling.
| grodes wrote:
| You are thinking like a manager, but this (as with most of the
| good things in life) has been built by doers, artisans, and
| engineers (developers).
|
| This is a problem interesting enough, with huge potential
| benefits for humanity if it manages to improve anything, which
| it did.
| AnonC wrote:
| Does anyone know if there's work being done to integrate this
| into the standard rsync tool (even as an optional feature)? It
| seems like a very useful improvement that ought to be available
| widely. From this website it seems a bit disappointing that it's
| not even available for Linux to Linux transfers.
| rincebrain wrote:
| You can find some thoughts on it not working for Linux to
| Linux, and more broad compatibility, here[1] and here[2].
|
| [1] - https://github.com/google/cdc-file-
| transfer/issues/56#issuec...
|
| [2] - https://github.com/librsync/librsync/issues/242
| est wrote:
| I wonder if this could be applied to git.
|
| The git blob was hashed with a header of decimal length, and you
| change a slight bit of content, you have to calculate the hash
| from start again.
|
| Something like CDC would improve this alot.
| oac wrote:
| It's done in xet as a replacement for git lfs:
| https://huggingface.co/blog/from-files-to-chunks
| pabs3 wrote:
| Backup tools like restic/borg do this, I wonder if anyone has
| used them to replace git yet.
| janpmz wrote:
| Tailscale and python3 -m http.server 1337 and then navigating the
| browser to ip:1337 is a nice way to transfer files too (without
| chunking). I've made an alias for it alias serveit="python3 -m
| http.server 1337"
| wheybags wrote:
| If anyone else was left wondering about the details of how CDC
| actually generates chunks, I found these two blog posts explained
| the idea pretty clearly:
|
| https://joshleeb.com/posts/content-defined-chunking.html
|
| https://joshleeb.com/posts/gear-hashing.html
| jcul wrote:
| Thanks, I was puzzled by that. They kind of gloss over it in
| the original link.
|
| Looking forward to reading those.
| tgsovlerkhgsel wrote:
| Key sentence: "The remote diffing algorithm is based on CDC
| [Content Defined Chunking]. In our tests, it is up to 30x faster
| than the one used in rsync (1500 MB/s vs 50 MB/s)."
| MayeulC wrote:
| I am quite confused; doesn't rsync already use content-defined
| chunk boundaries, with a condition on the rolling hash to define
| boundaries?
|
| https://en.wikipedia.org/wiki/Rolling_hash#Content-based_sli...
|
| https://en.wikipedia.org/wiki/Rolling_hash#Content-based_sli...
|
| The speed improvements over rsync seem related to a more
| efficient rolling hash algorithm, and possibly by using native
| windows executables instead of cygwin (windows file systems are
| notoriously slow, maybe that plays a role here).
|
| Or am I missing something?
|
| In any case, the performance boost is interesting. Glad the
| source was opened, and I hope it finds its way into rsync.
| sneak wrote:
| rsync seems frozen in time; it's been around for ages and there
| are so many basic and small quality of life improvements that
| could have been made that haven't been. I have always assumed
| it's like vim now: only really maintained in theory, not in
| practice.
| Zardoz84 wrote:
| So you not used vim or neovim in the last 10 years ?
| lftl wrote:
| To be fair, there was a roughly 6 year period when vim saw
| one very minor release. That slow development period was
| the impetus for the fork of Neovim.
| Zardoz84 wrote:
| I know. I use Neovim. But since that, and thanks to
| Neovim, Vim has speedup and got some improvements.
| dotancohen wrote:
| Time for neorsync.
|
| That said, VIM 8 was terrific.
| chasil wrote:
| Please bear in mind that there are [now] two distinct rsync
| codebases.
|
| The original is the GPL variant [today displaying "Upgrade
| required"]:
|
| https://rsync.samba.org/
|
| The second is the BSD clone:
|
| https://www.openrsync.org/
|
| The BSD version would be used on platforms that are
| intolerant of later versions of the GPL (Apple, Android,
| etc.).
| re wrote:
| > doesn't rsync already use content-defined chunk boundaries,
| with a condition on the rolling hash to define boundaries?
|
| No, it operates on fixed size blocks over the destination file.
| However, by using a rolling hash, it can detect those blocks at
| any offset within the source file to avoid re-transferring
| them.
|
| https://rsync.samba.org/tech_report/node2.html
| ohitsdom wrote:
| The readme very nicely contrasts the approach with rsync.
| exikyut wrote:
| I'm curious: what does MUC stand for? :)
| bilekas wrote:
| This is actually kind of cool, I've implemented my own version of
| this for my job and seems to be something that's important when
| the numbers gets tight, but if I remember correctly for their
| case i guess, wouldn't it have been easier to work from rsynch?
|
| > scp always copies full files, there is no "delta mode" to copy
| only the things that changed, it is slow for many small files,
| and there is no fast compression.
|
| I havent tried it myself but doesnt this already suit that
| requirement ? https://docs.rc.fas.harvard.edu/kb/rsync/
|
| > Compression If the SOURCE and DESTINATION are on different
| machines with fast CPUs, especially if they're on different
| networks (e.g. your home computer and the FASRC cluster), it's
| recommended to add the -z option to compress the data that's
| transferred. This will cause more CPU to be used on both ends,
| but it is usually faster.
|
| Maybe it's not fast enough, but seems a better place to start
| than scp imo.
| regularfry wrote:
| > The remote diffing algorithm is based on CDC. In our tests,
| it is up to 30x faster than the one used in rsync (1500 MB/s vs
| 50 MB/s).
| rincebrain wrote:
| rsync in my experience is not optimized for a number of use
| cases.
|
| Game development, in particular, often involves truly enormous
| sizes and numbers of assets, particularly for dev build
| iteration, where you're sometimes working with placeholder or
| unoptimized assets, and debug symbol bloated things, and in my
| experience, rsync scales poorly for speed of copying large
| numbers of things. (In the past, I've used naive wrapper
| scripts with pregenerated lists of the files on one side and
| GNU parallel to partition the list into subsets and hand those
| to N different rsync jobs, and then run a sync pass at the end
| to cleanup any deletions.)
|
| Just last week, I was trying to figure out a more effective way
| to scale copying a directory tree that was ~250k files varying
| in size between 128b and 100M, spread out across a
| complicatedly nested directory structure of 500k directories,
| because rsync would serialize badly around the cost of creating
| files and directories. After a few rounds of trying to do many-
| way rsync partitions, I finally just gave the directory to
| syncthing and let its pregenerated index and watching handle
| it.
| jmuhlich wrote:
| Try this: https://alexsaveau.dev/blog/projects/performance/fi
| les/fuc/f...
|
| > The key insight is that file operations in separate
| directories don't (for the most part) interfere with each
| other, enabling parallel execution.
|
| It really is magically fast.
|
| EDIT: Sorry, that tool is only for local copies. I just
| remembered you're doing remote copies. Still worth keeping in
| mind.
| Sammi wrote:
| It's dead and archived atm, but it looks like a good candidate
| for revival as an actual active open source project. If you ever
| wanted to work on something that looks good on your resume, then
| this looks like your chance. Basically just get it running and
| released on all major platforms.
| phyzome wrote:
| You can see something similar in use in the borg backup tool --
| content-defined chunking, before deduplication and encryption.
| syngrog66 wrote:
| CDC is an unfortunately chosen name
| 0xfeba wrote:
| the name reminds me of Microsoft's RDC, Remote Differential
| Compression.
|
| https://en.wikipedia.org/wiki/Remote_Differential_Compressio...
| velcrovan wrote:
| > Download the precompiled binaries from the latest release to a
| Windows device and unzip them. The Linux binaries are
| automatically deployed to ~/.cache/cdc-file-transfer by the
| Windows tools. There is no need to manually deploy them.
|
| Interesting, so unlike rsync there is no need to set up a service
| on the destination Linux machine. That always annoyed me a bit
| about rsync.
| justinsaccount wrote:
| The most common use for rsync is to run it over ssh where it
| starts the receiving side automatically. cdc is doing the exact
| same thing.
|
| You were misinformed if you thought using rsync required
| setting up an rsync service.
| charleshwang wrote:
| Is this how IBM Aspera works too? I was working QA at a game
| publisher a while ago, and they used it to upload some screen
| recordings. I didn't understand how it worked, but it was
| exceeding the upload speeds of the regular office internet.
|
| https://www.ibm.com/products/aspera
| ksherlock wrote:
| They should have duck ducked the initialism. CDC is Control Data
| Corporation.
| shae wrote:
| I've read lots about content defined chunking and recently heard
| about monoidal hashing. I haven't tried it yet, but monoidal
| hashing reads like it would be all around better, does anyone
| know why or why not?
___________________________________________________________________
(page generated 2025-10-01 23:02 UTC)