[HN Gopher] How to build a faster file transfer protocol
___________________________________________________________________
How to build a faster file transfer protocol
Author : mcharawi
Score : 62 points
Date : 2022-03-26 16:44 UTC (6 hours ago)
(HTM) web link (www.trytachyon.com)
(TXT) w3m dump (www.trytachyon.com)
| kevinherron wrote:
| > If your internet connection is 1Gps and you are transferring a
| 10Gb file, it should theoretically take 10 seconds to transfer
|
| Err what? I don't know the this "Gps" unit is, but if it's 1Gbps
| (gigabit per second), and a 10GB (gigabyte) file, that's not how
| it works... it would be 80 seconds.
| madsbuch wrote:
| It should be OK. It says "10Gb" not "10GB", ie. it is a 10 giga
| _bit_ file. (while it is untraditional to measure file size in
| bits, it should be perfectly fine)
| [deleted]
| mcharawi wrote:
| Sorry about the typo-you are right it should be Gbps. As for
| the transfer time, we are just using bits for the file size to
| make the mental math easier.
| mypalmike wrote:
| I worked at a tier 2 ISP about 15 years ago that developed
| multiple products trying to sell accelerated transfers as a
| service. They worked similarly to what this article describes.
| The problem was that there were very few buyers. It's easier to
| sell transparent acceleration boxes as an appliance, and even
| then it's very niche.
| metadat wrote:
| > It took us a little while to build UDT..
|
| > Building this infrastructure took a substantial amount of
| time..
|
| > If anyone is interested in trying the Tachyon Transfer
| Algorithm we offer a storage transfer acceleration API like AWS
| does. Our SDK includes node, c++ and objc and could be used in a
| wide variety of applications
|
| So it was a lot of effort, and now they're inviting Big-G and
| Cloudflare to contact them to possibly achieve a paltry 30%-ish
| speed increase for certain scenarios? Or are they inviting app
| devs who want faster video uploads to reach out? What is the
| actual use case where the sometimes 30% improvement matters and
| actually moves the needle?!
|
| Why hasn't Tachyon been working with their prospective customers
| and warming them up the whole time, or at least working the
| social and investor nets and reaching out proactively already?
|
| This strategy is kind of like being a dweeb at a poorly lit
| school dance and hoping the most popular girl at the dance
| somehow notices you're wearing shoes that let you float a
| centimeter in the air. Cool trick, bud.
|
| Presumably it's not a $10/mo service contract. Is this really an
| effective strategy when building and selling to enterprise these
| days? To me it sounds like a risky and hard way to make less
| money than what is possible using tried and true product
| development strategies. To be fair, I have also made this mistake
| before. It was embarrassing enough as a solo-founder, and seems
| less forgivable with larger founding group sizes, because it
| means more folks agreed to support and follow such a sub-optimal
| harebrained scheme :)
|
| You all sound like very capable software engineers, and I know
| it's both fun and satisfying to build and make The Thing.
|
| Good luck, sincerely.
|
| P.s. You may also consider pursuing some of the medium sized
| targets like Backblaze, Rackspace, Larry Ellisons Oracle OCI, or
| Microshaft Azure.
|
| (sorry, I couldn't resist having some fun at the end, though the
| suggestion is real!)
| kkfx wrote:
| IMVHO the main issue in file transfer today is that in 2022 most
| people still do not have a public ip (like an IPv6 global ones)
| so most people still have NAT traversal issues and need to relay
| on third parties or not-so-performant more or less distributed
| networks...
|
| The second main issue is that most do not own a personal domain
| name with a subdomain per personal host (like
| {desktop,craphone,laptop}.mydomain.tld etc).
|
| Those two issues are so big IMVHO that push all others aside...
| dochtman wrote:
| Does UDT come with encryption? If so, how does it compare to
| QUIC?
| mcharawi wrote:
| The canonical UDT implementation does not come with encryption,
| however there are some older open source GitHub repos that have
| attempted to add TLS to UDT. The original author of UDT,
| Yunhong Gu, has a project called Sector/Sphere that adds some
| application-level encryption to file transfer if you want to
| check it out: http://sector.sourceforge.net/. We've added
| encryption for our algorithm though!
|
| With respect to QUIC, I believe it was designed specifically to
| reduce the latency of HTTP connections by using multiple UDP
| flows and building the reliability/ordering guarantees at the
| application layer.
|
| The problem with getting performance increases out of multiple,
| distinct traffic flows is that you become more and more unfair
| to other packet traffic as you increase the number of flows you
| are using. For example, if you use 9 TCP (or any other AIMD)
| flows to send a file over some link, and a tenth connection is
| started, you now are taking up to 90% of the available
| bandwidth (because AIMD flows are designed to be fair amongst
| themselves).
| moreati wrote:
| > AIMD
|
| Additive Increase Multiplactive Decrease (for others
| wondering)
|
| > a feedback control algorithm best known for its use in TCP
| congestion control. AIMD combines linear growth of the
| congestion window when there is no congestion with an
| exponential reduction when congestion is detected.
|
| -- https://en.wikipedia.org/wiki/Additive_increase/multiplica
| ti...
| Scaevolus wrote:
| Always good to see more in this space! Long fat networks (LFNs or
| "elephants") are everywhere, especially once you start moving
| data between continents.
|
| I've had success personally with UFTP, but you explicitly set the
| transmit rate. Don't forget to enable encryption/authentication
| if you want the downloads to be verified! You'll get silent UDP
| corruption otherwise: http://uftp-multicast.sourceforge.net/
| ac130kz wrote:
| Some basic transfer based on UDP with forward error correction is
| a really good solution to tackle packet loss and avoid TCP
| congestion entirely.
| mcharawi wrote:
| So congestion and packet loss are different problems; it is
| true that forward error correction could be a good way to avoid
| retransmitting lost packets, but the only way to avoid
| congestion is to adjust the congestion window (for window based
| congestion control) or packet sending rate (for rate based
| congestion control) based on some indicator of congestion.
| mcharawi wrote:
| Hey HN! I'm Mahamad, co-founder of Tachyon Transfer, where we're
| building faster file transfer tools for developers. We've spent
| the last year building an ultra-fast FTP replacement, and we
| thought we'd show you guys what our technical process was like.
| Let me know if you have any questions!
| KennyBlanken wrote:
| Please show performance tests versus hpn-ssh, GridFTP (aka, the
| defacto tool of the particle physics and genetics research
| communities) and simpler systems like wget2's multi-threaded
| mode.
| eps wrote:
| Would also be nice to compare against different standard TCP
| congestion avoidance algs, of which there's plenty.
|
| It is, after all, a _very_ well researched area.
| Bancakes wrote:
| Can I tunnel this over SSH and use it the same way as faster
| drop-in replacement for SFTP? (Why not?)
| mcharawi wrote:
| Standard SSH uses TCP over port 22 by default, so it wouldn't
| be possible without modifying SSH to use a different
| protocol. That being said, however, our protocol uses TLS
| over UDP via the OpenSSL libraries so it is secure by
| default. We also offer a BSD-style socket interface that you
| can use if you want a drop in replacement for TCP sockets.
| Shoot me a note at mahamad _at_ trytachyon _dot_ com if you
| want to chat!
| [deleted]
| rsync wrote:
| Is this software that one licenses and uses on any arbitrary
| network or do you run a network of some kind that users pay to
| access?
|
| Or both ?
|
| I think this is a software package but the tl;dr doesn't make
| that clear to me...
| mcharawi wrote:
| At the moment we offer both options. We offer our own network
| with a pricing plan similar to massive.io (though 10c per gb
| vs 25c) Our licensing is cheaper but requires large volumes.
| tener wrote:
| Can you share some actual performance numbers across whatever
| are the key metrics that you observe?
| AitchEmArsey wrote:
| Interesting, but somewhat misses the point; the reason people
| want an alternative to Aspera is that no-one wants to pay for
| file transfer tools.
| mcharawi wrote:
| Thanks for the feedback-we're actually planning to open-source
| a version of our work that significantly improves on the
| original UDT project: https://udt.sourceforge.io/.
| AitchEmArsey wrote:
| Look forward to it. I'd be interested to hear how your tool
| compares with Facebook WDT[1], as that would be my go-to
| right now if someone asked me for a fast point-to-point data
| transfer solution.
|
| [1] https://github.com/facebook/wdt
| amaccuish wrote:
| Never understood, once SMB gets going it's pretty fast, but it
| takes agessss to list a directory. Like why can't it just pipe
| the output of dir() or ls (when samba) out over the network.
| rsync wrote:
| I actually read the entire article and was specifically looking
| for a reference to hpn-ssh which I think is the most standard way
| to approach this ... can op comment here on that tooling and how
| that compares and contrasts ?
| mcharawi wrote:
| Thanks for reading!
|
| I haven't seen hpn-ssh before, but from a cursory look at the
| project page it looks like the main improvements are targeted
| at improving the speed of the encryption using multi-threading,
| and increasing ssh/scp buffer sizes. These are certainly good
| improvements over standard ssh/scp (and setting TCP buffers to
| the value of the bandwidth delay product for a particular
| network path is a well known way to squeeze some perf out of
| TCP) but do not address the root cause of slowdown in window-
| based, loss-based congestion control.
|
| In order to be fair to other flows, exponential back-off is
| required on detection of congestion, and packet loss as an
| indicator of congestion is both a lagging indicator of
| congestion and has a very low signal to noise ratio on high
| throughput, lossy networks.
| KennyBlanken wrote:
| hpn-ssh is specifically designed for high latency, high
| bandwidth file transfer and is more than just "big buffers
| and multi-threaded." And the question remains: how does your
| solution compare in simulated and real-world testing?
|
| It's a little strange that you "conducted an extensive
| literature review" of congestion algorithms but you aren't
| aware of basic common tools like hpn-ssh, wget2's
| multithreading mode, or GridFTP which is used extensively in
| particle physics and genetics research communities.
| mcharawi wrote:
| Thanks for the feedback. The file transfer ecosystem is
| very large and conducting a through review of the
| application level tools was not the goal of this project,
| as the overwhelming majority of them focus on differences
| at the application layer, not the transport layer.
|
| We are specifically focusing on rebuilding a congestion
| control algorithm from the ground up that can better
| tolerate modern network conditions, including things like
| high bandwidth, high packet loss, and high latency.
|
| With respect to Grid-FTP, wget2 multi-threading, and other
| multi-flow approaches: the problem with getting performance
| increases out of multiple, distinct traffic flows is that
| you become more and more unfair to other packet traffic as
| you increase the number of flows you are using. For
| example, if you use 9 TCP (or any other AIMD) flows to send
| a file over some link, and a tenth connection is started,
| you now are taking up to 90% of the available bandwidth
| (because AIMD flows are designed to be fair amongst
| themselves).
| fn-mote wrote:
| This article was interesting and also frustrating to read.
|
| 1. There are very few numbers. In particular, improvement in
| performance under various circumstances is _not_ given! If you
| dig around you can find their transfer time application [1], but
| there is no discussion on that page.
|
| 2. The basis for the improvement is not spelled out. (References
| are given, but you have to know the field - "acronyms only".) If
| I understand correctly, their contribution is the improved
| measures of congestion used. Their landing page just touts "don't
| use TCP"... which sounds like Step 0 of a very long process.
|
| I admit, the title is basically accurate: "how to build" not "the
| performance of".
|
| tl;dr: Start with existing work, simulate and improve
| incrementally.
|
| I don't know anything about the field, but this article didn't
| lead me to understand any better. I'd love to know the real
| numbers they observed, which approaches didn't pan out, are they
| effectively using an error correcting code?
|
| Anyway, it's certainly not an academic paper - just an
| advertisement.
|
| [1] https://www.trytachyon.com/file-transfer-calculator
| mcharawi wrote:
| Thanks for taking the time to read it! To address your
| concerns:
|
| 1. To give you an idea of the speed improvements, we
| transferred a 2GB file between Ohio and Singapore on AWS and
| were able to transfer it in 0:26 (seconds) using our protocol,
| vs 2:15 for SCP.
|
| 2. The basis for improvement is taking into account the changes
| in round-trip-time for a particular network path; these
| temporary increases are used as the primary congestion signal.
|
| We are not using error correcting codes, which are good for
| preventing the retransmission of packets but do not address the
| underlying problem of avoiding congestion in a network.
| koprulusector wrote:
| Can I ask a dumb question? Why SCP and not rsync?
| Straw wrote:
| How much was SCP affected by TCP buffer size tuning?
| amelius wrote:
| I just tried the calculator. It seems that if you're in "US
| Metro" or "Europe", then the transfer protocol is just as fast
| as TCP, is this correct? I wonder why this is the case. Is it
| because the routers play more fairly?
| jandrese wrote:
| I would expect it means your service provider isn't dropping
| packets. Their protocol seems to just be more aggressive
| about not backing off in the face of packet loss, which is
| helpful if one of your links is a marginal radio connection.
|
| The cynic in me thinks they achieve better throughput because
| they don't play nice with TCP and monopolize the link while
| everybody else gets backed off.
___________________________________________________________________
(page generated 2022-03-26 23:01 UTC)