[HN Gopher] NNCP: Lossless Data Compression with Neural Networks...
___________________________________________________________________
NNCP: Lossless Data Compression with Neural Networks (2019)
Author : ToJans
Score : 372 points
Date : 2021-05-22 06:15 UTC (16 hours ago)
(HTM) web link (bellard.org)
(TXT) w3m dump (bellard.org)
| GistNoesis wrote:
| I'd like to highlight the Introduction of nmcp.pdf which explains
| in a very compressed form how it works which encompass all the
| magic.
|
| "The lossless data compressor employs the traditional predictive
| approach: at each time t, the encoder uses the neural network
| model to compute the probability vector p of the next symbol
| values t knowing all the preceding symbols s0 up to st-1. The
| actual symbol value st is encoded using an arithmetic encoder
| with approximately -log2 ( pst ) bits. Then the model is updated
| knowing the symbol st. The decoder works symmetrically so there
| is no need to transmit the model parameters. It implies both
| encoder and decoder update their model identically.
|
| When no preprocessing is done, st represents the byte at position
| t. Hence there are Ns= 256 different symbol values from 0 to
| Ns-1."
|
| For those usually familiar with Huffman coding this is using an
| Arithmetic coding https://en.wikipedia.org/wiki/Arithmetic_coding
|
| It allows adaptive coding (i.e. changing the probabilities of
| symbols dynamically). Here these probabilities are modelled
| dynamically using a neural network.
|
| The magic is that the neural network parameters are defined
| implicitly : You don't need to transmit the neural network
| parameters, therefore you can use as big as you want neural
| network.
|
| During the decoding the from scratch network is continuously
| trained using the freshly decoded data, in the same way that it
| was during the encoding. The more data you compress, the more you
| train the internal neural network parameters, and the better the
| prediction for next character gets.
|
| You decode 20 bytes, you update the model with this new data, you
| decode 20 new bytes with the new model, you update the model,...
| , every 1000000 bytes you can even update your model multiple
| times using all currently available data to make the network
| converge faster.
|
| Of course this only works if everything is exactly deterministic,
| which Fabrice Bellard took great effort in guaranteeing, which is
| no small engineering feat.
| meiji163 wrote:
| I was just working on a similar project and didn't think you
| could do decode without sending weights, amazing!
| debbiedowner wrote:
| What is the probability that there are no collisions by the
| encoder?
| labawi wrote:
| What do you mean by collisions?
|
| As described, arithmetic coding does a 1:1 mapping of input-
| output, while NN deterministically generates probability
| distribution for the arithmetic coding. Conceptually, it
| would work with any deterministic / reproducible function,
| even a PRF or a running hash, though the point of NN is to
| give good distribution estimates for interesting inputs, thus
| short output from the arithmetic encoder.
|
| Presumably, there are some limited choices during coding,
| which then need to be encoded as well.
| pkhuong wrote:
| 100%.
| threatripper wrote:
| Is the model actually trained (=updating the weights instead of
| just the state) during en/decoding?
| rakmos wrote:
| The concept is rather intriguing in terms of having a
| compression algorithm that evolves towards an ideal
| optimization asymptote. I, for one, would be annoyed at the
| thought that the compression of an identical artifact might
| result in a different compression size output as it would seed
| doubt as to whether they actually had the same input.
| TD-Linux wrote:
| As long as the initial state of the network is fixed, the
| compressed output will be identical every time.
| esjeon wrote:
| FYI, the compression speed numbers from the paper:
| gzip = 17 MB/s xz = 1 MB/s CMIX = 1.6 kB/s
| NNCP = 2~3 kB/s
|
| (Note that NNCP is using RTX 3090 here.)
|
| Not only this is unacceptable for hot/warm data, also it's still
| impractical for cold data. It takes 10~16 years to compress or
| decompress 1 TB data.
| vanderZwan wrote:
| It does make me wonder: if we wrote an LZ77-based algorithm
| (like gzip and xz) with that time-budget in mind, could we
| match NNCP?
| user-the-name wrote:
| No. People are pushing the envelope for these algorithms
| constantly, as part of the various competitions in data
| compression, and they don't get anywhere close.
| pornel wrote:
| I don't know enough about xz, but for gzip I can tell you
| that definitely not. Even with an infinite amount of
| computation there's a hard limit to how good gzip can
| compress, and it's not very good.
|
| For example, let's say you've decoded a stream that so far
| contained:
|
| 123456
|
| a clever algorithm can make a guess that "7890" will be next
| (compression is all about making good guesses about the
| future data). LZ77 just can't -- it has no way to express
| such prediction. This is general problem for LZ77 with data
| that has a predictable structure, but isn't a literal
| repetition of previous data.
| jopsen wrote:
| It's still an amazing research result.
|
| Demonstrating a novel compression scheme with a high
| compression ratio is worthwhile. It might inspire something
| else.
| fabioyy wrote:
| Fabrice Bellard is a legend!
| telesilla wrote:
| Could this address lower file sizes for lossless video and audio?
| riffraff wrote:
| It seems too slow for real time decompression
| telesilla wrote:
| Real time decompression isn't the only requirement, for
| example media storage has massive needs now studio production
| is moving to the cloud. If a new lossless codec is possible
| even if it needs a day to compress or decompress there are
| still huge benefits in some applications.
| layoutIfNeeded wrote:
| No.
| legulere wrote:
| I often wonder where Fabrice Bellard finds the time for all of
| his projects
| alecco wrote:
| If you remove all the time wasters and you do strict 40h of
| work you'd be surprised how much time you have left.
|
| Remove wasting time on HN/reddit/watercooling and then
| compensating by working more hours, and cut the "cord" for
| all/most entertainment. Instead, read technical stuff. Even if
| you do it for just a couple of hours a day you'll leave most
| people behind. Same with programming. Ideally do it in the
| morning _before_ work. So go to sleep very early.
| ZephyrBlu wrote:
| For me, I think social media on the whole quite has been
| quite valuable use of time. It feels less like a waste and
| more like research or learning.
|
| The serendipity of HN, Reddit and Twitter have exposed me to
| a lot of ideas and information I would have otherwise never
| known about.
| opsy2 wrote:
| Maybe it's an explore/exploit thing, social media is good
| for exploring but to build depth we should switch off the
| socials and hit the books every other week.
| alecco wrote:
| Sure. I *try* to limit my HN/Wikipedia/etc usage because
| it's interesting but it gets me very distracted from my
| goals. For example, my monkey brain hears a song and I
| suddenly _need_ to get the lyrics, the story behind it,
| etc. The monkey brain is very creative at finding ways to
| avoid doing difficult tasks.
|
| Reddit and Twitter are mostly a waste of time nowadays due
| to the signal to noise ratio. I dropped my reddit addiction
| and I'm working on eliminating Twitter.
|
| I wish there were a social network system where people pool
| and share what they are expected to be doing and help each
| other. Or at least encourage each other to focus. But this
| goes against the advertisement paradigm where they want us
| to be impulsive and stupid. Sad, sad, sad.
| ZephyrBlu wrote:
| I think that a lot of serendipitous moments created by
| social media have actually helped me move towards my
| goals.
|
| Not directly, but by giving me new perspectives and ideas
| that bounce off my current perspective and ideas to
| create new ways of thinking about things.
|
| My perspective now is that too much focus is a bad thing
| because it comes at the expense of broadening your
| perspective.
|
| Reddit I agree is low signal, but I think Twitter is high
| signal depending on who you're following.
| alecco wrote:
| > Twitter is high signal depending on who you're
| following.
|
| The signal might be good but is it the signal you should
| be focusing on? There is an amazing amount of interesting
| stuff being discovered all the time. But is it helping
| you achieve concrete goals?
|
| With Twitter I see a trap. Say you work on
| implementing/using <Knowledge Domain X> and follow
| exclusively people doing similar work. They will also
| share things outside this domain. Since 2016 almost all
| science twitter users became highly political. And there
| is no way to filter this out.
|
| This is a feature for Twitter and the like. They profit
| by making you emotional and going a rabbit hole of
| impulsiveness.
| ballenf wrote:
| Could this work on an M1, if optimized for its ML cores?
| Bulat_Ziganshin wrote:
| There is long NNCP thread on the forum dedicated to compression
| algos:
|
| https://encode.su/threads/3094-NNCP-Lossless-Data-Compressio...
| londons_explore wrote:
| Fabrice Bellard picks up one more side project, and beats state
| of the art in text compression by a substantial margin...
| rhaps0dy wrote:
| It's a very cool achievement (a NN in 200KB, to compress? Wow)
| but the CMIX compression program performs better in the 1st
| benchmark, and is competitive in the 2nd.
|
| So he does not beat the SoTA "by a substantial margin". Your
| comment is overstating the case.
| rndgermandude wrote:
| True.
|
| Aside from the compression efficiency, I'd like to see other
| metrics too, like the work it takes to compute the
| compression (in units of energy, time, or whatever), to get
| more of a sense on what kind of workloads using something
| like that would make sense. These super-efficient compressors
| often only make sense in long term archival and compress-
| once-transmit-often scenarios.
|
| It appears http://www.mattmahoney.net/dc/text.html (that
| Bellard links to on the NNCP page) has some information:
|
| - NNCP (w/GPU) used 161812 nanoseconds/byte to compress,
| 158982 nanoseconds/byte to decompress, on an Intel Xeon
| E3-1230 v6, 3.5 GHz, RTX 3090 GPU, using about 6GB of RAM, to
| compress enwik9 down to 110,034,293 bytes
|
| - CMIX used 602867 nanoseconds/byte to compress, 601569
| nanoseconds/byte to decompress, on an Intel Core i7-7700K, 32
| GB DDR4 using about 25GB of RAM, to compress enwik9 down to
| 115,714,367 bytes
|
| - gzip -9 used 101 nanoseconds/byte to compress, 17
| nanoseconds/byte to decompress, on an... Athlon-64 3500+ in
| 32-bit mode running XP, to compress enwik9 down to
| 322,591,995 bytes
|
| - 7zip 9.20 used 1031 nanoseconds/byte to compress, 42
| nanoseconds/byte to decompress, on a Gateway M-7301U laptop
| with 2.0 GHz dual core Pentium T3200 (1MB L2 cache), 3 GB
| RAM, Vista SP1, 32 bit, to compress enwik9 down to
| 227,905,645 bytes
|
| These benchmarks are not directly comparable, as vastly
| different machines were used, but it still gives a trend at
| the very least.
|
| As expected, these super-efficient algorithms are slow,
| really slow. Honestly, I'd think at least today too slow to
| make sense even for long term archival, and far too slow for
| compress-once-transmit-often scenarios, as the decompression
| rate is almost as slow as the compression rate.
|
| Still, it's impressive work, don't get me wrong.
| rogerb wrote:
| Every time I see a Fabrice Bellard submission here - it's like a
| little cake that you found in your fridge unexpectedly. Great
| stuff!
| mercer wrote:
| It's like checking out the freezer and realizing you bought ice
| cream a while back.
| etaioinshrdlu wrote:
| I suspect his neural net library used in this could become quite
| useful and propagate far and wide if it were open-sourced:
| https://bellard.org/libnc/
| justinclift wrote:
| For reference, this is what it's License section says:
| The LibNC library is free to use as a binary shared library.
| Contact the author if access to its source code is required.
|
| https://bellard.org/libnc/libnc.html#License
| xxs wrote:
| Compression ratio/speed Program or model
| size: bytes rato: bpb speed: KB/s xz -9
| 24 865 244 1.99 1020 LSTM (small) 20
| 500 039 1.64 41.7 Transformer 18 126
| 936 1.45 1.79 (!) LSTM (large2) 16 791
| 077 1.34 2.38
|
| Note the decompression is not faster than compression (unlike
| xz)
| bckr wrote:
| Okay. Stupid question. What in the world do the ratio numbers
| mean? I get that "bpb" means "byte-per-byte". Is it input
| bytes per output bytes? Or the other way around? And why do
| some of the ratios go below 1.0? e.g. NNCP v2 (Transformer)
| _0.914_
| xxs wrote:
| bpb - it's bits per byte, e.g 8/bpt is the compression
| ratio.
| IcePic wrote:
| bits per byte?
| londons_explore wrote:
| It's a lightweight ML library... But I'm not sure if it makes
| sense for anything with CUDA as a dependency to be lightweight.
| tromp wrote:
| Compression achieved on the first billion bytes of English
| Wikipedia is only 0.88 bits per byte (bpb). Even when the
| decompressing program is large at almost 200KB, that is... wow!
| londons_explore wrote:
| Note that it is an XML formatted version of Wikipedia, so some
| of that data is very predictable XML tags.
| IcePic wrote:
| True, but that would hold true for all common compressors, so
| beating them is still a huge win.
| smel wrote:
| Guys are you sure that bellard is human ?
| agustinbena wrote:
| Is there a church to go and pray God Bellard
| fit2rule wrote:
| You're in it.
| kzrdude wrote:
| I hope he's eligible for the Hutter prize for this improvement,
| should be a pretty nice sum.
| yalok wrote:
| True story: long time ago, I was once interviewing a guy, and
| wanted to check his coding skills. Asked him to write some
| simple C function, doing some simple algorithm. Let the guy use
| my PC as I watched him (did I say it was very long time ago?).
| To my surprise, instead of writing C code right away, the guy
| went to Far on my PC, with a blink of an eye, with hot keys,
| found some existing files on my disk, glanced through them,
| returned to file editor and wrote that function.
|
| When I asked what that was, why did he have to dive into my C
| files, he admitted he doesn't care to remember C syntax, and
| just had to look it up quickly to write his function.
|
| It turned out he was mostly writing all of his code in x86
| assembly, and a pretty complex code. His main project in
| assembly was a compressor that was faster and more efficient
| than RAR, which at that time was at the top. I tried it and
| that was true.
|
| Many year later, the guys went on to win Hutter prize, and kept
| improving his own record year after year. I admire his grit and
| persistence, and how he keeps discovering new ways to improve
| his algorithms. And I wouldn't be surprised if he maintains
| them in pure assembly (he did originally).
|
| I did hire the guy, and enjoyed working with him.
| ZephyrBlu wrote:
| So this guy was Alexander Rhatushnyak?
| londons_explore wrote:
| Sadly not because it only works well on enwiki9 but not the
| smaller enwiki8.
|
| I would guess that's down to neural networks poor performance
| with small training sets.
| userbinator wrote:
| According to http://prize.hutter1.net/#prev he might actually
| be eligible.
| kzrdude wrote:
| GPUs are excluded, unfortunately.
| amelius wrote:
| Anything a GPU can do can be done on a CPU, albeit
| slower.
| kzrdude wrote:
| Uh.. not so great, this algorithm is already _very_ slow
| :)
| alecco wrote:
| This is very interesting but for people who are not from the
| field note 1. The compressors he is comparing
| to *are* neural networks themselves (PAQ) 2. He is using
| a rather small corpus enwik8 (100,000,000 bytes), instead of the
| common enwik9 3. He is not publishing the amount of RAM
| or hardware he used
|
| http://mattmahoney.net/dc/dce.html (seems down)
|
| https://web.archive.org/web/20210216231934/http://mattmahone...
|
| It seems Bellard is starting to play with these. I bet he will
| find interesting things and I would love to see him interact in
| the Russian compression webforum (please don't link). Big names
| in compression like Yann Collet, Matt Mahoney, and Jarek Duda
| used to hang out there.
| SahAssar wrote:
| The linked page has results for both enwik8 and enwik9, right?
| alecco wrote:
| Don't know how I missed enwik9. Maybe too early for me.
| -\\_(tsu)_/-
| mpfundstein wrote:
| Bellard is so awesome.
|
| But can someone please explain to me how this guy finds the time
| to build all those amazing things? It's so much non trivial stuff
| in his list. I wonder... Does he have kids? A family? Is he rich?
| Does he get millions to build quantum computers? I really wanna
| know :-)
|
| Thanks
| TynanSigg wrote:
| Oh hey I did something like this a while ago, but much less
| polished. I basically made an rANS encoder and hooked it up to
| GPT-2 (via the excellent transformers library from HuggingFace)
| to see what compression ratio I could get. Bellard's approach
| seems to use a custom model, which is probably advisable as big
| language models like I used are _very_ slow. I didn 't try to
| optimize much and only ever tested on my little personal laptop
| with just CPU, but I only got encoding speeds of ~0.1 Kb/s
| (limited mostly by available memory). However I was able to get
| very good compression factors of 6.5x using standard GPT-2 and
| 7.3x using the medium version.
|
| Here's the code if anyone's interested:
| https://github.com/trsigg/lm-compression . I haven't looked at it
| in a while except for updating it today to work with more recent
| versions of the transformers library, but I think it should still
| function and is pretty easy to try out with different models. I
| thought about using something like a hidden Markov model instead
| which would be much faster (even more so because it would enable
| using tabled rather than range-based ANS), but I haven't gotten
| around to it.
| meiji163 wrote:
| Using a pretrained model is a little bit cheating, you would
| have to include the size of the weights in the ratio
| TynanSigg wrote:
| Those two metrics will converge as the size of the text being
| compressed goes to infinity. It's necessary to include model
| size for things like the Hutter prize that involve
| compressing a fixed text (to avoid hard-coding) but isn't
| usually a useful metric for compression, especially because
| it will cause the compression ratio to depend on the size of
| the data being compressed.
|
| Edit: I thought the model in the above paper was pre-trained,
| but apparently it's only trained on the data as it arrives!
| That's indeed a very interesting approach, I wouldn't have
| expected that neural network models would converge quickly
| enough for it to be useful.
| TynanSigg wrote:
| After a bit more digging, it seems Bellard also has a version
| using GPT-2: https://bellard.org/libnc/gpt2tc.html Looks much
| cleaner and faster than my little winter break project, cool to
| see this implemented properly!
| amcoastal wrote:
| This seems like an old school and specific way of describing a
| much larger goal using NNs and auto encoders. To take data and
| reduce its dimensionality, while getting a (near) perfect
| reconstruction of the original input from the latent space data.
| Reducing (haha pun) this broad effort to just "data compression"
| leaves a lot of nuance and use out of sight.
| kdmdmdmmdmd wrote:
| This is some of the dumbest shit I've ever seen. There's
| literally no qualified people commenting on this board any
| longer. If you have any idea what the fuck neural network is or
| how compression works you'll know why this is a bad idea I would
| elaborate but I'd probably get banned for eing negitive
| kdmdmdmmdmd wrote:
| Lol dumb
| dariosalvi78 wrote:
| When I was a student at the university I thought it could be
| possible to combine quantisation used in lossy compression like
| JPEG and NN in some way..
| ZephyrBlu wrote:
| So what's the Weissman score of this baby?
| xxs wrote:
| Extremely low - it's a very slow algorithm. Think of xz being
| slow and takes forever... that's 3 orders of magnitude worse.
| linkdd wrote:
| In the TV show "Silicon Valley", the Weissman score measure
| the compression, not the speed.
|
| In reality, the Weissman score does not exist.
|
| I highly recommend this serie if you haven't seen it :)
| xxs wrote:
| >the Weissman score measure the compression, not the speed
|
| This is false, it uses both - hence the 'middle out' was
| suitable for video chat and all kind of enterprise software
| (incl. a custom appliance box). Overall - a be it all
| compression.
|
| The score is so popular, it has its own wikipedia page[0].
| _It compares both required time and compression ratio of
| measured applications, with those of a de facto standard
| according to the data type._
|
| [0]: https://en.wikipedia.org/wiki/Weissman_score
| merricksb wrote:
| Original 2019 discussion for those curious:
|
| https://news.ycombinator.com/item?id=19589848
| eps wrote:
| Compression speed is around 2 KB/sec.
|
| The figure is given in nncp_v2.1.pdf. This is 1000x slower than
| xz, which in turn is 1000x slower than gzip.
|
| I have always wondered if a compression _service_ could be a
| viable commercial offering and with NNCP rates and its
| compression speed, it looks like it could! That 's for cases when
| you need to _really_ compress the heck out of something, e.g.
| prior to distributing a large static piece of data to a very
| large number of recipients. I think there was a service like
| this, used by AAA game developers for distributing patches, but
| it didn 't take because it had a very slow decompression... and,
| now reading further through the paper, it looks like an issue
| with NNCP too - The NNCP v2 decompression speed
| is similar to its compression speed.
|
| So it's a no-go for my brilliant and lazy startup idea, but
| fascinating stuff nonetheless.
| m463 wrote:
| I think maybe matching the compression speed with the network
| speed is what you want to do.
|
| In a place with ubiquitous fast broadband, a compromise with
| regard to decompression speed might be ok.
|
| If you're sending something to mars... 2 kb/sec might make
| sense.
| xxs wrote:
| >If you're sending something to mars... 2 kb/sec might make
| sense.
|
| Keep in mind that 2KB/s is on something like 3090...
| Massively power hungry even on the decompression.
| loeg wrote:
| > I think maybe matching the compression speed with the
| network speed is what you want to do.
|
| You might be interested: Zstandard has an adaptive mode that
| attempts to do this.
| agumonkey wrote:
| What about decompression speed ?
| xxs wrote:
| Yup, it isn't a dictionary based compression. Nonetheless
| that's a fascinating take
| thesz wrote:
| With preprocessing it _is_ a dictionary based compression. It
| uses a fixed dictionary to improve on compression.
| wolf550e wrote:
| This is what the games industry uses in practice. It is
| optimized for fast decompression.
| http://www.radgametools.com/oodle.htm
| AtlasBarfed wrote:
| Hm, there are likely multiple large files that exist that are
| very common (DLLs, versions of shared libs, etc etc etc) that
| might be frequently compressed (SCP, stored in SSDs, etc) since
| disk space isn't free yet last I checked.
|
| Each file is really its own special theoretical multivariate
| space to find the best theoretical compression ratio. Usually
| finding the optimal or near-optimal compression is the hardest
| computationally.
|
| And it's basically a perfectly cacheable computation if you
| have some sort of database mapping algorith + parameters to
| apply to a given file.
|
| Decompression usually isn't as big a deal for performance.
|
| Anyway an online database for a service that has actively tried
| to take a known common file (match on filesize, hash, etc) down
| to its best compressed representation so 1) we don't have to do
| it every time.
|
| Cloud providers would presumably love these, since S3 probably
| has millions copies of many files and would 1) get to compress
| the file and 2) get to share its representation across all
| instances.
| raverbashing wrote:
| For real
|
| I know shipping speeds are important, but it seems game
| developers could try harder with delta compressing their
| updates, because it's hard to believe a tens-of-GBs download is
| the best they could do when you already have a BluRay disk or
| something (unless it's new content, of course)
|
| I wonder how much their bandwidth costs are and how quickly a
| reduction would pay itself.
| tomalpha wrote:
| Does DRM contribute to this perhaps?
|
| Given the layers of encryption and obfuscation, and leaving
| aside the merits or otherwise of DRM in the first place, I
| can imagine it could be hard to do delta patching. At least,
| without making it easier to crack the DRM - e.g. you'd
| presumably have to decrypt the asset that needed patching,
| apply the delta, and re-encrypt/re-sign it all client side.
| xxs wrote:
| What would pay for itself is p2p, e.g. bittorrent... alas.
| raverbashing wrote:
| p2p wouldn't solve the main issues, which are excessive
| download times and poor user experiences for upgrades.
| lxgr wrote:
| Blizzard used to use Bittorrent in some of their
| installers/updaters a while ago.
|
| I'm not sure how viable this is today, or ever was: Many
| residential connections are heavily asymmetric. Players
| presumably aren't too excited either by the idea of
| continuing to seed while they already play online, with
| shitty ISP-provided routers suffering hundreds of
| milliseconds of bufferbloat whenever the uplink is
| saturated.
| xxs wrote:
| >Blizzard used to use Bittorrent in some of their
| installers/updaters a while ago.
|
| Yes, WoW used to have bit-torrent based update.
|
| >Many residential connections are heavily asymmetric.
|
| this is a US centric-view, though. For instance I have
| 300Mb up/down. At least for most of the Europe (and I
| suppose Asia), it's pretty common not to be restricted in
| any way.
|
| I could leave a bit-torrent client non-stop and won't
| feel it. Like I've said: alas. There is some stigma
| associated with p2p/torrents.
| dividuum wrote:
| Not sure where you are in Europe, but in Germany you have
| to search a bit to find a residential ISP with symmetric
| bandwidth. Mine is 50/10.
| lxgr wrote:
| > this is a US centric-view, though
|
| I live in Europe and I have a 1000/0.5 Mbit/s connection.
| labawi wrote:
| That seems very atypical. As the link capacity ratio goes
| past 10:1, even for typical dowloads, speed starts to be
| limited by the ability to send requests, or even TCP
| ACKs.
|
| I've seen 20:1 offers on cable providers, but it's mostly
| marketing BS, as you're unlikely to get full download
| speed in most circumstances due to upload limitations.
| cinntaile wrote:
| This is not the standard at all in Europe, maybe that's
| the standard in your specific country?
| xxs wrote:
| I have lived in quite few countries, travelled as well.
| Someone mentioned Germany (and Bavaria) - indeed, it had
| one of the worst inet couple of years back. OTOH, Austria
| was totally fine.
| cinntaile wrote:
| Your claim was that the lines were symmetric and
| "unrestricted". Neither are the standard unfortunately.
| lxgr wrote:
| "It's fine" is a different statement from "up- and
| downlink bandwidths are heavily asymmetric", no?
|
| While asymmetric connections make sense for many
| residential use cases, P2P is not one of them.
| Bayart wrote:
| Asymmetry is the rule in my country for both copper and
| fibre lines, although it's less pronounced with fibre.
| Symmetric lines are a specific business option.
| paavohtl wrote:
| I have very rarely seen symmetric broadband plans
| available in Finland. In the past I've had 10/1, 24/1,
| 100/10 and now I have 1000/100. With ADSL connections
| being asymmetric was basically necessary because the
| bandwidth is so limited - I think most consumers would
| prefer 24/1 over 12/12. With fiber I think it's more
| about market segmentation: ISPs want businesses to buy
| significantly more expensive symmetric plans.
| patrakov wrote:
| > At least for most of the Europe (and I suppose Asia),
| it's pretty common not to be restricted in any way.
|
| In the Philippines, 15 Mbps down / 3 Mbps up ADSL with a
| static public IP is not something that a foreigner could
| easily buy. Globe Telecom in Cebu asked for evidence that
| I have sufficient income to pay for this (2600 PHP per
| month), and their rules only allow documents from the
| Philippines, so my bank statement was rejected. I had to
| talk to their higher-level manager to negotiate an
| exception. And their default router has 2 seconds of
| bufferbloat, so I bought a TP-Link Archer C7 and
| installed OpenWRT to compensate.
| cinntaile wrote:
| Would it even pay for itself? It seems to me like bandwidth
| is pretty cheap compared to computation costs so why bother
| trying to squeeze everything out. Just do some basic
| compression and you're good to go. Buy a bigger SSD if
| necessary, with technologies like reBAR compression just
| slows down a game anyway.
| literallycancer wrote:
| As far as cloud providers go, bandwidth is where the lock
| in is. It's priced to make it too expensive to switch to a
| competitor.
| kitd wrote:
| _Compression speed is around 2 KB /sec._
|
| Apologies if this is a noob question, but could you improve it
| with training?
| esquire_900 wrote:
| > Apologies if this is a noob question, but could you improve
| it with training?
|
| Considering all other things like GPU, code quality etc.
| equal, the speed here mainly depends on the size of the
| network. Larger networks result in better performance
| (logarithmic), but also slower speeds (linear). There are
| some tricks like pruning to create significantly smaller
| networks with similar-ish performance, but good models are
| often large.
|
| Unrelated to your question, what nobody mentions is that the
| decompression side also needs to complete network. With 57M
| and 187M parameters, those files are going to be quite large,
| > 100MB for sure. That completely annihilates the performance
| wins for 1 time transfers.
| 317070 wrote:
| No, as I understand it, the network does not need to be
| transmitted. The decoder learns on the same stream as the
| encoder, so you do not need to transmit network parameters.
| ironSkillet wrote:
| If I compress a large file at location A using this
| algorithm, and want to send it to location B, how does
| location B know how to decompress it?
| clavigne wrote:
| see the (current) top comment
|
| https://news.ycombinator.com/item?id=27244810
|
| it's a very clever use of symmetry.
| vidarh wrote:
| It's clever, but pretty much "standard" in compression.
| Earliest I'm aware of that used symmetry in the encoder
| and decoder to prevent explicitly transferring the
| parameters this way was LZ78 (Lempel, Ziv; 1978), but
| there could well be predecessors I'm not aware of.
|
| LZ78 used it "just" to build a dictionary, but the
| general idea of using symmetry is there.
|
| (A fascinating variant on this is Semantic Dictionary
| Encoding, by Dr Michael Franz in his doctoral
| dissertation (1994), that used this symmetry to compress
| partial ASTs to cache partially generated code for
| runtime code generation)
| e12e wrote:
| See https://news.ycombinator.com/item?id=27244810
|
| As I understand it, when starting up (beginning of
| stream) - the network does not compress, then guesses the
| next byte at every turn _determininstically_ from input
| so far.
|
| Decoding can "read" the fist byte, then proceed to learn
| and guess the second byte and so on.
| Yoric wrote:
| Basically the same way your favorite codec knows how to
| decompress. It builds the predictor (in this case, a NN)
| from the known input. The predictor then assigns a symbol
| (or a series of symbols) to the next few bits - higher
| probabilities need fewer bits (I haven't checked for this
| specific technique, but there are techniques that don't
| even require an integer number of bits, e.g. range
| encoding).
| rurban wrote:
| paq also has a similar NN encoder, but a much better and much
| faster one. In asm though, but open source. LibNC is closed
| ivoras wrote:
| Did anyone manage to use the provided Windows binary to
| succesfully compress anything? For me, it exits producing a
| 0-byte "compressed" file.
| cylon13 wrote:
| That's impressively small!
| drran wrote:
| Yep, but file metadata still takes some space.
| thesz wrote:
| It needs AVX-512.
| logicchains wrote:
| Fabrice Bellard is proof that 10x programmers exist. Or even more
| than 10x; he single-handled wrote a software 4G LTE base station
| (https://bellard.org/lte/), something that would normally take a
| team of hundreds to develop due to the complexity of the
| specification.
| [deleted]
| willvarfar wrote:
| Fabrice is fantastic!
|
| There are other programmers at his level, or with the potential
| to be at his level, splattered all over the industry.
|
| Unfortunately, they 'waste' their time on their day job stuff,
| often having to do things badly because technical approaches
| are prescribed by others etc.
|
| Many of us meet truly great programmers, but so few of them
| actually demonstrate and express it. They just grin when asked
| to implement micro services and add some banal features to the
| website etc.
| vanderZwan wrote:
| He's pretty much the programmer's John von Neumann of our time,
| isn't he? Or at least one of the few contenders for that title
| amelius wrote:
| Von Neumann was more a mathematician/scientist than an
| engineer.
| vanderZwan wrote:
| Well yes, but a 10x mathematician/scientist. Plus important
| enough to the field of computing to be comparable I think
| [deleted]
| yalok wrote:
| Totally. Plus FFmpeg...
| logicchains wrote:
| And QEMU...
| sugarkjube wrote:
| And quickjs
| mobilio wrote:
| And LZEXE!
| zserge wrote:
| And tcc/otcc
| mercer wrote:
| I love how FFmpeg is way down on his homepage. Meanwhile,
| even non-programmers in my world know and use it.
| Y_Y wrote:
| I think hero-worship in any field is weird and unhealthy. That
| said, Bellard is astonishingly talented and prolific and it's
| hard to look at someone like that and say he's just a mortal
| who tries his best to write code, just like everyone else.
|
| So what is it that causes this big gulf in productivity? I know
| that as a programmer I have all sorts of internal and external
| obstacles to just sitting down and churning out entire
| projects, let alone something useful with clever and high-
| quality code. Is a 10x rockstar just somebody who's been able
| to eliminate most of their obstacles, and has some amount of
| talent left over to solve problems with?
| slim wrote:
| Why is hero-worship is unhealthy? I thought it's the contrary
| user-the-name wrote:
| Because it decreases self-confidence and initiative. It
| makes you think of the things that the "hero" does as
| exceptional and out of your reach, instead of something you
| could do as well if you put your mind to it properly.
| zaphirplane wrote:
| Interestingly enough seeing powerful role models inspired
| schwarzenegger, bolt martial artists, poets, architects
| even Tom cruise inspired Air Force pilots
| opsy2 wrote:
| It seems small, but it's a big distinction between hero
| worship and role model!
|
| 'Hero Worship' is about elevating someone to a
| categorically higher level. This can lead to thinking you
| could never be as good, taking everything they say as
| gospel, defending even their poor opinions, etc.
|
| If we -worship- 10x developers, you can maybe better see
| how with the emphasis on that word things can get pretty
| toxic pretty quick. For example, I think many would agree
| HR disputes should not take talent level as an input, but
| if one of those people isn't even a mortal...
|
| ---
|
| None of this is necessary to have a role model, or
| precludes celebrating someone for doing excellent work as
| is this case.
| foepys wrote:
| Some people are just very intelligent and can grasp concepts
| quickly and apply this new knowledge efficiently. To them a
| person of average intelligence is often like a toddler that
| is struggling to put the cube into the round hole.
| logicchains wrote:
| There are billions of people in the world, over seven
| thousand people with one-in-a-million intelligence, seven
| million people with one-in-a-thousand intelligence, but
| nobody else close to Fabrice Bellard in software output.
| Cybiote wrote:
| The trick is that even a tiny difference in ability can lead
| to a large difference in results as the number of decision
| steps involved in a task increase. The expert wastes less
| time in search space. This is a combination of good mental
| models honed from experience and attention to detail.
|
| Something unusual about Bellard is the breadth of his
| knowledge. There are probably a fair few who can write a fast
| deterministic low level code optimized neural network
| library. A decent number who can write an autodiff library
| but few who can do both. Once that's done, I daresay that's
| the hard part.
|
| Given such a library, implementing an LSTM or transformer
| model and combining it with Adaptive Modeling/incremental
| learning of data statistics using a neural network to feed an
| Arithmetic Coder is something that is within the reach of
| many more.
|
| In summary, a 10x developer is not 10x smarter or 10x more
| skilled, they just waste less time investigating unproductive
| paths and are maybe 1% less likely to take wrong turns. This
| ends up making a huge difference for large complex tasks.
| pansa2 wrote:
| > _Something unusual about Bellard is the breadth of his
| knowledge._
|
| One thing I've noticed is that he chooses to apply himself
| to many different subfields of development, but always
| using the same technologies e.g. all his work seems to be
| in either C or JavaScript.
|
| I've only worked in a couple of subfields but have
| seriously used at least 7 different languages, plus
| countless libraries. Compared to Bellard I've probably
| spent (wasted?) a large amount of time trying to choose,
| and then learn, the "best" tool for each job.
| ZephyrBlu wrote:
| To be fair, Bellard has the luxury of doing that because
| he's generally writing things from near scratch and he's
| not beholden to other people or outside requirements.
| chii wrote:
| > it's hard to look at someone like that and say he's just a
| mortal who tries his best to write code, just like everyone
| else.
|
| would anyone look at Usain Bolt and say that he's mortal and
| just like everyone else? Why is it so difficult to imagine
| that there are people who are just so much better than you,
| that you (an average person) can never catch up?
|
| Most people have no trouble understanding that they cannot
| reach the level of Usain Bolt. Why should programming be any
| different from sprinting in this case?
| jbjbjbjb wrote:
| I would say an innate talent for 'coding' is a relatively
| minor factor. What is very rare to see is people showing
| the level of responsibility and grit to get on by
| themselves and build something. Also in a business setting
| it is unheard of for someone to have the opportunity or
| even be trusted enough to do it not to mention all the
| other business factors.
| ZephyrBlu wrote:
| > _I would say an innate talent for 'coding' is a
| relatively minor factor_
|
| I would argue that innate talent is strongly linked to
| genetics. In the case of coding, I think intelligence
| plays a large role.
| jbjbjbjb wrote:
| It plays a role, sure. There will be significant
| diminishing returns though. What I'm suggesting is there
| are other factors at play here that are way more
| important than a few extra IQ points. Completing a
| significant project isn't one dimensional.
| Datenstrom wrote:
| > innate talent is strongly linked to genetics
|
| I do not believe there is much of a spread in
| intelligence based on genetics. Short of disability, and
| given the same consistency, training, and opportunities I
| seriously doubt that anything even remotely close to a
| 10x gap would be seen.
|
| I am certainly not keen on the science of the topic
| though, biology is certainly my worst subject. But I have
| heard of no research showing such a divide.
| taneq wrote:
| Just like Usain Bolt's physical attributes are a minor
| factor and it's his grit and responsibility that makes
| him so quick?
| jbjbjbjb wrote:
| No it is completely different the analogy doesn't work.
| Coding isn't running.
| abiro wrote:
| Usain Bolt doesn't run 10x faster than your average fit
| person. It's more like 2x. So it's fair to ask if there are
| factors other than innate ability at play when observing
| orders of magnitudes of differences in programmers'
| outputs.
| nl wrote:
| The combination of better performance, more consistently
| over a long period of time is what makes 10x
|
| If you put Manchester City up against a team consisting
| of average people who are fit and know how to play soccer
| it's pretty easy to see Manchester City scoring 10x the
| goals - at least.
| SirSourdough wrote:
| Isn't the question here more like putting Man City (elite
| professionals) up against average professionals? I don't
| see the 10x goal differential there necessarily. It would
| be unremarkable for a "10x programmer" to be 10x better
| than a hobbyist, which is what I see as the equivalent of
| someone "who is fit and knows how to play soccer".
| ZephyrBlu wrote:
| The difficulty of tasks generally increases
| exponentially, not linearly though. 2x in absolute terms
| definitely translates to 10x or more in terms of skill.
| roenxi wrote:
| 1) Yes, Bolt is also mortal.
|
| 2) If quantified the performance gap between me (unfit,
| untrained) and Bolt will be smaller than the gap between me
| (practised programmer) and Bellard.
| itsoktocry wrote:
| > _If quantified the performance gap between me (unfit,
| untrained) and Bolt will be smaller than the gap between
| me (practised programmer) and Bellard._
|
| This is quite the assumption. Bolt is the fastest man
| we've ever seen. You actually think you're more closely
| competitive to him than Bellard???
| vlovich123 wrote:
| It may be stated poorly but what OP is trying to say is
| that in terms of just output, the relative time
| difference in the 100m dash is smaller than the relative
| software output between them.
|
| Obviously that's silly because the physical result is
| approached asymptotically. World class athletes train and
| compete their whole lives of a fractional difference in
| result. Clearly the output by the human brain is not
| constrained by something asymptotic as what the human
| body can physically be trained to accomplish.
| agumonkey wrote:
| What surprises me about bellard, beside the excellency, is
| the breadth. Who could predict his next topic...
|
| Surely his intellectual life is diverse
| sesm wrote:
| Did he ever had a dayjob as a programmer? My bet is 'no'
| av500 wrote:
| he did, at least a few years ago a former coworker worked
| with him in Paris at a French streaming provider or so
___________________________________________________________________
(page generated 2021-05-22 23:01 UTC)