[HN Gopher] PyTorch 1.8, with AMD ROCm support
       ___________________________________________________________________
        
       PyTorch 1.8, with AMD ROCm support
        
       Author : lnyan
       Score  : 292 points
       Date   : 2021-03-05 05:38 UTC (17 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | blackcat201 wrote:
       | The supported ROCm version is 4.0 which only the latest AMD
       | instinct supports it. There's still a long way before its
       | supported in the latest consumer RDNA 2 GPUs ( RX 6000 series )
        
         | porphyra wrote:
         | Lack of ROCm support in consumer RDNA 2 GPUs really makes it
         | impossible for regular people to use ROCm. As an owner of an
         | AMD Radeon RX 6800 I'm pretty salty about it.
        
           | Dylan16807 wrote:
           | At least they promised support.
           | 
           | They're actively refusing to comment on 5000-series.
        
             | jiehong wrote:
             | Indeed. Yet, the amdgpu-pro driver supports the 5000
             | series.
             | 
             | I managed to get it to work for my 5500 XT, but only in
             | Ubuntu and for the 5.4 kernel and it wasn't
             | straightforward.
             | 
             | It would be nice if a project like Mesa would exist for
             | OpenCL.
             | 
             | In the future, I suppose Vulkan will be used instead for
             | compute though.
        
           | cranium wrote:
           | It's the last thing that keeps me on Nvidia with proprietary
           | Linux drivers. I wouldn't mind ML training on a AMD card to
           | be slower but I need my workload to be at least GPU-
           | accelerated.
        
             | carlmr wrote:
             | I mean I wouldn't worry too much about it, I think if
             | something big like PyTorch supports it AMD might rethink
             | their strategy here. They have a lot to gain by entering
             | the compute market.
        
               | meragrin_ wrote:
               | AMD only cares about data center. Anything below that,
               | they don't care. They don't care about supporting
               | anything other than linux. God forbid some person try to
               | experiment with their hardware/software as a hobby before
               | putting it to use in a work setting.
        
               | carlmr wrote:
               | Eh, not sure if that's correct. Their Ryzen consumer CPUs
               | work amazingly well on Windows. The gaming graphics cards
               | also still target Windows.
               | 
               | And if they do care about the data center that much and
               | ROCm becomes a thing for compute, they will want as many
               | people as possible to be able to experiment with ROCm at
               | home. So that they demand data centers with ROCm.
        
               | meragrin_ wrote:
               | > And if they do care about the data center that much and
               | ROCm becomes a thing for compute, they will want as many
               | people as possible to be able to experiment with ROCm at
               | home. So that they demand data centers with ROCm.
               | 
               | The engineers know this and have tried to explain this to
               | upper management, but upper management only listens to
               | "customer interviews" which consists of management from
               | data centers. Upper management does not care about what
               | engineers hear from customers because those are not the
               | customers they care about.
        
               | sorenjan wrote:
               | ROCm has been a thing for four years now, Cuda had been
               | around almost a decade before ROCm was conceived, and AMD
               | still doesn't show any interest in an accessible compute
               | platform for all developers and users. I think they
               | should focus on OpenCL and Sycl.
        
               | carlmr wrote:
               | Well it's a chicken and egg problem.
               | 
               | Without major library support nobody cares about ROCm, if
               | nobody cares about ROCm AMD management will focus on
               | other things.
               | 
               | But PyTorch just laid that egg.
        
               | meragrin_ wrote:
               | As far as ROCm support on consumer products, it is
               | strictly a management problem which will takes years for
               | them to figure out since they do not listen to their
               | engineers and do not view users of consumer graphics
               | cards as compute customers.
               | 
               | I would love an alternative to Nvidia cards. After
               | waiting for so long for ROCm support for RDNA cards and
               | reading an engineer's comments about why there is no
               | support yet, I've given up on AMD for compute support on
               | their graphics cards. I'm hoping Intel's graphic cards
               | aren't garbage and get quick support. I probably will buy
               | an Nvidia card before then if I have the opportunity
               | since I'm tired of waiting for an alternative.
        
           | krastanov wrote:
           | I am getting some conflicting messages about support. There
           | is a small group of people working on ROCm for Julia (in
           | AMDGPU.jl) and while that work is still alpha quality, they
           | seem to expect even devices as old as RX 580 to work.
           | 
           | Are all of these support issues something that is on the
           | ROCm/AMD/driver side, or are they on the side of libraries
           | like pytorch?
        
             | jpsamaroo wrote:
             | This is an issue with AMD not wanting to long-term support
             | the code paths in ROCm components necessary to enable ROCm
             | on these devices. My hope is that Polaris GPU owners will
             | step up to the plate and contribute patches to ROCm
             | components to ensure that their cards keep working, when
             | AMD is unwilling to do the leg work themselves (which is
             | fair, they aren't nearly as big or rich as NVidia).
        
             | snovv_crash wrote:
             | RX580 had a pro version which got ROCm support. It's the
             | newer cards and older cards which aren't supported.
        
               | my123 wrote:
               | RX580 is out of support for ROCm nowadays, which only
               | supports Vega and Instinct now.
               | 
               | https://github.com/RadeonOpenCompute/ROCm/issues/1353#iss
               | uec...
        
               | dnautics wrote:
               | Can confirm, I made an 8x rx580 rig for an r&d project
               | specifically for that reason.
        
               | freeqaz wrote:
               | Time to re-sell into the hot GPU market now? :P
        
               | dnautics wrote:
               | Don't work there anymore; even so that server is probably
               | buried under loads of failed ideas.
        
       | [deleted]
        
       | arthurmorgan wrote:
       | This still won't work with a RX 580 right? It seems like ROCm 4
       | doesn't support that card.
        
       | etaioinshrdlu wrote:
       | This is pretty neat since it is the first time in years that a
       | top-tier deep learning framework has official support for any
       | training accelerator with open source kernel drivers.
       | 
       | I guess the TPU also doesn't require kernel drivers because you
       | talk to it over network instead of PCIE. But you cannot buy a
       | TPU, only the int8 edge TPU is for sale. (And I've heard that the
       | edge TPU's are absolutely top-notch for performance per $ and
       | Watt right now, as an aside.)
        
         | [deleted]
        
         | rfoo wrote:
         | I believe TensorFlow is a top-tier deep learning framework, and
         | it had ROCm support since 2018.
         | 
         | > edge TPU's are absolutely top-notch for performance per $ and
         | Watt right now
         | 
         | Do you mean "aren't"? The performance per $ and Watt is not
         | awesome even when it was released, I was hoping for great
         | toolchain support but that also didn't happen.
        
           | dheera wrote:
           | Jetson Nano: 1.4 TOPS/W, Coral TPU: 2 TOPS/W ?
           | 
           | Of course it doesn't really help that Google refuses to
           | release a more powerful TPU that can compete with e.g. Xavier
           | NX or a V100 or RTX3080 so for lots of applications there
           | isn't much of a choice but to use NVIDIA.
        
             | rfoo wrote:
             | Sorry, should have mentioned "if you have access to
             | Shenzhen" in my post :)
             | 
             | What I have in mind is something like RK3399Pro, it has a
             | proprietary NPU at roughly 3Tops / 1.5W (on paper). But its
             | toolchain is rather hard to use. Hisilicon had similar
             | offerings. There are also Kendryte K210 which claims 1Tops
             | @ 0.3W but I haven't get any chance to try it.
             | 
             | I was already playing with RK3399Pro When Edge TPU was
             | announced, life is tough when you had to feed your model
             | into a blackbox "model converter" from the vendor. That's
             | the part I hope Edge TPU excels at. But months later I was
             | greeted by... "to use Edge TPU, you have to upload your
             | TFLite model to our _online_ model optimizer ", which is
             | worse!
        
               | simcop2387 wrote:
               | There's now a blackbox compiler that doesn't have to run
               | on their service, but it's basically the same as all the
               | others now because of that.
        
               | my123 wrote:
               | On Xavier, the dedicated AI inference block is open
               | source hardware.
               | 
               | Available at http://nvdla.org/
        
           | NavinF wrote:
           | Are there any <10W boards that have better performance/watt
           | for object detection?
           | 
           | If it exists, I wanna buy it
        
             | geek_at wrote:
             | I used the Intel neural compute sticks [1] for my porn
             | detection service [2] and they worked great. Could import
             | and run models on a Pi with ease
             | 
             | [1] https://blog.haschek.at/2018/fight-child-pornography-
             | with-ra... [2] https://nsfw-categorize.it/
        
               | belval wrote:
               | Very interesting project and article. You should consider
               | submitting it to HN for its own post (if you have not
               | done so already).
        
             | my123 wrote:
             | Jetson Xavier NX, but that comes with a high price tag.
             | It's much more powerful however.
        
               | dheera wrote:
               | I'll also add a caveat that toolage for Jetson boards is
               | extremely incomplete.
               | 
               | They supply you with a bunch of sorely outdated models
               | for TensorRT like Inceptionv3 and SSD-MobileNetv2 and
               | VGG-16. WTF, it's 2021. If you want to use anything
               | remotely state-of-the-art like EfficientDet or HRNet or
               | Deeplab or whatever you're left in the dark.
               | 
               | Yes you can run TensorFlow or PyTorch (thankfully they
               | give you wheels for those now; before you had to google
               | "How to install TensorFlow on Jetson" and wade through
               | hundreds of forum pages) but they're not as fast at
               | inference.
        
               | homarp wrote:
               | https://github.com/wang-xinyu/tensorrtx has a lot of
               | models implemented for TensorRT. They test on GTX1080 not
               | jetson nano though, so some work is also needed.
               | 
               | TVM is another alternative to get models to inference
               | fast on nano
        
               | codethief wrote:
               | How does TVM compare to TensorRT performance-wise?
        
               | my123 wrote:
               | You have https://github.com/NVIDIA-AI-IOT/torch2trt as an
               | option for example to use your own models on TensorRT
               | just fine.
               | 
               | And https://github.com/tensorflow/tensorrt for TF-TRT
               | integration.
        
               | codethief wrote:
               | TF-TRT doesn't work nearly as well as pure TRT. On my
               | Jetson Nano a 300x300 SSD-MobileNetV2 with 2 object
               | classes runs at 5 FPS using TF, <10 FPS using TF-TRT and
               | 30 FPS using TensorRT.
        
               | codethief wrote:
               | > I'll also add a caveat that toolage for Jetson boards
               | is extremely incomplete.
               | 
               | A hundred times this. I was about to write another rant
               | here but I already did that[0] a while ago, so I'll save
               | my breath this time. :)
               | 
               | Another fun fact regarding toolage: Today I discovered
               | that many USB cameras work poorly on Jetsons (at least
               | when using OpenCV), probably due to different drivers
               | and/or the fact that OpenCV doesn't support ARM64 as well
               | as it does x86_64. :(
               | 
               | > They supply you with a bunch of sorely outdated models
               | for TensorRT like Inceptionv3 and SSD-MobileNetv2 and
               | VGG-16.
               | 
               | They supply you with such models? That's news to me.
               | AFAIK converting something like SSD-MobileNetv2 from
               | TensorFlow to TensorRT still requires _substantial_
               | manual work and magic, as this code[1] attests to. There
               | are countless (countless!) posts on the Nvidia forums by
               | people complaining that they 're not able to convert
               | their models.
               | 
               | [0]: https://news.ycombinator.com/item?id=26004235
               | 
               | [1]: https://github.com/jkjung-
               | avt/tensorrt_demos/blob/master/ssd... (In fact, this is
               | the _only_ piece of code I 've found on the entire
               | internet that managed to successfully convert my SSD-
               | MobileNetV2.)
        
           | etaioinshrdlu wrote:
           | Tensorflow doesn't seem to officially support ROCm, only
           | unofficial community projects do. This is official support
           | from PyTorch.
        
             | rfoo wrote:
             | Tensorflow does officially support ROCm. The project was
             | started by AMD and later upstreamed.
             | 
             | https://github.com/tensorflow/tensorflow/tree/master/tensor
             | f...
             | 
             | https://github.com/tensorflow/tensorflow/blob/master/tensor
             | f...
             | 
             | It is true that it is not Google who are distributing
             | binaries compiled with ROCm support through PyPI
             | (tensorflow and tensorflow-gpu is uploaded by Google, but
             | tensorflow-rocm is uploaded by AMD). Is this what you meant
             | by "not officially supporting"?
        
               | etaioinshrdlu wrote:
               | Oh, interesting. I do wonder if Google puts the same
               | quality control and testing into the Rocm version,
               | though. Otherwise it would really be a lower tier of
               | official support.
               | 
               | Granted, I don't know anything about the quality of
               | PyTorch's support either.
        
       | std_badalloc wrote:
       | PyTorch is the most impressive piece of software engineering that
       | I know of. So yeah, it's a nice interface for writing fast
       | numerical code. And for zero effort you can change between
       | running on CPUs, GPUs and TPUs. There's some compiler
       | functionality in there for kernel fusing and more. Oh, and you
       | can autodiff everything. There's just an incredible amount of
       | complexity being hidden behind behind a very simple interface
       | there, and it just continues to impress me how they've been able
       | to get this so right.
        
         | UncleOxidant wrote:
         | > Oh, and you can autodiff everything.
         | 
         | Well, not everything. Julia's Zygote AD system can autodiff
         | most Julia code (currently with the exception of code that
         | mutates arrays/matrices).
        
         | jampekka wrote:
         | OTOH PyTorch seems to be highly explosive if you try to use it
         | outside the mainstream use (i.e. neural networks). There's
         | sadly no performant autodiff system for general purpose Python.
         | Numba is fine for performance, but does not support autodiff.
         | JAX aims to be sort of general purpose, but in practice it is
         | quite explosive when doing something other than neural
         | networks.
         | 
         | A lot of this is probably due to supporting CPUs and GPUs with
         | the same interface. There are quite profound differences in how
         | CPUs and GPUs are programmed, so the interface tends to
         | restrict especially more "CPU-oriented" approaches.
         | 
         | I have nothing against supporting GPUs (although I think their
         | use is overrated and most people would do fine with CPUs), but
         | Python really needs a general purpose, high performance
         | autodiff.
        
           | _coveredInBees wrote:
           | I really don't understand the GPUs are overrated comment. As
           | someone who uses Pytorch a lot and GPU compute almost every
           | day, there is an order of magnitude difference in the speeds
           | involved for most common CUDA / Open-CL accelerated
           | computations.
           | 
           | Pytorch makes it pretty easy to get large GPU accelerated
           | speed-ups with a lot of code we used to traditionally limit
           | to Numpy. And this is for things that have nothing to do with
           | neural-networks.
        
           | lgessler wrote:
           | I get what you mean by the GPUs are overrated comment, which
           | is that they're thought of as essential in many cases when
           | they're probably not, but in many domains like NLP, GPUs are
           | a hard requirement for getting anything done
        
           | wxnx wrote:
           | > I have nothing against supporting GPUs (although I think
           | their use is overrated and most people would do fine with
           | CPUs), but Python really needs a general purpose, high
           | performance autodiff.
           | 
           | As someone who works with machine learning models day-to-day
           | (yes, some deep NNs, but also other stuff) - GPUs really seem
           | unbeatable to me for anything gradient-optimization-of-
           | matrices (i.e. like 80% of what I do) related. Even inference
           | in a relatively simple image classification net takes an
           | order of magnitude longer on CPU than GPU on the smallest
           | dataset I'm working with.
           | 
           | Was this a comment about specific models that have a
           | reputation as being more difficult to optimize on the GPU
           | (like tree-based models - although Microsoft is working in
           | this space)? Or am I genuinely missing some optimization
           | techniques that might let me make more use of our CPU
           | compute?
        
           | komuher wrote:
           | Wait wat, jax and also pytorch is used in a lot more areas
           | then NN's. Jax is even consider to do better in that
           | department in terms on performance then all of julia so wat
           | are u talking about
        
             | jpsamaroo wrote:
             | > Jax is even consider to do better in that department in
             | terms on performance then all of julia so wat are u talking
             | about
             | 
             | Please provide sources for this claim
        
             | BadInformatics wrote:
             | GP makes a fair point about JAX still requiring a limited
             | subset of Python though (mostly control flow stuff). Also,
             | there's really no in-library way to add new kernels. This
             | doesn't matter for most ML people but is absolutely
             | important in other domains. So Numba/Julia/Fortran are
             | "better in that department in terms on performance" than
             | JAX because the latter doesn't even support said
             | functionality.
        
           | UncleOxidant wrote:
           | > There's sadly no performant autodiff system for general
           | purpose Python.
           | 
           | Like there is for general purpose Julia code?
           | (https://github.com/FluxML/Zygote.jl)
           | 
           | > I have nothing against supporting GPUs (although I think
           | their use is overrated and most people would do fine with
           | CPUs),
           | 
           | Do you run much machine learning code? All those matrix
           | multiplications run a good bit faster on the GPU.
        
           | jl2718 wrote:
           | Have you tried using Enzyme* on Numba IR?
           | 
           | * https://enzyme.mit.edu
        
           | ahendriksen wrote:
           | What do you mean by "seems to be highly explosive"? I have
           | used Pytorch to model many non-dnn things and have not
           | experienced highly explosive behavior. (Could be that I have
           | become too familiar with common footguns though)
        
         | thecleaner wrote:
         | Its python wrappers on top of existing ThTensor library which
         | was already provided by torch. But yes great engineering
         | nonetheless.
        
           | rrss wrote:
           | I don't think this is a particularly accurate description of
           | pytorch in 2021. Yeah, the original c++ backend came from
           | torch, but I think most of that has been replaced. AFAIK, all
           | the development of the c++ backend for pytorch over that last
           | several years has been done as part of the pytorch project
           | -it's not just python wrappers at this point.
        
             | danieldk wrote:
             | What I like about PyTorch is that most of the functionality
             | is actually available through the C++ API as well, which
             | has 'beta API stability' as they call it. So, there are
             | good bindings for some other languages as well. E.g., I
             | have been using the Rust bindings in a larger project [1],
             | and they have been awesome. A precursor to the project was
             | implemented using Tensorflow, which was a world of pain.
             | 
             | Even things like mixed-precision training are fairly easy
             | to do through the API.
             | 
             | [1] https://github.com/tensordot/syntaxdot
        
         | sillysaurusx wrote:
         | _and TPUs_
         | 
         | BS. There's so much effort getting Pytorch working on TPUs, and
         | at the end of it it's incredibly slow compared to what you have
         | in Tensorflow. I hate this myth and wish it would die.
         | 
         | Old thread on this, detailing exactly why this is true:
         | https://news.ycombinator.com/item?id=24721229
        
       | wing-_-nuts wrote:
       | So, for someone not familiar, how far is AMD behind Nvidia's
       | CUDA? I ask because AMD clearly has better linux driver support
       | than Nvidia, and it would be _awesome_ if their AI /ML libs were
       | catching up.
        
         | danieldk wrote:
         | At my previous employer, we bought two Radeon VIIs (in addition
         | to NVIDIA GPUs). The last time I tried it (just over ~6 months
         | ago), there were still _many_ bugs. Things would just crash and
         | burn very frequently (odd shape errors, random crashes, etc.).
         | Two colleagues reported some of those bugs in ROCm, but the bug
         | reports were largely ignored.
         | 
         | Maybe out-of-the-box support for PyTorch will result in more
         | polish. Who knows? Another issue is that AMD has not yet
         | implemented support for consumer GPUs after Vega. You'd think
         | that targeting researchers on a budget (most of academia) would
         | help them improving the ecosystem and weed out bugs. But they
         | only seem interested in targeting large data centers.
         | 
         | It's all a shame, because as you say, AMD graphics on Linux is
         | awesome. And I think many Linux enthusiasts and researchers
         | would be happy to use and improve ROCm to break NVIDIA's
         | monopoly. Only if AMD actually cared.
        
           | slavik81 wrote:
           | So, just up front: these are my personal opinions. I do not
           | speak on behalf of AMD as a company. I'm just a software
           | developer who works on ROCm. I joined AMD specifically
           | because I wanted to help ROCm succeed.
           | 
           | If the problems you encountered are related to a particular
           | ROCm software library, I would encourage you to open an issue
           | on the library's GitHub page. You will get the best results
           | if you can get your problem directly in front of the people
           | responsible for fixing it. Radeon VII is one of the core
           | supported platforms and bugs encountered with it should be
           | taken seriously. In fact, if you want to point me at the bug
           | reports, I will personally follow up on them.
           | 
           | ROCm has improved tremendously over the past 6 months, and I
           | expect it to continue to improve. There have been growing
           | pains, but I believe in the long-term success of this
           | project.
        
           | wing-_-nuts wrote:
           | Yeah I think having support for consumer level cards is
           | really important because it makes it easier to have a
           | pipeline of students who are familiar with your tech. These
           | same folks advocate for AMD in the workplace and contribute
           | to OSS on behalf of AMD. Forcing this to be a 'datacenter
           | only' solution is _really_ short sighted.
        
         | [deleted]
        
       | buildbot wrote:
       | It also looks like they are in the progress of adding Apple metal
       | support, possibly for the M1: Part of 1.8 is this issue:
       | https://github.com/pytorch/pytorch/pull/47635
        
         | ipsum2 wrote:
         | What's the point? If you have enough money to buy a brand new
         | Apple M1 laptop, you can afford a training rig or cloud
         | credits. Any modern discrete GPU will blow away any M1 laptop
         | for training.
         | 
         | Is anyone training ML models on their ultra-thin laptop?
        
           | volta83 wrote:
           | > Any modern discrete GPU will blow away any M1 laptop for
           | training.
           | 
           | Counter example: none of the AMD consumer GPUs can be used
           | for training. So no, not any modern discrete GPU blows away
           | the M1.
           | 
           | The M1 might not be a great GPGPU system, but it is much
           | better than many systems with discrete GPUs.
        
           | currymj wrote:
           | I use PyTorch for small-ish models, not really the typical
           | enormous deep learning workload where you let it train for
           | days.
           | 
           | I still prefer to work on a GPU workstation because it's the
           | difference between running an experiment in minutes vs hours,
           | makes it easier to iterate.
           | 
           | Some of that speed on a low-power laptop would be great. Much
           | less friction.
        
           | upbeat_general wrote:
           | I don't understand this logic. If someone has $1000 for an
           | entry level m1 machine then they also have enough money for a
           | separate rig with a GPU that's probably another $600-1000 for
           | something decent? Cloud GPUs are also pretty expensive.
           | 
           | I don't think anyone is seriously training their ML models on
           | their ultra thin laptop but I think the ability to do so
           | would make it easier for lots of people to get started with
           | what they have. It might be that they don't have lots of
           | extra cash to throw around or it's hard to justify on
           | something they are just starting.
           | 
           | Cloud training is also just less convenient and an extra
           | hassle compared to doing things locally, even if it's
           | something like google collab which is probably the best
           | option right now for training on thin laptops
        
             | ipsum2 wrote:
             | If someone is shelling out for a brand new, early adopter
             | product, then they probably have a decent amount of money.
             | 
             | Even when TensorFlow and PyTorch implement training support
             | on the M1, it will be useless for practically anything
             | except training 2-3 layer models on MNIST.
             | 
             | So why should valuable engineering time be spent on this?
        
               | sumnuyungi wrote:
               | This is just patently false. Most of the folks I know
               | that have an M1 are students that were saving up to
               | upgrade from a much older computer and got the M1 Air. I
               | can assure you that they don't have a decent amount of
               | money.
               | 
               | Tensorflow has a branch that's optimized for metal with
               | impressive performance. [1] It's fast enough to do
               | transfer learning quickly on a large resnet, which is a
               | common use-case for photo/video editing apps that have
               | ML-powered workflows. It's best for everyone to do this
               | locally: maintains privacy for the user and eliminates
               | cloud costs for the developer.
               | 
               | Also, not everyone has an imagenet sized dataset. A lot
               | of applied ML uses small networks where prototyping is
               | doable on a local machine.
               | 
               | [1] https://blog.tensorflow.org/2020/11/accelerating-
               | tensorflow-...
        
               | tpetry wrote:
               | Because with support for M1 you can prototype your
               | network on your local machine with ,,good" performance.
               | There are many cloud solutions etc. but for convenience
               | nothing beats your local machine. You can use an IDE you
               | like etc.
        
               | rfoo wrote:
               | Because contrary to what you believe, M1 simply is not
               | performant enough to be used to "prototype" your network.
               | NNs can't be simply scaled up and down. It is *NOT* like
               | those web apps which you can run on potatoes just fine as
               | long as nobody are hitting them heavily.
        
               | sumnuyungi wrote:
               | This is false. You can prototype a network on an M1 [1]
               | and teacher-student models are a de facto standard for
               | scaling down.
               | 
               | You can trivially run transfer-learning on an M1 to
               | prototype and see if a particular backbone fits well to a
               | small dataset, then kickoff training on some cloud
               | instance with the larger dataset for a few days.
               | 
               | [1] https://blog.tensorflow.org/2020/11/accelerating-
               | tensorflow-...
        
               | tpetry wrote:
               | You can just run your training with a lot less of data.
        
               | jefft255 wrote:
               | Not so sure about that. Here's two things you can do
               | (assuming you're not training huge transformers or
               | something).
               | 
               | 1. Test your code with super low batch size. Bad for
               | convergence, good for sanity check before submitting your
               | job to a super computer.
               | 
               | 2. Post-training evaluation. I'm pretty sure the M1 has
               | enough power to do inference for not-so-big models.
               | 
               | These two reasons are why I'm sometimes running stuff on
               | my own GTX 1060, even though it's pretty anemic and I
               | wouldn't actually do a training run there.
               | 
               | There quite a bit of friction to training in the cloud,
               | especially if it's on shared cluster (which is what I
               | have access to). You have a quota, and wait time when the
               | supercomputer is under load. Sometimes you just need to
               | quickly fire up something!
        
               | danieldk wrote:
               | _1. Test your code with super low batch size. Bad for
               | convergence, good for sanity check before submitting your
               | job to a super computer._
               | 
               | Or you can buy a desktop machine for the same price as an
               | M1 MacBook with 32GB or 64GB RAM and an RTX2060 or
               | RTX3060 (which support mixed-precision training) and you
               | can actually finetune a reasonable transformer model with
               | a reasonable batch size. E.g., I can finetune a multi-
               | task XLM-RoBERTa base model just fine on an RTX2060,
               | model distillation also works great.
               | 
               | Also, there are only so many sanity checks you can do on
               | something as weak (when it comes to neural net training).
               | Sure, you can check if your shapes are correct, loss is
               | actually decreasing, etc. But once you get at the point
               | your model is working, you will have to do dozens of
               | tweaks that you can't reasonably do on an M1 and still
               | want to do locally.
               | 
               | tl;dr: why make your life hard with an M1 for deep
               | learning, if you can buy a beefy machine with a
               | reasonable NVIDIA GPU at the same price? Especially if it
               | is for work, your employer should just buy such a machine
               | (and an M1 MacBook for on the go ;)).
        
               | jefft255 wrote:
               | Absolutely agree! My points were more about the benefits
               | of running code on your own machine rather than in the
               | cloud or on a cluster. I don't own an M1, but if I did I
               | wouldn't want to use it to train models locally... When
               | on my laptop I still deploy to my lab desktop; this adds
               | little friction compared to a compute cluster, and as you
               | mention we're able to do interesting stuff with a regular
               | gaming GPU. When everything works great and I now want to
               | experiment at scale, I then deploy my working code to a
               | supercomputer.
        
               | rsfern wrote:
               | Sure you probably don't want to do full training runs
               | locally, but There's a lot you can do locally that has a
               | lot of added friction on a gpu cluster or other remote
               | compute resource
               | 
               | I like to start a new project by prototyping and
               | debugging my training and cunning config code, setting up
               | the data loading and evaluation pipeline, hacking around
               | with some baseline models and making sure they can
               | overfit some small subset of my data
               | 
               | After all that's done it's finally time to scale out to
               | the gpu cluster. But I still do a lot of debugging
               | locally
               | 
               | Maybe this kind of workflow isn't as necessary if you
               | have a task that's pretty plug and play like image
               | classification, but for nonstandard tasks I think there's
               | lots of prototyping work that doesn't require hardware
               | acceleration
        
               | jefft255 wrote:
               | Coding somewhat locally is a must for me too because the
               | cluster I have access to has pretty serious wait times
               | (up to a couple hours on busy days). Imagine only being
               | able to run the code you're writing a few times a day at
               | most! Iterative development and doing a lot of mistakes
               | is how I code; I don't want to go back to punch card days
               | where you waited and waited before you ended up with a
               | silly error.
        
               | sjwright wrote:
               | I think you underestimate just how many MacBooks Apple
               | sells. There's only so many affluent early adopters out
               | there. And the M1 MacBooks are especially ideal for
               | mainstream customers.
        
       | sandGorgon wrote:
       | Ryzen APU - the embedded GPU inside laptops is not supported by
       | AMD for running rocm
       | 
       | There is an unofficial Bruhnspace project...but it is really sad
       | that AMD has made a management decision that prevents use of a
       | perfectly viable Ryzen laptop to make use of these libraries.
       | 
       | Unlike...say a M1
        
         | qayxc wrote:
         | Might have to do with the fact that AMD just doesn't seem to
         | have the resources (see the common complaints about their
         | drivers' quality) to fully support every chip.
         | 
         | Another reason is certainly that they simply don't need to -
         | just like Intel's iGPU, people working with deep learning opt
         | for discrete GPUs (either built-in or external), both just
         | isn't an option (yet?) for M1-based systems.
         | 
         | The audience would be a niche within a niche and the cost-
         | benefit-ratio doesn't seem to justify the effort for them.
        
           | sandGorgon wrote:
           | I dont believe so...and looks like neither does either Apple
           | or Google/Tensorflow
           | 
           | - https://developer.apple.com/documentation/mlcompute
           | 
           | - https://blog.tensorflow.org/2020/11/accelerating-
           | tensorflow-...
           | 
           | For a more personal take on your answer - do consider the
           | rest of world. For example, Ryzen is very popular in India.
           | Discrete GPU are unaffordable for that college student who
           | wants to train a non-English NLP model on GPU.
        
             | qayxc wrote:
             | How does that contradict my point - Apple has to support
             | the M1, since their M1-based SoCs don't support dGPUs
             | (yet), so the M1 is all there is.
             | 
             | Besides, Apple is the most valuable company in the world
             | and has exactly the kind of resources AMD doesn't have.
             | 
             | > or example, Ryzen is very popular in India. Discrete GPU
             | are unaffordable for that college student who wants to
             | train a non-English NLP model on GPU.
             | 
             | Well Google Collab is free and there are many affordable
             | cloud-based offers as well. Training big DL models is not
             | something you'd want to do on a laptop anyway. Small models
             | can be trained on CPUs, so that shouldn't be an issue.
             | 
             | Inference is fast enough on CPUs anyway and if you really
             | need to train models on your Ryzen APU, there's always
             | other libraries, such as TensorflowJS, which is hardware
             | agnostic as it runs on top of WebGL.
             | 
             | That's why I don't think this is a big deal at all -
             | especially given that Intel still holds >90% of the
             | integrated GPU market and doesn't even have an equivalent
             | to ROCm. Again, niche within a niche, no matter where you
             | look.
        
           | daniel-thompson wrote:
           | > AMD just doesn't seem to have the resources
           | 
           | AMD's net income for 2020 was about $2.5B. If it was a
           | management priority, they would fund more people to focus on
           | this.
           | 
           | I would love to support open-source drivers, but AMD's
           | efforts with ROCm on consumer hardware are a joke. It's been
           | said in other comments that AMD only cares about the
           | datacenter. That certainly seems to be the case. So until AMD
           | takes this seriously and gets a legitimate developer story
           | together, I'm spending my money elsewhere.
        
             | qayxc wrote:
             | > So until AMD takes this seriously and gets a legitimate
             | developer story together, I'm spending my money elsewhere.
             | 
             | Fair enough. Thing is, AMD's market share in the mobile
             | market has been below 15% over the past years [1] and only
             | last year increased to about 20%.
             | 
             | Of these 20%, how many notebooks are (think realistically
             | for a second) intended to be used for DL while also not
             | featuring an NVIDIA dGPU?
             | 
             | ROCm on consumer cards isn't a priority for AMD, since
             | profits are small compared to the datacentre market and
             | there's not that many people actually using consumer
             | hardware for this kind of work.
             | 
             | I always feel there's a ton of bias going on and one should
             | refer to sales data and market analysis to find out what
             | the actual importance of one's particular niche really is.
             | 
             | AMD's focus w.r.t. consumer hardware is on gaming an CPU
             | performance. That's just how it is and it's not going to
             | change anytime soon. On the notebook side of things, and
             | AMD APU + NVIDIA dGPU is the best you can get right now.
             | 
             | [1] https://www.tomshardware.com/news/amd-vs-
             | intel-q3-2020-cpu-m...
        
               | daniel-thompson wrote:
               | > ROCm on consumer cards isn't a priority for AMD, since
               | profits are small compared to the datacentre market and
               | there's not that many people actually using consumer
               | hardware for this kind of work.
               | 
               | I think causality runs the other way: Profits are small
               | and there aren't many people using AMD cards for this
               | _because_ the developer experience for GPGPU on AMD is
               | terrible (and that because it's not a priority for AMD).
        
               | qayxc wrote:
               | That would imply that there even was a laptop market for
               | AMD in the first place. As the market numbers show, up
               | until last year, AMD simply wasn't relevant at all in the
               | notebook segment, so what developer experience are you
               | even talking about if there were no developers on AMD's
               | platform?
        
               | daniel-thompson wrote:
               | I agree that AMD on mobile is a wasteland. But AMD has
               | shipped over 500m desktop GPUs in the last 10 years.
               | Surely some of those could/would have been better used
               | for GPGPU dev if there was a decent developer experience.
        
               | sandGorgon wrote:
               | I disagree with you. AMD is running behind developers to
               | use AMD for GPU based training. Just 1 month or so back,
               | it announced partnership with AWS to get its GPU on the
               | cloud.
               | 
               | https://aws.amazon.com/blogs/aws/new-amazon-ec2-g4ad-
               | instanc...
               | 
               | So I would disagree with your claim about marketshare
               | being the reason for its helplessness to create a
               | superior developer-laptop experience.
               | 
               | Cluelessness? sure. But not helplessness. If it wants the
               | developer market (versus the gamer), then it better start
               | acting like a developer tools company...which includes
               | cosying up to Google/Facebook/AWS/Microsoft and throwing
               | money on ROCm. Education is one of them -
               | https://developer.nvidia.com/educators/existing-courses
               | ... and giving developers a generally superior
               | development experience on the maximum number of machines
               | is another.
        
               | qayxc wrote:
               | Well, NVIDIA view themselves as a software company that
               | also builds hardware.
               | 
               | > "NVIDIA is a software-defined company today," Huang
               | said, "with rich software content like GeForce NOW,
               | NVIDIA virtual workstation in the cloud, NVIDIA AI, and
               | NVIDIA Drive that will add recurring software revenue to
               | our business model." [1]
               | 
               | Not sure that's the case with AMD. AMD lags behind
               | massively when it comes to software- and developer
               | support and I doubt that a few guys who insist on ROCm
               | running on their APUs are in their focus.
               | 
               | It's just not a relevant target demographic.
               | 
               | [1] https://www.fool.com/investing/2020/08/21/nvidia-is-
               | more-tha...
        
       | mastazi wrote:
       | Now that major frameworks finally started supporting ROCm, AMD
       | has half-abandoned it (IIRC the last consumer cards supported
       | were the Vega ones, cards from 2 generations ago). I hope this
       | will change.
        
         | slavik81 wrote:
         | I work for AMD, but this comment contains exclusively my
         | personal opinions and information that is publicly available.
         | 
         | ROCm has not been abandoned. PyTorch is built on top of
         | rocBLAS, rocFFT, and Tensile (among other libraries) which are
         | all under active development. You can watch the commits roll in
         | day-by-day on their public GitHub repositories.
         | 
         | I can't speak about hardware support beyond what's written in
         | the docs, but there are more senior folks at AMD who do comment
         | on future plans (like John Bridgman).
        
           | mcdevilkiller wrote:
           | Yes, you can ask bridgmanAMD on Reddit.
        
           | ebegnwgnen wrote:
           | "Future plan"... it's been years and it's still "future
           | plan".
           | 
           | I want to buy AMD because they are more open than Nvidia. But
           | Nvidia supports CUDA day one for all their graphic cards and
           | AMD still don't have rocm support on most of their product
           | even years after their release [0]
           | 
           | Given AMD size & budget, the reason why they don't hire a few
           | more employee full time on making rocm work with their own
           | graphic card is beyond me.
           | 
           | The worst is how they keep people in waiting. It's always
           | vague phrases like "not currently", "may be supported in the
           | future", " "future plan", " we cannot comment on specific
           | model support ", etc.
           | 
           | AMD doesn't want rocm on consumer card ? Then say it. Stop
           | making me check rocm repos every week to get always more
           | disappointed.
           | 
           | AMD plans to support it on consumer card ? Then say it and
           | give a release date : "In May 2021 , the RX 6800 will get
           | rocm support, thanks for your patience and your trust in our
           | product".
           | 
           | I like AMD for their openness and support of standards, but
           | they are so unprofessional when it comes to Compute
           | 
           | [0] https://github.com/RadeonOpenCompute/ROCm/issues/887
        
             | 01100011 wrote:
             | > why they don't hire a few more employee
             | 
             | The reason is that it would take more than a few more
             | employees to provide optimized support across all GPUs. If
             | nothing else, it's a significant testing burden.
        
           | vetinari wrote:
           | I have no affiliation with AMD. Just own few pieces of their
           | hardware.
           | 
           | The following is my speculation. For what I noticed, Polaris
           | and Vega had a brute force approach; you could throw a
           | workload on them and they would brute force through it. Navi
           | generation is more gamer oriented; it does not have the
           | hardware for the brutal approach, it relies more on
           | optimizations from the software running on it and using it in
           | specific ways. Provides better perfomance for games, but
           | worse compute - and that's why it is not and is not going to
           | be supported by ROCm.
           | 
           | The downside of course is, that you can no longer buy Vega.
           | If I knew it last time it was on sale...
        
           | mastazi wrote:
           | > PyTorch is built on top of rocBLAS, rocFFT, and Tensile
           | (among other libraries) which are all under active
           | development.
           | 
           | That's not very helpful if we can't use them on our own
           | computers... Not many devs are able to put their hands on a
           | datacentre-class card...
        
           | beagle3 wrote:
           | Can you comment perhaps on what you guys have compared to
           | nVidia's DGX? I'd rather buy a workhorse with open drivers.
        
             | slavik81 wrote:
             | The MI100 is our server compute product. The marketing page
             | is probably a better source of information than I am:
             | https://www.amd.com/en/products/server-
             | accelerators/instinct...
        
           | ckastner wrote:
           | Hardware support is key, though.
           | 
           | CUDA works with basically any card made in the last 5 years,
           | consumer or compute. ROCm seems to work with a limited set of
           | compute cards only.
           | 
           | There's a ROCm team in Debian [1] trying to push ROCm forward
           | (ROCm being open), but just getting supported hardware alone
           | is already a major roadblock, which stalls the effort, and
           | hence any contributions Debian could give back.
           | 
           | [1] https://salsa.debian.org/rocm-team
        
             | slavik81 wrote:
             | I didn't know about the Debian team. Thank you very much
             | for informing me! I'm not sure how much I can do, but I
             | would be happy to discuss with them the obstacles they are
             | facing and see what I can help with. I'm better positioned
             | to help with software issues than hardware matters, but I
             | would love to hear about ROCm from their perspective.
             | What's the best way to contact the team?
        
           | [deleted]
        
         | NavinF wrote:
         | Oh oof. Thanks for saving me time not having to look up ROCm
         | benchmarks. I find it really surprising they they don't wanna
         | compete on performance/$ at all by not supporting consumer
         | cards
        
           | mastazi wrote:
           | No worries, for future reference you can check here
           | (hopefully that page will report improved support in the
           | future)
           | 
           | https://github.com/RadeonOpenCompute/ROCm#supported-gpus
        
           | AlphaSite wrote:
           | I think this is more of an issue that they have Compute
           | optimised and Graphics optimised cards and Vega is their last
           | compute optimised card.
           | 
           | It would be very nice for them to refresh their compute cards
           | as well.
        
             | my123 wrote:
             | They made new compute cards, but they aren't available to
             | customers. (Only businesses, under the Radeon Instinct
             | brand)
             | 
             | With the price to match for those...
        
               | dogma1138 wrote:
               | They aren't available for business either, you can buy a
               | Tesla card in a microcenter or on Newegg, and through a
               | million partners ranging from small system builders to
               | Dell and HP.
               | 
               | Good luck getting an instinct card.
        
             | detaro wrote:
             | Going for market segmentation like that sounds like a
             | pretty bad idea if you are already the underdog in the
             | game.
        
               | sorenjan wrote:
               | Especially since more and more consumer workloads use
               | compute as well. DLSS makes real time ray tracing at
               | acceptable resolutions possible, your phone camera uses
               | neural nets for every photo you take, Nvidias OptiX noise
               | reduction is very impressive, and so on.
               | 
               | AMD doesn't seem to want to be part of future computer
               | usages, it's a shame. "Armies prepare to fight their last
               | war, rather than their next war".
        
               | my123 wrote:
               | "Making science accessible to everyone is important to
               | us. That's one of the reasons why GeForce cards can run
               | CUDA." -- Bryce Lelbach at Nvidia
               | 
               | AMD meanwhile considers GPU compute as a premium
               | feature... which is a different approach.
               | 
               | Not having a shippable IR like PTX but explicitly
               | targeting a given GPU ISA, making this unshippable
               | outside of HPC and supporting Linux only also points in
               | that direction.
               | 
               | Intel will end up being a much better option than AMD for
               | GPU compute once they ship dGPUs... in their first gen.
        
         | dogma1138 wrote:
         | It's worse....
         | 
         | It's Linux only so no Mac, Windows or WSL.
         | 
         | No support what so ever for APUs which means if you have a
         | laptop without a dedicated GPU you're out of luck (tho discrete
         | mobile GPUs aren't officially supported either and often do not
         | work).
         | 
         | They've not only haven't been supporting any of their consumer
         | based R"we promise it's not GCN this time"DNA GPUs, but since
         | December last year (2020) they've dropped support for all pre-
         | Vega GCN cards which means that Polaris (400/500 series) which
         | is not only the most prolific of AMD GPUs that have been
         | released in the past 5 years or so but also the most affordable
         | ones are no longer supported.
         | 
         | That's on top of all the technical and architectural software
         | issues that plague the framework.
        
           | vetinari wrote:
           | > It's worse....
           | 
           | > It's Linux only so no Mac, Windows or WSL.
           | 
           | I don't see really a problem here. You want Office or
           | Photoshop? It runs on Mac and Windows only, so you better get
           | one. You want ROCm? Get Linux for exactly the same reason.
        
             | dogma1138 wrote:
             | No I don't want ROCm I want GPGPU, which is why the entire
             | planet is running CUDA.
             | 
             | Every decision AMD made with ROCm seems to boil down too "i
             | never want to be actually competitive with CUDA".
             | 
             | Everything from being Linux only and yes that's is a huge
             | huge huge limitation because most people don't want ROCm
             | but many would like Photoshop or Premier or Blender being
             | able to use it. Especially with the OpenCL issues AMD
             | has....
             | 
             | Through them not supporting their own hardware to other
             | mind boggling decisions like the fact that ROCm binaries
             | are hardware specific so you have no guarantee of
             | interoperability and worse no guarantee for future
             | compatibility and in fact it does break.
             | 
             | The point is that I can take a 6 years old CUDA binary and
             | run it today still on everything form an embedded system
             | through a high end server, it would maintain its
             | compatibility across multiple generations of GPU hardware,
             | multi operating systems and multiple CPU architectures.
             | 
             | And if you can't understand why that is not only valuable
             | but more or less mandatory for every GPGPU application you
             | would ever ship to customers, perhaps you should stick to
             | photoshop.
             | 
             | AMD is a joke in this space and they are very much actively
             | continuing in making themselves one with every decision
             | they make.
             | 
             | It's not that they've abandoned ROCm is that it's entire
             | existence is telegraphing "Don't take use seriously,
             | because we don't".
        
             | sorenjan wrote:
             | It's a big problem if you want to use GPGPU in a product
             | shipped to customers, or if you want to do some GPU
             | programming on your own computer in your free time.
             | 
             | Keep in mind that not only is it Linux only, it doesn't
             | work on consumer hardware either.
        
               | vetinari wrote:
               | > It's a big problem if you want to use GPGPU in a
               | product shipped to customers,
               | 
               | So ship linux version and let the customers decide
               | whether they want it or not.
               | 
               | If you are doing server product, it is even handy; just
               | ship docker image or docker-compose/set of k8s pods, if
               | it needs them.
               | 
               | > or if you want to do some GPU programming on your own
               | computer in your free time.
               | 
               | You can install Linux very easily; it's not that it is
               | some huge expense.
               | 
               | > it doesn't work on consumer hardware either.
               | 
               | It does, just not on the current gen one or even
               | currently purchasable one. I agree, this is a problem.
        
               | sorenjan wrote:
               | > So ship linux version and let the customers decide
               | whether they want it or not.
               | 
               | What if I want to use GPGPU in Photoshop, or a game with
               | more than two users? Or really anything aimed at
               | consumers?
               | 
               | > If you are doing server product
               | 
               | That's irrelevant, server products can choose their own
               | hardware and OS.
               | 
               | > You can install Linux very easily
               | 
               | "For a Linux user, you can already build such a system
               | yourself quite trivially by getting an FTP account,
               | mounting it locally with curlftpfs, and then using SVN or
               | CVS on the mounted filesystem"
               | 
               | Also, their main competitor with a huge market lead have
               | this: https://docs.nvidia.com/cuda/cuda-installation-
               | guide-microso...
               | 
               | > It does, just not on the current gen one or even
               | currently purchasable one. I agree, this is a problem.
               | 
               | Only on old hardware, and you can't expect users'
               | computers to be compatible. In fact, you should expect
               | users' copmuters to be incompatible. That's like saying
               | consumer hardware support GLIDE.
        
               | vetinari wrote:
               | > What if I want to use GPGPU in Photoshop, or a game
               | with more than two users? Or really anything aimed at
               | consumers?
               | 
               | Then use API supported on your target platform. It's not
               | ROCm then. Maybe Vulcan Compute/DirectCompute/Metal
               | Compute?
               | 
               | > That's irrelevant, server products can choose their own
               | hardware and OS.
               | 
               | It is relevant for ROCm.
               | 
               | > > You can install Linux very easily
               | 
               | > "For a Linux user, you can already build such a system
               | yourself quite trivially by getting an FTP account,
               | mounting it locally with curlftpfs, and then using SVN or
               | CVS on the mounted filesystem"
               | 
               | I'm quite surprised to see comparisons ad absurdum after
               | suggesting to use proper tool for a job. Installing a
               | suitable operating system - for a supposed hacker - is a
               | convoluted, nonsensical action nowadays?
               | 
               | > Only on old hardware, and you can't expect users'
               | computers to be compatible. In fact, you should expect
               | users' copmuters to be incompatible. That's like saying
               | consumer hardware support GLIDE.
               | 
               | Vega is not that old; the problem is that is not not
               | procurable anymore, which, if you read my comment again,
               | I agreed that it is a problem.
        
               | sorenjan wrote:
               | > Then use API supported on your target platform. It's
               | not ROCm then.
               | 
               | That's the point, ROCm isn't suitable for things outside
               | datacenters or maybe some workstations. Cuda is however,
               | and that's what AMD should be aiming for. Their best bet
               | is SYCL, but that uses ROCm as backend...
               | 
               | > Installing a suitable operating system - for a supposed
               | hacker - is a convoluted, nonsensical action nowadays?
               | 
               | Again, if all you need is to run ROCm on your own
               | computer Linux isn't a hurdle. If you want to ship
               | software to customers you can't just say "switch OS",
               | they're probably already using their computers for other
               | things.
               | 
               | > Vega is not that old; the problem is that is not not
               | procurable anymore, which, if you read my comment again,
               | I agreed that it is a problem.
               | 
               | The fact that they used to support something isn't a
               | relevant argument and I just don't see the point in
               | bringing it up, other than to underline the fact that AMD
               | doesn't care about compute support for the mass market
               | anymore. At least we agree on one thing.
               | 
               | The splitting between graphics specific and compute
               | specific hardware is an even bigger issue than Linux
               | only. ROCm stands for RadeonOpenCompute, and their Radeon
               | DNA hardware can't run it, so streamers can't use AMD
               | hardware to play games and improve their mic sound, while
               | it's trivial to do with Nvidias OptiX. And what good are
               | all the ML models if you can't ship them to customers?
        
               | kortex wrote:
               | We all started as noobs once. A sizeable market of GPGPU
               | is academics who don't yet have their nix chops. Had to
               | walk an undergrad through basic git branching the other
               | day.
        
               | vetinari wrote:
               | I understand, but look at from another POV: ROCm is an
               | USP of the Linux platform. Windows has its own USPs
               | (DirectX or Office, for example), MacOS the same (iOS
               | development, for example).
               | 
               | Why should Linux platform give up its competitive
               | advantages against others? It would only diminish the
               | reasons for running it. The other platforms won't do the
               | same -- and nobody is bothered by that. In fact, it is
               | generally considered to be an advantage of platform and
               | reasons for getting it.
               | 
               | And it's not like getting Linux to run could be a
               | significant expense (like purchasing an Apple computer is
               | for many) or you need to get a license (like getting
               | Windows). You can do it for free, and you will only learn
               | something from it.
               | 
               | I've got my nix chops in the early '90 exactly this way:
               | I wanted 32-bit flat-memory C compiler, didn't have the
               | money for Watcom and associated DOS-extenders and djgpp
               | wasn't a thing yet (or I didn't know about it yet). So
               | I've got Slackware at home, as well as access to DG-UX at
               | uni. It was different, I had to learn something, but
               | ultimately, I'm glad I did.
        
               | dogma1138 wrote:
               | ROCm isn't a USP of the Linux platform it's vendor
               | locked.
               | 
               | It provides no competitive advantage to Linux, not to
               | mention that that entire concept is a bitty laughable as
               | far as FOSS/OS goes.
               | 
               | I don't understand why AMD seems adamant at not wanting
               | to be a player in the GPU compute market.
               | 
               | Close to Metal was aborted.
               | 
               | OpenCL was abandoned.
               | 
               | HSA never got off the ground.
               | 
               | ROCm made every technical decision possible to ensure it
               | won't be adopted.
               | 
               | Can't get GPUs that could run it, can't ship a product to
               | customers because good luck buying the datacenter GPUs
               | from AMD the MI100 is unavailable for purchase unless you
               | make a special order and even then apparently AMD doesn't
               | want anyone to actually buy it, can't really find GPU
               | cloud instances that run it on any of the major
               | providers.
               | 
               | So what exactly is it? It's been 5 years and if you want
               | today to develop anything on the platform you have to
               | build a desktop computer, find overpriced hardware on the
               | second hand market pay a premium for it, hope that AMD
               | won't deprecate it within months like they did with GCN
               | 2/3 without even a heads up notice all so you can develop
               | something that only you can run with no future
               | compatibility or interoperability.
               | 
               | If this is the right tool for the job then the job is
               | wrong.
        
               | my123 wrote:
               | GPGPU with the graphics APIs isn't a comparable developer
               | experience at all, _and_ comes with quite some major
               | limitations.
               | 
               | About DirectCompute, C++ AMP is in practice dead on
               | Windows, stuck at a DX11 feature level with no
               | significant updates since 2012-13, staying present just
               | for backwards compatibility.
        
               | vetinari wrote:
               | Then probably SYSCL is what you are looking for. But that
               | one is still WIP.
        
           | the8472 wrote:
           | > or WSL
           | 
           | Does WSL not support PCIe passthrough?
        
             | my123 wrote:
             | It does not.
             | 
             | And even then, almost all AMD customer GPUs are not
             | supported. (Everything but Vega)
             | 
             | APUs? Nope too.
        
       | birisileri wrote:
       | fantazi gecelik sizi daha cekici ve seksi gostererek
       | partnerinizle birlikte ozel dakikalar yasamaniza yardimci olur.
       | Zevkinize gore modeller ya da renkler secerek size en uygun
       | geceligi bulmalisiniz.
       | https://www.gipastekstil.com/kategori/bayan-fantazi-gecelik
        
       | birisileri wrote:
       | toptan ic giyim tercih etmenizin sebebi kaliteyi ucuza satin
       | alabilmektir. Urunler yine orjinaldir ve size sorun yasatmaz.
       | Yine de bilinen tekstil markalarini tercih etmelisiniz.
       | https://www.gipastekstil.com
        
       | birisileri wrote:
       | tutku ic giyim Turkiye'nin onde gelen ic giyim markalarindan
       | birisi olmasinin yani sira en cok satan markalardan birisidir.
       | Urunleri hem cok kalitelidir hem de pamuk kullanimi daha
       | fazladir. https://www.toptantutkuicgiyim.com
        
       | birisileri wrote:
       | nbb sutyen hem kaliteli hem de uygun fiyatli sutyenler
       | uretmektedir. Sutyene ek olarak sutyen takimi ve jartiyer gibi
       | urunleri de mevcuttur. Ozellikle Avrupa ve Orta Dogu'da cokca
       | tercih edilmektedir. https://nbbsutyen.com
        
       | birisileri wrote:
       | yeni inci sutyen kaliteyi ucuz olarak sizlere ulastirmaktadir.
       | Cok cesitli sutyen varyantlari mevcuttur. ic giyime damga vuran
       | markalardan biridir ve genellikle Avrupa'da ismi siklikla
       | duyulur. https://yeniincisutyen.com
        
       | lumost wrote:
       | Curious to see how this performs in a real world setting. My
       | understanding is that Nvidia's neural network libs and other
       | proprietary foo would still hold an edge over a standard AMD
       | card.
       | 
       | If this is not the case then this is a really big deal.
        
       ___________________________________________________________________
       (page generated 2021-03-05 23:02 UTC)