[HN Gopher] PCI-Sig Releases 256GBps PCIe 6.0 X16 Spec
___________________________________________________________________
PCI-Sig Releases 256GBps PCIe 6.0 X16 Spec
Author : ksec
Score : 88 points
Date : 2022-01-11 20:22 UTC (2 hours ago)
(HTM) web link (www.servethehome.com)
(TXT) w3m dump (www.servethehome.com)
| ChuckMcM wrote:
| Okay, that is just stupidly wide bandwidth :-). As a systems guy
| I really don't see the terabytes of bandwidth in main memory
| coming in the next 5 years. GDDR6, the new "leading edge" is
| nominally 768GBs, which sounds great but how many transactions
| per second can you push through a CPU/GPU memory controller? Are
| we going to see 1152 bit wide memory busses? (1024 bits w/ECC)
| That is over 1/3 of all the pins on a high end Xeon and you're
| going to need probably another 512 pins worth of ground return.
| HBM soldered to the top of the die? Perhaps, that is where GPUs
| seem to be headed, but my oh my.
|
| I'm sure on the plus side there will be a huge resale market for
| "combination computer and room heater" :-)
| synergy20 wrote:
| what a coincidence, my boss just asked me to buy the
| memerbership.
|
| keywords: PAM4 and 128GBps/per x16 lane(full width, per my
| understanding, not 256Gbps as title says), which mean it can be
| used to make a 1Tbps NIC, or any traffic at the scale.
|
| from my reading, we will see pcie6 products mid-2023.
| NKosmatos wrote:
| Good news for everyone and hopefully this will not interfere with
| the PCIe 5.0 rollout, even though the 4.0 one is not fully
| adopted by the market yet.
|
| Not that I'd be able to use it, but it's a pity they make the
| spec available to members only (4000$/year membership) or they
| sell it at ridiculous prices
| https://pcisig.com/specifications/order-form
|
| I know that other specifications and even ISO standards are
| provided for a fee https://www.iso.org/store.html and perhaps
| something similar should be applied for open source software to
| avoid similar issues like faker.js and colors.js
| wmf wrote:
| You might be better off with a book than the spec anyway.
| Unfortunately I only see books covering PCIe 3.0.
| AdrianB1 wrote:
| The problem with PCIe is not bandwidth, it is the limit in lanes
| on consumer PCs: 20 lanes from CPU and a few more from the
| SouthBridge is not enough when the GPU is usually linked on a 16
| lane connection. The easy way out is to reduce GPU lanes to 8,
| that leaves plenty of bandwidth for nVME SSDs and maybe for 10 or
| 25 Gbps NICs (it's about time).
|
| For servers it is a different story, but the recent fast move
| from PCIe ver 3 to ver 5 improved the situation 4x, doubling
| again is nice, but it does not seem that much of a deal. Maybe
| moving NICs from the usual 8 lane to a lot less (8 lanes of ver 3
| means 2 lanes of ver 5 or a single lane of ver 6) will also make
| some difference.
| addaon wrote:
| But doubling the bandwidth per lane allows one to use half as
| many lanes to a GPU and maintain bandwidth. As you mention, it
| allows an eight lane GPU to be a viable option. And better yet,
| due to how PCIe handles variable number of lanes between host
| and device, different users with the same CPU, GPU, and even
| motherboard can choose to run the GPU at eight lane with a
| couple of four-lane SSDs, or at sixteen lane for even more
| bandwidth if they don't need that bandwidth elsewhere.
| AdrianB1 wrote:
| 8 lane GPU is viable for a long time (benchmarks on PCIe 8x
| versus 16x shows a 5% perf difference), but it did not change
| the physical layout the motherboard manufacturers use; you
| cannot use the existing lanes any way you want, on some
| motherboards you cannot even split it the way you want
| between physical connectors and video card manufacturers
| continue to push 16x everywhere.
| johncolanduoni wrote:
| PCIe bandwidth increases have outstripped increases in GPU
| bandwidth use by games for a while now. Anything more than 8x
| is overkill unless you're doing GPGPU work: https://www.gamer
| snexus.net/guides/2488-pci-e-3-x8-vs-x16-pe...
| tjoff wrote:
| With the bandwidth you do get more flexibility though, and with
| pci express bifurcation you can get the bandwidth equivalent of
| 4 pcie4 16x from a single pcie6 16x.
|
| And that's great since the vast majority don't need that much
| bandwidth anyway.
|
| Today you typically have a whole slew of devices sharing 4x to
| the cpu. More bandwidth would open up for more usb and perhaps
| cheaper onboard 10gig Ethernet etc.
| csdvrx wrote:
| Totally, and sTRX4 has a limited set of board available.
|
| I was hoping AM4 would provide that many lanes on easy to buy
| motherboards, but it's a meager 28, so not even enough for 2x
| x16
| yxhuvud wrote:
| At least AM5, which is coming soon, seems to improve the
| situation.
| csdvrx wrote:
| With a meagre extra 4 lanes
| alberth wrote:
| Isn't this only an artificial problem Intel created to segment
| the market, a problem that AMD doesn't have.
| jeffbee wrote:
| It's not just segmentation. Laptop buyers are not going to
| pay for 64 lanes. A regular Intel SKU of the 12th generation
| has 28 PCIe 4.0/5.0 lanes. A Xeon has 64, does _not_ have
| 5.0, and costs way more, partly because it has 4189 pins on
| the bottom, which is insane.
| csdvrx wrote:
| Yes they do have that problem, even the upcoming AM5 will
| only have 28 lanes given the annoucements:
| https://www.hwcooling.net/en/more-on-amd-am5-tdp-to-
| reach-12...
| wmf wrote:
| AMD has the same or slightly more lanes than Intel.
| the8472 wrote:
| EPYCs have 128 PCIe Gen4, recent Xeons have 64 PCIe Gen4.
| And Intel introduced Gen4 later than AMD.
| tjoff wrote:
| It's an AMD problem as well. It's absolute nightmare trying
| to research for a computer today. What ports can you use in
| what circumstances. Which slots go the CPU directly and which
| go to a chipset.
|
| Which lanes are disabled if you use nvme-slot 2. Which slot
| has which generation etc. A proper nightmare.
|
| And while we are at it, dedicating pci-lanes to nvme-slots
| must be one of the most boneheaded decisions in modern
| computers. Just use a pci-card with up to four nvme-slots on
| it instead.
| FridgeSeal wrote:
| Maybe it's because I bought a "gaming" motherboard, but the
| manual was pretty (for my understanding at least) as to
| what configuration of m.2 drives and PCIe lanes would run
| at what version, what went to cpu and what went to chipset.
| alberth wrote:
| Netflix, I imagine, would love to have this kind of I/O
| bandwidth.
| mikepurvis wrote:
| I'd be surprised if any of this mattered for them, since their
| workload (at least the "copy movie files from disk to network"
| part of it) is embarrassingly parallel.
|
| Unless they're really squeezed on power or rack space budget, I
| would imagine they'd do just fine being a generation back from
| the bleeding edge.
| loeg wrote:
| They are very squeezed on rack space budget in at least some
| locations.
| extropy wrote:
| Yeah, using significantly cheaper but just somewhat slower
| hardware works great if you can parallelize.
|
| Also cutting edge is usually very power hungry and
| power/cooling costs are majority of your expenses at data
| center scale.
| nijave wrote:
| >Unless they're really squeezed on power or rack space budget
|
| I think this is the case for their open connect appliances
| (or whatever they call them). They want to try to maximize
| throughput on a single device so they don't have to colocate
| so much equipment
| SahAssar wrote:
| Isn't netflix pretty much capped at network I/O, not disk I/O?
| All the posts I've read about them have been focused on
| network.
| willcipriano wrote:
| Isn't this just pure I/O? You could have a PCIe 6.0 raid
| controller or network card.
| vmception wrote:
| what use case does this level of bandwidth open up, and I am not
| able to understand why the article thinks SSDs are one of them.
| PCIe already provides more than enough bandwidth for the fastest
| SSDs, am I missing some forward-looking advancement?
| jjoonathan wrote:
| CXL puts PCIe in competition with the DDR bus. The bandwidth
| was already there (now doubly so), but CXL brings the latency.
| That's exciting because the DDR bus is tightly linked to a
| particular memory technology and its assumptions -- assumptions
| which have been showing a lot of stress for a long time. The
| latency profile of DRAM is really quite egregious, it drives a
| lot of CPU architecture decisions, and the DDR bus all but
| ensures this tight coupling. CXL opens it up for attack.
|
| Expect a wave of wacky contenders: SRAM memory banks with ultra
| low worst-case latency compared to DRAM, low-reliability DRAM
| (not a good marketing name, I know) where you live with 10
| nines of reliability instead of 20 or 30 and in exchange can
| run it a lot faster or cooler, instant-persistent memory that
| blurs the line between memory and storage, and so on.
| user_7832 wrote:
| > CXL
|
| Thank, that it quite an interesting technology I wasn't aware
| of. Apparently Samsung already made a CXL RAM module for
| servers in 2021 (1). I wonder how Intel optane would have
| been if it had used CXL (assuming it didn't).
|
| Side note but AMD devices' (laptops/NUCs) lack of thunderbolt
| or pcie access is why I'm quite hesitant to buy a portable
| AMD device which is quite unfortunate. I really hope
| AMD/their partners can offer a solution soon now that
| thunderbolt is an open standard.
|
| 1. https://hothardware.com/news/samsung-cxl-module-dram-
| memory-...
| zionic wrote:
| 120hz star citizen
| [deleted]
| dragontamer wrote:
| PCIe 4.0 x4 lane provides enough bandwidth. Ish... we've
| already capped out with 8GBps SSD actually. PCIe 4.0 x4 lane is
| now the limiting factor.
|
| PCIe 6.0 1x lane would provide the same bandwidth, meaning you
| run 1/4th as many wires and still get the same speed.
|
| Alternatively, PCIe 6.0 4x lane will be 4x faster than 4.0,
| meaning our SSDs can speed up once more.
| AnotherGoodName wrote:
| Internally passing around multiple 4k video outputs would use
| this. Maybe you don't want the output via the port on the back
| of the card but want to pass it through internally to some
| other peripheral? I think this is how thunderbolt ports work
| right (happy to be corrected)?
| adgjlsfhk1 wrote:
| current top of the line SSDs are close to maxing out 4 lanes of
| PCIE gen 4. Gen 6 will make it a lot easier to find room for a
| few 100gb/s ethernet connections which are always nice for
| faster server to server networking, as well as making it easier
| to use pcie only storage.
| jeffbee wrote:
| You can make SSDs arbitrarily fast by just making them
| wider/more parallel. The reason it seems like PCIe 4 or 5 is
| "fast enough" is because the SSDs are co-designed to suit the
| host bus. If you have a faster host bus, someone will market a
| faster SSD.
| Zenst wrote:
| > PCIe already provides more than enough bandwidth for the
| fastest SSDs
|
| Today, yes and a few tomorrows as well. But even when a
| standard is announced as finalized, it can be a long time
| (Years even) until it makes it's way onto motherboard of the
| consumer space. By which time, the current goalposts may start
| looking closer than expected.
|
| I'm just glad they have one number, no endless revisions and
| renaming of past releases and with that - thank you PCI-Sig.
| smiley1437 wrote:
| From what I understand, internally an SSD's bandwidth can be
| easily scaled by spreading reads and writes across arbitrarily
| large numbers of NVRAM chips within the SSD.
|
| So, you can just create SSDs that saturate whatever bus you
| connect them to.
|
| In a sense then, it is the bus specification itself that limits
| SSD throughput.
| StillBored wrote:
| Current SSDs, but there is literally nothing stopping people
| from putting PCIe switches in place and maxing out literally
| any PCIe link you can create.
|
| The limit then becomes the amount of RAM (or LLC cache if you
| can keep it there) bandwidth in the machine unless one is doing
| PCIe PtP. There are plenty of applications where a large part
| of the work is simply moving data between a storage device and
| a network card.
|
| But, returning to PtP, PCIe has been used for accelerator
| fabric for a few years now, so a pile of GPGPU's all talking to
| each other can also swamp any bandwidth limits put in place
| between them for certain applications.
|
| Put the three together and you can see what is driving ever
| higher PCIe bandwidth requirements after PCIe was stuck at 3.0
| for ~10 years.
| nwmcsween wrote:
| So my understanding of DDR5 has on chip ECC is needed due to the
| ever increasing need for -Ofast, will/does PCIe have the same
| requirements?
| loeg wrote:
| PCIe packets have always had error detection at the DLLP layer.
| ksec wrote:
| Yes. Forward Error Correction (FEC) [1] . The Anandtech article
| wasn't in my feed when I submitted this. As it offers much more
| technical details.
|
| [1]
| https://www.anandtech.com/show/17203/pcie-60-specification-f...
| Taniwha wrote:
| reading the articles I think that FEC is being used to protect
| link integrity - it's different from ECC on DRAM which is also
| protecting the contents (against things like row-hammer and
| cosmic rays)
| monocasa wrote:
| The on chip ECC for DDR5 isn't because of faster memory; it's
| because of denser memory. It can rowhammer itself, and fox it
| up silently. And the cheaper brands can start shipping chips
| with defects like they do with flash, relying on the ECC to
| paper over it.
| rjzzleep wrote:
| How long does it usually take to get consumer products of new
| PCIe specs? Fast PCIe Gen 4 is only just getting affordable. Like
| $350 for 2 TB NVMe ssds.
|
| Also, I remember playing around with PCI implementation on FPGAs
| over a decade ago and timing was already not easy. What goes into
| creating a PCIe Gen4/5 device these days? How can you actually
| achieve that when you're designing it? Are people just buying the
| chipsets from a handful of producers because it's unachievable
| for normal humans?
|
| EDIT: What's inside the spec differences between say gen 3 and 6
| that allows for so many more lanes to be available?
| willis936 wrote:
| I've not done PHY development personally, but these interfaces
| are called SerDes. SerDes is short for Serial-Deserializer.
| Outside of dedicated EQ hardware, everything on the chips are
| done in parallel so nothing needs to run at a multi-GHz clock.
| [deleted]
| Taniwha wrote:
| I think that these days there's a lot of convergence going on
| - everything is essentially serdes in some form - some chips
| just have N serdes lanes and let you config them for
| PCIe/ether/data/USB/etc as you need them, much as more
| traditional SoCs config GPIOs between a bunch of other
| functions like uarts/spi/i2c/i2s/pcm/...
| ksec wrote:
| >How long does it .......
|
| It is not just about getting a product out ( i.e PCI-E 6.0 SSD
| ), but also the platform support. ( i.e Intel / AMD Motherboard
| support for PCI-e 6.0 )
|
| Product Launch are highly dependent on Platform support. So far
| Intel and AMD dont have any concrete plan on PCI-E 6.0, but I
| believe Amazon could be ahead of the pack with their Graviton
| platform. Although I am eager to see Netflix's Edge Appliance
| serving up to 800Gbps if not 1.6Tbps per box.
| iancarroll wrote:
| I recently bought a few Zen 2 and Zen 3 HPE servers, and
| found out only via trial and error that HPE sells Zen 2
| servers without Gen4 motherboard support!
|
| It seems they took the original Zen motherboards with Gen3
| and just swapped out the CPU. Only the Zen 3 has a refreshed
| motherboard. Makes me now check things more carefully to be
| sure.
| dragontamer wrote:
| > How long does it usually take to get consumer products of new
| PCIe specs?
|
| Like 2 years.
|
| When PCIe 3.0 was getting popular, 4.0 was finalized. When 4.0
| was getting popular, 5.0 was finalized. Now that PCIe 5.0 is
| coming out (2022, this year), PCIe 6.0 is finalized.
| formerly_proven wrote:
| There was a much bigger gap between 3.0 and 4.0. PCIe 3.0 was
| available with Sandy or Ivy Bridge, so 2011/2012. PCIe 4.0
| was introduced with Zen 2 in 2019.
|
| We seem to be back to a faster cadence now however.
| zamadatix wrote:
| The large delay between 3.0 to 4.0 was a gap between
| specifications (2010 to 2017) not a gap between
| specification to implementations (2017 to 2019).
| dragontamer wrote:
| With the rise of GPU-compute, a lot of the supercomputers
| are playing around with faster I/O systems. IBM pushed
| OpenCAPI / NVLink with Nvidia, and I think that inspired
| the PCIe ecosystem to innovate.
|
| PCIe standards are including more and more coherent-memory
| options. It seems like PCIe is trying to become more like
| Infinity Fabric (AMD) / UltraPath Interconnect (Intel).
| jeffbee wrote:
| PCIe 5 was standardized May 2019 and you could buy it at
| retail in late 2021. 2 years good rule of thumb.
| jiggawatts wrote:
| I love how exponential growth can be utterly terrifying or
| unfathomably amazing.
|
| Just a few years ago I was trying to explain to an IT manager
| that 200 IOPS just doesn't cut it for their biggest, most
| important OLAP database.
|
| He asked me what would be a more realistic number.
|
| "20,000 IOPS is a good start"
|
| "You can't be serious!"
|
| "My laptop can do 200,000."
| KennyBlanken wrote:
| > Just a few years ago
|
| > "My laptop can do 200,000."
|
| Only now (PCIe 4 and very recent controllers etc) are the
| very latest top-end NVME drives hitting around 150k IOPS
| (which isn't stopping manufacturers from claiming ten times
| that; WD's NVME drive tests at around 150-200k IOPS and yet
| they claim 1M) and only in ideal circumstances...reads and
| writes coming out of the SLC cache, which typically under
| 30GB, often a lot smaller except in the highest-end drives.
|
| Many drives that claim to reach that sort of performance are
| actually using Host Backed Cache, ie stealing RAM.
|
| IOPS on SSDs drops precipitously once you exhaust any HBC,
| controller ram, SLC cache, mid-level MLC cache...and start
| having to hit the actual QLC/TLC. In the case of a very large
| database, a lot of IO would be outside cache (though
| certainly any index, transaction, logging, etc IO would
| likely be in cache.)
| Cullinet wrote:
| I would love to pick up from the 200k IOPS laptop quote and
| demo a RAM drive and then saturate the RAM drive into
| swapping - I don't know how you could do this on stock
| distros or Windows but it would make a great executive
| suite demo of the issues.
| jeffbee wrote:
| There's not more lanes available. The generations are getting
| faster just by increasing the transfer clock rate, up to PCIe
| 5, and in PCIe 6 by increasing the number of bits per transfer.
| The way they doubled the speed every generation was pretty
| basic: the timing tolerances were chopped in half every time.
| The allowable clock phase noise in PCIe 4 is 200x less than in
| PCIe 1. The miracle of progress, etc.
|
| That miracle is somewhat over. They're not going to be able to
| drive phase noise down below 1 femtosecond, so 6.0 changes
| tactics. They are now using a fancy encoding on the wire to
| double the number of bits per symbol. Eventually, it will look
| more like wifi-over-copper than like PCI. Ethernet faster than
| 1gbps has the same trend, for whatever it's worth.
| bserge wrote:
| Speaking of which, when is 10Gbit Ethernet coming to laptops?
| Most have even lost the port ffs.
| jeffbee wrote:
| Many laptops have a thunderbolt port which serves a similar
| purpose. On TB4 I get 15gbps in practice, and I can bridge
| it to ethernet using either a dock or a PC (I use a mac
| mini with a 10g port to bridge TB to 10ge).
| rektide wrote:
| > _How long does it usually take to get consumer products of
| new PCIe specs?_
|
| Personally I'm expecting this spec to drive pcie 5.0 adoption
| into consumer space.
|
| Tbh consumers dont need this througjput. But given that
| consumer space has remained stuck around 20x lanes off the cpu
| (plus some for the chipset), the 5.0 and 6.0 specs will be
| great for those wanting to build systems with more peripherals.
| A 1x 16GBps link is useful for a lot.
| robbedpeter wrote:
| I'd be leery of dismissing the potential consumer demand.
| That much throughput could be put to good use for a myriad of
| personal and business functions, and software trends to
| filing whatever hardware can provide. It's like every
| prediction about users not needing X amount of ram or cpu or
| dpi or network or storage space.
|
| Having that much throughput suggests paging and caching
| across multiple disks, or using giant models (ml or others)
| with precomputed lookups in lieu of real-time generation. At
| any rate, all it takes is a minor inconvenience to overcome
| and the niche will be exploited to capacity.
| cjensen wrote:
| PCIe Gen5 is now available in the latest Intel desktop
| processors. There are very few lanes, so it can really only run
| a single GPU, but that's covers a lot of the potential market.
| eigen wrote:
| Look like Desktop Chipset 600 series just supports gen 3 &
| 4[1] and PCIe gen5 ports are only available on AlderLake
| Desktop[2] not Mobile[3] processors.
|
| [1] https://ark.intel.com/content/www/us/en/ark/products/seri
| es/...
|
| [2] https://ark.intel.com/content/www/us/en/ark/products/1345
| 98/...
|
| [3] https://ark.intel.com/content/www/us/en/ark/products/1322
| 14/...
___________________________________________________________________
(page generated 2022-01-11 23:00 UTC)