Up to 56 Cores and 112 PCIe 5.0 Lanes
For all of the singular focus that Intel has placed on its consumer Core desktop CPU parts in the last few years, you could be forgiven for thinking that Intel has forgotten about their Xeon premium processor lineups for workstations. Between the de facto retirement of Intel’s desktop-grade Xeon W-1×00-series lineup, and the repeated delays of Intel’s current-generation big silicon parts for servers, the Sapphire Rapids-based 4th Generation Xeon Scalable series, there hasn’t been much noise from Intel in the workstation space in the last few years. But now that Sapphire Rapids for servers has finally launched, the logjam in Intel’s product roadmap has at last cleared out, and Intel is finally in a position to resume cascading their latest silicon into new workstation parts.
This morning Intel is announcing their first top-to-bottom refresh of workstation parts, the Xeon W-3400 and Xeon W-2400 series. Aimed at what Intel is broadly classifying as the Expert Workstation and Mainstream Workstation markets respectively, these chip lineups are intended for use in high-performance desktop workstation setups, particularly those that require more CPU cores, more PCIe lanes, more memory bandwidth, or a combination of all three elements. Based on the same Sapphire Rapids silicon as Intel’s recently-launched server parts, the new Xeon W SKUs will bring down many (but not all) of the features that have come to define Intel’s leading-edge server silicon, along with a new chipset (W790) and motherboards that are more suitable for use in high-performance workstations.
As with the new Xeon Scalable Parts, the big three additions here are the shift to Intel’s Golden Cove CPU architecture – with all the IPC and clockspeed benefits that brings – along with the addition of support for DDR5 memory and PCIe 5 for I/O connectivity. All of which is a significant upgrade over the mix of Cascade Lake and Ice Lake parts that make up Intel’s previous product stack. Meanwhile compared to Intel’s existing desktop processor lineup, these are all features that were pioneered on Alder Lake (12th Gen Core) back in late 2021, the workstation-focused Xeon W parts are going to be building things out to a much larger degree.
Starting at the top, the Xeon W-3400 series (Sapphire Rapids-112L) will vary from 12 to 56 cores, and all will include 112 PCIe 5.0 lanes, support for up to 4 TB of DDR5-4800 memory across eight memory channels, ECC memory (RDIMM-only), Intel vPro, and Intel Standard Manageability (ISM). Four of the seven W-3400 SKUs (X-series) benefit from unlocked multipliers and, as such, officially support overclocking. Meanwhile a step down from, the Xeon W-2400 series (Sapphire Rapids-64L), will offer between 6 and 24 CPU cores paired with a pared-down 64 lanes of PCIe 5.0 connectivity, support for up to 2 TB of DDR5-4800 memory across four memory channels, and all of the rest of the Xeon W trimmings such as ECC memory.
|Intel Xeon Workstation Desktop Platforms|
|Expert Workstation||Xeon W 3300 (Ice Lake-64L)
& Xeon W-3200 (Cascade Lake-64L)
|Mainstream Workstation||Xeon W-2200 (Cascade Lake)||Xeon W-2400
|Entry Workstation||Xeon W-1200
|12th Gen Core
+ W680 Chipset
|13th Gen Core
+ W680 Chipset
The new Xeon W parts will be replacing a mish-mash of different Xeon generations from Intel. While Intel did launch some Ice Lake-based Xeon parts in 2021 – the Xeon W-3300 family – these parts were a supplemental update of sorts for Intel’s Xeon lineup, for specific customers that needed the extra CPU cores or PCIe bandwidth. For everyone else, the outgoing Xeon W product stack, the circa 2019 W-3200 and W-2200 families, have been based on Intel’s Cascade Lake silicon – which itself was a modest update to Intel’s Skylake parts. So the importance of the launch of the Xeon W-3400/2400 series to Intel’s workstation lineup is hard to overstate: this is a major overhaul and upgrade of Intel’s Xeon workstation stack.
The new Xeon W parts, in turn, will be going up against AMD’s Threadripper Pro 5000 WX parts, which are based on AMD’s Zen 3 architecture. The most recent Threadripper Pro parts launched last spring, and AMD has essentially had the run of the market in terms of CPU performance since then, thanks to a significant advantage in core counts and IPC. Even with their new parts, Intel technically still isn’t completely closing that core count gap, but the boost in IPC, core counts, and clockspeeds should help to level the playing field in terms of overall CPU performance – though by how much remains to be seen.
Intel Xeon W-3400 Series: ‘Expert’ Platform with Up to 56 Cores, 112 PCIe 5.0 lanes, and 8-Channel Memory
Intel’s Xeon W-3400 and W-2400 series workstation processors are based on Intel’s Golden Cove CPU architecture, the same architecture as Intel’s Alder Lake (12th Gen) desktop processors. Representing the premier line-up from Intel’s 4th Gen Xeon Scalable Sapphire Rapids premium workstation offerings, the W-3400 family has seven SKUs in total. The Xeon W-3400 ranges from a modest 12-core/24-thread part (w5-3425) to a highly anticipated 56-core/112-thread part, the flagship w9-3495X.
|Intel Xeon W-3400 Series (Sapphire Rapids-112L)|
For the Xeon W-3400 series in particular, these parts are based on Intel’s Sapphire Rapids Extreme Core Count (XCC) silicon, which is currently used in Intel’s higher-end Xeon server parts. The XCC silicon relies on 4 compute tiles, bound together using Intel’s latest EMIB interconnect – a first for a Xeon workstation processor.
The individual tiles for a Sapphire Rapids XCC chip are all identical/symmetrical, so each tile provides a quarter of the CPU cores, I/O, and memory channels of the entire chip. As such, each tile can provide up to a maximum of 32 PCIe 5.0 lanes (112 total on the w9-3495X), while each tile also includes up to two memory controllers providing eight-channel memory across the W-3400 series.
Focusing on the top-end SKU of the Xeon W-3400 family, the Intel Xeon w9-3495X, it has similar vibes to Intel’s previous behemoth Xeon W-3175X, which was released in 2019 and came with official support for overclocking. Like the Skylake-based Xeon W-3175X, the latest Xeon w9-3495X also has an unlocked multiplier for overclocking.
The Intel Xeon w9-3495X has 56 cores (for 112 threads), and unlike Intel’s desktop parts, every last one of these is a Performance (P) core. Also present is a total of 105 MB of Intel’s Smart L3 Cache, with official support for eight-channels of DDR5-4800 ECC RDIMM memory, with a maximum capacity of up to 4 TB.
Like the server part it’s based on, w9-3495X has a rather toasty TDP rating, coming in at 350 Watts. And in practice, peak power consumption is likely to be much higher under full load with Intel’s Turbo Boost and Turbo Boost Max 3.0 technologies enabled, especially on 56-unlocked cores. Although it has a base frequency on the 56 Golden Cove cores of 1.9 GHz, it has a turbo frequency of up to 4.6 GHz, and thanks to Turbo Boost Max 3.0 (Intel’s favored core technology), a handful of cores can boost further to 4.8 GHz.
The other SKUs from the Xeon W-3400 family range from 36-cores down to 12-core offerings, such as the w9-3475X (36C/72T) and the w5-3425 (12C/24T). Ultimately, all of the Xeon W-3400 parts offer the same number of DDR5 memory channels and PCIe lanes, so what separates the different SKUs is CPU core counts, max memory clockspeeds, L3 cache, and of course, price.
Meanwhile, as previously noted, four of the Xeon W-3400 SKUs – the w9-3495X, w9-3475X, w7-3465X, and the w5-3435X – are all “unlocked” processors. This is something Intel hasn’t offered on a Xeon W part in a few years and comes with some interesting ramifications. Besides the most basic ability to alter the clockspeed multipliers for the CPU, unlocked processors can also have their AVX and AMX offsets adjusted to keep the processors from dropping quite as much under heavy SIMD loads. Finally, all of these parts also offer some tuning options for their mesh interconnects, though Intel hasn’t said what precisely can be tweaked here.
Prices on the Intel Xeon W-3400 family start at $1189, with Intel providing pricing on a 1K per unit pricing (tray) and not individually purchased retail SKUs. The Xeon w9-3495X has a 1KPU price of $5889, which makes the top SKU and each subsequent W-3400 SKU more expensive than the previous generation of Xeon W-3300 chips, but they do come with higher core counts, faster turbo frequencies, more L3 cache, and support for DDR5-4800.
It is worth pointing out that all of Intel’s W-3400 SKUs feature support for up to 4TB of eight-channel DDR5-4800 ECC memory, even the bottom SKU, the w5-3425 (12C/24T). So there are options in the Xeon product stack for systems that need a whole lot of DRAM, but not necessarily a ton of CPU cores. Do note, however, that actually hitting 4TB requires using 2 DIMMs per channel (DPC), which requires backing off to DDR5-4400 memory speeds.
With 112 PCIe 5 lanes available from the CPU (and yet more from the chipset), the Xeon W-3400 chips can support a rather massive number of I/O devices. This works out to seven discrete x16 graphics cards, or up to 28 x4 high-speed storage devices. This, along with core counts and memory channels, is one of the primary differentiators from the lower-tier Xeon W-2400 series – and should be a welcome development for Intel platform users who were stuck with a fraction of the I/O bandwidth on Intel’s earlier Xeon W parts.
Interestingly, 112 PCIe 5 lanes is actually more than Intel offers in its Sapphire Rapids server parts. The Xeon Scalable lineup tops out at just 80 lanes. This discrepancy comes from the fact that Intel only enabled 5 of the 7 root ports for their server parts, leaving a further 2 ports (32 lanes) unused. However as the workstation Sapphire Rapids parts do not need to allocate any pins to supporting Intel’s multi-socket UPI links, it would seem that Intel has instead allocated those pins to carrying the additional PCIe lanes for the workstation parts. It’s worth noting that Intel is using the same socket for both server and workstation chips here – LGA 4677 – but with the pin changes I wouldn’t expect them to be compatible.
Meanwhile, in another first for Intel, the company has said that they are going to support DDR5 XMP 3.0 memory overclocking profiles for RDIMMs. The details on this announcement are very scant, but at a high level it’ll give unlocked processor owners running on W790 the option of trying to squeeze more out of their memory if they can. Generally speaking, memory overclocking and the rock-solid stability of RDIMMs are diametrically opposing goals, so it will be interesting to see how this plays out in the market. The DRAM may be able clock higher than just DDR5-4800, but can the registered clock drivers (RCDs)?
As an aside, all of this talk explicitly around RDIMMs is intentional: in a big change from previous Xeon W platforms, the Sapphire Rapids Xeon workstation platforms will not support UDIMMs. This is a limitation of the DDR5 specification, which calls for different voltages for UDIMMs and RDIMMs respectively. Whereas UDIMMs take 5 volts, RDIMMs take 12 volts, rendering them incompatible. If you’ve ever had the chance to see an DDR5 RDIMM in person, you may have noticed that they are even keyed differently from UDIMMs, so they are both physically and electrically incompatible.
Ultimately, this means that users will have to pair these processors and W790 motherboards with more expensive, albeit higher-quality ECC-enabled DDR5 RDIMMs. For dyed in the wool workstation users this is unlikely to be an issue (or even a difference that gets noticed), but anyone hoping to build an HEDT-style system or low-end workstation on the cheap is going to find that the final price tag for a Xeon W system is going to be higher than what you could pull off with the W-3200/2200 series.
Accelerated Computing: AMX and CXL Make the Cut, Most Domain-Specific Accelerators Do Not
For their Sapphire Rapids Xeon silicon and resulting server parts, Intel introduced a slew of different acceleration blocks and other accelerator-related features. Between matrix extensions (AMX), various domain specific hardware acceleration blocks, and support for Compute eXpress Link (CXL) for external accelerators, Intel ended up devoting a fair bit of silicon to non-CPU tasks. This has meant that for their Xeon Scalable server parts in particular, Intel has opted (if not needed) to lean on these accelerator features, with one DSA engine enabled in all of the chips. Still, QAT, DLB, and IAA are not supported. This in lieu of just raw x86 CPU performance for differentiating the hardware from its predecessors and its competition.
But for their workstation parts, things are a little more straightforward, for better and for worse. In short, not all of Intel’s accelerated computing features are being made available in the Xeon W-3400/2400 families. So let’s do a quick rundown of which of Sapphire Rapids more esoteric features made the cut for Xeon W.
Perhaps most critically of all, Intel’s Advanced Matrix Extensions (AMX) did make the cut, and support for them is fully present and enabled on the Xeon W-3400/2400 family. AMX is Intel’s matrix math execution block, and similar to tensor cores and other types of matrix accelerators, these are ultra-high-density blocks for efficiently executing matrix math. AMX isn’t a dedicated accelerator, rather it’s a part of the CPU cores, with each core getting a block, which allows AMX code to be mixed with x86 (and AVX) code, and is also why Sapphire Rapids has negative clockspeed offsets for using the ultra-dense code.
AMX is Intel’s play for the deep learning market, going above and beyond the throughput they can achieve today with AVX-512 by using even denser data structures. While Intel has AMX-enabled GPUs (Intel Data Center Max GPU Series) that go beyond even this, for Sapphire Rapids Intel is looking to address the customer segment that needs AI inference taking place very close to CPU cores, rather than in a less flexible, more dedicated accelerator. The new AMX units also support Bfloat16, ensuring that every tier of Intel’s accelerated computing blocks (AVX and AMX) support this common mid-precision floating point format for deep learning.
One of Sapphire Rapids’ new domain-specific hardware accelerator blocks, the Data Streaming Accelerator (DSA), also made the cut. This block is for offloading/accelerating certain operations, such as data copies and simple computations such as calculating CRC32s. The DSA block is available across all of the Xeon W SKUs.
However you won’t find mention of the rest of Intel’s accelerator blocks, such as Intel Dynamic Load Balancer (DLB), Intel In-Memory Analytics Accelerator (IAA), and Intel QuickAssist Technology (QAT). This despite the fact that these accelerators are all part of the same functional block on the Sapphire Rapids silicon. These other accelerator blocks are all primarily aimed at servers, so it’s not surprising to not see their inclusion, but it does mean anyone prototyping code for servers will need to test on an actual Xeon Scalable if they’re using their features.
Finally, while CXL support is absent from Intel’s Xeon W spec sheets, Intel has confirmed to us that it is in fact supported on both families. The built-on-top-of-PCIe standard for host-to-device connectivity has been in the wings for a few years now, and Sapphire Rapids is the first Intel CPU platform to support the technology. Like some of these other features, it is primarily intended for servers, so there’s less of an impetus to bring it to workstations. Still, Intel has enabled it for users looking to leverage its functionality.
Intel Xeon W-2400 Series: Up to 24-Cores, 64 PCIe 5.0 lanes, For Mainstream Workstations
Dropping down a tier, we have the Xeon W-2400 series (Sapphire Rapids-64L), which is designed as a ‘Mainstream’ workstation platform. Xeon W-2400 offers a bit more than half as many PCIe lanes as the W-3400 SKUs, with 64 PCIe 5.0 lanes available, and the number of memory channels is cut in half as well to four channels. As such, this means prices are lower on the W-2400 series than its beefier W-3400 counterparts, going as low as $359 for the entry-level Xeon w3-2423.
|Intel Xeon W-2400 Series (Sapphire Rapids-64L)|
Overall, the Xeon W-2400 series will range from 6 cores up to 24 cores. Intel is using their Sapphire Rapids Medium Core Count (MCC) silicon here, which unlike the XCC silicon, is a traditional monolithic die. This means no fancy EMIB packaging is required to build the chip – instead, Intel only has to fab one rather large die.
At the top-end of the Xeon W-2400 lineup is the w7-2495X, which features 24-cores/48-threads, 45 MB of Intel Smart L3 cache, and a TDP of 225 Watts. Intel also has three w5 series SKUs, and finally the trio of w3 SKUs.
Like its expert-tier counterpart, the Xeon W-2400 series offers a consistent memory and I/O configuration across the entire lineup. This means 64 lanes of PCIe 5 coming from the CPU, and four channels of DDR5 memory, allowing for a maximum of 2 TB of memory overall. It it also worth pointing out that only the w5 and w7 SKUs offer full DDR-4800 memory speeds; the w3 parts are all capped at DDR4-4400. The silver lining? All SKUs drop to this speed in a 2 DPC configuration, so if you were looking to build a 2 TB system for whatever reason, you won’t get penalized.
Like the Xeon W-3400 series, the W-2400 family also has a few unlocked X SKUs in its arsenal, including the top-tier w7-2495X. Other SKUs with unlocked multipliers include the w7-2475X with 20 cores and 37.5 MB of L3 cache, and two w5 SKUs (w5-2465X 16C/32T and w5-2455X 12C/24T). You won’t find any unlocked w3 parts, however, as all three entry-level w3 SKUs are fully locked down.
Intel W790 Chipset: Supports both Xeon W-3400 and W-2400 Platforms
All of Intel’s Xeon W-3400 and W-2400 series SKUs benefit from Intel vPro and Intel’s Standard Manageability (ISM) technologies. Both the Xeon W-2400 and W-3400 families are supported by the associated W790 chipset, although CPU-specific features such as the number of memory channels and PCIe lanes available depend on the processor itself.
Some of the main features of the W790 chipset include a Direct Media Interface (DMI) 4.0 x8 link between the processor and the chipset itself, as well as up to 16 PCIe 4.0 lanes and support for up to eight SATA 3.0 ports. W790 also supports up to five USB 3.2 Gen2x2 (20Gbps) ports, includes an Intel Wi-Fi 6E PHY, and can support 2.5 GbE controllers natively.
Although there isn’t any mention of new motherboards, there are expected to be Intel W790 motherboards from vendors such as ASUS, GIGABYTE, Supermicro, and ASRock. System integrators such as Dell, Lenovo, and Supermicro are expected to take precedence first in delivering solutions and systems before DIY builders can get their hands on them.
The ASRock W790 WS motherboard
ASRock emailed us just before the launch to outline its W790 WS model, with a 20+2-phase power delivery, dual 10 GbE controllers, and support for up to 2 TB of DDR5-4800 ECC RDIMMs across eight slots. Although this board supports both Xeon W-3400 and W-2400 processors, this board is only enabled for quad-channel memory.
Something worth mentioning concerning the latest generation of motherboards is that W790 boards are likely to cost more than the C621A-based boards that were used to support the Xeon W-3300 series (Ice Lake). This is because W790 boards have four more lanes of DDR5 memory and 48 more PCIe 5 lanes to account for. While we expect to see different levels of board designs with different slots and I/O configurations available at some point, Intel hasn’t specified if some of these motherboards will support both families, or if vendors will design specific boards around the individual Xeon W-3400 and W-2400 series.
Intel’s Xeon W-3400 and W-2400 processors are available to pre-order from industry partners, while systems deployments are expected sometime in early March. Intel’s expected and recommended pricing starts at $359 for the Xeon w3-2423 and goes up to $5889 for the Xeon w9-3495X.