
AMD Threadripper 9980X 64-Core CPU Review & Benchmarks
video description
Date: 2025-08-02
Related videos
Comments and reviews: 20
ryankuba-cj6bp
I used to work at Qualcomm in our android build farm, when you are doing compilation tests it is important to note that many parts of a large compilation process spin down to very little or even one thread.
This happens due to pre-requisite and step by step nature of clang/gcc/g compilation, they kept throwing hardware at it but no matter how many more cores or even extreme situations running from ram vs disk made no difference in total compilation time because of this generally unknown bottleneck steps in the build process.
How you would see a large difference generationally or going from 64 cores to 128 is you need to stagger multiple builds, so if the build time is 1 hour, you start one a minute 0 one at minute 15 etc, the net result is you actually use all of your cores most of the time. That is really the only way to represent the real gains you get from something like this at many long steps in the build process more than like 8 cores are simply not needed. This can be further optimized if you perf track a long compile to identify these dead zones where everything is waiting on a single object file to finish or just a couple.
I do not expect you to do something like this (perf tracking and multi build alignment) but just more of an FYI in real build farms people sit around and come up with all kinds of stuff to do more with the same hardware. It might make sense to run like 4 compiles on a stagger divided by total single build time to ensure the cpu is loaded the whole time, that and just running one build at minute X then another at minute Y can be scripted pretty easily. You would see much larger generational leaps or shifts from high core count systems doing something like this.
reply
I used to work at Qualcomm in our android build farm, when you are doing compilation tests it is important to note that many parts of a large compilation process spin down to very little or even one thread.
This happens due to pre-requisite and step by step nature of clang/gcc/g compilation, they kept throwing hardware at it but no matter how many more cores or even extreme situations running from ram vs disk made no difference in total compilation time because of this generally unknown bottleneck steps in the build process.
How you would see a large difference generationally or going from 64 cores to 128 is you need to stagger multiple builds, so if the build time is 1 hour, you start one a minute 0 one at minute 15 etc, the net result is you actually use all of your cores most of the time. That is really the only way to represent the real gains you get from something like this at many long steps in the build process more than like 8 cores are simply not needed. This can be further optimized if you perf track a long compile to identify these dead zones where everything is waiting on a single object file to finish or just a couple.
I do not expect you to do something like this (perf tracking and multi build alignment) but just more of an FYI in real build farms people sit around and come up with all kinds of stuff to do more with the same hardware. It might make sense to run like 4 compiles on a stagger divided by total single build time to ensure the cpu is loaded the whole time, that and just running one build at minute X then another at minute Y can be scripted pretty easily. You would see much larger generational leaps or shifts from high core count systems doing something like this.
reply
Lodinn
You have a well-balanced set of benchmarks here.
I have worked with physical simulations and ML/AI stuff in the past as well as on bioinformatics more recently, your suite of benchmarks seems well-rounded and covers most bases (or, at least, I could extrapolate a lot about how it'd perform in my applications). One HUGE potential bottleneck with these systems though is working with memory and parallelization. You have MPI in there, which is good, but the total memory capacity is low for these kinds of systems. The sweet spot for us seems to be 256GB for 32 threads, 1T for 128 threads etc; I have also seen the tendency towards this scaling in multiple institutions. It is rare to need a lot of compute for not a lot of data. And bioinformatics tasks are especially notorious for being RAM-hungry, for a few gigabytes of input (essentially, text), a computational graph easily takes hundreds of gigabytes. And that's where the architecture really starts to matter, you may run into all kinds of bottlenecks, cache optimization becomes noticeable, and so on. All that is to say, for a 128t part I'd definitely fit it with at least 512GB worth of RAM and load it with something utilizing at least 250-300GB of it. Also would be interesting to look at the regular HEDT configuration vs the one using ECC memory, usually folks running longer tasks (on the scale of days) start being more concerned about that.
reply
You have a well-balanced set of benchmarks here.
I have worked with physical simulations and ML/AI stuff in the past as well as on bioinformatics more recently, your suite of benchmarks seems well-rounded and covers most bases (or, at least, I could extrapolate a lot about how it'd perform in my applications). One HUGE potential bottleneck with these systems though is working with memory and parallelization. You have MPI in there, which is good, but the total memory capacity is low for these kinds of systems. The sweet spot for us seems to be 256GB for 32 threads, 1T for 128 threads etc; I have also seen the tendency towards this scaling in multiple institutions. It is rare to need a lot of compute for not a lot of data. And bioinformatics tasks are especially notorious for being RAM-hungry, for a few gigabytes of input (essentially, text), a computational graph easily takes hundreds of gigabytes. And that's where the architecture really starts to matter, you may run into all kinds of bottlenecks, cache optimization becomes noticeable, and so on. All that is to say, for a 128t part I'd definitely fit it with at least 512GB worth of RAM and load it with something utilizing at least 250-300GB of it. Also would be interesting to look at the regular HEDT configuration vs the one using ECC memory, usually folks running longer tasks (on the scale of days) start being more concerned about that.
reply
saull287
The last thing in my mind are the games, I'm here for the specs and performance, games it's just a metric; I want to see all the charts, 'coz I want to see how much it sucks normally, and what improvements get with OC, with all the testing. I admire the high Core count CPU, 'coz they're Wonderful Computing Machines.
Convolutions Benchmark: Well all the optimizations pay off; it's as simple as that, maybe an new Instruction Sets or improvement to several of them, maybe some Core Comms, or grouping that rearrange of the I/O Chiplet, better handling RAM. I don't see the problem, they're supposed to be better, better Boost handling or OCing. LTT made it go to 5.0 sustained, it's just works haha, not but seriously you should check the Freq handling and ISA, Cache Associativity, Bus bandwidth for the FPU, so yeah 2x performance, if it's FPU intensive; also the nature of the algorithms and the data handling.
Results are too generic: Games don't use all Cores sometimes; so it'd be useful to correlate benchmarks with Core Use Count; also not parallel threads is very bad for performance in general and in particular for high Core count CPUs, 'coz of the massive count of potential multi-level Cache use from each Core, wasted
reply
The last thing in my mind are the games, I'm here for the specs and performance, games it's just a metric; I want to see all the charts, 'coz I want to see how much it sucks normally, and what improvements get with OC, with all the testing. I admire the high Core count CPU, 'coz they're Wonderful Computing Machines.
Convolutions Benchmark: Well all the optimizations pay off; it's as simple as that, maybe an new Instruction Sets or improvement to several of them, maybe some Core Comms, or grouping that rearrange of the I/O Chiplet, better handling RAM. I don't see the problem, they're supposed to be better, better Boost handling or OCing. LTT made it go to 5.0 sustained, it's just works haha, not but seriously you should check the Freq handling and ISA, Cache Associativity, Bus bandwidth for the FPU, so yeah 2x performance, if it's FPU intensive; also the nature of the algorithms and the data handling.
Results are too generic: Games don't use all Cores sometimes; so it'd be useful to correlate benchmarks with Core Use Count; also not parallel threads is very bad for performance in general and in particular for high Core count CPUs, 'coz of the massive count of potential multi-level Cache use from each Core, wasted
reply
Rob...0...1
If AMD truly wants to ignite the enthusiast/non-pro HEDT community with their TRX50, then they NEED to grow a pair and own up to all the TRX40 owners that they royally screwed over using those long term support promises and that now infamous bait and switch...killing off that enthusiast HEDT userbase in the process.
As it is, TRX50 motherboards - now, with TR 9000, in it's second series of CPU's (one more than TRX40 !!) - still sit in deathly silence on shelves in numbers so low as to be beyond laughable...every motherboard manufacturer letting AMD know how they feel about being equally royally shafted with TRX40.
--> TRX50 is a total disaster. AMD clearly don't give a f about all those TRX40 'fools'.....or about their 'We're back' TRX50 !!!
reply
If AMD truly wants to ignite the enthusiast/non-pro HEDT community with their TRX50, then they NEED to grow a pair and own up to all the TRX40 owners that they royally screwed over using those long term support promises and that now infamous bait and switch...killing off that enthusiast HEDT userbase in the process.
As it is, TRX50 motherboards - now, with TR 9000, in it's second series of CPU's (one more than TRX40 !!) - still sit in deathly silence on shelves in numbers so low as to be beyond laughable...every motherboard manufacturer letting AMD know how they feel about being equally royally shafted with TRX40.
--> TRX50 is a total disaster. AMD clearly don't give a f about all those TRX40 'fools'.....or about their 'We're back' TRX50 !!!
reply
edragyz8596
6:42
Steve, respectfully. I wanna know if it's slower than the 9950x, because LINUS failed to compare it to a non-X3D like that was supposed to tell me ANYTHING. Duh. The Zen 5 CPU with no 3D cache is slower than the Zen 5 CPU with 3D cache.
Edit: It seems wildly inconsistent but, in the realm of other Zen 5 CPUs when working properly. I'd wager a guess that manually setting the affinity to the fastest CCD (hopefully not CCD0) would clear up these performance issues and push it up next to the 9950x. This same trick can be used to boost any Ryzen multi-CCD CPU's performance too. Also these higher core count Zen CPUs sometimes benefit from SMT off in some games, I believe Cyberpunk 2077 is one such game.
reply
6:42
Steve, respectfully. I wanna know if it's slower than the 9950x, because LINUS failed to compare it to a non-X3D like that was supposed to tell me ANYTHING. Duh. The Zen 5 CPU with no 3D cache is slower than the Zen 5 CPU with 3D cache.
Edit: It seems wildly inconsistent but, in the realm of other Zen 5 CPUs when working properly. I'd wager a guess that manually setting the affinity to the fastest CCD (hopefully not CCD0) would clear up these performance issues and push it up next to the 9950x. This same trick can be used to boost any Ryzen multi-CCD CPU's performance too. Also these higher core count Zen CPUs sometimes benefit from SMT off in some games, I believe Cyberpunk 2077 is one such game.
reply
someguyperson
That SpecWS Convolution benchmark is mostly a workload you would run on a DSP or a GPU. You might want to run it on a CPU if you're designing something new and you don't want to optimize it for the end device. A potential scenario would be that you're designing a new photo application and you want to design different filters or effects to apply to the image and the application could run on a phone, a GPU, or a CPU depending on the end user's hardware.
I believe this is also used for a good amount of oil & gas people who send sonar into the ground and need to process the data, but that has been running on CUDA for years and years now.
reply
That SpecWS Convolution benchmark is mostly a workload you would run on a DSP or a GPU. You might want to run it on a CPU if you're designing something new and you don't want to optimize it for the end device. A potential scenario would be that you're designing a new photo application and you want to design different filters or effects to apply to the image and the application could run on a phone, a GPU, or a CPU depending on the end user's hardware.
I believe this is also used for a good amount of oil & gas people who send sonar into the ground and need to process the data, but that has been running on CUDA for years and years now.
reply
GraysonZimmer
Bruh, I'm specifically looking for a CPU I can do my work on and play games. I want to know my FPS. I know what the performance will be for my actual work load but I really don't know what it will be in gaming. I want to play VR for 6 hours and then turn around and do unity commissions (lots of compression and decompression), run 3x GPUs for 3d rendering, etc... so yeah, I'll jump to the gaming benchmark. I know there are more of us out there. Someone who wants all the PCIE lanes, good workhorse performance but also plays games. If all our money goes into one PC, why not want to do both
reply
Bruh, I'm specifically looking for a CPU I can do my work on and play games. I want to know my FPS. I know what the performance will be for my actual work load but I really don't know what it will be in gaming. I want to play VR for 6 hours and then turn around and do unity commissions (lots of compression and decompression), run 3x GPUs for 3d rendering, etc... so yeah, I'll jump to the gaming benchmark. I know there are more of us out there. Someone who wants all the PCIE lanes, good workhorse performance but also plays games. If all our money goes into one PC, why not want to do both
reply
Ramune5654
About to attend college soon for Computer Software/Hardware Engineering for my master's. I can afford, and am actually considering upgrading my current desktop platform for black friday. I will be looking at snagging one of these bad boys, the gigabyte TRX50 AI TOP board, an 8 stick 256gb ECC RDIMM memory kit, and slapping an AMD Instinct MI210 above my 7900XTX (the prices look a lot more reasonable). ROCm and Mesa here on openSUSE has been a lot of fun. and I'm oddly curious to see if some of my steam strategy games could possibly scale close to what I will have available lol.
reply
About to attend college soon for Computer Software/Hardware Engineering for my master's. I can afford, and am actually considering upgrading my current desktop platform for black friday. I will be looking at snagging one of these bad boys, the gigabyte TRX50 AI TOP board, an 8 stick 256gb ECC RDIMM memory kit, and slapping an AMD Instinct MI210 above my 7900XTX (the prices look a lot more reasonable). ROCm and Mesa here on openSUSE has been a lot of fun. and I'm oddly curious to see if some of my steam strategy games could possibly scale close to what I will have available lol.
reply
JoeStuffzAlt
CPUs with a lot of cores also tend to have a lower clock, though for a 64-core CPU, 3.2 GHz is amazing. For example, the 24-core Threadripper has a higher clock. The 7960X was sometimes faster than the 7980X, and most likely because of that per-core performance. The 7960X got destroyed in multi-core though. Because of the high clocks for Threadripper, it's not a huge leap. Intel seems to take a significant hit when the core count gets high, down to around 2 GHz (haven't checked Xeons in a while)
reply
CPUs with a lot of cores also tend to have a lower clock, though for a 64-core CPU, 3.2 GHz is amazing. For example, the 24-core Threadripper has a higher clock. The 7960X was sometimes faster than the 7980X, and most likely because of that per-core performance. The 7960X got destroyed in multi-core though. Because of the high clocks for Threadripper, it's not a huge leap. Intel seems to take a significant hit when the core count gets high, down to around 2 GHz (haven't checked Xeons in a while)
reply
Berserker464
As someone who has been daily driving a Threadripper for about 5 years now; re: jumping to gaming charts.
I'm not concerned with its performance in multi core applications. I know it will demolish that workload and will be an upgrade to my older model easily (I do have multiple workloads that count into this). The thing that interests me in this review is if I should upgrade for my gaming workload. That's the unknown variable to me prior to watching, since I use the CPU for work and gaming.
reply
As someone who has been daily driving a Threadripper for about 5 years now; re: jumping to gaming charts.
I'm not concerned with its performance in multi core applications. I know it will demolish that workload and will be an upgrade to my older model easily (I do have multiple workloads that count into this). The thing that interests me in this review is if I should upgrade for my gaming workload. That's the unknown variable to me prior to watching, since I use the CPU for work and gaming.
reply
nickfigert5894
I know it’s probably tough to get everything together, but seeing zen 3/2 comparisons would be (personally) useful. I’m probably going to pull the trigger on this generation, but boy would I enjoy seeing what my personal boosts would hypothetically be.
I know this is a gamers page, and y’all have bills to pay it’s probably quite a reach for me to ask that you keep relevant up to date benchmarks on such expensive semi-enterprise silicon.
reply
I know it’s probably tough to get everything together, but seeing zen 3/2 comparisons would be (personally) useful. I’m probably going to pull the trigger on this generation, but boy would I enjoy seeing what my personal boosts would hypothetically be.
I know this is a gamers page, and y’all have bills to pay it’s probably quite a reach for me to ask that you keep relevant up to date benchmarks on such expensive semi-enterprise silicon.
reply
john39er
I'm at a loss as to how the 5950 is scoring almost the same as the 5900. Results may vary, maybe I won the lottery, but mine turbos more than expected of the 5900x. I typically get about 4.5 all core. But with 4 more cores, and a few mb more cache, you'd expect it to be minimal 25% better, closer to 33%. Instead it's tied or a point above. I don't understand the science testing maybe it ties up a few cores with the same task. Still. I find it interesting
reply
I'm at a loss as to how the 5950 is scoring almost the same as the 5900. Results may vary, maybe I won the lottery, but mine turbos more than expected of the 5900x. I typically get about 4.5 all core. But with 4 more cores, and a few mb more cache, you'd expect it to be minimal 25% better, closer to 33%. Instead it's tied or a point above. I don't understand the science testing maybe it ties up a few cores with the same task. Still. I find it interesting
reply
gamersnexus
So, here’s an interesting test for real-time convolution processing such as in DAWs, testing single-core performance
Have a project with one audio track, and load the track up with convolution reverbs (or anything else convolution distortion Filters As long as it’s standardized and load it to the point of getting buffer dropouts, stuttering etc.
Would be interesting to see if there is 2x generational improvement in such scenarios.
reply
So, here’s an interesting test for real-time convolution processing such as in DAWs, testing single-core performance
Have a project with one audio track, and load the track up with convolution reverbs (or anything else convolution distortion Filters As long as it’s standardized and load it to the point of getting buffer dropouts, stuttering etc.
Would be interesting to see if there is 2x generational improvement in such scenarios.
reply
james2042
Honesty all 3 cpus have their own respective niche they fall into and theres no clear just buy this over this the 64 core is so massively bigger than the other two that it boils down to if more cores = more better, then yes. As for the 32 and 24 core, its now a question of can can you take advantage of the extra 8 cores or do you just need the pcie lanes and memory capacity due to just how similar they are.
reply
Honesty all 3 cpus have their own respective niche they fall into and theres no clear just buy this over this the 64 core is so massively bigger than the other two that it boils down to if more cores = more better, then yes. As for the 32 and 24 core, its now a question of can can you take advantage of the extra 8 cores or do you just need the pcie lanes and memory capacity due to just how similar they are.
reply
nowherebrain
Don't you do it!
Maybe you should know your audience and focus on hardware and reviews that revolve around that....instead of wasting resources.
I value the gaming benchmarks as a developer, I rarely, if ever, have time to play games....but hardware is important to me. Especially in terms of what I can budget for graphically...although I tend to try to make games that would run on a potato.
reply
Don't you do it!
Maybe you should know your audience and focus on hardware and reviews that revolve around that....instead of wasting resources.
I value the gaming benchmarks as a developer, I rarely, if ever, have time to play games....but hardware is important to me. Especially in terms of what I can budget for graphically...although I tend to try to make games that would run on a potato.
reply
syncmonism
The 285k is still losing in multiple game benchmarks to the 5800X3D I thought there were some windows updates or game patches which fixed this, but maybe the performance improvements were smaller than I thought. That is really embarrassing.
The 5800X3D is built on a manufacturing process that's two FULL process node generations behind the Intel 285k! O_O (TSMC 7nm vs. TSMC 3nm)
reply
The 285k is still losing in multiple game benchmarks to the 5800X3D I thought there were some windows updates or game patches which fixed this, but maybe the performance improvements were smaller than I thought. That is really embarrassing.
The 5800X3D is built on a manufacturing process that's two FULL process node generations behind the Intel 285k! O_O (TSMC 7nm vs. TSMC 3nm)
reply
jimtekkit
I discussed the idea of Threadripper workstations for FEA at my workplace but they require a massive desktop tower. Ideally for engineering you want a high-end CAD laptop that can be carried around to the workshop and meetings, which is still an Intel-only market (despite falling on their face in recent years). Of course 64 cores for FEA would still be just ridiculously excessive.
reply
I discussed the idea of Threadripper workstations for FEA at my workplace but they require a massive desktop tower. Ideally for engineering you want a high-end CAD laptop that can be carried around to the workshop and meetings, which is still an Intel-only market (despite falling on their face in recent years). Of course 64 cores for FEA would still be just ridiculously excessive.
reply
aBoogivogi
If anyone from AMD is reading chat; Give me a regular Ryzen CPU with twice as many PCIe lanes (for actual expansion cards and not more M.2 slots) and I will gladly pay a 100-200$ premium for it. This monster being the only way to get more PCIe lanes is a major let down for the modern consumer who have lacked an option like what I'm describing since the golden era of LGA 2011.
reply
If anyone from AMD is reading chat; Give me a regular Ryzen CPU with twice as many PCIe lanes (for actual expansion cards and not more M.2 slots) and I will gladly pay a 100-200$ premium for it. This monster being the only way to get more PCIe lanes is a major let down for the modern consumer who have lacked an option like what I'm describing since the golden era of LGA 2011.
reply
empireempire3545
Biochemist/Bioinformatician here - LAMMPS result is to be expected, since those codes rarely are handcrafted with AVX, and the compilers are usually too dumb to do that automatically. HOWEVER, most MD simulations are nowadays run on the NVIDIA GPUs since CUDA makes them ORDERS of magnitude better at this job. And for 5KUSD i can get what, 2 high end nvidia gpu's
reply
Biochemist/Bioinformatician here - LAMMPS result is to be expected, since those codes rarely are handcrafted with AVX, and the compilers are usually too dumb to do that automatically. HOWEVER, most MD simulations are nowadays run on the NVIDIA GPUs since CUDA makes them ORDERS of magnitude better at this job. And for 5KUSD i can get what, 2 high end nvidia gpu's
reply
LordHypnosis
I am absolutely not in the market for Threadripper but it's always an exciting time when a new batch of CPUs and/or platform come out. I remember the silly things like people running Crysis entirely in software mode on the first generation of Threadripper, but of course it's interesting to see how it stacks up against more gaming of general purpose CPU/GPU combos.
reply
I am absolutely not in the market for Threadripper but it's always an exciting time when a new batch of CPUs and/or platform come out. I remember the silly things like people running Crysis entirely in software mode on the first generation of Threadripper, but of course it's interesting to see how it stacks up against more gaming of general purpose CPU/GPU combos.
reply
Add a review, comment
Other channel videos















