LocalLLaMA

2590 readers

6 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup@sh.itjust.works

pax@sh.itjust.works

noneabove1182@sh.itjust.works

AMD reportedly working on gaming Radeon RX 9070 XT GPU with 32GB memory (videocardz.com)

submitted 1 week ago* (last edited 1 week ago) by Eyekaytee@aussie.zone to c/localllama@sh.itjust.works

18 comments fedilink hide all child comments

One might question why an RX 9070 card would need so much memory, but increased capacity can serve purposes beyond gaming, such as Large Language Model (LLM) support for AI workloads. Additionally, it’s worth noting that RX 9070 cards will use 20 Gbps memory, much slower than the RTX 50 series, which features 28-30 Gbps GDDR7 variants. So, while capacity may increase, bandwidth likely won’t.

you are viewing a single comment's thread
view the rest of the comments

[–] MalReynolds@slrpnk.net 12 points 1 week ago* (last edited 1 week ago) (1 children)

it’s worth noting that RX 9070 cards will use 20 Gbps memory, much slower than the RTX 50 series, which features 28-30 Gbps GDDR7 variants.

Seeing, as the article notes, there are no 4Gb modules, they'll need to use twice as many chips, which could mean doubling the bus width (one can dream) to 512 bit (ala 5900), which would make it very tasty. It would be a bold move and get them some of the market share they need so badly.

[–] kopasz7@sh.itjust.works 1 points 1 week ago* (last edited 1 week ago) (1 children)

Clamshell design. RX 7900XTX 24GB | PRO W7900 48GB

Same GPU, same 384 bit bus.

[–] MalReynolds@slrpnk.net 1 points 1 week ago (1 children)

Yeah, says as much in the article. This'll most likely, if it's not vaporware, have a 256 bus, which will be a damn shame for inference speed, just saying if they doubled the bus and sold for ≤ $1000 they'd eat the 5900 alive and generate a lot of goodwill in the influential local LLM community and probably get a lot of free ROCm development. It'd be a damn smart move, but how often can you accuse AMD of that?

[–] kopasz7@sh.itjust.works 2 points 1 week ago* (last edited 1 week ago) (1 children)

they’ll need to use twice as many chips, which could mean doubling the bus width (one can dream) to 512 bit (ala 5900)

I misread this part, thinking you implied a bus width increase is necessary.

For a 512 bit memory bus, AMD would either have to use 1+8 dies if they follow the 7900XTX scheme or have a monolithic behemoth like GB102. The former would have increased power draw but lower manufacturing costs, while the latter is more power efficient and more prone to defects as it's getting close to the aperture size limit.

I'd guess nvidia will soon have to switch to chiplet based GPUs. Maybe AMD stopped (for now?) because not their whole product stack was using chiplet based designs so they had way less flexibility with allocation and binning than with ryzen chiplets.

[–] Tobberone@lemm.ee 1 points 1 week ago

Has monolithic Vs chiplet been confirmed for 9070? A narrow buswidth on a much smaller (compared to previous I/O-die) technology would mean a whole lot in regards to surface area available for the stream processors.