this post was submitted on 10 Jan 2025
12 points (92.9% liked)

LocalLLaMA

2585 readers
7 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago
MODERATORS
 

Do i need industry grade gpu's or can i scrape by getring decent tps with a consumer level gpu.

you are viewing a single comment's thread
view the rest of the comments
[–] muntedcrocodile@lemm.ee 1 points 1 month ago (2 children)

Huh so basicly sidestepping the gpu issue entirly and essentially just using some other special piece of silicon with fast (but conventional ram). I still dont understand why u cant distribute a large llm over many different processors each holding a section of the parameters in memory.

[–] tpWinthropeIII@lemmy.world 3 points 1 month ago

Not exactly. Digits still uses a Blackwell GPU, only it uses unified RAM as virtual VRAM instead of actual VRAM. The GPU is probably a down clocked Blackwell. Speculation I've seen is that these are defective and repurposed Blackwells; good for us. By defective I mean they can't run at full speed or are projected to have the cracking die problem, etc.

[–] breakingcups@lemmy.world 2 points 1 month ago

I still dont understand why u cant distribute a large llm over many different processors each holding a section of the parameters in memory.

Because each weight in a layer influences each weight in the next layer, which means the bandwidth requirements are enormous and regular networking solutions are insufficient for that.