LocalLLaMA

2590 readers

10 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup@sh.itjust.works

pax@sh.itjust.works

noneabove1182@sh.itjust.works

Beginner questions thread (sh.itjust.works)

submitted 1 year ago by noneabove1182@sh.itjust.works to c/localllama@sh.itjust.works

28 comments fedilink hide all child comments

Trying something new, going to pin this thread as a place for beginners to ask what may or may not be stupid questions, to encourage both the asking and answering.

Depending on activity level I'll either make a new one once in awhile or I'll just leave this one up forever to be a place to learn and ask.

When asking a question, try to make it clear what your current knowledge level is and where you may have gaps, should help people provide more useful concise answers!

you are viewing a single comment's thread
view the rest of the comments

[–] hendrik@palaver.p3x.de 2 points 1 day ago* (last edited 1 day ago) (1 children)

From what I know, I assume yes, the relation between model size and speed/performance should be linear. Maybe there is some additional small overhead making it a bit faster or slower than expected. But I'm really not an expert on the maths, so don't trust me.

And maybe have a look at this bugreport: https://github.com/ggml-org/llama.cpp/issues/11332
I think it matches your situation. They resolve this by messing with the batch size and someone recommends not to use Vulkan on an iGPU.

[–] corvus@lemmy.ml 1 points 1 day ago

Oh great, thanks