this post was submitted on 30 Jan 2025

529 points (99.4% liked)

Memes

46567 readers

2001 users here now

Rules:

Be civil and nice.
Try not to excessively repost, as a rule of thumb, wait at least 2 months to do it if you have to.

founded 5 years ago

MODERATORS

gary_host_laptop@lemmy.ml

sexy_peach@feddit.de

cyclohexane@lemmy.ml

cypherpunks@lemmy.ml

529

2 in a single week that is crazy (lemmy.ml)

submitted 1 week ago by Ascend910@lemmy.ml to c/memes@lemmy.ml

20 comments fedilink hide all child comments

blob:https://phtn.app/bce94c48-9b96-4b8e-a4fd-e90166d56ed7

you are viewing a single comment's thread
view the rest of the comments

[–] brucethemoose@lemmy.world 1 points 6 days ago* (last edited 6 days ago)

Yes! Try this model: https://huggingface.co/arcee-ai/Virtuoso-Small-v2

Or the 14B thinking model: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

But for speed and coherence, instead of ollama, I'd recommend running it through Aphrodite or TabbyAPI as a backend, depending if you prioritize speed or long inputs. They both act as generic OpenAI endpoints.

I'll even step you through it and upload a quantization for your card, if you want, as it looks like there's not a good-sized exl2 on huggingface.