this post was submitted on 05 Feb 2025
447 points (97.7% liked)
196
16883 readers
1253 users here now
Be sure to follow the rule before you head out.
Rule: You must post before you leave.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It's really bad for the environment, it's also trained on stuff that it shouldn't be such as copy writed material.
the training process being shiddy i completely agree with. that is simply awful and takes a shidload of resources to get a good model.
but... running them... feels oki to me.
as long as you're not running some bigphucker model like GPT4o to do something a smoler model could also do, i feel it kinda is okay.
32B parameter size models are getting really, really good, so the inference (running) costs and energy consumption is already going down dramatically when not using the big models provided by BigEvilCo™.
Models can clearly be used for cool stuff. Classifying texts is the obvious example. Having humans go through that is insane and cost-ineffective. Meanwhile models can classify multiple pages of text in half a second with a 14B parameter (8GB) model.
obviously using bigphucker models for everything is bad. optimizing tasks to work on small models, even at 3B sizes, is just more cost-effective, so i think the general vibe will go towards that direction.
people running their models locally to do some stuff will make companies realize they don't need to pay 15€ per 1.000.000 tokens to OpenAI for their o1 model for everything. they will realize that paying like 50 cents for smaller models works just fine.
if i didn't understand ur point, please point it out. i'm not that good at picking up on stuff..