TechTakes

1862 readers

170 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

dgerard@awful.systems

Facebook Pushes Its Llama 4 AI Model to the Right, Wants to Present “Both Sides” [404 Media] (www.404media.co)

submitted 1 month ago by BlueMonday1984@awful.systems to c/techtakes@awful.systems

9 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] will_a113@lemmy.ml -2 points 1 month ago (2 children)

Do you have any sources on this? I started looking around for pre-training, training and post-training impact of new input but didn't find what I was looking for. In just my own experience with retraining (e.g. fine-tuning) pre-trained models, it seems to be pretty easy to add or remove data to get significantly different results than the original model.

[–] sailor_sega_saturn@awful.systems 6 points 1 month ago

My go to source for the fact that LLM chatbots suck at writing reasoned replies is https://chatgpt.com/

[–] corbin@awful.systems 5 points 1 month ago

It's well-known folklore that reinforcement learning with human feedback (RLHF), the standard post-training paradigm, reduces "alignment," the degree to which a pre-trained model has learned features of reality as it actually exists. Quoting from the abstract of the 2024 paper, Mitigating the Alignment Tax of RLHF (alternate link):

LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting pretrained abilities, which is also known as the alignment tax.