this post was submitted on 04 Jun 2025

327 points (97.7% liked)

Technology

71276 readers

4101 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

327

Wikimedia Foundation's plans to introduce AI-generated article summaries to Wikipedia (lemmy.dbzer0.com)

submitted 1 week ago* (last edited 1 day ago) by antonim@lemmy.dbzer0.com to c/technology@lemmy.world

142 comments fedilink hide all child comments

I don't know if this is an acceptable format for a submission here, but here it goes anyway:

Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries

We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.

Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.

In our previous research (Content Simplification), we have identified two needs:

The need for readers to quickly get an overview of a given article or page

The need for this overview to be written in language the reader can understand

Etc., you should check the full text yourself. There's a brief video showing how it might look: https://www.youtube.com/watch?v=DC8JB7q7SZc

This hasn't been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn't the introductory paragraphs do the same job already?), and some other complaints have been provided as well:

Taking a quote from the page for the usability study:

"Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level."

Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they 'use AI for everything'. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don't think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)

The survey the user mentions is this one: https://wikimedia.qualtrics.com/jfe/form/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there's no judgment of their actual quality, and they're only asking for people's feedback on how they should be presented. I filled it out and couldn't even find the space to say that e.g. the summary they show is written almost insultingly, like it's meant for particularly dumb children, and I couldn't even tell whether it is accurate because they just scroll around in the video.

Very extensive discussion is going on at the Village Pump (en.wiki).

The comments are also overwhelmingly negative, some of them pointing out that the summary doesn't summarise the article properly ("Perhaps the AI is hallucinating, or perhaps it's drawing from other sources like any widespread llm. What it definitely doesn't seem to be doing is taking existing article text and simplifying it." - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:

I'm glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it "summarises". Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)

One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)

Finally, some comments are problematising the whole situation with WMF working behind the actual wikis' backs:

This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed "early and often" of new developments. We shouldn't be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others') statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.

Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that's an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)

Again, I recommend reading the whole discussion yourself.

EDIT: WMF has announced they're putting this on hold after the negative reaction from the editors' community. ("we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together")

(page 2) 50 comments

sorted by: hot top controversial new old

[–] kittenzrulz123@lemmy.blahaj.zone 19 points 1 week ago* (last edited 1 week ago)

Hell nah, I am never donating to Wikipedia if they go AI.

[–] deathbird@mander.xyz 18 points 1 week ago

This is not the medicine for curing what ails Wikipedia, but when all anyone is selling is a hammer....

[–] bitwolf@sh.itjust.works 18 points 1 week ago

Guess they're going to double down on the donation campaign considering the cost involved with ai

[–] Vanilla_PuddinFudge@infosec.pub 18 points 1 week ago (1 children)

If you can't make people smarter, make text dumber.

load more comments (1 replies)

[–] sbv@sh.itjust.works 13 points 1 week ago (7 children)

There's a core problem that many Wikipedia articles are hard for a layperson to read and understand. The statement about reading level is one way to express this.

The Simple version of articles shows humans can produce readable text. But there aren't enough Simple articles, and the Simple articles are often incomplete.

I don't think AI should be solely trusted with summarization/translation, but it might have a place in the editing cycle.

load more comments (7 replies)

[–] Deflated0ne@lemmy.world 13 points 1 week ago

[–] pinball_wizard@lemmy.zip 12 points 1 week ago

The big issue I see here isn't the proposed solution, it's the public image of doing something the tech bro billionaires are pushing hard right now.

It looks a bit like choosing the other side of the class war from their contributors.

Wikipedia, in particular, may not be able to afford that negatvie image, right now.

I could welcome this kind of tool later, but their timing sucks.

[–] BrianTheeBiscuiteer@lemmy.world 11 points 1 week ago

I do have concerns about this but it's really all about the usage, not the AI itself. Would the AI version be the only version allowed? Would the summaries get created on the fly for every visitor? Would edits to an AI summary be allowed? Would this get applied to and alter existing summaries?

I'm totally fine with LLMs and AI as a stop-gap for missing info or a way to coach or critique a human-written summary, but generally I haven't seen good results when AI is allowed to do its thing without a human reviewing or guiding the outputs.

[–] miguel@fedia.io 10 points 1 week ago

Well, this inspired me to swing my monthly wikipedia donation over to a world book sub instead. It's bad enough that wikipedia was a very dubious source of info, but now this is just too much.

[–] tfm@europe.pub 8 points 1 week ago

Thanks, I hate it.

[–] wpb@lemmy.world 7 points 1 week ago (6 children)

It's kind of indirectly related, but adding a query parameter udm=14 to the url of your Google searches removes the AI summary at the top, and there are plugins for Firefox that do this for you. My hopes for this WM project are that similar plugins will be possible for Wikipedia.

The annoying thing about these summaries is that even for someone who cares about the truth, and gathering actual information, rather than the fancy autocomplete word salad that LLMs generate, it is easy to "fall for it" and end up reading the LLM summary. Usually I catch myself, but I often end up wasting some time reading the summary. Recently the non-information was so egregiously wrong (it called a certain city in Israel non-apartheid), that I ended up installing the udm 14 plugin.

In general, I think the only use cases for fancy autocomplete are where you have a way to verify the answer. For example, if you need to write an email and can't quite find the words, if an LLM generates something, you will be able to tell whether it conveys what you're trying to say by reading it. Or in case of writing code, if you've written a bunch of tests beforehand expressing what the code needs to do, you can run those on the code the LLM generates and see if it works (if there's a Dijkstra quote that comes to your mind reading this: high five, I'm thinking the same thing).

I think it can be argued that Wikipedia articles satisfy this criterion. All you need to do to verify the summary is read the article. Will people do this? I can only speak for myself, and I know that, despite my best intentions, sometimes I won't. If that's anything to go by, I think these summaries will make the world a worse place.

load more comments (6 replies)

[–] UberKitten@lemmy.blahaj.zone 6 points 1 week ago (2 children)

sounds like a good use case for an LLM. hope the issues get figured out

[–] RandomVideos@programming.dev 5 points 1 week ago

It would be a good use case for an LLM if it didnt make up false information

[–] notabot@lemm.ee 4 points 1 week ago

It might, possibly, be a viable use case if the LLM produced the summary for an editor, who then confirmed it's veracity and appropriateness to the article and posted it themselves.

[–] drmoose@lemmy.world 6 points 1 week ago* (last edited 1 week ago) (7 children)

AI threads on lemmy are always such a disappointment.

Its ironic that people put so little thought into understanding this and complain about "ai slop". The slop was in your heads all along.

To think that more accessibility for a project that is all about sharing information with people to whom information is least accessible is a bad thing is just an incredible lack of awareness.

Its literally the opposite of everything people might hate AI for:

RAG is very good and accurate these days that doesn't invent stuff. Especially for short content like wiki articles. I work with RAG almost every day and never seen it hallucinate with big models.
it's open and not run a "big scary tech"
it's free for all and would save millions of editor hours and allow more accuracy and complexity in the articles themselves.

And to top it all you know this is a lost fight even if you're right so instead of contributing to steering this societal ship these people cover their ears and "bla bla bla we don't want it". It's so disappointingly irresponsible.

[–] Don_alForno@feddit.org 15 points 1 week ago

I'll make a note to get back to you about this in a few years when they start blocking people from correcting AI authored articles.

[–] qevlarr@lemmy.world 11 points 1 week ago

The point is they should be fighting AI, not open the door even an inch to AI on their site. Like so many other endeavors, it only works because the contributors are human. Not corpos, not AI, not marketing. AI kills Wikipedia if they let that slip. Look at StackOverflow, look at Reddit, look at Google search, look at many corporate social media. Dead internet theory is all around us.

Wikipedia is trusted because it's all human. No other reason

[–] antonim@lemmy.dbzer0.com 8 points 1 week ago* (last edited 1 week ago)

RAG is very good and accurate these days that doesn’t invent stuff.

In the OP I linked a comment showing how the summary presented in the showcase video is not actually very accurate and it definitely does invent some elements that are not present in the article that is being summarised.

And in general the "accessibility" that primarily seems to work by expressing things in imprecise, unscientific or emotionally charged terms could well be more harmful than less immediately accessible but accurate and unambiguous content. You appeal to Wikipedia being "a project that is all about sharing information with people to whom information is least accessible", but I don't think this ever was that much of a goal - otherwise the editors would have always worked harder on keeping the articles easily accessible and comprehensible to laymen (in fact I'd say traditional encyclopedias are typically superior to Wikipedia in this regard).

and would save millions of editor hours and allow more accuracy and complexity in the articles themselves.

Sorry but you're making things up here, not even the developers of the summaries are promising such massive consequences. The summaries weren't meant to replace any of the usual editing work, they weren't meant to replace the normal introductory paragraphs or anything else. How would they save these supposed "millions of editor hours" then? In fact, they themselves would have to be managed by the editors as well, so all I see is a bit of additional work.

[–] phantomwise@lemmy.ml 6 points 1 week ago (1 children)

I don't think the idea itself is awful, but everyone is so fed up with AI bullshit that any attempt to integrate even an iota of it will be received very poorly, so I'm not sure it's worth it.

load more comments (1 replies)

load more comments (3 replies)

[–] qevlarr@lemmy.world 6 points 1 week ago

🪦🪦🪦🪦

RIP Wikipedia, we will miss you

[–] Redex68@lemmy.world 6 points 1 week ago* (last edited 1 week ago) (1 children)

Honestly, I think it's a good idea. As long as it's clearly highlighted that "this is an AI generated summary", it could be very useful. I feel like a lot of people here have never tried to e.g. read a maths article without having a PHD in mathematics. I would often find myself trying to remember what a term means or how it works in practice, only to be met by a giant article going into extreme technical detail that I for the life of me cannot understand, but if I were to ask ChatGPT to explain it I would immediately get it.

[–] JandroDelSol@lemmy.world 11 points 1 week ago

People will believe the AI summary without reading the article, and AI hallucinates constantly. Never trust an output from a LLM

[–] vrighter@discuss.tchncs.de 5 points 1 week ago

the summary (not ecessarily ai generated) I read elsewhere is what got me to wikipedia in the first place.

[–] jsomae@lemmy.ml 4 points 1 week ago

ok, just so long as the articles themselves aren't AI generated.

load more comments