Technology

59692 readers

3932 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

177

ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate | It was bad at recognizing relationships and needs selective training, researchers say. (arstechnica.com)

submitted 11 months ago by L4s@lemmy.world to c/technology@lemmy.world

24 comments fedilink hide all child comments

ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate | It was bad at recognizing relationships and needs selective training, researchers say.::It was bad at recognizing relationships and needs selective training, researchers say.

you are viewing a single comment's thread
view the rest of the comments

[–] autotldr@lemmings.world 1 points 11 months ago

This is the best summary I could come up with:

While the chatty AI bot has previously underwhelmed with its attempts to diagnose challenging medical cases—with an accuracy rate of 39 percent in an analysis last year—a study out this week in JAMA Pediatrics suggests the fourth version of the large language model is especially bad with kids.

The medical field has generally been an early adopter of AI-powered technologies, resulting in some notable failures, such as creating algorithmic racial bias, as well as successes, such as automating administrative tasks and helping to interpret chest scans and retinal images.

But AI's potential for problem-solving has raised considerable interest in developing it into a helpful tool for complex diagnostics—no eccentric, prickly, pill-popping medical genius required.

For ChatGPT's test, the researchers pasted the relevant text of the medical cases into the prompt, and then two qualified physician-researchers scored the AI-generated answers as correct, incorrect, or "did not fully capture the diagnosis."

Though the chatbot struggled in this test, the researchers suggest it could improve by being specifically and selectively trained on accurate and trustworthy medical literature—not stuff on the Internet, which can include inaccurate information and misinformation.

"This presents an opportunity for researchers to investigate if specific medical data training and tuning can improve the diagnostic accuracy of LLM-based chatbots," the authors conclude.

The original article contains 721 words, the summary contains 211 words. Saved 71%. I'm a bot and I'm open source!