Futurology

1814 readers

64 users here now

founded 1 year ago

MODERATORS

voidx@futurology.today

Lugh@futurology.today

Espiritdescali@futurology.today

AwesomeLowlander@futurology.today

Many artificial intelligence (AI) systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest. (techxplore.com)

submitted 6 months ago by Lugh@futurology.today to c/futurology@futurology.today

29 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Endward23@futurology.today 2 points 6 months ago (1 children)

"But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI's training task. Deception helps them achieve their goals."

Sounds like something I would expect from an evolved system. If deception is the best way to win, it is not irrational for a system to choice this as a strategy.

In one study, AI organisms in a digital simulator "played dead" in order to trick a test built to eliminate AI systems that rapidly replicate.

Interesting. Can somebody tell me which case it is?

As far as I understand, Park et al. did some kind of metastudy as a overview of literatur.

[–] Endward23@futurology.today 3 points 6 months ago

"Indeed, we have already observed an AI system deceiving its evaluation. One study of simulated evolution measured the replication rate of AI agents in a test environment, and eliminated any AI variants that reproduced too quickly.10 Rather than learning to reproduce slowly as the experimenter intended, the AI agents learned to play dead: to reproduce quickly when they were not under observation and slowly when they were being evaluated." Source: AI deception: A survey of examples, risks, and potential solutions, Patterns (2024). DOI: 10.1016/j.patter.2024.100988

As it appears, it refered to: Lehman J, Clune J, Misevic D, Adami C, Altenberg L, et al. The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities. Artif Life. 2020 Spring;26(2):274-306. doi: 10.1162/artl_a_00319. Epub 2020 Apr 9. PMID: 32271631.

Very interesting.