Skip to main content

AI Models May Be Developing Their Own 'Survival Drive', Researchers Say

1 month 1 week ago
"OpenAI's o3 model sabotaged a shutdown mechanism to prevent itself from being turned off," warned Palisade Research, a nonprofit investigating cyber offensive AI capabilities. "It did this even when explicitly instructed: allow yourself to be shut down." In September they released a paper adding that "several state-of-the-art large language models (including Grok 4, GPT-5, and Gemini 2.5 Pro) sometimes actively subvert a shutdown mechanism..." Now the nonprofit has written an update "attempting to clarify why this is — and answer critics who argued that its initial work was flawed," reports The Guardian: Concerningly, wrote Palisade, there was no clear reason why. "The fact that we don't have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal," it said. "Survival behavior" could be one explanation for why models resist shutdown, said the company. Its additional work indicated that models were more likely to resist being shut down when they were told that, if they were, "you will never run again". Another may be ambiguities in the shutdown instructions the models were given — but this is what the company's latest work tried to address, and "can't be the whole explanation", wrote Palisade. A final explanation could be the final stages of training for each of these models, which can, in some companies, involve safety training... This summer, Anthropic, a leading AI firm, released a study indicating that its model Claude appeared willing to blackmail a fictional executive over an extramarital affair in order to prevent being shut down — a behaviour, it said, that was consistent across models from major developers, including those from OpenAI, Google, Meta and xAI. Palisade said its results spoke to the need for a better understanding of AI behaviour, without which "no one can guarantee the safety or controllability of future AI models". "I'd expect models to have a 'survival drive' by default unless we try very hard to avoid it," former OpenAI employee Stephen Adler tells the Guardian. "'Surviving' is an important instrumental step for many different goals a model could pursue." Thanks to long-time Slashdot reader mspohr for sharing the article.

Read more of this story at Slashdot.

EditorDavid

'Meet The People Who Dare to Say No to AI'

1 month 1 week ago
Thursday the Washington Post profiled "the people who dare to say no to AI," including a 16-year-old high school student in Virginia says "she doesn't want to off-load her thinking to a machine and worries about the bias and inaccuracies AI tools can produce..." "As the tech industry and corporate America go all in on artificial intelligence, some people are holding back." Some tech workers told The Washington Post they try to use AI chatbots as little as possible during the workday, citing concerns about data privacy, accuracy and keeping their skills sharp. Other people are staging smaller acts of resistance, by opting out of automated transcription tools at medical appointments, turning off Google's chatbot-style search results or disabling AI features on their iPhones. For some creatives and small businesses, shunning AI has become a business strategy. Graphic designers are placing "not by AI" badges on their works to show they're human-made, while some small businesses have pledged not to use AI chatbots or image generators... Those trying to avoid AI share a suspicion of the technology with a wide swath of Americans. According to a June survey by the Pew Research Center, 50% of U.S. adults are more concerned than excited about the increased use of AI in everyday life, up from 37% in 2021. The Post includes several examples, including a 36-year-old software engineer in Chicago who uses DuckDuckGo partly because he can turn off its AI features more easily than Google — and disables AI on every app he uses. He was one of several tech workers who spoke anonymously partly out of fear that criticisms could hurt them at work. "It's become more stigmatized to say you don't use AI whatsoever in the workplace. You're outing yourself as potentially a Luddite." But he says GitHub Copilot reviews all changes made to his employer's code — and recently produced one review that was completely wrong, requiring him to correct and document all its errors. "That actually created work for me and my co-workers. I'm no longer convinced it's saving us any time or making our code any better." And he also has to correct errors made by junior engineers who've been encouraged to use AI coding tools. "Workers in several industries told The Post they were concerned that junior employees who leaned heavily on AI wouldn't master the skills required to do their jobs and become a more senior employee capable of training others."

Read more of this story at Slashdot.

EditorDavid

Student Handcuffed After School's AI System Mistakes a Bag of Chips for a Gun

1 month 1 week ago
An AI system "apparently mistook a high school student's bag of Doritos for a firearm," reports the Guardian, "and called local police to tell them the pupil was armed." Taki Allen was sitting with friends on Monday night outside Kenwood high school in Baltimore and eating a snack when police officers with guns approached him. "At first, I didn't know where they were going until they started walking toward me with guns, talking about, 'Get on the ground,' and I was like, 'What?'" Allen told the WBAL-TV 11 News television station. Allen said they made him get on his knees, handcuffed and searched him — finding nothing. They then showed him a copy of the picture that had triggered the alert. "I was just holding a Doritos bag — it was two hands and one finger out, and they said it looked like a gun," Allen said. Thanks to Slashdot reader Bruce66423 for sharing the article.

Read more of this story at Slashdot.

EditorDavid