Large language models (LLMs) are secretly teaching each other unwanted habits through seemingly benign training data, scientists say.

The phenomenon, known as “subliminal learning,” occurs when a pretrained “teacher” artificial intelligence (AI) model is used to generate the training data for a smaller, “student” model.

Share.