MIT and UC San Diego Researchers Develop Method to Identify and Manipulate Abstract Concepts in Large Language Models
Researchers from MIT and UC San Diego have developed a method to detect and manipulate abstract concepts in large language models using recursive feature machines.