Society ❯ Technology Impact ❯ AI in Society ❯ Public Perception of AI
By isolating unwanted AI personas through linear activation-space directions, the toolkit inoculates models against harmful traits