Science ❯ Computer Science ❯ Artificial Intelligence
Bias in AI AI Safety Risk Assessment Explainable AI Safety and Alignment Behavioral Control Human Impact Misuse of AI Value Alignment Model Interpretability Motivated Reasoning Fairness in AI Data Compliance Safety Concerns AI Limitations Human Values Alignment Social Biases Privacy Concerns AI Research and Development Military Applications Public Benefit Corporations Reliability in AI Systems
The revision elevates misalignment plus persuasion into formal risk thresholds with mandatory safety reviews before release.