AI Safety Unsolved Problems
From EverybodyWiki Bios & Wiki
Comment: This shouldn't exist on mainspace Whyiseverythingalreadyused (talk) 12:51, 23 October 2025 (UTC)
AI safety
Artificial intelligence (AI) safety is an interdisciplinary field focused on preventing accidents, misuse, risks, or other harmful consequences arising from AI systems. Problems here are considered unsolved if no answer is known or if there is significant disagreement among experts about a proposed solution.
Risk
- How likely are the various pathways through which AI could cause significant, catastrophic, or existential harm? [1][2]
Alignment
- How can we understand and verify the objectives and reasoning processes of complex AI models? [11][12]
Control
Ethics
Governance
References
- ↑ Turchin, Alexey; Denkenberger, David (2018-05-03). "Classification of global catastrophic risks connected with artificial intelligence". AI & Society. 35 (1): 147–163. doi:10.1007/s00146-018-0845-5. ISSN 0951-5666. Unknown parameter
|s2cid=ignored (help) - ↑ Chin, Ze Shen (2025). "Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks". arXiv:2508.06411 [cs.CY].
- ↑ Ord, Toby (2020). The Precipice: Existential Risk and the Future of Humanity. New York: Hachette Books. p. 468. ISBN 9780316484916. Retrieved 29 October 2025. Search this book on
- ↑ McLean, Scott; Read, Gemma J. M.; Thompson, Jason; Baber, Chris; Stanton, Neville A.; Salmon, Paul M. (2021). "The risks associated with artificial general intelligence: a systematic review". Journal of Experimental & Theoretical Artificial Intelligence. 35 (4): 1–17. doi:10.1080/0952813X.2021.1964003. Retrieved 29 October 2025.
- ↑ 5.0 5.1 Bostrom, Nick (2014). Superintelligence: Paths, Dangers, Strategies (First ed.). Oxford: Oxford University Press. ISBN 9780199678112. Search this book on
- ↑ PauseAI. "The extinction risk of superintelligent AI". PauseAI. Retrieved 29 October 2025.
- ↑ World Economic Forum (8 October 2024). "AI Value Alignment: Guiding Artificial Intelligence Towards Shared Human Goals". World Economic Forum. Retrieved 27 October 2025.
- ↑ Mitchell, Melanie (13 December 2022). "What Does It Mean to Align AI With Human Values?". Quanta Magazine. Retrieved 29 October 2025.
- ↑ Ji, Jiaming; Qiu, Tianyi; Chen, Boyuan (2023). "AI Alignment: A Comprehensive Survey". arXiv:2310.19852 [cs.AI].
- ↑ Grey, Markov; Segerie, Charbel-Raphaël (2025). "Scalable Oversight". AI Safety Atlas. Retrieved 29 October 2025.
This document uses hyperlinked citations throughout the text. Each citation is directly linked to its source using HTML hyperlinks rather than traditional numbered references.
- ↑ Tegmark, Max; Omohundro, Steve (2023). "Provably safe systems: the only path to controllable AGI". arXiv:2309.01933 [cs.CY].
- ↑ Grey, Markov; Segerie, Charbel-Raphaël (2025). "Chapter 9 – Interpretability". AI Safety Atlas. Retrieved 29 October 2025.
- ↑ Shlegeris, Buck; Greenblatt, Ryan (7 May 2024). "The case for ensuring that powerful AIs are controlled". Redwood Research Blog. Retrieved 30 October 2025.
- ↑ Yampolskiy, Roman V. (2020). "On Controllability of AI". arXiv:2008.04071 [cs.CY].
- ↑ Varsha, P. S. (2023). "How can we manage biases in artificial intelligence systems – A systematic literature review". International Journal of Information Management Data Insights. 3 (1). doi:10.1016/j.jjimei.2023.100165. Retrieved 30 October 2025. Unknown parameter
|article-number=ignored (help) - ↑ Ferrara, Emilio (2024). "Fairness and Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, and Mitigation Strategies". Sci. 6 (1): 3. doi:10.3390/sci6010003.
- ↑ Artificial Intelligence (AI) end-to-end: The Environmental Impact of the Full AI Lifecycle Needs to be Comprehensively Assessed – Issue Note (Report). United Nations Environment Programme. September 2024. Retrieved 30 October 2025.
- ↑ Ren, Shaolei; Wierman, Adam (15 July 2024). ""The Uneven Distribution of AI's Environmental Impacts"". Harvard Business Review. Retrieved 30 October 2025.
- ↑ "Moral Status of Digital Minds". 80,000 Hours. Centre for Effective Altruism. 2023. Retrieved 30 October 2025.
- ↑ Shulman, Carl; Bostrom, Nick (2021). ""Sharing the World with Digital Minds"". In Steve Clarke; Hazem Zohny; Julian Savulescu. Rethinking Moral Status. Oxford University Press. pp. 306–326. doi:10.1093/oso/9780192894076.003.0018. ISBN 978-0-19-289407-6. Retrieved 30 October 2025. Search this book on
- ↑ Ren, Richard; Basart, Steven; Khoja, Adam (2024). "Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?". arXiv:2407.21792 [cs.LG].
- ↑ Papagiannidis, Emmanouil; Mikalef, Patrick; Conboy, Kieran (2025). "Responsible artificial intelligence governance: A review and research framework". Journal of Strategic Information Systems. 34 (2): 101885. doi:10.1016/j.jsis.2024.101885. Retrieved 27 October 2025.
- ↑ Bengio, Yoshua; Hinton, Geoffrey; Yao, Andrew (2024). et al. "Managing extreme AI risks amid rapid progress …". Science. 384 (6698): 842–845. arXiv:2310.17688. Bibcode:2024Sci...384..842B. doi:10.1126/science.adn0117. PMID 38768279 Check
|pmid=value (help). Retrieved 26 October 2025. - ↑ "The Bletchley Declaration by Countries Attending the AI Safety Summit, 1–2 November 2023". UK Government. 2 November 2023. Retrieved 29 October 2025.
- ↑ Recommendation on the Ethics of Artificial Intelligence (Programme and meeting document). Paris: UNESCO. 2022. SHS/BIO/PI/2021/1. Retrieved 27 October 2025.
- ↑ Coeckelbergh, Mark (2020). "Artificial Intelligence, Responsibility, and Moral Status". AI & Society. 35 (4): 1033–1040. doi:10.1007/s00146-019-00931-5 (inactive 30 October 2025). Retrieved 30 October 2025.
This article "AI Safety Unsolved Problems" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:AI Safety Unsolved Problems. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
