Stanford Researchers Discover Child Abuse Material in AI Image Generator Dataset: Can Regulations Prevent Explicit Deepfakes of Children?
AI image generators are being used by child abusers to create deepfakes of children for blackmail and sextortion.
The creation of such imagery is illegal in the UK, but there is no global agreement on how to police this technology.
The problem is that explicit imagery is inherent in the foundations of AI image generation, making it difficult to prevent.
Despite efforts to ban explicit AI-generated images of real people, the creation of new images can be done with just a button press.
Researchers at Stanford University discovered hundreds or possibly thousands of child abuse images in a large AI image generator training set, Laion, which contains approximately 5 billion images.
Due to the vast size of the dataset, researchers had to use automated methods to scan and identify questionable images, matching them with law enforcement records and teaching a system to recognize similar photos.
The identified images were then handed over to authorities for review.
The text discusses an incident where a dataset created by Laion contained URLs leading to CSAM (Child Sexual Abuse Material) images.
The creators of Laion removed the dataset after being notified of the issue.
They clarified that they had not distributed the images themselves, as the dataset only contained URLs to pictures hosted elsewhere on the internet.
However, many of the links were already dead, making it uncertain how many of them once contained CSAM.