David Lillis: Improving the Accuracy of Automated Facial Age Estimation to Aid CSEM Investigation

Improving the Accuracy of Automated Facial Age Estimation to Aid CSEM Investigation

Felix Anda, David Lillis, Aikaterini Kanta, Brett Becker, Elias Bou-Harb, Nhien An Le Khac and Mark Scanlon

Digital Investigation, 28, Supple:S142, 2019.


The investigation of violent crimes against individuals, such as the investigation of child sexual exploitation material (CSEM), is one of the more commonly encountered criminal investigation types throughout the world. While hash lists of known CSEM content are commonly used to identify previously encountered material on suspects' devices, previously unencountered material requires expert, manual analysis and categorisation. The discovery, analysis, and categorisation of these digital images and videos has the potential to be significantly expedited with the use of automated artificial intelligence (AI) based techniques. Intelligent, automated evidence processing and prioritisation has the potential to aid investigators in alleviating some of the digital evidence backlogs that have become commonplace worldwide. In order for AI-aided CSEM investigations to be beneficial, the fundamental question when analysing multimedia content becomes ``how old is each subject encountered?''. Our work presents the evaluation of existing cloud-based and offline age estimation services, introduces our deep learning model, DS13K, which was created with a VGG-16 Deep Convolutional Neural Network (CNN) architecture, and develops an ensemble technique that improves the accuracy of underage facial age estimation. In addition to our model, a number of existing services including Amazon Rekognition, Microsoft Azure Cognitive Services, How-Old.net, and Deep Expectation (DEX) were used to create an ensemble learning technique. It was found that for the borderline adulthood age range (i.e., 16 to 17 years old), our DS13K model substantially outperformed existing services, achieving a performance accuracy of 68{{\%}}. A comparative examination of the obtained results allowed us to identify performance trends and issues inherent to each service/tool and develop ensemble techniques to improve the accuracy of automated adulthood determination.