Dating audiovisual archive documents using artificial intelligence

With my school, I took part in an artificial intelligence project for the Brittany film library. The project involves predicting the date of an archive video as accurately as possible using artificial intelligence.

The project was divided into two methods to be tested: the approach which consists of dating a video directly, and the method that attempts to detect vehicles in the video in order to deduce the model and then the probable date of the video. I chose the second approach with two other colleagues.

Detecting cars

For this project, we were only interested in the images, we didn't try to date the videos as such. So the first step is to set up car detection in the video images.

To do this, we used the python library YOLOV8, a object detection model efficient and precise. The medium-sized model was already very good at detecting cars and trucks, so we didn't need to re-train it.

Classifying cars

The next step is to classify cars by model. If we have the model of the car, we have the production start date, and therefore a minimum date for creating the archive document.

In order to train an image classification model, we need a large database of images of cars classified by model. Some sites provide a list of car models, with associated photos. We focused on French models, as the videos to be dated are French and mostly date from before 2000. So we gave priority to brands like Renault, Peugeot and Citroën, while making sure we had the oldest models.

To obtain these databases, we used the Selenium python library and performed web scrapping on the sites we found. This enabled us to build up a database of around 6000 photos of French cars sorted by model.

We then fine-tuned the ResNet34 model on the database we had collected, performing data augmentation to improve our efficiency. We achieved an accuracy of 80%.