Artificial intelligence screens war photos to meet EU requirements
“Our study has much to offer in terms of accessibility on the one hand and AI and image research on the other,” says Anssi Männistö, university lecturer in visual journalism.
Männistö is a member of research projects related to AI and image analysis of war photos taken during WW2. The starting point is basic research whose results can help the work of digital archives that contain millions of images. The results will benefit memory organisations, such as archives, libraries, museums and research institutes, as well as newspapers, media and image archives owned by private companies.
A solution for museums’ problem with EU Directive
The multidisciplinary research group received funding from the Academy of Finland’s Intelligent Society Programme for the research project “Improving Public Accessibility of Large Image Archives” (IPALIA), which uses AI to study the accessibility of large image archives. At Tampere University, three research fields participate in the study: machine learning, information studies and visual culture research.
The project aims to improve public access to large image databases and collections in a way that meets the conditions of the EU’s Web Accessibility Directive that entered into force in September.
The directive requires that all images that are publicly available online must be accompanied by a textual description of the content. This requirement is a bottleneck for image archives and museums because there are no cheap and efficient methods to do this work. The problem is particularly great for Finnish-language and Finnish image content.
The research project will use materials from the Finnish Wartime Photograph Archive of the Finnish Defence Forces, which contains almost 160,000 photographs taken by the information troops (TK) during the Finnish Winter, Continuation and Lapland Wars captured in 1939–1945.
According to Männistö, the aim of the new research project that is about be launched is to create a model that provides content description for the TK images in accordance with the directive. The same model will also help other museums and image archives to meet the conditions of the directive.
“The purpose is to provide help in a big challenge. The obligation to have descriptive texts is a rather big problem for many museums,” Männistö says.
Männistö considers the situation to be optimal for AI research because there is a high demand for practical accessibility solutions.
Ideological undertones of war photographs
The captions for images are often written with an emphasis on factors that do not meet the conditions for accessibility. Images from the front may even contain an ideological or propagandistic description of the situation.
AI and algorithms are not neutral either, but in their basic form they extract quantitative information from many images, for example, of individuals, animals, objects, buildings and their numbers.
AI does not tell you directly that the sergeant major and the captain in the photo are in an artillery barrage or in a certain place in their free time. Instead, it can tell, for example, which patterns or objects are in the front or behind the main characters. Männistö is particularly fascinated by the fact that AI is also able to do spatial analysis.
AI produces basic descriptive text, but it is still partly unclear how much it manages to say about the sexes and age groups of the people in the photo and how well it is able to determine, for example, the weather or the season.
“We have a pretty big basic research component. It is not possible to say with certainty how deep we will get, but we are quite sure that we will gain interesting results,” Männistö says.
The research project has agreed with the Tampere Museum Centre Vapriikki that the model created in the project will be tested with some of Vapriikki’s materials in the final phase.
“There may be certain special characteristics in the war imagery that are good to analyse and exclude and to examine how generalisable our method is. We are pleased to have this opportunity,” Männistö notes.
Männistö praises the research project, for example, for the fact that the new Tampere University can, at its best, introduce genuine and new creative multidisciplinary research and problem solving.
In addition to Männistö, Postdoctoral Research Fellow Jenni Raitoharju, who studies machine learning and AI, and Associate Professor Sanna Kumpulainen from information studies are involved in the planning and implementation of the IPALIA project.
WW2 photos are public, but their contents have not been analysed
The Finnish WW2 images have been published for everyone to use, but only a small part of them have been used in eg books. According to Männistö, no one still has a precise idea of the totality of the huge material.
“The photos have been published, but I hit a wall while getting acquainted with their number, contents and temporal variation,” Männistö describes his research experiences.
The idea of using machine learning for the analysis of war images arose when Männistö went through such images while writing a book with historian Ville Kivimäki.
In the first phase, Männistö and an international research group conducted an AI based analysis of the war images whose results have been reported in the article “Machine Learning Based Analysis of Finnish World War II Photographers”. The study is known to be the first of its kind worldwide.
According to Männistö, a new dimension of photographic research is introduced in the article, which is the fact that the study distinguishes between basic image sizes such as overview shots, medium shots and close-ups. AI also distinguished where the people in the image are located. These functions can be applied by museums and media houses.
With traditional methods, researchers can browse and observe the quantitative properties of images very slowly, which means that the numbers of images that can be examined are small, only a few hundred or thousand images.
The research group got AI to analyse about 59,000 WW2 images in a short time. The study produced interesting distributions that would have taken months or years to do with other methods.
From war image analysis to photos about covid-19
The analysis of war images will be helpful when Männistö begins studying photographs published in the Helsingin Sanomat (HS) newspaper during the covid-19 pandemic in the KORTE project. According to Männistö, the method implemented in the study to identify the mutual distance of the persons in the image in both depth and width is the key to being able to analyse the changes in the distances between people in HS photographs during the pandemic.
In the research project, the character structure of approximately 80,000–100,000 images will be broken down by machine learning and the interrelationships between image content will be investigated further. In the second phase of the project, qualitative analysis – meaning more traditional media research methods – will be used. It is not a question of alternative but of complementary methods.
Männistö says that the transition from WW2 images to newspaper images is a tremendous leap into another era. The images of WW2 basically lack all metadata, such as information about the time, place, and camera settings, which in modern cameras are automatically saved to an image file.
The sample of newspaper images will cover a period of 6–7 months in 2020 and the same months for 2019. AI will screen information from the HS images about the mode in which the images were taken, whether indoors or outdoors, and whether the people are close to each other or far apart. The basic distribution will provide a starting point for qualitative analysis.
Researchers will have more time for their studies
The new opportunities afforded by AI and machine learning are likely to revolutionise research by analysing raw data.
Männistö remembers how in the 1990s, he went through six months of the Time and Suomen Kuvalehti magazines in six research periods in search of images on Islam for his doctoral dissertation.
“It took me months to go through the magazines. I took photocopies and there were no digital cameras,” Männistö reminisces.
He went through hundreds of images and browsed thousands of pages. Now, he estimates that thanks to AI, a zero could be added to the end of those numbers and a couple of zeros deleted from the time he spent. That time can now be used on the qualitative analysis part.
“I find this inspiring because new methods are being developed and it is tested how they could work together in the multidisciplinary study of journalism, communication and history,” Männistö says.
In the future, huge amounts of data can be processed with the help of AI, and researchers’ time is freed from browsing pages to the actual research analysis.
“I cannot say when we are at a mature stage, but we are now taking decisive steps in that direction. There are likely to be setbacks, but I’m very excited about this method,” Männistö enthuses.
Text: Heikki Laurinolli