Untargeted Scraping of Facial Images

The EU AI Act, in Article 5(1)(e), prohibits the use of AI systems that create or expand facial recognition databases by scraping facial images from the internet or CCTV footage without targeting specific individuals. This practice, often referred to as “untargeted scraping,” is seen as a serious intrusion on privacy and data protection rights, denying individuals the ability to remain anonymous in public or online spaces.

The Act’s prohibition applies where AI is used specifically to scrape images from open sources for the purpose of building facial recognition databases. It targets practices that contribute to a sense of mass surveillance and raise significant risks of violating fundamental rights. Importantly, this prohibition only applies when several conditions are met: (a) the AI system must be placed on the market or used for this specific purpose; (b) it must create or expand a facial recognition database; and (c) it must do so through untargeted scraping from sources such as the internet or CCTV footage.

By banning this type of data collection, the AI Act draws a clear line between legitimate uses of AI and those that compromise individual freedoms on a systemic scale.

What is a facial recognition database?

A facial recognition database, in this context, refers to any organised collection of facial data that enables computers to match images or video frames of individuals against stored facial images. These databases can be temporary or permanent, centralised or decentralised, and need not be solely intended for facial recognition. It is enough that they are capable of being used for that purpose.

Untargeted scraping of facial images:

The AI Act distinguishes between targeted and untargeted scraping. Scraping generally refers to the use of automated tools, such as bots or web crawlers, to collect data from sources such as websites, social media, or CCTV footage. Untargeted scraping involves indiscriminately collecting large volumes of facial images without focusing on specific individuals. This kind of broad, vacuum-like data gathering is the primary concern of Article 5(1)(e), and it is strictly prohibited when used to build or expand facial recognition databases.

In contrast, targeted scraping, where images are collected for specific individuals or defined groups, such as identifying known victims of trafficking, is not banned under the same provision. However, attempts to circumvent the rule by building a database incrementally through multiple targeted scrapes that ultimately serve the same purpose as untargeted scraping will still fall under the prohibition. If a system combines both targeted and untargeted scraping, the untargeted aspect remains unlawful.

Internet and CCTV footage:

The prohibition under Article 5(1)(e) of the AI Act applies specifically when facial images are scraped from two key sources: the internet and CCTV footage. It is essential to note that just because someone has shared a photo of themselves online, such as on social media, does not mean they have consented to that image being used in a facial recognition database. Similarly, scraping facial images from CCTV footage in public spaces like airports or streets also falls under the ban when done indiscriminately.

A common example of the prohibited practice would be a company using automated tools to collect facial images from platforms like Facebook or YouTube, along with associated data such as geolocation or usernames. These images are processed into biometric data, indexed, and stored for comparison against future uploads. This kind of mass data collection for facial recognition purposes without individual consent or targeting, is exactly what the AI Act seeks to prohibit.

Conversely, when an AI system performs a reverse image search using a single uploaded photo to find potential matches online, this is considered a targeted activity. While such a process may still raise privacy concerns, it does not fall within the scope of the untargeted scraping ban—particularly where no standalone database is being created or expanded.

Out of scope:

Not all uses of facial images or scraping practices fall within the scope of the prohibition under Article 5(1)(e) of the EU AI Act. The prohibition applies specifically to the use of AI systems that create or expand facial recognition databases, through untargeted scraping. It does not extend to the untargeted scraping of other types of biometric data, such as voice samples, or to scraping activities that don’t involve AI at all.

Additionally, facial image databases that are not used for identifying individuals, such as those used for training or testing AI models where no recognition of real persons occurs, are not prohibited. AI systems that generate synthetic faces based on scraped images, but which do not link those faces back to identifiable people, are also out of scope, though they may still be subject to the EU AI Act’s transparency obligations.

Notably, the prohibition does not retroactively apply to facial recognition databases created before the rules in the EU AI Act came into effect, provided they are not further expanded using untargeted AI scraping. However, their use must still comply with existing EU data protection laws. The Act’s focus is squarely on the creation or expansion of facial recognition capabilities, not on the act of biometric identification itself, which is governed by separate legal provisions.

Conclusion:

The EU AI Act draws a firm line against the untargeted scraping of facial images for the purpose of building or expanding facial recognition databases, recognising the significant risks this practice poses to privacy, anonymity, and fundamental rights. By focusing on indiscriminate data collection carried out by AI systems, the prohibition under Article 5(1)(e) aims to prevent the mass surveillance of individuals without their knowledge or consent. At the same time, the EU AI Act provides clarity around what falls outside the scope of this prohibition, ensuring that legitimate research and AI development practices where real individuals are not being identified can continue within a clearly defined legal framework. This balanced approach reflects the EU’s broader objective of fostering trustworthy AI while safeguarding the rights and freedoms of individuals across the Union.

Untargeted Scraping of Facial Images

Out of scope:

Conclusion:

bluearrow

Previous PostEmotion Recognition

Next PostIndividual Risk Assessment & Predictive Policing

Leave a Reply Cancel Reply

Learn more

About

Services

Contact

Contact Us

info@bluearrow.ai