Artificial Intelligence Needs to Be Secure

ZEW Lunch Debate in Brussels

ZEW Lunch Debate on the EU‘s AI Act

Participating in the panel discussion (from left): Dr Dominik Rehse (ZEW), Kilian Gross (European Commission) and Olga Nowicka (OpenAI), moderated by Luca Bertuzzi.

On 13 March 2024, the EU Parliament approved the Artificial Intelligence (AI) Act. Representatives from the European Commission, the European Parliament, and the Council of the European Union had already agreed on this regulation in early December 2023 after lengthy trilogue negotiations. However, not all aspects of the regulation are yet fully specified. Among other things, the question of how large generative AI models with potential systemic risks can be specifically tested for their safety must now be answered. In this regard, the AI regulation refers to “codes of practice” and harmonised standards that are yet to be developed.

To address this issue, a ZEW Lunch Debate on “Implementing the AI Act” took place on 24 April at the Representation of the State of Baden-Württemberg to the EU in Brussels. Dr. Dominik Rehse (ZEW), Olga Nowicka (OpenAI), and Kilian Gross (European Commission) spoke at the event, moderated by Luca Bertuzzi (freelance tech journalist).

AI safety evaluations require good market design

Bodo Lehmann, head of the State Representation, welcomed the more than 100 guests in the audience as well as the participants in the discussion. Dr. Dominik Rehse, head of ZEW’s Junior Research Group “Digital Market Design” and deputy head of the “Digital Economy” Unit, then kicked off the debate by presenting a concept for testing generative AI. Together with ZEW economists Sebastian Valet and Johannes Walter, he published a ZEW policy brief proposing specific rules that the codes of practice and harmonised standards should incorporate.

The AI Act mandates that models with systemic risks be evaluated through “adversarial testing”. While this provision is highly sensible, the focus should not be limited to defining technical requirements, but should also include consistent incentive and coordination mechanisms for all parties involved. In other words, AI safety evaluations require intelligent market design, according to Rehse.

Red teaming as the most efficient safety evaluation

Therefore, the ZEW researchers propose a comprehensive form of adversarial testing, known as red teaming. This involves repeatedly interacting with an AI model to lead it to exhibit unwanted behaviour, thereby detecting and improving weaknesses. For efficient red teaming, four different requirements must be met from the experts’ perspective.

To avoid conflicts of interest, the testing process should have to be implemented by independent external bodies. Secondly, the codes of practice and harmonised standards should contain specific testing goals. This allows for defining whether and when a model has been sufficiently tested. Thirdly, there should be well-defined and clear roles within the red teaming process: while the red teamers search for errors and are rewarded for doing so, a group of validators evaluates identified errors; an organizer provides the infrastructure and recruits the red team. Fourthly, the costs of red teaming should be borne by the AI developer – since the evaluation process becomes more expensive the more errors are found, developers have an incentive to develop safer models even before red teaming.

Involving civil society

In the discussion, Rehse noted that specifying the rules is not merely a technical agreement among software developers, as particularly generative AI sometimes requires a normative assessment of model behaviour. Therefore, the clarification of detailed questions should not be guided solely by industry interests; members of civil society should also be involved. However, due to the effort required for civil society actors to participate, this poses a significant challenge.

Kilian Gross, as head of unit in the Directorate-General for Communications Networks, Content and Technology (DG CNECT) of the European Commission, coordinates EU AI policy. He emphasised that while the AI Act is abstract, the Commission has clear ideas. For instance, the planned EU AI Office would be empowered to enforce the regulation on AI developers, mandate and evaluate adversarial testing, and potentially request direct access to models for safety evaluations. Various stakeholders, including civil society, should be given the opportunity to participate.

AI Act should preserve flexibility

Olga Nowicka, as EU Policy & Partnerships Lead for OpenAI, is responsible for the tech developer’s dialogue with the European legislature. She reported that OpenAI formed a red teaming network last year, consisting of external experts who assist the company in safety evaluations. The evaluation of the current GPT-4 model alone took six months. Therefore, OpenAI intends to use internal red teaming in the early stages of AI model development to address identified errors as quickly as possible. Nowicka stressed that the AI Act should leave sufficient room for implementation and inclusion of anticipated technical developments. Furthermore, it should ensure that no internal secrets are disclosed.

In conclusion, tech journalist Luca Bertuzzi provided an outlook, stating that the AI Act is not the end of a regulatory process but rather the beginning of a journey. Discussions like the ZEW Lunch Debate provide an opportunity to delineate a thematic horizon.

Additional Information