CETIC
Full-Spectrum Privacy-Preserving Artificial Intelligence

Trusted AI

Aim

The aim is to provide approaches where artificial intelligence users, solution providers and other stakeholders keep the privacy of their data, Information and knowledge under control.

What's at stake

Many AI solutions require large amount of data to guarantee accurate results. However, businesses and organisations in most domains have always been precautious on their data and information sharing to preserve secrecy and more recently, to comply with data protection regulations. This often limits the volume of datasets available for AI training to sizes much too small for generating quickly reliable solution leveraging on pre-trained AI models/algorithms.

Challenges

As businesses and organisations start exploring the AI way forward, they soon realise that it entails a number of challenges:

  • Good quality datasets available are too small for the development of reliable AI, and data sharing and pooling is difficult to implement in practice, notably, in healthcare and in manufacturing where Wallonia has world leading businesses;
  • Interoperability of data from different sources is often complex to determine and to align;
  • Data access has become more difficult as both organisations and citizens have increasingly become aware of the value that can be extracted from their data;
  • Beside privacy of data used during training sessions, it is as important to protect input/output data fed to/generated by AI models and algorithms deployed in operations (that is, protect input/output data during inference where trained AI models are used);
  • AI models must also be protected against intellectual property theft independent of how the AI models/algorithms are integrated/embedded into a solution and of the type of hardware infrastructure used to execute the solution.

AI possible solutions

Among others, solutions to privacy preserving during training sessions where reliable AI models are created consists of:

  • Developing Federated learning techniques where the level of precision is similar to common centralised learning;
  • Developing approaches where training can take place on encrypted data (eg using homomorphic encryption or similar) without inducing inacceptable time overhead;
  • Developing anonymisation/pseunonimisation techniques along with sound approaches to evaluate the level of anonymisation/pseunonimisation reached. Alternatively, show that an anonymised/pseunonymised datasets resist deanonymisation attacks.
  • Next to data privacy during AI training, solutions must also address the protection of input/output data and of AI models when solutions are in operations. Interesting techniques could investigate how confidential computing techniques (e.g. on top of a Trusted Execution Environment) can be used for protecting users’ input data, AI output results as well as AI models. An exciting challenge for our AI researchers.

    Key AI tech topics

    • Federated learning
    • Homomorphic encryption machine learning
    • Secure Multiparty Computation

    Get in touch

    Xavier Lessage (xavier.lessage@cetic.be)