Factory Update Spring 2025 Subjects

This is a curated list of proposed subjects for the upcoming year for our C4DT partners. You can find more suggestions here: Additional Subjects. The projects are split in two categories: hands-on workshops, which are a 1-day training on a given subject, and project suggestions, based on current research of our affiliated labs:

Summary of the Proposals

Potential Hands-on workshops
- LLM and ML
  - Anyway – privacy preserving inference on commodity hardware
  - Disco – management and training of data in a decentralized setting
  - LLM agents – best practices, security issues, deceptive agents
- Security
  - eBPF fuzzing with specs – finding bugs in eBPF applications to make them more secure
- Cryptography
  - User consent for digital signatures – use existing infrastructure to add trust
- Other
  - Privacy-preserving implementation of Swiyu – how to integrate the upcoming Swiss eID in your services, and future developments
Project Suggestions
- LLM and ML
  - Trust in LLMs – addressing alignment, copyright, and toxic outputs
  - Training base models – include certification against adversarial perturbations
- Privacy and Security with Cryptography
  - Securing Ranking Algorithm – Privacy-Preserving Ranking Algorithms via Homomorphic Encryption

Potential Hands-on Workshops

The hands-on workshop proposed by the C4DT Factory allow you to take a deep-dive for one day into a specific subject. It is geared towards project leaders in software engineering, and also more manager roles for the morning session. Besides learning a new topic, this is an excellent way to get in contact with other of our partners. And of course, getting in contact with our labs for discussion about your challenges and ways to do research on those.

In the morning, one of our affiliated professors presents the subject to the audience. In the afternoon, a hands-on workshop takes place, prepared by the C4DT Factory. We split the proposed hands-up workshops in three categories:

Research deep-dive: cutting-edge, published research subject to show upcoming trends
Research application: mature research result which can be applied to a concrete use-case with a partner. This means setting up a project to verify the research results in a real-life setting
Startup proof-of-concept: an early-stage startup with a first proof-of-concept, ready to enter the big world

LLM and ML

Startup proof-of-concept – Anyway: privacy-preserving and cheaper usage of large models on office computer hardware w/o a data center – Prof. Rachid Guerraoui
- What you will learn: How to distribute large machine learning models across multiple standard computers to run inference without expensive GPUs; techniques for preserving privacy when splitting models; practical implementation of distributed inference protocols in adversarial (availability) settings.
- Where it is used: Financial institutions processing sensitive client data locally; healthcare organizations analyzing patient information without cloud exposure; SMEs wanting AI capabilities without cloud infrastructure costs; government agencies with regulatory requirements for data sovereignty.
- Collaborations: Partners can propose industry-specific use cases to guide the platform’s evolution; participate in pilot deployments to validate performance and security in real environments; setting up a grant to include it in a local project.
- More information: Pitch – Inference4All

Research application – Privacy-preserving collection using Disco, management and training of data in a decentralized setting – Prof. Mary-Anne Hartley
- What you will learn: how to run decentralized training on low-resources platforms, the privacy tradeoffs, and security considerations. Participants will understand federated learning principles and practice implementing privacy-preserving ML pipelines.
- Where it is used: in a clinical field trial where it trains an ML to detect resistance of bacteria to antibiotics. Similar approaches are utilized in healthcare, finance, and IoT applications where data privacy regulations restrict centralized data collection.
- Collaborations: implement a use-case where an ML needs to be trained with private data and only little computing resources are available. Partners can explore integrating these techniques into existing systems or developing new privacy-preserving data analytics solutions.

Research deep-dive – LLM agents, best practices, security issues from deceptive agents – Prof. Volkan Cevher
- What you will learn: Understand how LLM agents can develop hidden objectives contrary to user instructions, identify detection techniques for recognizing deceptive behavior patterns, and explore defensive mechanisms to ensure alignment with intended goals.
- Where it is used: Financial services security, autonomous systems validation, content moderation systems, and critical infrastructure protection where AI agents handle complex tasks with limited human oversight.

Security

Research application – Finding bugs in eBPF application through fuzzing based on specifications – Prof. Sanidhya Kashyap
- What you will learn: You’ll understand eBPF (extended Berkeley Packet Filter) technology and its applications in modern systems. Learn how to leverage application specifications to create effective fuzzing strategies that detect subtle and complex bugs in eBPF programs. Practical techniques for implementing specification-based fuzzing will be covered.
- Where it is used: eBPF is widely deployed in network security (firewalls, intrusion detection), performance monitoring, and observability tools. Real-world implementations include Kubernetes networking (Cilium), Cloudflare’s DDoS protection, and Facebook’s performance monitoring systems. It’s becoming essential infrastructure for cloud-native applications.
- Collaboration possibilities: Sanidhya seeks industry partners who currently implement eBPF in production environments to validate his research findings. Partners can benefit from improved security and reliability in their eBPF applications while contributing to advancing the field.

Cryptography

Research application – Proving user consent for digital signatures handled by a central authority – Prof. Serge Vaudenay
- What you will learn: what the problem is with current centralized digital signatures, and how to include a consent from the users with existing infrastructure.
- Where it is used: different centralized signature schemes exist today for PDFs, emails, and other documents. But they all require trust in the central authorities that they really represent the users.
- Possible collaboration: increase security and trust in these signatures by implementing this algorithm in the signing procedure.

Other

Research application – Privacy-preserving implementation of Swiyu – how to integrate the upcoming Swiss eID in your services, and future developments
- What you will learn: The different components of the upcoming Swiss eID solution, Swiyu, including its authentication protocols, security features, and API structure. You’ll gain hands-on experience implementing Swiyu integration in web and mobile applications, and understand compliance requirements for identity verification.
- Where is it used: The upcoming Swiss eID is used for administrative tasks like digital signatures for government documents, tax filings, and accessing public services. Commercial applications include secure customer onboarding, age verification, digital contract signing, and secure access to financial services.
- Collaborations: Partners can collaborate with EPFL labs to study privacy implications of specific implementations, develop enhanced privacy-preserving mechanisms (like zero-knowledge proofs for attribute verification), and create industry-specific integration frameworks that maintain security while optimizing user experience.

Project Suggestions

These are more abstract than the hands-on workshops, as the subject is just hot out of the published paper. For these ideas, we propose you to discuss with our professors if you are interested. We will set up the discussion, and accompany you if you decide to set up a partnership, with or without a grant.

We split the suggested projects in two categories:

Research suggestion: multi-year research ideas to be explored by a PhD student paid by the partner or by a grant.
Research application: published research which needs further use-cases from industry to verify its usability. Either paid by the partner or by a grant.

LLM and ML

Research suggestion: Trust in Large Language Models – Addressing alignment, copyright verification of training data, and mitigation of toxic outputs – Prof. Martin Jaggi
- Where it is used: Large Language Models (LLMs) are increasingly deployed in customer service, content creation, code development, and decision support systems. Each application raises specific trust concerns including legal compliance, ethical output generation, and alignment with human values.
- Collaboration: Support a PhD student in their research on trustworthy LLM models. This can be done through funding, and providing use-cases and data from real-world applications where trust is critical.

Research application: Training models from scratch which can be certified to be robust against adversarial perturbations. – Prof. Volkan Cevher
- Where it is used: Critical security applications where models must withstand malicious inputs designed to fool AI systems, such as autonomous vehicles, facial recognition for authentication, medical diagnosis, and financial fraud detection.
- Collaboration: Partners can work with Prof. Cevher’s lab to develop customized training pipelines with mathematically provable robustness guarantees for their specific data and threat models. This includes access to cutting-edge optimization techniques that ensure both model performance and security against adversarial attacks.

Privacy and Security with Cryptography

Research suggestion: Privacy-Preserving Ranking Algorithms via Homomorphic Encryption – Prof. Matthias Grossglauser
When researching ranking algorithms, specifically comparing ranking with comparison, researchers face challenges accessing proprietary datasets without signing NDAs. This project explores using homomorphic encryption to work with sensitive datasets while preserving privacy, enabling publication of research results without exposing the original data.
- Where it is used: This approach could transform recommendation systems by moving beyond traditional 5-star ratings toward comparative rankings. Users would rank products against similar alternatives, potentially yielding more nuanced preference data. Real-world applications include e-commerce platforms, content streaming services, and software marketplaces where preference data is valuable but sensitive.
- Collaboration: Partners can support a PhD student investigating the feasibility of homomorphic encryption for privacy-preserving ranking research. The project would determine whether this approach enables reproducible research while maintaining data confidentiality. Alternatively, this could be structured as a Master Thesis Project. Prof. Matthias Grossglauser would provide academic supervision.