Factory Update Spring 2025 Subjects

This is a curated list of proposed subjects for the upcoming year for our C4DT partners. You can find more suggestions here: Additional Subjects. The are split in two categories: hands-on workshops, which are a 1-day training on a given subject, and project suggestions, based on current research of our affiliated labs:


Summary of the Proposals

Potential Hands-on Workshops

The hands-on workshop proposed by the C4DT Factory allow you to take a deep-dive for one day into a specific subject. It is geared towards project leaders in software engineering, and also more manager roles for the morning session. Besides learning a new topic, this is an excellent way to get in contact with other of our partners. And of course, getting in contact with our labs for discussion about your challenges and ways to do research on those.


In the morning, one of our affiliated professors presents the subject to the audience. In the afternoon, a hands-on workshop takes place, prepared by the C4DT Factory. We split the proposed hands-up workshops in three categories:

  • Research deep-dive: cutting-edge, published research subject to show upcoming trends
  • Research application: mature research result which can be applied to a concrete use-case with a partner. This means setting up a project to verify the research results in a real-life setting
  • Startup proof-of-concept: an early-stage startup with a first proof-of-concept, ready to enter the big world

LLM and ML

  • Startup proof-of-conceptInference4All: privacy-preserving and cheaper usage of large models on office computer hardware w/o a data center – Prof. Rachid Guerraoui
    • What you will learn: How to distribute large machine learning models across multiple standard computers to run inference without expensive GPUs; techniques for preserving privacy when splitting models; practical implementation of distributed inference protocols in adversarial (availability) settings.
    • Where it is used: Financial institutions processing sensitive client data locally; healthcare organizations analyzing patient information without cloud exposure; SMEs wanting AI capabilities without cloud infrastructure costs; government agencies with regulatory requirements for data sovereignty.
    • Collaborations: Partners can propose industry-specific use cases to guide the platform’s evolution; participate in pilot deployments to validate performance and security in real environments; setting up a grant to include it in a local project.
    • More information: Pitch – Inference4All

  • Research application – Privacy-preserving collection using Disco, management and training of data in a decentralized setting – Prof. Mary-Anne Hartley
    • What you will learn: how to run decentralized training on low-resources platforms, the privacy tradeoffs, and security considerations. Participants will understand federated learning principles and practice implementing privacy-preserving ML pipelines.
    • Where it is used: in a clinical field trial where it trains an ML to detect resistance of bacteria to antibiotics. Similar approaches are utilized in healthcare, finance, and IoT applications where data privacy regulations restrict centralized data collection.
    • Collaborations: implement a use-case where an ML needs to be trained with private data and only little computing resources are available. Partners can explore integrating these techniques into existing systems or developing new privacy-preserving data analytics solutions.

  • Research deep-diveLLM agents, best practices, security issues from deceptive agents – Prof. Volkan Cevher
    • What you will learn: Understand how LLM agents can develop hidden objectives contrary to user instructions, identify detection techniques for recognizing deceptive behavior patterns, and explore defensive mechanisms to ensure alignment with intended goals.
    • Where it is used: Financial services security, autonomous systems validation, content moderation systems, and critical infrastructure protection where AI agents handle complex tasks with limited human oversight.

Security

  • Research application – Finding bugs in eBPF application through fuzzing based on specifications – Prof. Sanidhya Kashyap
    • What you will learn: You’ll understand eBPF (extended Berkeley Packet Filter) technology and its applications in modern systems. Learn how to leverage application specifications to create effective fuzzing strategies that detect subtle and complex bugs in eBPF programs. Practical techniques for implementing specification-based fuzzing will be covered.
    • Where it is used: eBPF is widely deployed in network security (firewalls, intrusion detection), performance monitoring, and observability tools. Real-world implementations include Kubernetes networking (Cilium), Cloudflare’s DDoS protection, and Facebook’s performance monitoring systems. It’s becoming essential infrastructure for cloud-native applications.
    • Collaboration possibilities: Sanidhya seeks industry partners who currently implement eBPF in production environments to validate his research findings. Partners can benefit from improved security and reliability in their eBPF applications while contributing to advancing the field.

Cryptography

Other

  • Research applicationPrivacy-preserving implementation of Swiyu – how to integrate the upcoming Swiss eID in your services, and future developments

Project Suggestions

These are more abstract than the hands-on workshops, as the subject is just hot out of the published paper. For these ideas, we propose you to discuss with our professors if you are interested. We will set up the discussion, and accompany you if you decide to set up a partnership, with or without a grant.


We split the suggested projects in two categories:

  • Research suggestion: multi-year research ideas to be explored by a PhD student paid by the partner or by a grant.
  • Research application: published research which needs further use-cases from industry to verify its usability. Either paid by the partner or by a grant.

LLM and ML

  • Research suggestion: Trust in Large Language Models – Addressing alignment, copyright verification of training data, and mitigation of toxic outputs – Prof. Martin Jaggi
    • Where it is used: Large Language Models (LLMs) are increasingly deployed in customer service, content creation, code development, and decision support systems. Each application raises specific trust concerns including legal compliance, ethical output generation, and alignment with human values.
    • Collaboration: Support a PhD student in their research on trustworthy LLM models. This can be done through funding, and providing use-cases and data from real-world applications where trust is critical.

  • Research application: Training models from scratch which can be certified to be robust against adversarial perturbations. – Prof. Volkan Cevher
    • Where it is used: Critical security applications where models must withstand malicious inputs designed to fool AI systems, such as autonomous vehicles, facial recognition for authentication, medical diagnosis, and financial fraud detection.
    • Collaboration: Partners can work with Prof. Cevher’s lab to develop customized training pipelines with mathematically provable robustness guarantees for their specific data and threat models. This includes access to cutting-edge optimization techniques that ensure both model performance and security against adversarial attacks.

Privacy and Security with Cryptography

  • Research suggestion: Privacy-Preserving Ranking Algorithms via Homomorphic Encryption – Prof. Matthias Grossglauser
    When researching ranking algorithms, specifically comparing ranking with comparison, researchers face challenges accessing proprietary datasets without signing NDAs. This project explores using homomorphic encryption to work with sensitive datasets while preserving privacy, enabling publication of research results without exposing the original data.
    • Where it is used: This approach could transform recommendation systems by moving beyond traditional 5-star ratings toward comparative rankings. Users would rank products against similar alternatives, potentially yielding more nuanced preference data. Real-world applications include e-commerce platforms, content streaming services, and software marketplaces where preference data is valuable but sensitive.
    • Collaboration: Partners can support a PhD student investigating the feasibility of homomorphic encryption for privacy-preserving ranking research. The project would determine whether this approach enables reproducible research while maintaining data confidentiality. Alternatively, this could be structured as a Master Thesis Project. Prof. Matthias Grossglauser would provide academic supervision.