Symposium on Machine Learning across Modalities

Join us in Davies Auditorium, New Haven, CT, USA, April 10th, 2026, for a deep dive into Multi-modal representation learning and foundation models.

Event
Yale University • Davies Auditorium, New Haven, CT • April 10th, 2026
  • Format: Invited Talks • Posters • Panel
See Schedule

Machine learning's progress with isolated modalities—text, vision, audio—masks a fundamental gap: intelligence emerges from integrating diverse data sources. Developing unified models that reason coherently across modalities remains one of AI's defining challenges.

Multimodal learning bridges this gap by enabling systems to jointly learn from and align information across heterogeneous data sources. This paradigm promises richer representations, stronger generalization, and more robust reasoning, with implications for science, healthcare, robotics, and human-centered AI.

This workshop brings together researchers and practitioners exploring the foundations and frontiers of multimodal intelligence. We seek to spark cross-disciplinary discussion on architectures, alignment strategies, and applications that will shape the future of AI systems that understand the world as we do.

News

RSS

2. Call for Contributions (Topics & Scope)

Theoretical Foundations

  • Principles of multimodal alignment and representation learning
  • Information-theoretic perspectives on cross-modal fusion
  • Modeling inter-modal dependencies and disentanglement
  • Foundations of transfer, generalization, and scaling across modalities

Architectures & Algorithms

  • Multimodal transformers, diffusion models, and joint encoders
  • Cross-attention, alignment, and fusion mechanisms
  • Contrastive and generative approaches for multi-sensor data
  • Learning shared latent spaces across modalities

Applications

  • Vision–language and audio–language understanding
  • Multimodal reasoning in healthcare, robotics, and science
  • Cross-modal retrieval, grounding, and embodied AI
  • Learning with missing or weakly paired modalities

Single-Modality Research

  • Works focusing on a single modality (e.g., vision, language, or audio)
  • Advances that push individual modalities
  • Building blocks for multimodal integration
  • Inspirations for cross-modal reasoning, alignment, or transfer

Trustworthiness & Tools

  • Bias, fairness, and interpretability in multimodal models
  • Evaluation benchmarks and standardized datasets
  • Toolkits for multimodal foundation model training and visualization
  • Efficient and sustainable multimodal learning at scale

3. Submissions & Important Dates

1 page Extended abstract (excl. refs/appendix), non‑archival by default.

  • Paper deadline: March 25, 2026
  • Author notification: April 3, 2026
  • Workshop early registration: April 3, 2026
  • Workshop date: April 10, 2026

4. Tentative Schedule

Time Session
8:30–8:50Poster setup
8:50–9:00Opening remarks
9:00–9:50Keynote - Ruslan Salakhutdinov (Carnegie Mellon University)
9:50–10:40Keynote - Alan Yuille (Johns Hopkins University)
10:40–11:00Discussions & coffee break
11:00–11:50Highlighted talks: Manling Li, Ruohan Gao
11:50–1:40Poster session & Lunch Break
1:50-2:40Highlighted talks: Kayhan Batmanghelich
2:40–3:30Keynote - Atlas Wang (UT Austin)
3:30–3:45Discussions & coffee break
3:45–4:35Keynote - Jian Pei (Duke University)
4:35–5:25Highlighted talks: Yunzhu Li, Yifeng Gao
5:25–6:00Panel: All keynote speakers.
6:00–6:10Concluding remarks.
6:10–8:00Social (Optional)

5. Organizers

Extended organizers shown where noted.

6. Keynote Speakers

Atlas Wang Atlas Wang
UT Austin
Ruslan Salakhutdinov Ruslan Salakhutdinov
Carnegie Mellon University
Jian Pei Jian Pei
Duke University
Alan Yuille Alan Yuille
Johns Hopkins University

7. Highlighted Talks

Manling Li Manling Li
Northwestern University
Kayhan Batmanghelich Kayhan Batmanghelich
Boston University
Yunzhu Li Yunzhu Li
Columbia University
Ruohan Gao Ruohan Gao
University of Maryland, College Park
Yifeng Gao Yifeng Gao
The University of Texas Rio Grande Valley

8. FAQ

Can I attend virtually?

All talks will be live‑streamed. Posters are in‑person only; we recommend attending in person if possible.

What does non‑archival mean?

Your submission will not appear in formal proceedings or be indexed; you retain the right to publish elsewhere.

Contact

Email us: hiren.madhu@yale.edu