Symposium on Machine Learning across Modalities

Machine learning's progress with isolated modalities—text, vision, audio—masks a fundamental gap: intelligence emerges from integrating diverse data sources. Developing unified models that reason coherently across modalities remains one of AI's defining challenges.

Multimodal learning bridges this gap by enabling systems to jointly learn from and align information across heterogeneous data sources. This paradigm promises richer representations, stronger generalization, and more robust reasoning, with implications for science, healthcare, robotics, and human-centered AI.

This workshop brings together researchers and practitioners exploring the foundations and frontiers of multimodal intelligence. We seek to spark cross-disciplinary discussion on architectures, alignment strategies, and applications that will shape the future of AI systems that understand the world as we do.

News

RSS

Mar 2026
We are accepting papers for oral and poster presentations. Paper deadline: March 25, 2026. See the Call for Contributions.
Mar 2026
Submit your poster or register for early attendance via the Submission & Registration Form ↗. Early registration deadline: April 3, 2026.

2. Call for Contributions (Topics & Scope)

Theoretical Foundations

Principles of multimodal alignment and representation learning
Information-theoretic perspectives on cross-modal fusion
Modeling inter-modal dependencies and disentanglement
Foundations of transfer, generalization, and scaling across modalities

Architectures & Algorithms

Multimodal transformers, diffusion models, and joint encoders
Cross-attention, alignment, and fusion mechanisms
Contrastive and generative approaches for multi-sensor data
Learning shared latent spaces across modalities

Applications

Vision–language and audio–language understanding
Multimodal reasoning in healthcare, robotics, and science
Cross-modal retrieval, grounding, and embodied AI
Learning with missing or weakly paired modalities

Single-Modality Research

Works focusing on a single modality (e.g., vision, language, or audio)
Advances that push individual modalities
Building blocks for multimodal integration
Inspirations for cross-modal reasoning, alignment, or transfer

Trustworthiness & Tools

Bias, fairness, and interpretability in multimodal models
Evaluation benchmarks and standardized datasets
Toolkits for multimodal foundation model training and visualization
Efficient and sustainable multimodal learning at scale

Submit / Register ↗ LaTeX Template ↗

3. Submissions & Important Dates

1 page Extended abstract (excl. refs/appendix), non‑archival by default.

Paper deadline: March 25, 2026
Author notification: April 3, 2026
Workshop early registration: April 3, 2026
Workshop date: April 10, 2026

4. Tentative Schedule

Time	Session
8:15–8:50	Registration and poster setup
8:50–9:00	Opening remarks
9:00–9:50	Keynote: Ruslan Salakhutdinov (Carnegie Mellon University)
9:50–10:40	Keynote: James Duncan (Yale University)
10:40–11:05	Coffee break & discussions
11:05–11:55	Highlighted talks: Manling Li (Northwestern), Ruohan Gao (UMD)
11:55–12:10	Remarks: Jeffrey Brock, Dean, Yale School of Engineering & Applied Science
12:10–1:40	Lunch break & poster session
1:40–2:05	Highlighted talks: Kayhan Batmanghelich (BU)
2:05–2:55	Keynote: Atlas Wang (UT Austin)
2:55–3:20	Coffee break & discussions
3:20–4:10	Keynote: Jian Pei (Duke)
4:10–5:00	Highlighted talks: Yunzhu Li (Columbia), Yifeng Gao (UTRGV)
5:00–6:00	Panel: All keynote speakers
6:00–6:10	Concluding remarks
6:10–8:00	Social

5. Organizers

Extended organizers shown where noted.

Rex Ying

Yale

Smita Krishnaswamy

Yale

Arman Cohan

Yale

Alex Wong

Yale

Hiren Madhu

Yale

Hyoungseob Park

Yale

Jenny Lee

Yale

Shania Guo

Yale

Mary Xie

Yale

Daniel Wang

Yale

Suchisrit (Rit) Gangopadhyay

Yale

6. Keynote Speakers

Atlas Wang

UT Austin

Ruslan Salakhutdinov

Carnegie Mellon University

Jian Pei

Duke University

James Duncan

Yale University

7. Highlighted Talks

Manling Li

Northwestern University

Kayhan Batmanghelich

Boston University

Yunzhu Li

Columbia University

Ruohan Gao

University of Maryland, College Park

Yifeng Gao

The University of Texas Rio Grande Valley

8. Attending SMLM

Traveling to New Haven

For those arriving by train:
We recommend getting off at Union Station.

View Union Station on Google Maps

For those arriving by car:
Due to the large number of attendees, we recommend the Grove Street Parking Garage, one of the larger garages in the area.

Grove Street Garage (LAZ Parking) View garage on Google Maps

Alternatively, you may also search on ParkMobile in New Haven, but note that some options may be limited to 2‑hour meters.

Recommended Lodging

9. FAQ

Can I attend virtually?

Unfortunately, we do not have an option to attend virtually.

What does non‑archival mean?

Your submission will not appear in formal proceedings or be indexed; you retain the right to publish elsewhere.

Contact

Email us: hiren.madhu@yale.edu

10. Sponsor

AIJ

Proudly sponsored by the Yale School of Engineering and Applied Science AI Seed Grant and Futurewei Technologies.