Paul G. Allen Center for Computer Science & Engineering, Microsoft Atrium
Sheng Wang
Paul G. Allen School of Computer Science & Engineering
Generative AI for Multimodal Biomedicine
Abstract
Biomedicine is inherently multimodal, including imaging modalities such as pathology, CT, MRI, X-ray and ultrasounds, as well as omics modality such as genomics, epigenomics and transcriptomics. General domain multimodal approaches are not applicable to biomedicine because biomedical images are very different from general domain images, thus necessitating the development of modality-specific approaches. In this talk, I will introduce three recent works towards building multimodal biomedicine foundation models. First, I will introduce GigaPath, the first whole-slide pathology foundation model that can handle gigapixel-level pathology images. GigaPath exploits a novel vision transformer architecture and achieves the state-of-the-art results on 23 out of 26 cancer tasks, including subtyping and biomarker prediction. Next, I will introduce OCTCube, the first 3D OCT retinal imaging foundation model. OCTCube significantly outperformed 2D models on 27 out of 29 tasks, including retinal disease prediction, cross-modality analysis, cross-device generalization and systemic disease prediction. Finally, I will introduce BiomedParse, a multi-modal foundation model that integrates 9 major biomedical imaging modalities by projecting all of them into the text space, resulting in superior performance on segmentation, detection, and recognition, paving the path for large-scale image-based biomedical discovery. I will conclude this task with discussion on how multi-modal generative AI can advance future medical applications through multi-agent framework and integration with multi-omics datasets.
Bio
Sheng Wang is an assistant professor in the School of Computer Science & Engineering at the University of Washington Seattle. He obtained his B.S. degree in Computer Science from Peking University, Ph.D. degree in Computer Science from University of Illinois at Urbana Champaign, and conducted postdoc training at Stanford School of Medicine. Sheng is currently interested in developing large-scale models for biomedical applications, with a focus on digital pathology, medical imaging foundation models, chromatin structure prediction, and genomics-based drug discovery. His research has been published in top venues such as Nature, Science, Nature Biotechnology, Nature Methods, Nature Machine Intelligence and The Lancet Oncology, and used by major biomedical institutes, including Mayo Clinic, Chan Zuckerberg Biohub, UW Medicine, and Providence genomics.