Mohamed El Banani

personal_pic.jpg

I am a Ph.D. candidate in Computer Science at the University of Michigan working with Justin Johnson.

I am broadly interested in computer vision, machine learning, and cognitive science. My goal is to build systems that learn to represent their visual world without supervision and generalize to novel objects and scenes.

During my PhD, I was fortunate to work with David Fouhey and John Laird at UM, Benjamin Graham at FAIR, and Varun Jampani at Google Research. I did my undergraduate studies at Georgia Tech where I got the chance to work with Maithilee Kunda and Jim Rehg on cognitive modeling, as well as Omer Inan and Todd Sulchek on biomedical devices.

news

Oct 2023 I gave a talk at the Stanford Vision and Learning Lab on cross-modal correspondence.
Sep 2023 I chatted in Arabic with Abdelrahman Mohamed about my research. (video on AI بالمصري)
May 2023 I am spending the summer at Google Research working with Varun Jampani.
Apr 2023 I am serving as a mentor for the Fatima Fellowship.
Feb 2023 Our work on language-guided self-supervised learning was accepted at CVPR 2023.

publications

  1. lgssl_teaser.png
    Learning Visual Representations via Language-Guided Sampling
    Mohamed El BananiKaran Desai, and Justin Johnson
    In CVPR, 2023
    TL;DR: A picture is worth a thousand words, but a caption can describe a thousand images. We use language models to find image pairs with similar captions, and use them for stronger contrastive learning.
  2. syncmatch_teaser.png
    Self-Supervised Correspondence Estimation via Multiview Registration
    In WACV, 2023
    TL;DR: Self-supervised correspondence estimation struggles with wide-baseline images. We use multiview registration and SE(3) transformation synchronization to leverage long-term consistency in RGB-D video
  3. byoc_teaser.png
    Bootstrap your own correspondences
    Mohamed El Banani, and Justin Johnson
    In ICCV, 2021 (Oral)
    TL;DR: Good features get us accurate correspondence, accurate correspondence is good for feature learning. We leverage this to learn {visual, geometric} features via self-supervised point cloud registration.
  4. unsupervisedrr_teaser.png
    UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering
    Mohamed El Banani, Luya Gao, and Justin Johnson
    In CVPR, 2021 (Oral)
    TL;DR: Can we learn point cloud registration from RGB-D video? We propose a register and render approach that learns via minimizing photometric and geometric losses between close-by frames.
  5. novelviewpoints_teaser.gif
    Novel Object Viewpoint Estimation through Reconstruction Alignment
    Mohamed El Banani, Jason J Corso, and David F Fouhey
    In CVPR, 2020
    TL;DR: Humans can not help but see 3D structure of novel objects, so aligning their viewpoints becomes very easy. We propose a reconstruct-and-align approach for novelobject viewpoint estimation.
  6. kohs_diagram.png
    A Computational Exploration of Problem-Solving Strategies and Gaze Behaviors on the Block Design Task.
    Maithilee KundaMohamed El Banani, and James M Rehg
    In CogSci, 2016
    TL;DR: We present a computational architecture to model problem-solving strategies on the block design task. We generate detailed behavioral predictions and analyze cross-strategy error patterns.
  7. preeclampsia.jpg
    A Pilot Study of a Modified Bathroom Scale To Monitor Cardiovascular Hemodynamic in Pregnancy
    Odayme Quesada, Mohamed El Banani, James Heller, Shire Beach, Mozziyar Etemadi, Shuvo Roy, Omer Inan, Juan Gonzalez, and Liviu Klein
    Journal of the American College of Cardiology, 2016
    TL;DR: We use ballistocardiogram measurements extracted from a modified bathroom scale to analyze maternal cardiovascular adaptation during pregnancy for low-cost detection of preeclampsia.
  8. microfluidic_diagram.jpg
    Three-dimensional particle tracking in microfluidic channel flow using in and out of focus diffraction
    Bushra Tasadduq, Gonghao Wang, Mohamed El Banani, Wenbin Mao, Wilbur Lam, Alexander Alexeev, and Todd Sulchek
    Flow Measurement and Instrumentation, 2015
    TL;DR: We use defocusing patterns to extract 3D particle motion trajectories in 2D bright field videos of microfluidic devices.