Publications

*: Equal contribution †: Corresponding author.

2024

  1. arXiv
    nips24.png
    Beyond Uncertainty: Evidential Deep Learning for Robust Video Temporal Grounding
    Kaijing Ma*, Haojian Huang*, Jin Chen*, Haodong Chen, Pengliang Ji, Xianghao Zang, Han Fang, Chao Ban, Hao Sun, Mulin Chen, and  others
    arXiv preprint arXiv:2408.16272, 2024
  2. ACM MM
    mm24.png
    GOAL: Grounded text-to-image Synthesis with Joint Layout Alignment Tuning
    Yaqi Li, Han Fang, Zerun Feng, Kaijing Ma, Chao Ban, Xianghao Zang, LanXiang Zhou, Zhongjiang He, Jingyan Chen, Jiani Hu, and  others
    In ACM Multimedia 2024, 2024
  3. arXiv
    tuned.jpg
    Trusted Unified Feature-Neighborhood Dynamics for Multi-View Classification
    Haojian Huang, Chuanyu Qin, Zhe Liu, Kaijing Ma, Jin Chen, Han Fang, Chao Ban, Hao Sun, and Zhongjiang He
    arXiv preprint arXiv:2409.00755, 2024
  4. arXiv
    bovila.png
    BoViLA: Bootstrapping Video-Language Alignment via LLM-Based Self-Questioning and Answering
    Jin Chen, Kaijing Ma, Haojian Huang, Jiayu Shen, Han Fang, Xianghao Zang, Chao Ban, Zhongjiang He, Hao Sun, and Yanmei Kang
    arXiv preprint arXiv:2410.02768, 2024

2023

  1. ICCVW
    teaser_iccvw.png
    LLaViLo: Boosting Video Moment Retrieval via Adapter-Based Multimodal Modeling
    Kaijing Ma*, Xianghao Zang*, Zerun Feng, Han Fang, Chao Ban, Yuhan Wei, Zhongjiang He, Yongxiang Li, and Hao Sun
    In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023