About Me
Iโm Junlong Tong, a Ph.D. candidate in Computer Science at Shanghai Jiao Tong University and Eastern Institute of Technology (supervised by Dr. Xiaoyu Shen).
My research focuses on multimodal foundation models for dynamic real-world environments, with the goal of enabling models to perceive, reason, and interact continuously and proactively in real time. In particular, I focus on two complementary directions:
- Streaming LLMs/MLLMs, which study real-time multimodal perception, concurrent reasoning, and proactive interaction over continuous, long-horizon multimodal streams.
- Efficient LLMs/MLLMs, which investigate data/token/KV cache compression, layer pruning, and efficient inference for scalable deployment.
More broadly, I aim to build multimodal foundation models that are both capable of operating in continuous real-world settings and efficient enough for practical deployment at scale, and serve as the foundation for multimodal agents. I also have research experience in LLMs for time-series modeling, which complements my broader interest in temporal modeling over continuous data.
๐ก Seeking Research Internship Opportunities & Collaborations: I am actively seeking research internship opportunities in LLMs/MLLMs and am always open to academic collaborations. Feel free to reach out to me via jl-tong@sjtu.edu.cn!
News โจ
[2026.04] The survey of streaming LLMs has been accepted by ACL 2026 Findings.๐
[2026.02] Three papsers are accepted by CVPR 2026.๐
[2026.01] Two papers are accepted by ICLR 2026.๐
[2025.08] One paper is accepted by EMNLP 2025.๐
[2025.06] One paper is accepted by ICCV 2025.๐
[2025.05] One paper is accepted by ACL 2025 Findings.๐
[2025.05] One paper is accepted by ICML 2025.๐
Publications
A full list of publications can be found on Google Scholar.
(* Equal Contribution, โ Corresponding author)
- Streaming LLMs/MLLMs (Demo)
- From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models. [PDF] [Repository]
Junlong Tong, Zilong Wang, YuJie Ren, Peiran Yin, Hao Wu, Wei Zhang, Xiaoyu Shenโ .
Findings of ACL 2026. - StreamingThinker: Large Language Models Can Think While Reading. [PDF] [Code] [Project]
Junlong Tong, Yingqi Fan, Anhao Zhao, Yunpu Ma, Xiaoyu Shenโ .
ICLR 2026. - Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models. [PDF] [Code] [Project]
Jialiang Zhang*, Junlong Tong*, Junyan Lin, Hao Wu, Yunpu Ma, Xiaoyu Shenโ . (* Equal Contribution)
CVPR 2026. - LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding. [PDF] [Code]
Junlong Tong, Jinlan Fu, Zixuan Lin, Yingqi Fan, Anhao Zhao, Hui Su, Xiaoyu Shenโ .
Findings of ACL 2025. - ProactiveLLM: Learning Active Interaction for Streaming Large Language Models.
Junlong Tong, Yao Zhang, Anhao Zhao, Yingqi Fan, Yunpu Ma, Xiaoyu Shenโ .
Under review - Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models. [PDF][Code]
Junyan Lin*, Junlong Tong*, Hao Wu, Jialiang Zhang, Jinming Liu, Xin Jin, Xiaoyu Shenโ .
Under review
- Efficient LLMs/MLLMs
- Context Guided Transformer Entropy Modeling for Video Compression. [PDF]
Junlong Tong, Wei Zhang, Yaohui Jin, Xiaoyu Shenโ .
ICCV 2025. - What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models. [PDF][Code]
Yingqi Fan, Junlong Tong, Anhao Zhao, Xiaoyu Shenโ .
CVPR 2026. (Highlight) - HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit [PDF][Code]
Hao Wu, Yingqi Fan, Jinyang Dai, Junlong Tong, Yunpu Ma, Xiaoyu Shenโ
ICLR 2026. - VisiPruner: Decoding Discontinuous Cross-Modal Dynamics for Efficient Multimodal LLMs. [PDF]
Yingqi Fan, Anhao Zhao, Jinlan Fu, Junlong Tong, Hui Su, Yijie Pan, Wei Zhang, Xiaoyu Shenโ .
EMNLP 2025. - SkipGPT: Each Token is One of a Kind. [PDF] [Code]
Anhao Zhao, Fanghua Ye, Yingqi Fan, Junlong Tong, Jing Xiong, Zhiwei Fei, Hui Su, Xiaoyu Shenโ .
ICML 2025. - From Data to Model: A Survey of the Compression Lifecycle in MLLMs. [PDF] [Repository]
Hao Wu*, Junlong Tong*, Xudong Wang, Yang Tan, Changyu Zeng, Anastasia Antsiferova, Xiaoyu Shenโ .
Under review
- LLM for Sequence Modeling
- Rethinking the Role of LLMs in Time Series Forecasting. [PDF] [Code]
Xin Qiu*, Junlong Tong*, Yirong Sun, Yunpu Ma, Wei Zhang, Xiaoyu Shenโ .
Under review - The Few Govern the Many: Unveiling Few-Layer Dominance for Time Series Models. [PDF] [Code]
Xin Qiu*, Junlong Tong*, Yirong Sun, Yunpu Ma, Xiaoyu Shenโ .
Under review - Probabilistic Decomposition Transformer for Time Series Forecasting. [PDF] [Code]
Junlong Tong, Liping Xie, Kanjian Zhang.
SIAM International Conference on Data Mining (SDM 2023).
- Enhancing Time Series Forecasting: A Hierarchical Transformer with Probabilistic Decomposition Representation. [PDF]
Junlong Tong, Liping Xie, Wankou Yang, Kanjian Zhang, Junsheng Zhao.
Information Sciences 2023. - Hourly Solar Irradiance Forecasting Based on Encoderโdecoder Model Using Series Decomposition and Dynamic Error Compensation. [PDF]
Junlong Tong, Liping Xie, Shixiong Fang, Wankou Yang, Kanjian Zhang.
Energy Conversion and Management 2022.
Educations
- Ph.D. candidate in Computer Science at Shanghai Jiao Tong University (2023 - present), Shanghai, China
- M.Eng in Artificial Intelligence at Southeast University (2020 - 2023), Nanjing, China
- B.Eng in Electrical Engineering at Shanghai University of Electric Power (2016 - 2020), Shanghai, China
Honors and Awards
- Outstanding Masterโs Thesis Award of Jiangsu Province, 2023
- SDM23 Student Travel Awards, 2023
Academic Services
- Conference Reviewer: ICML, NeurIPS, ACL, CVPR, ECCV, AAAI
- Journal Reviewer: TCVST, TNNLS
Contact
Email: jl-tong@sjtu.edu.cn
