About Me

I’m Junlong Tong, a Ph.D. candidate in Computer Science at Shanghai Jiao Tong University and Eastern Institute of Technology (supervised by Dr. Xiaoyu Shen).

My research focuses on multimodal foundation models for dynamic real-world environments, with the goal of enabling models to perceive, reason, and interact continuously and proactively in real time. In particular, I focus on two complementary directions:

Streaming LLMs/MLLMs, which study real-time multimodal perception, concurrent reasoning, and proactive interaction over continuous, long-horizon multimodal streams.
Efficient LLMs/MLLMs, which investigate data/token/KV cache compression, layer pruning, and efficient inference for scalable deployment.

More broadly, I aim to build multimodal foundation models that are both capable of operating in continuous real-world settings and efficient enough for practical deployment at scale, and serve as the foundation for multimodal agents. I also have research experience in LLMs for time-series modeling, which complements my broader interest in temporal modeling over continuous data.

💡 Seeking Research Internship Opportunities & Collaborations: I am actively seeking research internship opportunities in LLMs/MLLMs and am always open to academic collaborations. Feel free to reach out to me via jl-tong@sjtu.edu.cn!

News ✨

[2026.04] The survey of streaming LLMs has been accepted by ACL 2026 Findings.🎉
[2026.02] Three papsers are accepted by CVPR 2026.🎉
[2026.01] Two papers are accepted by ICLR 2026.🎉
[2025.08] One paper is accepted by EMNLP 2025.🎉
[2025.06] One paper is accepted by ICCV 2025.🎉
[2025.05] One paper is accepted by ACL 2025 Findings.🎉
[2025.05] One paper is accepted by ICML 2025.🎉

Publications

A full list of publications can be found on Google Scholar.
(* Equal Contribution, ^† Corresponding author)

Streaming LLMs/MLLMs (Demo)

From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models. [PDF] [Repository]
Junlong Tong, Zilong Wang, YuJie Ren, Peiran Yin, Hao Wu, Wei Zhang, Xiaoyu Shen^†.
Findings of ACL 2026.
StreamingThinker: Large Language Models Can Think While Reading. [PDF] [Code] [Project]
Junlong Tong, Yingqi Fan, Anhao Zhao, Yunpu Ma, Xiaoyu Shen^†.
ICLR 2026.
Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models. [PDF] [Code] [Project]
Jialiang Zhang*, Junlong Tong*, Junyan Lin, Hao Wu, Yunpu Ma, Xiaoyu Shen^†. (* Equal Contribution)
CVPR 2026.
LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding. [PDF] [Code]
Junlong Tong, Jinlan Fu, Zixuan Lin, Yingqi Fan, Anhao Zhao, Hui Su, Xiaoyu Shen^†.
Findings of ACL 2025.
ProactiveLLM: Learning Active Interaction for Streaming Large Language Models.
Junlong Tong, Yao Zhang, Anhao Zhao, Yingqi Fan, Yunpu Ma, Xiaoyu Shen^†.
Under review
Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models. [PDF][Code]
Junyan Lin*, Junlong Tong*, Hao Wu, Jialiang Zhang, Jinming Liu, Xin Jin, Xiaoyu Shen^†.
Under review

Efficient LLMs/MLLMs

Context Guided Transformer Entropy Modeling for Video Compression. [PDF]
Junlong Tong, Wei Zhang, Yaohui Jin, Xiaoyu Shen^†.
ICCV 2025.
What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models. [PDF][Code]
Yingqi Fan, Junlong Tong, Anhao Zhao, Xiaoyu Shen^†.
CVPR 2026. (Highlight)
HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit [PDF][Code]
Hao Wu, Yingqi Fan, Jinyang Dai, Junlong Tong, Yunpu Ma, Xiaoyu Shen^†
ICLR 2026.
VisiPruner: Decoding Discontinuous Cross-Modal Dynamics for Efficient Multimodal LLMs. [PDF]
Yingqi Fan, Anhao Zhao, Jinlan Fu, Junlong Tong, Hui Su, Yijie Pan, Wei Zhang, Xiaoyu Shen^†.
EMNLP 2025.
SkipGPT: Each Token is One of a Kind. [PDF] [Code]
Anhao Zhao, Fanghua Ye, Yingqi Fan, Junlong Tong, Jing Xiong, Zhiwei Fei, Hui Su, Xiaoyu Shen^†.
ICML 2025.
From Data to Model: A Survey of the Compression Lifecycle in MLLMs. [PDF] [Repository]
Hao Wu*, Junlong Tong*, Xudong Wang, Yang Tan, Changyu Zeng, Anastasia Antsiferova, Xiaoyu Shen^†.
Under review

LLM for Sequence Modeling

Rethinking the Role of LLMs in Time Series Forecasting. [PDF] [Code]
Xin Qiu*, Junlong Tong*, Yirong Sun, Yunpu Ma, Wei Zhang, Xiaoyu Shen^†.
Under review
The Few Govern the Many: Unveiling Few-Layer Dominance for Time Series Models. [PDF] [Code]
Xin Qiu*, Junlong Tong*, Yirong Sun, Yunpu Ma, Xiaoyu Shen^†.
Under review
Probabilistic Decomposition Transformer for Time Series Forecasting. [PDF] [Code]
Junlong Tong, Liping Xie, Kanjian Zhang.
SIAM International Conference on Data Mining (SDM 2023).
Enhancing Time Series Forecasting: A Hierarchical Transformer with Probabilistic Decomposition Representation. [PDF]
Junlong Tong, Liping Xie, Wankou Yang, Kanjian Zhang, Junsheng Zhao.
Information Sciences 2023.
Hourly Solar Irradiance Forecasting Based on Encoder–decoder Model Using Series Decomposition and Dynamic Error Compensation. [PDF]
Junlong Tong, Liping Xie, Shixiong Fang, Wankou Yang, Kanjian Zhang.
Energy Conversion and Management 2022.

Educations

Ph.D. candidate in Computer Science at Shanghai Jiao Tong University (2023 - present), Shanghai, China
M.Eng in Artificial Intelligence at Southeast University (2020 - 2023), Nanjing, China
B.Eng in Electrical Engineering at Shanghai University of Electric Power (2016 - 2020), Shanghai, China

Honors and Awards

Outstanding Master’s Thesis Award of Jiangsu Province, 2023
SDM23 Student Travel Awards, 2023

Academic Services

Conference Reviewer: ICML, NeurIPS, ACL, CVPR, ECCV, AAAI
Journal Reviewer: TCVST, TNNLS

Contact

Email: jl-tong@sjtu.edu.cn