📝 Publications
📊 AI-Native Data Systems
SIGMOD 2025

Doctopus: Budget-aware Structural Data Extraction from Documents
Yuanhao Zhong, Yuhao Deng, Chengliang Chai, et al.
Project | | Video Demo
- Doctopus is a framework designed to accurately extract structured data from large-scale unstructured documents under cost constraints.
- Impact: Doctopus improves accuracy by 11% under the same cost, or achieves a 2.7x cost reduction while maintaining precision.
VLDB 2025

DocDB: A Database for Unstructured Document Analysis
Zequn Li*, Yuanhao Zhong*, Chengliang Chai, Zhaoze Sun, Ye Yuan, Lei Cao
Project | | Video Demo
- DocDB is tailored for unstructured document analysis, enabling users to perform complex data filtering and joining via standard SQL queries.
- Performance: DocDB significantly outperforms existing systems in query accuracy, execution latency, and Token consumption cost.
VLDB 2025Budget-aware Structural Table Extraction from Unstructured Documents, Chengliang Chai, Jiajun Li, Yuhao Deng, Yuanhao Zhong, Ye Yuan, Lei Cao
💡 Others
CCL 2025Application of Macroscopic Pattern Prompting and Efficient Finetuning in Factivity Inference, Zequn Li*, Yuanhao Zhong*, et al.