Papers·6일 전
DR-Venus-4B: Edge-scale deep research agent with 10K open data outperforms 9B models

DR-Venus-4B, a 4B-parameter deep research agent trained entirely on ~10K open data, surpasses prior agentic models under 9B parameters on multiple benchmarks and narrows the gap to 30B-class systems. The two-stage recipe combines agentic SFT with strict data cleaning and long-horizon trajectory resampling, followed by agentic RL using turn-level rewards based on information gain and format-aware regularization. Models, code, and recipes are released for reproducible research.
- #deep-research
- #small-language-models
- #reinforcement-learning
- #agentic-sft
- #edge-deployment
inclusionAI