← Back to feed
Papers·1개월 전

DR-Venus-4B: Edge-scale deep research agent with 10K open data outperforms 9B models

DR-Venus-4B: Edge-scale deep research agent with 10K open data outperforms 9B models

DR-Venus-4B, a 4B-parameter deep research agent trained entirely on ~10K open data, surpasses prior agentic models under 9B parameters on multiple benchmarks and narrows the gap to 30B-class systems. The two-stage recipe combines agentic SFT with strict data cleaning and long-horizon trajectory resampling, followed by agentic RL using turn-level rewards based on information gain and format-aware regularization. Models, code, and recipes are released for reproducible research.

  • #deep-research
  • #small-language-models
  • #reinforcement-learning
  • #agentic-sft
  • #edge-deployment
inclusionAI
원문 보기 →

Comments

— 첫 댓글을 남겨보세요 —