Research
Model Merging Scaling Laws: A New Way to Predict and Plan LLM Composition
We study empirical scaling laws for language model merging measured by cross-entropy. Despite wide practical use, merging has lacked a quantitative rule predicting returns as experts are added or …
InfiAgent: A self-evolving pyramid agent framework for infinite scenarios released
With the rapid development of artificial intelligence, Large Language Model (LLM) agents have demonstrated remarkable capabilities in organizing and executing complex tasks. However, their development …
InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
The immense computational cost of training Large Language Models (LLMs) presents a major barrier to innovation. While FP8 training offers a promising solution with significant theoretical efficiency …
InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning
We introduce InfiMed-Series models, InfiMed-SFT-3B and InfiMed-RL-3B, medical-focused Multimodal Large Language Models (MLLMs) developed by the InfiX-AI team. InfiMed-RL-3B achieves an average …
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
We introduce InfiGUI-G1, a multimodal GUI agent that employs Adaptive Exploration Policy Optimization (AEPO) to improve semantic alignment in GUI grounding, achieving up to 8.3% relative improvement …
InfiGFusion: Graph-on-Logits Distillation for Scalable Model Fusion
Recent advances in large language models (LLMs) have intensified efforts to fuse heterogeneous open-source models into a unified system that inherits their complementary strengths. Existing …
InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models
Model fusion combines multiple Large Language Models (LLMs) with different strengths into a more powerful, integrated model through lightweight training methods. Existing works on model fusion focus …
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners
We present InfiGUI-R1, a novel GUI agent that combines spatial reasoning with reinforcement learning to achieve superior performance in GUI automation tasks across desktop, mobile, and web platforms.
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion
InfiFusion is the first fusion framework for large language models that fuse up to 4 models with 14B~24B parameters. We introduce a unified framework which can fuse many heterogeneous models in one …
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
A multimodal large language model-based GUI agent that enables enhanced task automation on computing devices through hierarchical reasoning and expectation-reflection reasoning.