LLM Efficiency and AI Infrastructure
Efficient training, inference, retrieval, data, and multimodal context infrastructure for open LLM builders.
This project gathers the infrastructure work that makes open models more usable: longer multimodal context, cheaper visual tokens, stronger retrieval data, editable memory, efficient fine-tuning, and reasoning systems that spend computation where it matters.
Background and Motivation
Better models are not enough if training data, retrieval pipelines, memory, long context, and inference cost do not scale with real tasks.
Images, videos, and long documents quickly overload context windows, making token selection and long-context design central to deployment.
Efficient reasoning systems should prune weak paths, tune with minimal supervision, and update knowledge without expensive retraining.
Core Ideas
LongLLaVA and MileBench study how models reason over many images and long multimodal inputs without losing global structure.
TRIM studies token reduction for multimodal LLMs, reducing cost while preserving reasoning-critical visual information.
RAG-Instruct, LLMZoo, and related resources package data, models, and tasks so downstream builders can reproduce and extend systems.
E2-RAG, prefix tuning, question-free fine-tuning, and early path pruning all aim to update or accelerate systems without rebuilding everything.
Typical Work
Long-context multimodal model and benchmark work for reasoning over many images, videos, and long visual contexts.
Project pageReduces visual tokens so multimodal inference is cheaper while retaining task-relevant evidence.
PaperStudies editable efficient retrieval and data infrastructure for retrieval-augmented instruction following.
Project pageImproves adaptive reasoning with less supervision and by pruning low-value reasoning branches early.
QFFTDisplay Figures
Resource Map
Long-context multimodal project page covering models, benchmarks, and visual context scaling.
Project pageData and retrieval infrastructure for instruction tuning and knowledge-grounded model building.
Project pageOpen model and training resource collection that supports the lab's broader multilingual and infrastructure stack.
RepositoryWhy It Matters
- Open AI progress depends on infrastructure that smaller teams can run, inspect, and extend.
- Long-context and multimodal tasks need cost-aware model design, not only larger context windows.
- Editable retrieval and efficient adaptation make models more practical in domains where knowledge changes quickly.