ShareGPT-4o-Image and Janus-4o
GPT-4o-level image generation data and a unified multimodal image generation model.
ShareGPT-4o-Image distills GPT-4o-style image generation interactions into an open dataset, and Janus-4o turns that data into a practical multimodal model for text-to-image and text-plus-image-to-image generation.
Research Storyline
ShareGPT-4o-Image captures text-to-image and image-conditioned editing instructions so researchers can study generation behavior with open data.
Janus-4o uses the data to support image understanding and generation in one compact model family.
Image-conditioned generation and editing matter because real users usually refine, transform, and localize existing visual content.
The dataset, model, and paper give the community an open reference point for GPT-4o-like image generation interactions.
What The Project Contributes
The dataset includes text-to-image and image-conditioned generation examples, making GPT-4o-like generation behavior easier to study and reproduce.
Janus-4o adapts a unified multimodal architecture so image understanding and image generation can live in one model family.
The data and model support not only generation from text but also edits and transformations conditioned on existing images.
Display Figures
Paper Trail
Open image generation and editing instruction data for studying GPT-4o-style multimodal generation behavior.
DatasetA unified multimodal model checkpoint for text-to-image and image-conditioned generation.
ModelConnects the generation project to the lab's larger multimodal program on visual context, data quality, and open model behavior.
Long-context multimodal AIWhy It Matters
- Open image generation research needs high-quality instruction data, not just model weights.
- Image editing is an important multimodal interaction pattern because users often refine existing visual content rather than generate from scratch.
- The project gives the community a compact way to study GPT-4o-level image generation behavior in open models.
Resource Map
Open image generation and editing instruction data.
DatasetUnified multimodal model checkpoint for text-to-image and image-conditioned generation.
ModelCode, examples, and release documentation for the ShareGPT-4o-Image project.
Repository