- 🎓 I'm currently pursuing a PhD in the field of multimodal learning.
- 🔭 I’m currently working on expressing visual content using language (e.g., image captioning, object description).
- 🌱 I’m looking to collaborate on using high-quality captions to train a diffusion / auto-regressive generation model to generate high-quality visual content (video/image/3D model).
- 📫 How to reach me: [email protected].
🔬
brain storm
Pinned Loading
-
PRIS-CV/CineTechBench
PRIS-CV/CineTechBench PublicA Benchmark for Cinematographic Technique Understanding and Generation
Python 23
-
PRIS-CV/ControllableObjectDescription
PRIS-CV/ControllableObjectDescription PublicA training-free pipeline to control dimension details in object description.
-
Caption2SceneGraph
Caption2SceneGraph PublicA parser tool using large language model and vision experts to parse the input caption into a scene graph
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

