Image En Mots is a generative model designed for scenarios that require the generation of ultra-detailed text from images. It leverages AI recognition and description capabilities in more complex scenarios using gpt4o, supporting English only. Trained with 100,000 hours of English data, Image En Mots guarantees high quality and naturalness in text generation.