Learning Timeline
Key Insights
Model Performance Comparison
For heavy text rendering: Midjourney often produces unreadable results; Flux may omit portions of the text; ChatGPT provides good layouts but may introduce typos. Gemini (Nano Banana Pro) is currently the most reliable for typo-free text.
Verification Strategy
When generating text on images, do not assume accuracy based on layout. Read the generated text word-for-word to ensure the model did not hallucinate typos or skip sentences.
Prompts
Text-on-Object Generation Template
Target:
Google Gemini
Create an image of [Object, e.g., a whiteboard]. Write the following text on it: '[Insert Exact Text Block]'. Include drawings of [Insert Drawing Descriptions] around the text.
Step by Step
Generating Accurate Text on Objects
- Open the Nano Banana Pro (Gemini) interface.
- Begin the prompt by defining the base object (e.g., 'A whiteboard').
- Type the specific command for the text content, such as 'Write the following text on the whiteboard:'.
- Input the exact block of text to be rendered.
- Append descriptions for visual context or drawings to accompany the text (e.g., 'Include drawing descriptions of...').
- Submit the prompt to generate the image.
- Inspect the result specifically for spelling accuracy (typos) and text completeness.