- Apple partnered with UC Santa Barbara to launch MGIE, which allows users to edit photos with text descriptions.
- MGIE is more efficient and flexible by using text prompts from the user and generating visual edits.
- But MGIE’s future applications are still up in the air.
Apple researchers have unveiled a new model that lets users describe how they want to modify photos in simple language, without having to touch the photo editing software.
A new model
Apple researchers, in collaboration with UC Santa Barbara, introduced MGIE, a novel model that allows users to describe photo edits via text without directly using editing software. This mllm guided image editing model can be cropped, resized, flipped and applied filters to accommodate both basic and complex editing tasks. By interpreting user prompts and generating visual edits, MGIE allows users to simply type in the desired changes.
A more intelligent response
When editing a photo with MGIE, users only need to enter the image they want to modify. The paper uses the example of editing a picture of pepperoni pizza. Typing the prompt “Make it healthier” will add vegetable toppings. A photo of a tiger in the Sahara Desert looks dark, but after telling the model to ‘add more contrast to simulate more light’, the photo looks brighter.
Still in the air
Apple announced MGIE on GitHub and posted a web demo on hug Face Spaces, but its future app is still up in the air. This advancement is similar to other platforms such as OpenAI’s DALL-E 3, which also employs text input for image editing.