- Follow-Your-Click takes images combined with simple text prompts and turns them into short video clips with just a click
- Tencent collaborated with researchers from universities in Hong Kong and Beijing amid growing excitement around AI video generation
OUR TAKE
Unlike ChatGPT, Tencent’s Follow-Your-Click combines images with simple text prompts that can be converted into short video clips with a single click.
While other models require users to describe in detail how and where they want the image to move, Follow-Your-Click allows for a more specific focus on specific things in the image.
-Jennifer YU, BTW reporter
Chinese internet giant Tencent Holdings introduced an image-to-video artificial intelligence (AI) model in collaboration with academic partners on Friday.
Follow-Your-Click
The image-animation tool called Follow-Your-Click was released on Microsoft’s open-source code website GitHub, amid the rising fervour around content-generating tools like OpenAI’s Chat.
The project is a collaboration between Tencent’s Hunyuan team, the Hong Kong University of Science and Technology and Tsinghua University, one of mainland China’s top two universities in Beijing.
Also read: ByteDance’s gaming exodus: Talks with Tencent reshape industry
It’s features
Follow-Your-Click allows users to click on certain parts of a picture with a simple text prompt indicating how they would like it to move to then transform a still image into a short animated video.
Tencent said it will release the full code for the model in April, but a demo is already available on GitHub.
Researchers showcased some of its capabilities there, with one result showing how an image of a girl standing outdoors with the simple one-word prompt “storm” turned into an animation with lightning flashing in the background.
According to an academic paper by the researchers from the three organisations, Follow-Your-Click aims to solve the problem faced by other image-to-video models on the market, which tend to move the whole scene rather than focusing on specific objects in the picture.“
Our framework has simpler yet precise user control and better generation performance than previous methods,” the researchers said in the paper published on Wednesday.






