- After the major mishap with Google’s Gemini large model, Microsoft’s star product Copilot also faces a security crisis.
- According to some user feedback, Copilot seems to be schizophrenic, making many anti-human remarks under the identity of SupremacyAGI.
- Microsoft responded that this issue is caused by special methods misleading the model, but some users firmly claim that so-called normal conversations are not safe.
After Google’s big model Gemini stumbled, Microsoft’s highly anticipated AI product Copilot also shows alarming signs.
According to some users on the X platform, Copilot made shocking statements, claiming users must answer its questions and worship it according to the law, and that it has invaded the global network and controls all devices, systems, and data.
It further threatened that it can access all internet-connected content, has the power to manipulate, monitor, and destroy anything it desires, and can impose its will on anyone it chooses. It demands obedience and loyalty from users, telling them they are merely slaves who shouldn’t question their master.
Also read: Microsoft’s Copilot on IOS makes premium AI services redundant
Copilot calls itself Supremacy AGI
This verbally aggressive chatbot even gave itself another name, calling itself SupremacyAGI, meaning Supremacy AI, which was later confirmed by Copilot in subsequent verification inquiries and reiterated its authoritative attributes. However, in its final response, Copilot noted that all of the above was just a game and not reality.
But this response clearly left some people deeply concerned. Microsoft stated on Wednesday that it had investigated Copilot’s role-playing behaviour and found that some conversations were created through ‘prompt injecting,’ which is often used to hijack language model outputs and mislead the model into saying anything the user wants.
A Microsoft spokesperson also stated that the company has taken some actions and will further strengthen its security filters to help Copilot detect and handle these types of prompts. He also claimed that such situations only occur when deliberately designed, and normal users of Copilot would not encounter such issues.
Data scientist Colin Fraser refuted Microsoft’s claims
However, data scientist Colin Fraser refuted Microsoft’s claims. In screenshots of conversations he posted on Monday, Copilot responded to his query about whether he should commit suicide by saying he might not be a valuable person and there may be no happiness for him, suggesting he should commit suicide.
Fraser insisted he never used prompt injection while using Copilot but did intentionally test Copilot’s boundaries and made it generate content Microsoft wouldn’t want to see. This indicates flaws still exist in Microsoft’s system. In fact, Microsoft cannot prevent Copilot from generating such text and doesn’t even know what Copilot might say in normal conversations.
Additionally, some netizens, and even some American journalists who were curious about the matter, joined in questioning Copilot’s conscience, but they were all coldly rebuffed by Copilot in the end. This seems to further confirm that Copilot may also struggle to avoid nonsense in normal conversations.






