Microsoft safety system can catch hallucinations in its AI apps

Microsoft’s Azure introduces new safety features for AI models, including Prompt Shields and Groundedness Detection, aimed at detecting vulnerabilities and blocking malicious prompts.
These features enhance control over filtering inappropriate content, addressing concerns about AI model safety.
Microsoft’s commitment to AI safety aligns with expanding Azure’s AI capabilities amidst growing demand.

Microsoft’s Chief Product Officer of Responsible AI, Sarah Bird, reveals to The Verge in an interview the rollout of new safety features designed by her team for Azure customers. These features, powered by LLM technology, aim to detect vulnerabilities, monitor for plausible yet unsupported scenarios, and block malicious prompts in real-time for users leveraging Azure AI models.

Also read: Windows is under new management due to Microsoft AI reorganisation

Also read: Microsoft Teams is getting smarter Copilot AI features

There are various functionalities improving safety

The functionalities include Prompt Shields, Groundedness Detection, and safety evaluations, with additional features such as directing models towards safe outputs and tracking prompts for problematic users forthcoming. Notably, the system evaluates input prompts for banned words and hidden cues before processing, ensuring responses align with desired outcomes.

It can filter hate speech or violence within AI models

Bird emphasises the customisable control over filtering hate speech or violence within AI models, addressing concerns about inappropriate content. These safety measures extend to popular models like GPT-4 and Llama 2, although users of smaller, less-used open-source models may need to manually configure the features.

Microsoft’s commitment to enhancing AI safety aligns with the growing demand for Azure’s AI capabilities, underscored by recent partnerships aimed at expanding its model offerings.

Chloe Chen

Chloe Chen is a junior writer at BTW Media. She graduated from the London School of Economics and Political Science (LSE) and had various working experiences in the finance and fintech industry. Send tips to c.chen@btw.media.

Microsoft safety system can catch hallucinations in its AI apps

There are various functionalities improving safety

It can filter hate speech or violence within AI models

Chloe Chen

Leave a Reply Cancel reply

Popular Posts

Where did the metaverse go? And will Apple’s Vision Pro bring it back?

What is a Regional Internet Registry?

Carbon trading explained: Top 4 carbon emission exchanges in 2024

How criminals used AI face apps to swindle users: A China case study exposes the risks

What are IP addresses and why they are important?

Leave a Reply Cancel reply

Our biggest stories, delivered to your inbox every week.

About us

Contact us

Microsoft safety system can catch hallucinations in its AI apps

There are various functionalities improving safety

It can filter hate speech or violence within AI models

Related Posts

Leave a Reply Cancel reply

Popular Posts

Leave a Reply Cancel reply

Our biggest stories, delivered to your inbox every week.

About us

Contact us