Anthropic plans to fund new AI benchmarks

  • Unveiled on Monday, Anthropic launched a new program, looking to fund the development of new benchmarks that can evaluate the performance and impact of AI models.
  • The new benchmarks that Anthropic hopes to create focuses more on the security and social impact of artificial intelligence, but considering the company’s business ambitions in the AI competition, their actions may have other purposes.

OUR TAKE
Anthropic has launched an AI benchmarks development program, funding third-party organi
sations to evaluate model performance and impact, with the aim of improving AI security. However, its business ambitions are questionable, and there are differences views in AI risk perception. Although the efforts are commendable, the universal significance of this benchmarks remains to be observed and its value needs to be continuously evaluated.
–Jasmine Zhang, BTW reporter

What happened

Anthropic announced the launch of a program aimed at funding the development of new benchmarks tests to evaluate the performance and impact of AI models. The plan was announced on Monday and will provide funding support to third-party organisations that can effectively measure the advanced capabilities of AI models.

Anthropic stated on its official blog that this investment aims to enhance the entire AI security field and provide valuable tools that benefit the entire ecosystem. The development of high-quality, safety related assessments remains challenging, and the demand far exceeds supply. The current AI benchmarks has shortcomings that make it difficult to reflect the actual usage of the system by ordinary people, and it is also questionable whether some old benchmarks tests have truly measured the claimed content.

The solution proposed by Anthropic is to create challenging benchmarks that focus on AI security and social impact, achieved through new tools, infrastructure, and methods. The company specifically calls for the development and evaluation of models for testing the capabilities of cyber attacks, enhancing weapons of mass destruction, and manipulating or deceiving humans. For AI risks related to national security and defense, Anthropic has promised to develop an “early warning system”, but has not disclosed specific details in the blog.

Also read: Schneider, NVIDIA to build AI ‘benchmark’ data centre design

Also read: Anthropic claims its latest model is best-in-class

Why it’s important

At present, the research, development, and regulation of artificial intelligence are all in a stage of rapid development. The efforts made by Anthropic to support the new artificial intelligence benchmarks are groundbreaking and to some extent, cost and return free, which is certainly commendable.

However, it should be noted that considering the company’s business ambitions in the artificial intelligence competition, we need to remain skeptical about the new benchmarks provided by Anthropic.

Some people in the artificial intelligence community also have objections to the “catastrophic” and “deceptive” risks of artificial intelligence mentioned by Anthropic. Many experts argue that there is little evidence to suggest that artificial intelligence will gain the ability to end the world and surpass humanity in the short term. So, there is still room for further discussion on whether the efforts made by Aerospace to create new benchmarks for AI have universal significance, whether their actions have a selfish interest in obtaining benefits, and whether the detection results of the new benchmarks have reference value.

Jasmine-Zhang

Jasmine Zhang

Jasmine Zhang is an intern reporter at Blue Tech Wave specialising in AI and Fintech. She graduated from Kunming University of Science and Technology. Send tips to j.zhang@btw.media.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *