Browsing: AI

Artificial Intelligence insights news and analysis, from product developments in AI to legal and policy issues.

Baichuan Intelligence, a startup founded by Wang Xiaochuan, the founder of Sogou, hasintroduced its next-generation large language model Baichuan-13B.Baichuan Intelligence, a startup founded by Wang Xiaochuan, the founder of Sogou, hasintroduced its next-generation large language model Baichuan-13B. Wang, a computer scienceprodigy from Tsinghua University, aims to establish China’s version of OpenAI. Baichuan isconsidered one of China’s most promising developers in the field of large language models(LLMs). The model, based on the Transformer architecture like OpenAI’s GPT, has 13 billionparameters and is trained on Chinese and English data. Baichuan-13B is open source andoptimised for commercial applications.Training Data Comparable to GPT 3.5Baichuan-13B is trained on 1.4 trillion tokens, surpassing Meta’s LLaMa, which uses 1 trilliontokens in its 13 billion-parameter model. Wang has expressed his intention to release a large-scale model comparable to OpenAI’s GPT-3.5 by the end of this year. Within a short period,Baichuan has made significant progress, expanding its team to 50 people by the end of Apriland launching its first LLM, Baichuan-7B, in June.​Baichuan-13B is now available for free to approved academics and developers who wish to useit for commercial purposes. Notably, the model offers variations that can run on consumer-grade hardware, addressing the constraints posed by U.S. AI chip sanctions on China.Baichuan-7B is an open-source, large-scale pre-training language model, meticulously crafted bythe visionary minds at Baichuan Intelligent Technology. Rooted in the architecture of theTransformer model, this model harnesses a staggering 7 billion parameters and has beennourished with the exposure to a staggering 1.2 trillion tokens. With its unwavering versatility,Baichuan-7B gracefully accommodates both the Chinese and English languages.High Performance Scores Across the BoardDuly celebrated as a front-runner amongst models of similar scale, Baichuan-7B has emergedvictorious in renowned Chinese and English benchmarks, including the esteemed C-EVAL andMMLU assessments, etching its name at the pinnacle of linguistic excellence.This model consistently surpasses its counterparts of similar parameter magnitude, reigningsupreme as the pre-eminent native pre-trained model in the realm of Chinese languagecomprehension. In the AGIEval assessment, Baichuan-7B outshines other open-sourcecontenders, including LLaMA-7B, Falcon-7B, Bloom-7B, and ChatGLM-6B, by an astonishingmargin, securing an impressive score of 34.4 points.Baichuan-7B conquers the C-EVAL examination with a commanding score of 42.8 points,outshining ChatGLM-6B’s 38.9 points. In the Gaokao evaluation, the model reigns supreme withan exceptional score of 36.2 points, firmly establishing its dominance among pre-trainedmodels of comparable parameter scale.AGIEval, a celebrated benchmark initiative by Microsoft Research Institute, represents anexhaustive endeavour to assess the cognitive and problem-solving capacities of fundamentalmodels. C-Eval, a collaborative creation by Shanghai Jiao Tong University, Tsinghua University,and the University of Edinburgh, serves as a comprehensive examination evaluating the prowessof Chinese language models, encompassing 52 diverse subjects across various industries.The Gaokao benchmark, crafted by the esteemed research team at Fudan University, leveragesthe Chinese college entrance examination questions as a dataset, offering a rigorousexamination of large models’ aptitude in Chinese language comprehension and logicalreasoning.Baichuan-7B’s mastery extends effortlessly into the realm of English. In the esteemed MMLUassessment, baichuan-7B astounds with an extraordinary score of 42.5 points, effortlesslysurpassing the English open-source pre-trained model, LLaMA-7B, and the Chinese open-sourcemodel, ChatGLM-6B, by significant margins.A key determinant of success in large-scale model training resides within the training corpusitself. Baichuan Intelligent Technology diligently constructs a high-quality pre-training corpus,drawing from rich Chinese learning data and seamlessly integrating high-quality English data.This data amalgamation encompasses a vast array of Chinese and English internet data, open-source Chinese and English data, alongside a substantial corpus of meticulously curatedknowledge.