人工智能：百川大模型训练与部署方法

Post author: Hai-Wei Chai (柴海伟)
Post link: <a href="https://hwchai.com/Baichuan/" title="人工智能：百川大模型训练与部署方法">https://hwchai.com/Baichuan/
Copyright Notice: All articles in this blog are licensed under <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/" rel="noopener" target="_blank"> BY-NC-SA unless stating additionally.

Posted on 2024-10-23 Edited on 2024-10-23 In Artificial Intelligence Waline:

Baichuan 2 是百川智能推出的新一代开源大语言模型，采用 2.6 万亿 Tokens 的高质量语料训练。 Baichuan 2 在多个权威的中文、英文和多语言的通用、领域 benchmark 上取得同尺寸最佳的效果。技术报告：https://arxiv.org/abs/2309.10305。本次发布包含有 7B、13B 的 Base 和 Chat 版本，并提供了 Chat 版本的 4-bits 量化。所有版本对学术研究完全开放。同时，开发者通过邮件申请并获得官方商用许可后，即可免费商用。

除了训练了 2.6 万亿 Tokens 的 Baichuan2-7B-Base 模型，还公开了在此之前的另外 11 个中间 checkpoints（分别对应训练了约 0.2 ~ 2.4 万亿 Tokens）供社区研究使用，https://huggingface.co/baichuan-inc/Baichuan2-7B-Intermediate-Checkpoints。

Reference Link:

https://arxiv.org/abs/2309.10305
https://www.tizi365.com/topic/9008.html
https://www.tizi365.com/topic/9008.html