人工智能:Gemini/Gemma模型部署与使用

Gemini是Google开发的一款多模态大模型,能够处理文本、图像、音频、视频和代码等信息。目前推出的Gemini模型分为Nano、Pro、Ultra以及1.5Pro,这些模型都可以在谷歌网站上进行访问:https://gemini.google.com 。此外,谷歌还提供了Gemini模型的API,可在代码中调用模型,输入文本和图片然后,输出文本回复,以及一款开源大语言模型gemma,该模型基于Gemini的研究和技术开发,能处理文本信息。模型有2b和7b两种参数规模以及经过指令调优(2b-it & 7b-it)和未调优的基础模型等版本,可通过多种框架构建:Keras、Pytorch、Transformers、Gemma C++、TensorRT-LLM、TensorFlow Lite、MaxText、Pax、Flax。感谢我过去的学生 Weizheng Wang, Hui Wu 对本文的贡献。

环境配置

安装 CentOS 7.9 及 Miniconda 3

本文选用 CentOS 7 的最后一代发行版本,内核版本3.10.0,并配置网络环境:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Last login: Fri Jul 12 03:58:06 2024 from 222.210.147.117
(base) [root@server01 ai]# uname -r
3.10.0-1160.el7.x86_64

(base) [root@server01 ai]# ping www.baidu.com -c4
PING www.a.shifen.com (39.156.66.14) 56(84) bytes of data.
64 bytes from 39.156.66.14 (39.156.66.14): icmp_seq=1 ttl=50 time=38.5 ms
64 bytes from 39.156.66.14 (39.156.66.14): icmp_seq=2 ttl=50 time=38.2 ms
64 bytes from 39.156.66.14 (39.156.66.14): icmp_seq=3 ttl=50 time=38.2 ms
64 bytes from 39.156.66.14 (39.156.66.14): icmp_seq=4 ttl=50 time=38.2 ms

--- www.a.shifen.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 38.220/38.334/38.594/0.151 ms

配置硬盘分区挂载及 DSA/RSA 加密:

1
2
3
4
5
6
7
8
(base) [root@server01 ai]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 14.6T 0 disk /dataset02
nvme0n1 259:0 0 1.8T 0 disk
├─nvme0n1p1 259:1 0 1G 0 part /boot
├─nvme0n1p2 259:2 0 64G 0 part [SWAP]
└─nvme0n1p3 259:3 0 1.8T 0 part /
(base) [root@server01 ai]# fdisk -l

安装 Nvidia 驱动环境:

使用yum install pciutils安装lspci工具:

1
2
Last login: Fri Jul 12 03:58:06 2024 from 222.210.147.117
(base) [root@server01 ai]# yum install pciutils

检查显卡连接状态:

1
2
3
4
5
6
Last login: Fri Jul 12 04:18:08 2024 from 222.210.147.117
(base) [root@server01 ai]# lspci | grep -i nvidia
18:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev a1)
3b:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev a1)
86:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev a1)
af:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev a1)

安装编译环境和内核开发包,确保与当前运行的内核版本相匹配:

1
2
3
4
5
6
(base) [root@server01 ai]# yum install -y gcc kernel-devel kernel-headers
(base) [root@server01 ai]# yum install "kernel-devel-uname-r == $(uname -r)"

(base) [root@server01 ai]# vi /etc/selinux/config
# 修改SELINUX=disabled。保存退出
(base) [root@server01 ai]# setenforce 0 # 临时关闭selinux

检查nouveau是否禁用:

1
(base) [root@server01 ai]# lsmod | grep nouveau

上述命令若无输出,则跳过以下步骤:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(base) [root@server01 ai]# vi /lib/modprobe.d/dist-blacklist.conf 
# 编辑 /lib/modprobe.d/dist-blacklist.conf 文件
# 注释掉 #blacklist nvidiafb
# 末尾添加:
# blacklist nouveau
# options nouveau modeset=0

# 重建initramfs image镜像
(base) [root@server01 ai]# mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
(base) [root@server01 ai]# dracut /boot/initramfs-$(uname -r).img $(uname -r)
# 修改运行模式为文本模式
(base) [root@server01 ai]# systemctl set-default multi-user.target

(base) [root@server01 ai]# reboot

Nvidia driver download根据GPU型号及操作系统选择相应的驱动程序下载至本地后安装,安装过程中选择默认选项即可:

1
2
3
4
(base) [root@server01 ai]# chmod +x NVIDIA-Linux-x86_64-版本号.run
(base) [root@server01 ai]# ./NVIDIA-Linux-x86_64-版本号.run

(base) [root@server01 ai]# reboot

检查 Nvidia driver:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Last login: Fri Jul 12 04:13:36 2024 from 222.210.147.117
[ai@server01 ~]$ nvidia-smi
Fri Jul 12 04:18:22 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:18:00.0 Off | 0 |
| N/A 25C P0 59W / 300W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe Off | 00000000:3B:00.0 Off | 0 |
| N/A 25C P0 61W / 300W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe Off | 00000000:86:00.0 Off | 0 |
| N/A 23C P0 58W / 300W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe Off | 00000000:AF:00.0 Off | 0 |
| N/A 26C P0 59W / 300W | 0MiB / 81920MiB | 2% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

安装 miniconda 3:

使用curl命令下载并安装Miniconda 3:

1
2
3
4
(base) [root@server01 ai]# curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

(base) [root@server01 ai]# chmod +x Miniconda3-latest-Linux-x86_64.sh
(base) [root@server01 ai]# ./Miniconda3-latest-Linux-x86_64.sh

一直按Enter阅读协议,当询问是否同意许可协议时,输入yes;当询问是否安装到指定位置时,按回车键接受默认路径;当询问是否让conda初始化时,如果你打算在所有用户之间共享Miniconda 3,选择yes

初始化 Conda:

若之前选择的是no,则需手动初始化,否则不用:

1
2
3
4
5
source ~/miniconda3/bin/activate
conda init

加载配置
source ~/.bashrc

使用conda –version 查看版本,检查是否安装成功(需进root):

1
2
(base) [root@server01 miniconda3]# conda --version
conda 24.4.0

Cuda、Pytorch等工具建议使用conda创建虚拟环境后安装。

安装 Pytorch 及 Transformers 等环境

环境创建:

1
(base) [root@server01 ai]# conda create -n gemma python=3.9

激活环境:

1
(base) [root@server01 ai]# conda activate gemma

激活环境后安装包依赖:

1
2
(base) [root@server01 ai]# pip install torch transformers bitsandbytes
(base) [root@server01 ai]# pip install tensorboard trl datasetsl peft

可能还有包未列出,可按照运行提示安装。

部署 Gemma 训练环境及预训练模型

注册Kaggle账号,在网站 https://www.kaggle.com/models/google/gemma 下载所需模型,配置要求:Python≥3.8,下载模型:

官方文档页面: https://github.com/google/gemma_pytorch ,文档中介绍了在Linux下使用docker配置环境并运行模型的方法,但未说明对模型进行调整的操作和对训练集的要求,使用常规Pytorch对NLP模型的训练方式即可。

下载的预训练模型分别存放在/home/ai/gemma-2b/home/ai/gemma-7b中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
(base) [root@server01 gemma-2b]# ls -al /home/ai/gemma-2b
total 14712812
drwxrwxr-x. 2 ai ai 4096 Jun 6 22:49 .
drwx------. 9 ai ai 4096 Jul 4 07:51 ..
-rw-rw-r--. 1 ai ai 634 Jun 6 22:49 config.json
-rw-rw-r--. 1 ai ai 10031780672 Jun 6 22:49 gemma-2b.gguf
-rw-rw-r--. 1 ai ai 137 Jun 6 22:18 generation_config.json
-rw-rw-r--. 1 ai ai 1620 Jun 6 22:49 .gitattributes
-rw-rw-r--. 1 ai ai 4945242264 Jun 6 22:18 model-00001-of-00002.safetensors
-rw-rw-r--. 1 ai ai 67121608 Jun 6 22:02 model-00002-of-00002.safetensors
-rw-rw-r--. 1 ai ai 13489 Jun 6 22:02 model.safetensors.index.json
-rw-rw-r--. 1 ai ai 555 Jun 6 22:02 special_tokens_map.json
-rw-rw-r--. 1 ai ai 1108 Jun 6 22:02 tokenizer_config.json
-rw-rw-r--. 1 ai ai 17477553 Jun 6 22:02 tokenizer.json
-rw-rw-r--. 1 ai ai 4241003 Jun 6 22:02 tokenizer.model
(base) [root@server01 gemma-2b]# ls -al /home/ai/gemma-7b
total 50054220
drwxrwxr-x 3 ai ai 4096 Jun 29 06:13 .
drwx------. 9 ai ai 4096 Jul 4 07:51 ..
-rw-rw-r-- 1 ai ai 636 Jun 29 06:13 config.json
drwxrwxr-x 2 ai ai 88 Jun 29 06:13 examples
-rw-rw-r-- 1 ai ai 34158344288 Jun 29 06:13 gemma-7b.gguf
-rw-rw-r-- 1 ai ai 137 Jun 29 05:25 generation_config.json
-rw-rw-r-- 1 ai ai 1620 Jun 29 06:13 .gitattributes
-rw-rw-r-- 1 ai ai 4995496656 Jun 29 05:25 model-00001-of-00004.safetensors
-rw-rw-r-- 1 ai ai 4982953168 Jun 29 05:18 model-00002-of-00004.safetensors
-rw-rw-r-- 1 ai ai 4982953200 Jun 29 05:11 model-00003-of-00004.safetensors
-rw-rw-r-- 1 ai ai 2113988336 Jun 29 05:04 model-00004-of-00004.safetensors
-rw-rw-r-- 1 ai ai 20920 Jun 29 05:01 model.safetensors.index.json
-rw-rw-r-- 1 ai ai 555 Jun 29 05:01 special_tokens_map.json
-rw-rw-r-- 1 ai ai 1108 Jun 29 05:01 tokenizer_config.json
-rw-rw-r-- 1 ai ai 17477553 Jun 29 05:01 tokenizer.json
-rw-rw-r-- 1 ai ai 4241003 Jun 29 05:01 tokenizer.model

测试数据集保存在另一路径,数据形如:

1
2
3
4
5
6
(base) [root@server01 ~]# less /dataset02/dataset/01.ont.seq/train/Exp001_shanzhu.aozhouhuangjin.raw.fastq.txt
TGCTTCGTTCAGTTACGTATTGCTAAGGTTAAACAGACGACTACAAACTGAATCGACAGCACCTCTTTCTATTATGGTGGACTTTATGTATTATAGTTTTGATTTGTGTATTATGGATTATGGTTGGTTGCTTTGATTTAGCTAGATTATGGATTACTTAGCCTCGTAAAGTGGTATCGATCGAAATGAGTGTAATGGTCGTGATG
ACATTTTGGAGGGTAACATCGATGTTTTGTTTAGATTGTAAAGAAGGGTGCCTATGGTATGTATGAGATGGGGTAAGAAGTGATTTTCTTGAATTGTCCATATTCCAATGTTTGGTTACTTAGTGAAATCGTCGGTGTTGATGCTTACTTGTTTTGTAGAATCATAATGGTGGCTAGC
TACTTCAGTTTCGGTTACGTATTGCTAAGGTTAACAGACGACTACAAAACGGAATCGACAGCACCTTTATTTTGTGTTTGTCGTTGGAGAATTGATCTTTCTTCAATGAAATTTATCTCTAGAATTTATTTGTTGATTAATTTCTAGGTTGAAGAACATAAAGAAATTCATAGATTAAATCCTATCTGAATAACTGGGGCCGATCT
ATGCGGCAATAAAAGGTTAATGATTTGTCTTTAATAAAGTTTATTTAAATCATGTATGATTAACCATGATCAATATAAATTTGGATAGGATTAATGTAATTTGATCGTAAGTACATTAATCAATCAAGATCACTATTTGGCTAGTAAAGGCAACAATTCAATTAGCATATCTATAGAAAATTGTCATATCATTACTTGGTTAAATT
······

Gemma模型导入与配置

编写脚本,使用transformers加载本地模型和分词器:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
import torch
from datasets import load_dataset
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
AutoTokenizer,
TrainingArguments,
set_seed
)
from trl import SFTTrainer

model_path = "/data/models/gemma-2b-tf/"
set_seed(1234) # For reproducibility

model_path = "/data/models/gemma-2b-tf/"
set_seed(1234) # For reproducibility

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path, add_eos_token=True, use_fast=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side = 'left'

data_files = {"train": "/data/datasets/WNLI/train1.tsv", "test": "/data/datasets/WNLI/dev1.tsv"}
ds = load_dataset("csv", data_files=data_files, delimiter="\t")

compute_dtype = getattr(torch, "float16")
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=compute_dtype,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_path, quantization_config=bnb_config, device_map={"": 0}
)
model = prepare_model_for_kbit_training(model)
# Configure the pad token in the model
model.config.pad_token_id = tokenizer.pad_token_id
model.config.use_cache = False
# 编辑微调配置
peft_config = LoraConfig(
lora_alpha=16,
lora_dropout=0.05,
r=16,
bias="none",
task_type="CAUSAL_LM",
target_modules= ['k_proj', 'q_proj', 'v_proj', 'o_proj', "gate_proj", "down_proj", "up_proj"]
)
# 设定训练参数
training_arguments = TrainingArguments(
output_dir="./results_qlora",
evaluation_strategy="steps",
do_eval=True,
optim="paged_adamw_8bit",
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
log_level="debug",
save_steps=50,
logging_steps=50,
learning_rate=2e-5,
eval_steps=50,
max_steps=300,
warmup_steps=30,
lr_scheduler_type="linear",
)
# 载入训练配置
trainer = SFTTrainer(
model=model,
train_dataset=ds['train'],
eval_dataset=ds['test'],
peft_config=peft_config,
dataset_text_field="text",
max_seq_length=128,
tokenizer=tokenizer,
args=training_arguments,
)
#开始训练
trainer.train()

将脚本保存为gemmatrain.py,命令行运行python gemma.train.py ,即可运行。

在上述脚本中导入了tsv格式的文本作为训练和测试集,实际上原始格式的文本并不能直接作为模型的输入,还需将一整行的文本内容(即单个输入输出对)记在同一标签“text”下。如:

1
2
3
4
text
sentence1:xxxxxxxx.senstence2:xxxxxxxx.label:n
sentence1:xxxxxxxx.senstence2:xxxxxxxx.label:n
sentence1:xxxxxxxx.senstence2:xxxxxxxx.label:n

以上格式仅为测试使用,后续会考虑数据集的具体内容和训练需求进行优化。