系列文章目录
3FS系列(一):存储新纪元的开篇——3FS编译调优与部署的工程实践
引言
2月24日,重磅存储系统3FS(Fire-Flyer File System) 在 DeepSeek 轰轰烈烈的开源周压轴登场,补齐了计算、网络以外的另一块拼图——存储。区别于过往巧妙算法的开源库,3FS 是完整的涉及多种节点、结合多种外部节点的高速并行文件系统,其代码结构清晰、模块间解耦程度高,充分展现了 DeepSeek 工程师对复杂工程的驾驭能力。作为 DeepSeek 开源生态的一部分,3FS 于 2025 年 2 月 27 日在 GitHub 上正式开源,一经发布热度值瞬间爆表,引发业界关注。3FS 提供了几个关键特性,使其极其适合 AI 工作负载:
尽管官方已提供 3FS 详尽的设计文档,其复杂程度对于想要学习 3FS 系统的爱好者仍然提出了不小的挑战。作为人工智能基础软件方向的前沿力量,九章云极的研发大咖们近期也都在热议3FS,但我们今天不讨论3FS本身的产品设计,而是尝试借助我们存储方向的专业知识一步步抽丝剥茧、为大家手把手教学AGI时代需要什么样的存储系统以及存储主要的应用场景,并提供一些存储系统编译和部署的过程中的技巧和思路,希望能起到抛砖引玉的作用。
本篇文章是九章云极 3FS 系列文章的第一篇,我们将通过一次操作实例为大家讲述 3FS 的编译与部署过程。本文篇幅较长,请耐心操作。
实例步骤如下:
1、编译
前置说明
我们在 ubuntu 22.04 发行版上进行编译。默认的编译路径为当前用户的 Home 目录:
export BUILD_DIR=$HOME
步骤 1:安装依赖
1.1 安装依赖
$ apt update $ apt install -y cmake libuv1-dev liblz4-dev liblzma-dev libdouble-conversion-dev libdwarf-dev libunwind-dev \ libaio-dev libgflags-dev libgoogle-glog-dev libgtest-dev libgmock-dev clang-format-14 clang-14 clang-tidy-14 lld-14 \ libgoogle-perftools-dev google-perftools libssl-dev gcc-12 g++-12 libboost-all-dev cargo git g++ wget meson
1.2 安装 FoundationDB
$ cd ${BUILD_DIR} $ wget https://github.com/apple/foundationdb/releases/download/7.1.67/foundationdb-server_7.1.67-1_amd64.deb \ https://github.com/apple/foundationdb/releases/download/7.1.67/foundationdb-clients_7.1.67-1_amd64.deb $ dpkg -i foundationdb-server_7.1.67-1_amd64.deb foundationdb-clients_7.1.67-1_amd64.deb
1.3 安装 Fuse
$ cd ${BUILD_DIR} $ wget https://github.com/libfuse/libfuse/releases/download/fuse-3.16.2/fuse-3.16.2.tar.gz $ tar -zxvf fuse-3.16.2.tar.gz $ cd fuse-3.16.2; mkdir build; cd build $ meson setup .. $ ninja $ ninja install
步骤 2:编译 3FS
$ cd ${BUILD_DIR} $ git clone https://github.com/deepseek-ai/3fs $ cd 3fs $ git submodule update --init --recursive $ ./patches/apply.sh $ cmake -S . -B build -DCMAKE_CXX_COMPILER=clang++-14 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS=ON $ cmake --build build -j 45
一旦编译成功,将会生成以下这些二进制:
$ ls -ls ${BUILD_DIR}/3fs/build/bin total 2308428 355344 -rwxr-xr-x 1 root root 363871904 Mar 4 11:36 admin_cli 144976 -rwxr-xr-x 1 root root 148454880 Mar 4 11:30 hf3fs-admin 204336 -rwxr-xr-x 1 root root 209239320 Mar 4 11:32 hf3fs_fuse_main 277812 -rwxr-xr-x 1 root root 284476352 Mar 4 11:30 meta_main 174700 -rwxr-xr-x 1 root root 178892200 Mar 4 11:27 mgmtd_main 168300 -rwxr-xr-x 1 root root 172336688 Mar 4 11:26 migration_main 102740 -rwxr-xr-x 1 root root 105205000 Mar 4 11:19 monitor_collector_main 170628 -rwxr-xr-x 1 root root 174721688 Mar 4 11:26 simple_example_main 395964 -rwxr-xr-x 1 root root 405484072 Mar 4 11:34 storage_bench 313628 -rwxr-xr-x 1 root root 321173936 Mar 4 11:28 storage_main
步骤 3:打包二进制
因为我们需要在多台机器上部署服务,遂将需要的二进制以及配置文件打包成 tar 包,以便分发至各台机器。我们部署需要的所有都将包含在该 tar 包中:
$ cd ${BUILD_DIR} $ mkdir -p /tmp/3fs/{conf,logs,misc/{deps,scripts}} $ cp -r 3fs/build/bin /tmp/3fs $ cp -r 3fs/configs/* /tmp/3fs/conf $ cp -r 3fs/deploy/{data_placement,sql,systemd} /tmp/3fs/misc $ cp 3fs/build/src/lib/api/libhf3fs_api_shared.so /tmp/3fs/misc/deps/ $ cp foundationdb-server_7.1.67-1_amd64.deb foundationdb-clients_7.1.67-1_amd64.deb fuse-3.16.2.tar.gz /tmp/3fs/misc/deps $ vim /tmp/3fs/misc/scripts/setup.sh # setup 脚本的内容见以下 $ (cd /tmp; tar -zcvf 3fs-deploy.tar.gz 3fs); cp /tmp/3fs-deploy.tar.gz .
setup.sh 脚本内容如下:
#!/usr/bin/env bash apt update apt install -y cmake libuv1-dev liblz4-dev liblzma-dev libdouble-conversion-dev libdwarf-dev libunwind-dev \ libaio-dev libgflags-dev libgoogle-glog-dev libgtest-dev libgmock-dev clang-format-14 clang-14 clang-tidy-14 lld-14 \ libgoogle-perftools-dev google-perftools libssl-dev gcc-12 g++-12 libboost-all-dev cargo git g++ wget meson libjemalloc-dev ( cd misc/deps dpkg -i foundationdb-server_7.1.67-1_amd64.deb foundationdb-clients_7.1.67-1_amd64.deb systemctl stop foundationdb tar -zxvf fuse-3.16.2.tar.gz cd fuse-3.16.2; mkdir build; cd build meson setup .. ninja ninja install )
2、部署
机器角色
我们一共准备了 12 台物理机:
1 台:部署监控、管理服务、元数据服务
5 台:部署数据节点(每台机器拥有 3 块盘)
6 台:部署 Fuse 客户端
并且每台机器有一张 400 Gb 支持 RDMA 的网卡,并配置 2 个网口:ib7s400p0、bond1
前置步骤
我们需要将上述打包的 3fs-deploy.tar.gz 分发至所有需要部署服务的机器,解压至指定目录,并安装相应依赖:
$ tar -zxvf 3fs-deploy.tar.gz -C /usr/local $ cd /usr/local/3fs; bash misc/scripts/setup.sh
服务所有的二进制、配置、日志都在 /usr/local/3fs 目录下。如果在部署的情况下遇到错误,你可以通过查看 /usr/local/3fs/logs 下的日志来排查问题。
1. 监控存储 - ClickHouse
ClickHouse 主要用于存储监控数据,该步骤需在元数据节点执行。
1.1 安装 ClickHouse
$ apt-get install -y apt-transport-https ca-certificates curl gnupg $ curl -fsSL 'https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key' | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg $ ARCH=$(dpkg --print-architecture) $ echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg arch=${ARCH}] https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list $ apt-get update $ apt-get install -y clickhouse-server clickhouse-client # 安装时需要输入密码,我们输入 zetyun
1.2 启动 ClickHouse 服务端
$ vim /etc/clickhouse-server/config.xml # 修改监听端口 tcp_port 为 19000 $ clickhouse start
1.3 试着启动 ClickHouse 客户端
$ clickhouse-client --port 19000 --password 'zetyun' ClickHouse client version 25.2.1.3085 (official build). Connecting to localhost:19000 as user default. Connected to ClickHouse server version 25.2.1. Warnings: * Delay accounting is not enabled, OSIOWaitMicroseconds will not be gathered. You can enable it using `echo 1 > /proc/sys/kernel/task_delayacct` or by using sysctl. zetyun-gpu-0001 :)
1.4 创建 Metric Table
退出客户端后,运行以下命令创建 Metric Table:
$ clickhouse-client --port 19000 --password 'zetyun' -n < /usr/local/3fs/misc/sql/3fs-monitor.sql
2. 监控服务 - Monitor
该步骤需在元数据节点执行。
2.1 修改配置
需要修改一下配置,主要是 IB 网卡、各等级日志的路径、ClickHouse 监听的地址:
$ vim /usr/local/3fs/conf/monitor_collector_main.toml [common] cluster_id = 'zetyun' [common.ib_devices] device_filter = [ 'ib7s400p0' ] [common.log.handlers]] file_path = '/usr/local/3fs/logs/monitor_collector_main.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/monitor_collector_main-err.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/monitor_collector_main-fatal.log' [server.base.groups.listener] filter_list = [ 'bond1' ] [server.monitor_collector.reporter.clickhouse] db = '3fs' host = '127.0.0.1' passwd = 'zetyun' port = '19000' user = 'default'
2.2 启动服务
$ cp /usr/local/3fs/misc/systemd/monitor_collector_main.service /usr/lib/systemd/system $ vim /usr/lib/systemd/system/monitor_collector_main.service # 需修改文件路径,内容见以下 $ systemctl start monitor_collector_main
monitor_collector_main.service 修改如下:
ExecStart=/usr/local/3fs/bin/monitor_collector_main --cfg /usr/local/3fs/conf/monitor_collector_main.toml
2.3 检查服务状态
检查服务运行状态:
$ systemctl status monitor_collector_main
检查监听地址是否符合预期:
$ netstat -antlp | grep LISTEN | grep monitor tcp 0 0 172.30.12.61:10000 0.0.0.0:* LISTEN 399127/monitor_coll
检查日志是否有错误:
$ cat /usr/local/3fs/logs/monitor_collector_main-err.log
3. 存储服务 - FoundationDB
FoundationDB 主要用于存储集群配置以及文件系统的元数据(这里我们选择共用),该步骤需在元数据节点执行。
3.1 启动服务
$ systemctl start foundationdb $ systemctl status foundationdb
3.2 检查服务状态
集群默认会监听本地的 4500 端口:
$ netstat -antlp | grep LISTEN | grep fdb tcp 0 0 127.0.0.1:4500 0.0.0.0:* LISTEN 2336918/fdbserver
4. 配置管理员工具 - AdminClient
该步骤需在元数据节点执行。
4.1 拷贝 fdb.cluster
$ cp /etc/foundationdb/fdb.cluster /usr/local/3fs/conf/
该文件主要存储着 FoundationDB 的集群地址,用于客户端连接使用
4.2 修改配置
修改 admin_cli.toml
$ vim /usr/local/3fs/conf/admin_cli.toml cluster_id = 'zetyun' log = 'DBG:normal; normal=file:path=/usr/local/3fs/logs/cli.log,async=true,sync_level=ERR' [fdb] clusterFile = '/usr/local/3fs/conf/fdb.cluster' [ib_devices] device_filter = [ 'ib7s400p0' ]
4.3 试着执行一下
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml help
如果能成功输出 Help 信息就 OK 了。
5. 集群管理服务 - Mgmtd
该步骤需在元数据节点执行。
5.1 修改配置
修改 mgmtd_main_app.toml
$ vim /usr/local/3fs/conf/mgmtd_main_app.toml node_id = 1
修改mgmtd_main_launcher.toml
$ vim /usr/local/3fs/conf/mgmtd_main_launcher.toml cluster_id = 'zetyun' [fdb] clusterFile = '/usr/local/3fs/conf/fdb.cluster' [ib_devices] device_filter = [ 'ib7s400p0' ]
修改 mgmtd_main.toml
$ vim /usr/local/3fs/conf/mgmtd_main.toml [[common.log.handlers]] file_path = '/usr/local/3fs/logs/mgmtd_main.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/mgmtd_main-err.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/mgmtd_main-fatal.log' [common.monitor.reporters.monitor_collector] remote_ip = '172.30.12.61:10000' [server.base.groups.listener] filter_list = [ 'bond1' ] listen_port = 8000 [server.base.groups.listener] filter_list = [ 'bond1' ] listen_port = 9030
5.2 初始化集群
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml "init-cluster --mgmtd /usr/local/3fs/conf/mgmtd_main.toml 1 1048576 16" Init filesystem, root directory layout: chain table ChainTableId(1), chunksize 1048576, stripesize 16 Init config for MGMTD version 1
1 代表 chainTable ID, 1048576 代表 chunk size, 16 代表 file strip size
该步骤会将数据写入数据库,会往 FoundationDB 监听的 4500 端口发送数据
5.3 启动服务
$ cp /usr/local/3fs/misc/systemd/mgmtd_main.service /usr/lib/systemd/system/ $ vim /usr/lib/systemd/system/mgmtd_main.service # 需修改文件路径,内容见以下 $ systemctl start mgmtd_main
mgmtd_main.service 修改内容如下:
ExecStart=/usr/local/3fs/bin/mgmtd_main --launcher_cfg /usr/local/3fs/conf/mgmtd_main_launcher.toml --app-cfg /usr/local/3fs/conf/mgmtd_main_app.toml
5.4 检查服务状态
检查服务运行状态:
$ systemctl status mgmtd_main
检查监听地址是否符合预期:
$ netstat -antlp | grep LISTEN | grep 'mgmtd_main' tcp 0 0 172.30.12.61:8000 0.0.0.0:* LISTEN 420329/mgmtd_main tcp 0 0 172.30.12.61:9000 0.0.0.0:* LISTEN 420329/mgmtd_main
检查日志是否有错误:
$ cat /usr/local/3fs/logs/mgmtd_main-err.log
查看节点列表:
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "list-nodes" Id Type Status Hostname Pid Tags LastHeartbeatTime ConfigVersion ReleaseVersion 1 MGMTD PRIMARY_MGMTD hd03-gpu2-0061 420329 [] N/A 1(UPTODATE) 250228-dev-1-999999-33da0642
其他:清理服务
如果 mgmtd_main.toml 文件需要修改,你可以停掉 FoundationDB,删除对应的数据,再执行步骤 5.2:
$ systemctl stop foundationdb $ rm -rf /var/lib/foundationdb/data/* /var/log/foundationdb/* /etc/foundationdb/* $ dpkg -P foundationdb-clients foundationdb-server $ dpkg -i /usr/local/3fs/misc/deps/foundationdb-clients_7.1.67-1_amd64.deb /usr/local/3fs/misc/deps/foundationdb-server_7.1.67-1_amd64.deb $ cp /etc/foundationdb/fdb.cluster /usr/local/3fs/conf/ $ systemctl status foundationdb
6. 元数据服务 - Meta
该步骤需在元数据节点执行。
6.1 修改配置
meta_main_app.toml
$ vim /usr/local/3fs/conf/meta_main_app.toml node_id = 100
meta_main_launcher.toml
$ vim /usr/local/3fs/conf/meta_main_launcher.toml cluster_id = 'zetyun' [ib_devices] device_filter = [ 'ib7s400p0' ] [mgmtd_client] mgmtd_server_addresses = [ 'RDMA://172.30.12.61:8000' ]
meta_main.toml
$ vim /usr/local/3fs/conf/meta_main.toml [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_meta_main.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_meta_main-err.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_meta_main-fatal.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_meta_main-event.log' [common.monitor.reporters.monitor_collector] remote_ip = '172.30.12.61:10000' [server.mgmtd_client] mgmtd_server_addresses = [ 'RDMA://172.30.12.61:8000' ] [server.base.groups.listener] filter_list = [ 'bond1' ] listen_port = 8001 [server.base.groups.listener] filter_list = [ 'bond1' ] listen_port = 9001 [server.fdb] clusterFile = '/usr/local/3fs/conf/fdb.cluster'
6.2 更新配置
更新配置至管理服务:
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "set-config --type META --file /usr/local/3fs/conf/meta_main.toml" Succeed ConfigVersion 1
6.3 启动服务
$ cp /usr/local/3fs/misc/systemd/meta_main.service /usr/lib/systemd/system $ vim /usr/lib/systemd/system/meta_main.service # 需修改文件路径,内容见以下 $ systemctl start meta_main
meta_main.service 修改如下:
ExecStart=/usr/local/3fs/bin/meta_main --launcher_cfg /usr/local/3fs/conf/meta_main_launcher.toml --app-cfg /usr/local/3fs/conf/meta_main_app.toml
6.4 检查服务状态
检查服务运行状态:
$ systemctl status meta_main
检查监听地址是否符合预期:
$ netstat -antlp | grep LISTEN | grep meta_main tcp 0 0 172.30.12.61:8001 0.0.0.0:* LISTEN 431374/meta_main tcp 0 0 172.30.12.61:9001 0.0.0.0:* LISTEN 431374/meta_main
检查日志是否有错误:
$ cat /usr/local/3fs/logs/hf3fs_meta_main
查看节点列表:
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "list-nodes" Id Type Status Hostname Pid Tags LastHeartbeatTime ConfigVersion ReleaseVersion 1 MGMTD PRIMARY_MGMTD hd03-gpu2-0061 420329 [] N/A 1(UPTODATE) 250228-dev-1-999999-33da0642 100 META HEARTBEAT_CONNECTED hd03-gpu2-0061 431374 [] 2025-03-11 11:51:12 1(UPTODATE) 250228-dev-1-999999-33da0642
7. 数据服务 - Storage
以下步骤在数据节点执行。
7.1 修改系统参数
修改 AIO 最大请求数:
$ sysctl -w fs.aio-max-nr=67108864 $ sysctl -n fs.aio-max-nr # 查看配置
7.2 修改配置
storage_main_launcher.toml
$ vim /usr/local/3fs/conf/storage_main_launcher.toml cluster_id = 'zetyun' [ib_devices] device_filter = [ 'ib7s400p0' ] [mgmtd_client] mgmtd_server_addresses = [ 'RDMA://172.30.12.61:8000' ]
storage_main.toml
$ vim /usr/local/3fs/conf/storage_main.toml [[common.log.handlers]] file_path = '/usr/local/3fs/logs/storage_main.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/storage_main-err.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/storage_main-fatal.log' [common.monitor.reporters.monitor_collector] remote_ip = '172.30.12.61:10000' [server.base.groups.listener] filter_list = [ 'bond1' ] listen_port = 8000 [server.base.groups.listener] filter_list = [ 'bond1' ] listen_port = 9000 [server.mgmtd] mgmtd_server_addresses = [ 'RDMA://172.30.12.61:8000' ] [server.targets] target_paths = [ '/3fs/storage/data0', '/3fs/storage/data1', '/3fs/storage/data2' ]
storage_main_app.toml
$ vim /usr/local/3fs/conf/storage_main_app.toml node_id = 10001 # 6 台机器,配置为 10001~10006
admin_cli.toml
$ vim /usr/local/3fs/conf/admin_cli.toml cluster_id = 'zetyun' log = 'DBG:normal; normal=file:path=/usr/local/3fs/logs/cli.log,async=true,sync_level=ERR' [ib_devices] device_filter = [ 'ib7s400p0' ]
7.3 更新配置
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "set-config --type STORAGE --file /usr/local/3fs/conf/storage_main.toml" Succeed ConfigVersion 1
7.4 启动服务
$ cp /usr/local/3fs/misc/systemd/storage_main.service /usr/lib/systemd/system $ vim /usr/lib/systemd/system/storage_main.service # 修改二进制和配置文件的路径 $ systemctl start storage_main
7.5 检查服务状态
检查服务运行状态:
$ systemctl status storage_main
检查监听地址是否符合预期:
$ netstat -antlp | grep LISTEN | grep -E 'storage' tcp 0 0 172.30.12.48:19000 0.0.0.0:* LISTEN 3379918/storage_mai tcp 0 0 172.30.12.48:8000 0.0.0.0:* LISTEN 3379918/storage_mai
检查日志是否有错误:
$ cat /usr/local/3fs/logs/storage_main-err.log
查看节点列表:
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "list-nodes" 1 MGMTD PRIMARY_MGMTD hd03-gpu2-0061 50900 [] N/A 1(UPTODATE) 250228-dev-1-999999-33da0642 100 META HEARTBEAT_CONNECTED hd03-gpu2-0061 51569 [] 2025-03-11 19:26:09 1(UPTODATE) 250228-dev-1-999999-33da0642 10001 STORAGE HEARTBEAT_CONNECTED hd03-gpu2-0046 3382653 [] 2025-03-11 19:26:16 6(UPTODATE) 250228-dev-1-999999-33da0642 10002 STORAGE HEARTBEAT_CONNECTED hd03-gpu2-0047 3630232 [] 2025-03-11 19:26:16 6(UPTODATE) 250228-dev-1-999999-33da0642 10003 STORAGE HEARTBEAT_CONNECTED hd03-gpu2-0048 3379918 [] 2025-03-11 19:26:16 6(UPTODATE) 250228-dev-1-999999-33da0642 10004 STORAGE HEARTBEAT_CONNECTED hd03-gpu2-0049 3385727 [] 2025-03-11 19:26:16 6(UPTODATE) 250228-dev-1-999999-33da0642 10005 STORAGE HEARTBEAT_CONNECTED hd03-gpu2-0050 3631938 [] 2025-03-11 19:26:16 6(UPTODATE) 250228-dev-1-999999-33da0642 10006 STORAGE HEARTBEAT_CONNECTED hd03-gpu2-0060 253473 [] 2025-03-11 19:26:14 6(UPTODATE) 250228-dev-1-999999-33da0642
8. 配置 3FS
该步操作需要回到元数据节点操作。
8.1 创建管理员
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "user-add --root --admin 0 root" Uid 0 Name root Token AABCB/x58QAyLhOJ2wCGqDu4(Expired at N/A) IsRootUser true IsAdmin true Gid 0 SupplementaryGids
终端会显示创建好的 token,你需要将 token 保存到 /usr/local/3fs/conf/token.txt:
$ echo AABCB/x58QAyLhOJ2wCGqDu4 > /usr/local/3fs/conf/token.txt
8.2 创建 chian table
先安装依赖:
$ apt install -y python3-pip $ pip3 install -r /usr/local/3fs/misc/data_placement/requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
执行 data_placement 命令,需要修改 num_nodes 参数为实际 storage 的个数:
$ cd /usr/local/3fs $ python3 /usr/local/3fs/misc/data_placement/src/model/data_placement.py -ql -relax -type CR --num_nodes 5 --replication_factor 3 --min_targets_per_disk 6 ... 2025-03-11 19:38:20.416 | SUCCESS | __main__:run:148 - saved solution to: output/DataPlacementModel-v_5-b_10-r_6-k_3-λ_2-lb_1-ub_1
执行产生 chainTable,需要修改 node_id_begin、node_id_end、num_disks_per_node、incidence_matrix_path 等参数,incidence_matrix_path 为上一步生成的文件:
$ python3 /usr/local/3fs/misc/data_placement/src/setup/gen_chain_table.py \ --chain_table_type CR --node_id_begin 10001 --node_id_end 10005 \ --num_disks_per_node 3 --num_targets_per_disk 6 \ --target_id_prefix 1 --chain_id_prefix 9 \ --incidence_matrix_path output/DataPlacementModel-v_3-b_6-r_6-k_3-λ_3-lb_3-ub_3/incidence_matrix.pickle
检查 output 文件:
$ ls -ls output 4 drwxr-xr-x 2 root root 4096 Mar 11 19:38 DataPlacementModel-v_5-b_10-r_6-k_3-λ_2-lb_1-ub_1 12 -rw-r--r-- 1 root root 8714 Mar 11 19:38 appsi_highs.log 12 -rw-r--r-- 1 root root 10350 Mar 11 19:39 create_target_cmd.txt 4 -rw-r--r-- 1 root root 308 Mar 11 19:39 generated_chain_table.csv 4 -rw-r--r-- 1 root root 1505 Mar 11 19:39 generated_chains.csv 12 -rw-r--r-- 1 root root 9990 Mar 11 19:39 remove_target_cmd.txt
创建 storage target:
$ /usr/local/3fs/bin/admin_cli --cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' --config.user_info.token $(<"/usr/local/3fs/conf/token.txt") < output/create_target_cmd.txt ... Create target 101000100306 on disk 2 of 10001 succeeded Create target 101000300306 on disk 2 of 10003 succeeded Create target 101000500306 on disk 2 of 10005 succeeded
上传 chains 和 chain table 到 mgmtd:
$ /usr/local/3fs/bin/admin_cli --cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' --config.user_info.token $(<"/usr/local/3fs/conf/token.txt") "upload-chains output/generated_chains.csv" Upload 30 chains succeeded $ /usr/local/3fs/bin/admin_cli --cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' --config.user_info.token $(<"/usr/local/3fs/conf/token.txt") "upload-chain-table --desc zetyun 1 output/generated_chain_table.csv" Upload ChainTableId(1) of ChainTableVersion(1) succeeded
检查是否上传成功,需要执行 2 条命令:
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "list-chains" ... 900300008 1 1 SERVING [] 101000100305(SERVING-UPTODATE) 101000300304(SERVING-UPTODATE) 101000400305(SERVING-UPTODATE) 900300009 1 1 SERVING [] 101000300305(SERVING-UPTODATE) 101000400306(SERVING-UPTODATE) 101000500305(SERVING-UPTODATE) 900300010 1 1 SERVING [] 101000100306(SERVING-UPTODATE) 101000300306(SERVING-UPTODATE) 101000500306(SERVING-UPTODATE) $ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "list-chain-tables" ChainTableId ChainTableVersion ChainCount ReplicaCount Desc 1 1 30 3 zetyun
9. 客户端 - Fuse Client
以下命令需要在客户端节点执行。
9.1 修改配置
保存 token(查看步骤 8.1 中生成的)
$ echo AABCB/x58QAyLhOJ2wCGqDu4 > /usr/local/3fs/conf/token.txt
hf3fs_fuse_main_launcher.toml
$ vim /usr/local/3fs/conf/hf3fs_fuse_main_launcher.toml cluster_id = 'zetyun' mountpoint = '/mnt/3fs' token_file = '/usr/local/3fs/conf/token.txt' [ib_devices] device_filter = [ 'ib7s400p0' ] [mgmtd_client] mgmtd_server_addresses = [ 'RDMA://172.30.12.61:8000' ]
hf3fs_fuse_main.toml
$ vim /usr/local/3fs/conf/hf3fs_fuse_main.toml [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_fuse_main.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_fuse_main-err.log' [[common.log.handlers]] file_path = '/usr/local/3fs/logs/hf3fs_fuse_main-fatal.log' [mgmtd] mgmtd_server_addresses = [ 'RDMA://172.30.12.61:8000' ] [common.monitor.reporters.monitor_collector] remote_ip = '172.30.12.61:10000'
admin_cli
$ vim /usr/local/3fs/conf/admin_cli.toml cluster_id = 'zetyun' log = 'DBG:normal; normal=file:path=/usr/local/3fs/logs/cli.log,async=true,sync_level=ERR' [ib_devices] device_filter = [ 'ib7s400p0' ]
9.2 更新配置
$ /usr/local/3fs/bin/admin_cli -cfg /usr/local/3fs/conf/admin_cli.toml --config.mgmtd_client.mgmtd_server_addresses '["RDMA://172.30.12.61:8000"]' "set-config --type FUSE --file /usr/local/3fs/conf/hf3fs_fuse_main.toml"
9.3 挂载 Fuse
$ cp /usr/local/3fs/misc/systemd/hf3fs_fuse_main.service /usr/lib/systemd/system $ vim /usr/lib/systemd/system/hf3fs_fuse_main.service # 需修改文件路径,内容见以下 $ systemctl start hf3fs_fuse_main
hf3fs_fuse_main.service 修改如下:
ExecStart=/usr/local/3fs/bin/hf3fs_fuse_main --launcher_cfg /usr/local/3fs/conf/hf3fs_fuse_main_launcher.toml
9.4 检查服务状态
检查服务运行状态:
$ systemctl status hf3fs_fuse_main
检查日志是否有错误:
$ cat /usr/local/3fs/logs/hf3fs_fuse_main-err.log
检查挂载点是否存在:
$ mount | grep zetyun
尝试执行一些操作:
$ cd /mnt/3fs $ touch f1 f2 f3 $ ls -ls $ seq 1 1000000 > f1 $ cat f1
如果操作一切正常,代表集群已部署成功 :)
本期教程到此结束。
| 我们是谁
提供本次实操教学的为九章云极研发人员。
九章云极,全称北京九章云极科技有限公司,2013年成立,致力于人工智能基础软件的规模化应用,融合了世界前沿的人工智能技术,以自主创新的“算力包”产品和智算操作系统为载体,为广大用户提供“算力+算法”一体化AI服务。
长按二维码,领取免费算力包!
接下来我们会开辟专栏,继续在本公司深耕领域及方向做持续分享,欢迎大家留言探讨!
| 文末彩蛋
最后,为大家呈现另一款通用性更高、成本更低的存储系统—— DataCanvas DingoFS分布式存储系统,该系统由北京九章云极科技有限公司开发,于2024年11月20日首次发表,并于2025年1月14日登记。DingoFS 因其高效的数据存储和管理、支持大规模数据的分布式存储、高可用性和可扩展性在业界独树一帜,更加适用于需要处理大量数据和要求高可靠性的应用场景。DingoFS 即将推出的新版将具备更佳的元数据性能。
DingoFS 核心特性如下:
| Alaya NeW算力云:让DeepSeek部署更简单!
借助 Alaya NeW算力云服务 提供的强大GPU资源,您可以轻松实现DeepSeek模型在云端的推理服务部署,并根据实际需求灵活使用算力,为技术创新与科研探索提供高效支持!
三步搞定一键部署,快速上手DeepSeek!
不想被复杂的配置流程困扰?别担心!只需三步,您就能轻松完成DeepSeek大语言模型的一键部署。立即行动起来吧!体验地址:
免费体验25度算力包,一键部署DeepSeek!
| End
欢迎同行大咖们也来体验一次 DingoFS存储系统并为DingoFS的迭代提供宝贵建议,共同推进本土大模型的演进。
| 下期预告
400G 网络性能实测3FS
3FS元数据性能详测