首页速度优化YOLO12 GPU算力优化部署：显存占用2GB起，适配T4/4090/A10

网站优化

RMBG-2.0开源可部署价值：满足等保2.0三级对图像处理数据本地化要求

臻识车牌识别一体机HTTP推送协议实战：从配置到数据解析全流程

2026-06-08 22:08:49

阅读时长:6分钟

562次阅读

核心内容摘要

真的太省时间!千笔，断层领先的AI论文平台

centos7-nvidia驱动安装类别信息服务器型号Rack Mount Chassis NF5280M6CPUIntel® Xeon® Silver 4310 CPU

10GHz * 2系统版本Centos 7系统内核版本

3.

1

0-

el

x86_64GPU型号NVIDIA A10040G*4Nvidia版本

525.

8

05CUDA版本

12.

0docker版本

20.

1

9 基础系统部分(已经安装过可以不用安装)

安装基础软件yum updateyum -yinstallopenssh-server openssh-client apt-utils freeipmi ipmitool sshpassethtoolzipunzipnanolessgitnetplan.io iputils-pingmtripvsadm smartmontools python3-pip socat conntrack libvirt-clients libnuma-dev ctorrent nvme-cli gcc-12 g-12vimwgetaptgitunzipzipntp ntpdate lrzsz lftp tree bash-completion elinks dos2unix tmux jqyum -yinstallnmap net-toolsmtrtraceroutetcptracerouteaptitudehtopiftop hping3 fping nethogs sshuttle tcpdump figlet stress iperf iperf3 dnsutilscurllinux-tools-generic linux-cloud-tools-genericyum groupinstall -yDevelopment Toolscurl-s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh|sudobashyuminstallgit-lfsgitlfsinstall

调整文件描述符echoulimit -SHn 655350/etc/profileechofs.file-max 655350/etc/sysctl.confechoroot soft nofile 655350/etc/security/limits.confechoroot hard nofile 655350/etc/security/limits.confecho* soft nofile 655350/etc/security/limits.confecho* hard nofile 655350/etc/security/limits.confsource/etc/profile优化historycat/etc/profileexportHISTTIMEFORMAT%Y-%m-%d %H:%M:%SwhoamiexportHISTFILESIZE50000exportHISTSIZE50000source/etc/profile

优化内核参数cp/etc/sysctl.conf /etc/sysctl.conf.bakvi/etc/sysctl.conf net.ipv

tcp_syncookies1net.ipv

tcp_abort_on_overflow1net.ipv

tcp_max_tw_buckets6000net.ipv

tcp_sack1net.ipv

tcp_window_scaling1net.ipv

tcp_rmem4096873804194304net.ipv

tcp_wmem4096663844194304net.ipv

tcp_mem94500000915000000927000000net.core.optmem_max81920net.core.wmem_default8388608net.core.wmem_max16777216net.core.rmem_default8388608net.core.rmem_max16777216net.ipv

tcp_max_syn_backlog1020000net.core.netdev_max_backlog862144net.core.somaxconn262144net.ipv

tcp_max_orphans327680net.ipv

tcp_timestamps0net.ipv

tcp_synack_retries1net.ipv

tcp_syn_retries1net.ipv

tcp_tw_reuse1net.ipv

tcp_fin_timeout15net.ipv

tcp_keepalive_time30net.ipv

ip_local_port_range102465535net.netfilter.nf_conntrack_tcp_timeout_established180net.netfilter.nf_conntrack_max1048576net.nf_conntrack_max1048576fs.file-max655350modprobe nf_conntrack sysctl -p /etc/sysctl.conf sysctl -w net.ipv

route.flush1

显卡驱动、cuda等部署手动创建禁用 nouveau 的配置bash-cecho blacklist nouveau /etc/modprobe.d/blacklist-nvidia-nouveau.confbash-cecho options nouveau modeset0 /etc/modprobe.d/blacklist-nvidia-nouveau.confechooptions nouveaumodeset0|tee-a /etc/modprobe.d/nouveau-kms.conf# boot备份cp-r /boot/ /root/ dracut -f /boot/initramfs-$(uname-r).img$(uname-r)# 重启验证是否禁用成功rebootlsmod|grepnouveau重启成功后打开终端输入如下如果什么都不显示说明正面上面禁用nouveau的流程正确安装nvidia驱动https://download.nvidia.com/XFree86/Linux-x86_64获取推荐安装版本可不选择推荐安装版本# 导入 ELRepo 的公钥sudorpm--import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org# 安装 ELRepo 仓库sudoyuminstall-y https://www.elrepo.org/elrepo-release-

0-

el

elrepo.noarch.rpmsudoyum makecache lspci|grep-i nvidia下载对应内核工具防止安装错误# 安装 yum-config-manager 工具开启工具查找centos7老版本内核工具yuminstall-y yum-utils# 启用 vault 仓库yum-config-manager --enable vault yuminstallkernel-devel-$(uname-r)kernel-headers-$(uname-r)wgethttps://download.nvidia.com/XFree86/Linux-x86_64/

525.

8

05/NVIDIA-Linux-x86_64-

525.

85.

runchmodx NVIDIA-Linux-x86_64-

525.

85.

runbashNVIDIA-Linux-x86_64-

525.

85.

run --no-opengl-files --uinone --no-questions --accept-license安装完成后执行nvidia-smi查看[rootgnode196 ~]# nvidia-smiTue Jan2716:48:412026-----------------------------------------------------------------------------|NVIDIA-SMI

525.

8

05 Driver Version:

525.

8

05 CUDA Version:

1

0||---------------------------------------------------------------------------|GPU Name Persistence-M|Bus-Id Disp.A|Volatile Uncorr. ECC||Fan Temp Perf Pwr:Usage/Cap|Memory-Usage|GPU-Util Compute M.||||MIG M.||||0NVIDIA A100-PCI... Off|00000000:4B:

0

0 Off|0||N/A 32C P0 36W / 250W|0MiB / 40960MiB|0% Default||||Disabled|---------------------------------------------------------------------------|1NVIDIA A100-PCI... Off|00000000:65:

0

0 Off|0||N/A 33C P0 36W / 250W|0MiB / 40960MiB|0% Default||||Disabled|---------------------------------------------------------------------------|2NVIDIA A100-PCI... Off|00000000:CA:

0

0 Off|0||N/A 31C P0 38W / 250W|0MiB / 40960MiB|0% Default||||Disabled|---------------------------------------------------------------------------|3NVIDIA A100-PCI... Off|00000000:E3:

0

0 Off|0||N/A 32C P0 39W / 250W|0MiB / 40960MiB|0% Default||||Disabled|--------------------------------------------------------------------------- -----------------------------------------------------------------------------|Processes:||GPU GI CI PID Type Process name GPU Memory||ID ID Usage||||No running processes found|-----------------------------------------------------------------------------安装cuda根据上面步骤可以看到cuda支持可用的cuda版本是

1

0登录访问https://developer.nvidia.com/cuda-toolkit-archive 并下载

1

0版本的cudawgethttps://developer.download.nvidia.com/compute/cuda/

12.

0/local_installers/cuda_

12.

0_

525.

6

13_linux.runbashcuda_

12.

0_

525.

6

13_linux.run --toolkit --silent --override增加环境变量并验证在pofile内添加cuda环境变量cat/etc/profileexportPATH/usr/local/cuda-

1

0/bin:$PATHexportLD_LIBRARY_PATH/usr/local/cuda-

1

0/lib64:$LD_LIBRARY_PATHsource/etc/profile nvcc -V 验证安装nvidia-dockercurl-s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo|\sudotee/etc/yum.repos.d/nvidia-container-toolkit.repo yuminstall-y nvidia-container-toolkit验证安装nvidia-container-cli --version nvidia-ctk --version配置docker使用nvidia-runtimenvidia-ctk runtime configure --runtimedocker systemctl restartdocker固定内核yum versionlockaddkernel-

3.

1

0-

el

x86_64 yum versionlockaddkernel-core-

3.

1

0-

el

x86_64 yum versionlockaddkernel-modules-

3.

1

0-

el

x86_64echoexcludekernel*/etc/yum.confCPU/GPU相关性能开启# 持久化开启开启Persistence Mode模式nvidia-smi -pm1# 允许ECC内存模式下模拟错误nvidia-smi -e ENABLED# CPU锁频yuminstall-y kernel-tools cpupower idle-set -D0cpupower frequency-set -g performanceechocpupower frequency-set -g performance/etc/rc.localchmodx /etc/rc.d/rc.local# GPU相关优化锁到最高频nvidia-smi -lgc1410,1410# 关闭 PCIe ASPM节能grubby --update-kernelALL --argspcie_aspmoff部署HPC-X(https://developer.nvidia.com/networking/hpc-x 页面最下选择下载版本)wgethttp://www.mellanox.com/page/hpcx_eula?mrequestdownloadsmtypehpcmverhpc-xmnamev

2.

1

1/hpcx-v

2.

1

1-gcc-inbox-redhat7-cuda12-x86_

tbztar-xf hpcx-v

2.

1

1-gcc-inbox-redhat7-cuda12-x86_

tbz -C /opt/ln-s /opt/hpcx-v

2.

1

1-gcc-inbox-redhat7-cuda12-x86_64 /opt/hpcxexportHPCX_HOME/opt/hpcx.$HPCX_HOME/hpcx-init.sh hpcx_loadnccl/gpubun测试安装nccl(静态编译)mkdir-p /root/nccl/cd/root/ncclgitclone https://github.com/NVIDIA/nccl.gitcdncclmake-j24src.buildCUDA_HOME/usr/local/cudaPATH$PATH:/usr/local/cuda/binLD_LIBRARY_PATH/usr/local/cuda/lib64:$LD_LIBRARY_PATH# -j 并法参数安装nccl-test (静态编译)mkdir-p /root/nccl/cd/root/ncclgitclone https://github.com/NVIDIA/nccl-tests.gitcdnccl-testswhichmpirun# /opt/hpcx/ompi/bin/mpirun 截取 MPI_HOME/opt/hpcx/ompicd/root/nccl/nccl-testsPATH$PATH:/usr/local/cuda/binLD_LIBRARY_PATH$LD_LIBRARY_PATH:/usr/local/cuda/lib64LIBRARY_PATH$LIBRARY_PATH:/usr/local/cuda/lib64make-j30CUDA_HOME/usr/local/cudaNCCL_HOME/root/nccl/nccl/buildNCCL_LIBDIR/root/nccl/nccl/build/libNCCL_STATIC1NVCC_GENCODE-gencodearchcompute_80,codesm_80nccl测试exportLD_LIBRARY_PATH$LD_LIBRARY_PATH:/root/nccl/nccl/build/lib ./build/all_reduce_perf -b8-e 35G -f2-g4-n50测试参数-b大小起始大小如 -b

-b 1M -e大小结束大小如 -e 10G -f倍数每次乘以几倍如 -f2表示翻倍 -g数量使用几个 GPU如 -g

-g4 -n次数测试迭代次数如 -n100默认20#

单 GPU 测试从 8 字节到 10GB每次翻倍./build/all_reduce_perf -b8-e 10G -f2-g1#

4 GPU 测试./build/all_reduce_perf -b8-e 10G -f2-g4#

测试更大数据量35GB4 GPU./build/all_reduce_perf -b8-e 35G -f2-g4#

增加迭代次数结果更稳定./build/all_reduce_perf -b8-e 10G -f2-g4-n100#

快速测试小数据范围./build/all_reduce_perf -b 1M -e 1G -f2-g4gpubungitclone https://github.com/wilicc/gpu-burn.git编辑配置文件cdgpu-burnviMakefile gpu_burn: gpu_burn-drv.o compare.ptx g -o$$-O3${LDFLAGS}修改为 gpu_burn: gpu_burn-drv.o compare.ptx g -o$$-O3${LDFLAGS}-static-libgcc -static-libstdc编译并测试修改后进行编译编译完成后在其他机器拷贝后就可以直接使用了 yuminstall-y libstdc-staticmakecleanmake./gpu_burn3600(测试时间)模型部署相关huggingface下载apt-get-yinstallgit-lfsgitlfsinstallapt-getinstallpython3 python-is-python3 python3 -m pipinstall--upgradepip

20.

4-i https://mirrors.aliyun.com/pypi/simple/ pip

12 configsetglobal.index-url https://pypi.org/simple/ pip

12install-U huggingface_hub --break-system-packageshuggingface登录huggingface-cli login# hf auth login# uggingface_hub 的最新版本

1.

3已经将 CLI 命令从 huggingface-cli 改为 hf。

旧命令 huggingface-cli 在新版本中不再支持⚠️ Warning:huggingface-cli loginis deprecated. Usehf auth logininstead. _|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|To log in,huggingface_hubrequires a token generated from https://huggingface.co/settings/tokens.Enter your token(input will not be visible): Add token asgitcredential?(Y/n)y Token is valid(permission: fineGrained). The tokendeployhas been saved to /root/.cache/huggingface/stored_tokens[rootgnode196 ~]# git config --global credential.helper store[rootgnode196 ~]# git config --global credential.helperstore

RMBG-2.0开源可部署价值：满足等保2.0三级对图像处理数据本地化要求

核心内容摘要

真的太省时间!千笔，断层领先的AI论文平台

10GHz * 2系统版本Centos 7系统内核版本

0-

el

x86_64GPU型号NVIDIA A10040G*4Nvidia版本

05CUDA版本

0docker版本

9

基础系统部分(已经安装过可以不用安装)

优化内核参数cp/etc/sysctl.conf /etc/sysctl.conf.bakvi/etc/sysctl.conf net.ipv

tcp_syncookies1net.ipv

tcp_abort_on_overflow1net.ipv

tcp_max_tw_buckets6000net.ipv

tcp_sack1net.ipv

tcp_window_scaling1net.ipv

tcp_rmem4096873804194304net.ipv

tcp_wmem4096663844194304net.ipv

tcp_mem94500000915000000927000000net.core.optmem_max81920net.core.wmem_default8388608net.core.wmem_max16777216net.core.rmem_default8388608net.core.rmem_max16777216net.ipv

tcp_max_syn_backlog1020000net.core.netdev_max_backlog862144net.core.somaxconn262144net.ipv

tcp_max_orphans327680net.ipv

tcp_timestamps0net.ipv

tcp_synack_retries1net.ipv

tcp_syn_retries1net.ipv

tcp_tw_reuse1net.ipv

tcp_fin_timeout15net.ipv

tcp_keepalive_time30net.ipv

ip_local_port_range102465535net.netfilter.nf_conntrack_tcp_timeout_established180net.netfilter.nf_conntrack_max1048576net.nf_conntrack_max1048576fs.file-max655350modprobe nf_conntrack sysctl -p /etc/sysctl.conf sysctl -w net.ipv

route.flush1

0-

el

05/NVIDIA-Linux-x86_64-

runchmodx NVIDIA-Linux-x86_64-

runbashNVIDIA-Linux-x86_64-

run --no-opengl-files --uinone --no-questions --accept-license安装完成后执行nvidia-smi查看[rootgnode196 ~]# nvidia-smiTue Jan2716:48:412026-----------------------------------------------------------------------------|NVIDIA-SMI

05 Driver Version:

05 CUDA Version:

0||---------------------------------------------------------------------------|GPU Name Persistence-M|Bus-Id Disp.A|Volatile Uncorr. ECC||Fan Temp Perf Pwr:Usage/Cap|Memory-Usage|GPU-Util Compute M.||||MIG M.||||0NVIDIA A100-PCI... Off|00000000:4B:

0 Off|0||N/A 32C P0 36W / 250W|0MiB / 40960MiB|0% Default||||Disabled|---------------------------------------------------------------------------|1NVIDIA A100-PCI... Off|00000000:65:

0 Off|0||N/A 33C P0 36W / 250W|0MiB / 40960MiB|0% Default||||Disabled|---------------------------------------------------------------------------|2NVIDIA A100-PCI... Off|00000000:CA:

0 Off|0||N/A 31C P0 38W / 250W|0MiB / 40960MiB|0% Default||||Disabled|---------------------------------------------------------------------------|3NVIDIA A100-PCI... Off|00000000:E3:

0登录访问https://developer.nvidia.com/cuda-toolkit-archive 并下载

0版本的cudawgethttps://developer.download.nvidia.com/compute/cuda/

0/local_installers/cuda_

0_

13_linux.runbashcuda_

0_

13_linux.run --toolkit --silent --override增加环境变量并验证在pofile内添加cuda环境变量cat/etc/profileexportPATH/usr/local/cuda-

0/bin:$PATHexportLD_LIBRARY_PATH/usr/local/cuda-

0-

el

x86_64 yum versionlockaddkernel-core-

0-

el

x86_64 yum versionlockaddkernel-modules-

0-

el

1/hpcx-v

1-gcc-inbox-redhat7-cuda12-x86_

tbztar-xf hpcx-v

1-gcc-inbox-redhat7-cuda12-x86_

tbz -C /opt/ln-s /opt/hpcx-v

-b 1M -e大小结束大小如 -e 10G -f倍数每次乘以几倍如 -f2表示翻倍 -g数量使用几个 GPU如 -g

-g4 -n次数测试迭代次数如 -n100默认20#

单 GPU 测试从 8 字节到 10GB每次翻倍./build/all_reduce_perf -b8-e 10G -f2-g1#

4 GPU 测试./build/all_reduce_perf -b8-e 10G -f2-g4#

测试更大数据量35GB4 GPU./build/all_reduce_perf -b8-e 35G -f2-g4#

增加迭代次数结果更稳定./build/all_reduce_perf -b8-e 10G -f2-g4-n100#

4-i https://mirrors.aliyun.com/pypi/simple/ pip

12 configsetglobal.index-url https://pypi.org/simple/ pip

12install-U huggingface_hub --break-system-packageshuggingface登录huggingface-cli login# hf auth login# uggingface_hub 的最新版本

3已经将 CLI 命令从 huggingface-cli 改为 hf。

爱液官网登录入口无需下载2025-爱液官网登录入口无需下载应用

📑 文章目录

🔥 热门优化文章

🛠️ 实用工具推荐

相关优化文章 推荐

百度百家号客服电话人工服务

相关优化文章推荐