| 知乎专栏 |
命令行演示
检查 Yolo 能否使用 GPU
import torch
# 检查torch是否有支持CUDA,即GPU能否使用
if torch.cuda.is_available():
print(torch.cuda.get_device_name(0))
print(torch.version.cuda)
else:
print("No gpu")
# 如果CUDA可用,它还会打印出当前默认的CUDA设备(通常是第一个GPU)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
查看 GPU 进程
PS D:\workspace\netkiller> nvidia-smi.exe Fri Nov 29 16:53:25 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 560.94 Driver Version: 560.94 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3050 ... WDDM | 00000000:02:00.0 Off | N/A | | N/A 55C P3 10W / 47W | 785MiB / 6144MiB | 40% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 17360 C ...rograms\Python\Python312\python.exe N/A | +-----------------------------------------------------------------------------------------+
PS D:\workspace\netkiller> nvidia-smi.exe -q
==============NVSMI LOG==============
Timestamp : Fri Nov 29 16:46:47 2024
Driver Version : 560.94
CUDA Version : 12.6
Attached GPUs : 1
GPU 00000000:02:00.0
Product Name : NVIDIA GeForce RTX 3050 6GB Laptop GPU
Product Brand : GeForce
Product Architecture : Ampere
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : N/A
Addressing Mode : N/A
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : WDDM
Pending : WDDM
Serial Number : N/A
GPU UUID : GPU-c6f6cb0c-8715-e499-2cf6-ba28d1ca9674
Minor Number : N/A
VBIOS Version : 94.07.87.00.e9
MultiGPU Board : No
Board ID : 0x200
Board Part Number : N/A
GPU Part Number : 25AC-730-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G001.0000.94.01
OEM Object : 2.0
ECC Object : N/A
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x02
Device : 0x00
Domain : 0x0000
Device Id : 0x25AC10DE
Bus Id : 00000000:02:00.0
Sub System Id : 0x8C2B103C
GPU Link Info
PCIe Generation
Max : 4
Current : 4
Device Current : 4
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 4x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 14 KB/s
Rx Throughput : 38 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : N/A
Performance State : P3
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 6144 MiB
Reserved : 142 MiB
Used : 785 MiB
Free : 5218 MiB
BAR1 Memory Usage
Total : 8192 MiB
Used : 8164 MiB
Free : 28 MiB
Conf Compute Protected Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : 25 %
Memory : 11 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows : N/A
Temperature
GPU Current Temp : 57 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 100 C
GPU Slowdown Temp : 97 C
GPU Max Operating Temp : 105 C
GPU Target Temperature : 87 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 11.06 W
Current Power Limit : 46.28 W
Requested Power Limit : N/A
Default Power Limit : 35.00 W
Min Power Limit : 1.00 W
Max Power Limit : 60.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 615 MHz
SM : 615 MHz
Memory : 5000 MHz
Video : 697 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 5501 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 612.500 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 17360
Type : C
Name : C:\Users\neo\AppData\Local\Programs\Python\Python312\python.exe
Used GPU Memory : Not available in WDDM driver model
Capabilities
EGM : disabled
PS D:\workspace\netkiller>
Accelerated PyTorch training on Mac
xcode-select --install
检查是否支持苹果M芯片
import torch # 判断macOS的版本是否支持 print(torch.backends.mps.is_available()) # 判断mps是否可用 print(torch.backends.mps.is_built())
运行一段代码,测试一下
import torch
if torch.backends.mps.is_available():
mps_device = torch.device("mps")
x = torch.ones(1, device=mps_device)
print (x)
else:
print ("MPS device not found.")
输出下面信息,表示系统运行在MPS环境下,一切正常
tensor([1.], device='mps:0')
训练时会现实 Ultralytics 8.3.38 🚀 Python-3.12.7 torch-2.5.1 MPS (Apple M4)
(.venv) neo@Neo-Mac-mini-M4 netkiller % yolo task=detect mode=train model=tongue.yaml data=/Volumes/tmp/datasets/XinNaoXieGuan/data.yaml epochs=150 device=mps name=XinNaoXieGuan project=/Volumes/tmp/runs WARNING ⚠️ no model scale passed. Assuming scale='n'. New https://pypi.org/project/ultralytics/8.3.49 available 😃 Update with 'pip install -U ultralytics' Ultralytics 8.3.38 🚀 Python-3.12.7 torch-2.5.1 MPS (Apple M4) engine/trainer: task=detect, mode=train, model=tongue.yaml, data=/Volumes/tmp/datasets/XinNaoXieGuan/data.yaml, epochs=150, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=mps, workers=8, project=/Volumes/tmp/runs, name=XinNaoXieGuan2, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=None, workspace=None, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, copy_paste_mode=flip, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=/Volumes/tmp/runs/XinNaoXieGuan2
resume=True
(.venv) PS D:\workspace\medical> yolo task=detect mode=train model=model/yolo11n.pt data=E:/datasets/AllInOne/data.yaml epochs=300 device=0 name=AllInOne project=E:/runs resume=True