知乎专栏 |
命令行演示
检查 Yolo 能否使用 GPU
import torch # 检查torch是否有支持CUDA,即GPU能否使用 if torch.cuda.is_available(): print(torch.cuda.get_device_name(0)) print(torch.version.cuda) else: print("No gpu") # 如果CUDA可用,它还会打印出当前默认的CUDA设备(通常是第一个GPU) device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") print(device)
查看 GPU 进程
PS D:\workspace\netkiller> nvidia-smi.exe Fri Nov 29 16:53:25 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 560.94 Driver Version: 560.94 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3050 ... WDDM | 00000000:02:00.0 Off | N/A | | N/A 55C P3 10W / 47W | 785MiB / 6144MiB | 40% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 17360 C ...rograms\Python\Python312\python.exe N/A | +-----------------------------------------------------------------------------------------+
PS D:\workspace\netkiller> nvidia-smi.exe -q ==============NVSMI LOG============== Timestamp : Fri Nov 29 16:46:47 2024 Driver Version : 560.94 CUDA Version : 12.6 Attached GPUs : 1 GPU 00000000:02:00.0 Product Name : NVIDIA GeForce RTX 3050 6GB Laptop GPU Product Brand : GeForce Product Architecture : Ampere Display Mode : Disabled Display Active : Disabled Persistence Mode : N/A Addressing Mode : N/A MIG Mode Current : N/A Pending : N/A Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : WDDM Pending : WDDM Serial Number : N/A GPU UUID : GPU-c6f6cb0c-8715-e499-2cf6-ba28d1ca9674 Minor Number : N/A VBIOS Version : 94.07.87.00.e9 MultiGPU Board : No Board ID : 0x200 Board Part Number : N/A GPU Part Number : 25AC-730-A1 FRU Part Number : N/A Module ID : 1 Inforom Version Image Version : G001.0000.94.01 OEM Object : 2.0 ECC Object : N/A Power Management Object : N/A Inforom BBX Object Flush Latest Timestamp : N/A Latest Duration : N/A GPU Operation Mode Current : N/A Pending : N/A GPU C2C Mode : N/A GPU Virtualization Mode Virtualization Mode : None Host VGPU Mode : N/A vGPU Heterogeneous Mode : N/A GPU Reset Status Reset Required : No Drain and Reset Recommended : No GSP Firmware Version : N/A IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x02 Device : 0x00 Domain : 0x0000 Device Id : 0x25AC10DE Bus Id : 00000000:02:00.0 Sub System Id : 0x8C2B103C GPU Link Info PCIe Generation Max : 4 Current : 4 Device Current : 4 Device Max : 4 Host Max : 4 Link Width Max : 16x Current : 4x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 14 KB/s Rx Throughput : 38 KB/s Atomic Caps Outbound : N/A Atomic Caps Inbound : N/A Fan Speed : N/A Performance State : P3 Clocks Event Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active Sparse Operation Mode : N/A FB Memory Usage Total : 6144 MiB Reserved : 142 MiB Used : 785 MiB Free : 5218 MiB BAR1 Memory Usage Total : 8192 MiB Used : 8164 MiB Free : 28 MiB Conf Compute Protected Memory Usage Total : N/A Used : N/A Free : N/A Compute Mode : Default Utilization Gpu : 25 % Memory : 11 % Encoder : 0 % Decoder : 0 % JPEG : 0 % OFA : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 ECC Mode Current : N/A Pending : N/A ECC Errors Volatile SRAM Correctable : N/A SRAM Uncorrectable Parity : N/A SRAM Uncorrectable SEC-DED : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Aggregate SRAM Correctable : N/A SRAM Uncorrectable Parity : N/A SRAM Uncorrectable SEC-DED : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A SRAM Threshold Exceeded : N/A Aggregate Uncorrectable SRAM Sources SRAM L2 : N/A SRAM SM : N/A SRAM Microcontroller : N/A SRAM PCIE : N/A SRAM Other : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows : N/A Temperature GPU Current Temp : 57 C GPU T.Limit Temp : N/A GPU Shutdown Temp : 100 C GPU Slowdown Temp : 97 C GPU Max Operating Temp : 105 C GPU Target Temperature : 87 C Memory Current Temp : N/A Memory Max Operating Temp : N/A GPU Power Readings Power Draw : 11.06 W Current Power Limit : 46.28 W Requested Power Limit : N/A Default Power Limit : 35.00 W Min Power Limit : 1.00 W Max Power Limit : 60.00 W GPU Memory Power Readings Power Draw : N/A Module Power Readings Power Draw : N/A Current Power Limit : N/A Requested Power Limit : N/A Default Power Limit : N/A Min Power Limit : N/A Max Power Limit : N/A Clocks Graphics : 615 MHz SM : 615 MHz Memory : 5000 MHz Video : 697 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Deferred Clocks Memory : N/A Max Clocks Graphics : 2100 MHz SM : 2100 MHz Memory : 5501 MHz Video : 1950 MHz Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : 612.500 mV Fabric State : N/A Status : N/A CliqueId : N/A ClusterUUID : N/A Health Bandwidth : N/A Processes GPU instance ID : N/A Compute instance ID : N/A Process ID : 17360 Type : C Name : C:\Users\neo\AppData\Local\Programs\Python\Python312\python.exe Used GPU Memory : Not available in WDDM driver model Capabilities EGM : disabled PS D:\workspace\netkiller>
Accelerated PyTorch training on Mac
xcode-select --install
检查是否支持苹果M芯片
import torch # 判断macOS的版本是否支持 print(torch.backends.mps.is_available()) # 判断mps是否可用 print(torch.backends.mps.is_built())
运行一段代码,测试一下
import torch if torch.backends.mps.is_available(): mps_device = torch.device("mps") x = torch.ones(1, device=mps_device) print (x) else: print ("MPS device not found.")
输出下面信息,表示系统运行在MPS环境下,一切正常
tensor([1.], device='mps:0')
训练时会现实 Ultralytics 8.3.38 🚀 Python-3.12.7 torch-2.5.1 MPS (Apple M4)
(.venv) neo@Neo-Mac-mini-M4 netkiller % yolo task=detect mode=train model=tongue.yaml data=/Volumes/tmp/datasets/XinNaoXieGuan/data.yaml epochs=150 device=mps name=XinNaoXieGuan project=/Volumes/tmp/runs WARNING ⚠️ no model scale passed. Assuming scale='n'. New https://pypi.org/project/ultralytics/8.3.49 available 😃 Update with 'pip install -U ultralytics' Ultralytics 8.3.38 🚀 Python-3.12.7 torch-2.5.1 MPS (Apple M4) engine/trainer: task=detect, mode=train, model=tongue.yaml, data=/Volumes/tmp/datasets/XinNaoXieGuan/data.yaml, epochs=150, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=mps, workers=8, project=/Volumes/tmp/runs, name=XinNaoXieGuan2, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=None, workspace=None, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, copy_paste_mode=flip, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=/Volumes/tmp/runs/XinNaoXieGuan2