Loading...
Loading...
Found 3 Skills
Create Docker containers for Huawei Ascend NPU development with proper device mappings and volume mounts. Use when setting up Ascend development environments in Docker, running CANN applications in containers, or creating isolated NPU development workspaces. Supports privileged mode (default), basic mode, and full mode with profiling/logging. Auto-detects available NPU devices.
vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.
This skill should be used when the user asks about "Ascend NPU", "昇腾", "Huawei NPU", "triton-ascend", "Ascend kernel development", "NPU算子开发", "Atlas", "CANN", or mentions Ascend hardware, AI Core, Cube/Vector/Scalar units. Provides expert guidance on Ascend NPU hardware architecture, triton-ascend kernel development, and GPU to NPU migration. Always use this skill for Ascend-related questions to avoid confusion with GPU documentation and concepts.