Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction to Custom Operator Development
- Understanding the rationale for building custom operators: use cases and constraints.
- Structure of the CANN runtime and key integration points for operators.
- Overview of TBE, TIK, and TVM within the Huawei AI ecosystem.
Low-Level Operator Programming with TIK
- Grasping the TIK programming model and its supported APIs.
- Managing memory and implementing tiling strategies in TIK.
- Creating, compiling, and registering a custom operator with CANN.
Testing and Validating Custom Operators
- Performing unit testing and integration testing of operators within the graph.
- Debugging performance issues at the kernel level.
- Visualizing operator execution flow and buffer behavior.
Scheduling and Optimization Using TVM
- Overview of TVM as a compiler for tensor operations.
- Writing schedules for custom operators in TVM.
- Utilizing TVM for tuning, benchmarking, and code generation for Ascend devices.
Integration with Frameworks and Models
- Registering custom operators for compatibility with MindSpore and ONNX.
- Verifying model integrity and handling fallback behaviors.
- Supporting multi-operator graphs involving mixed precision.
Case Studies and Specialized Optimizations
- Case study: Achieving high-efficiency convolution for small input shapes.
- Case study: Optimizing memory-aware attention operators.
- Best practices for deploying custom operators across various devices.
Summary and Next Steps
Requirements
- In-depth understanding of AI model internals and operator-level computations.
- Practical experience with Python programming and Linux development environments.
- Familiarity with neural network compilers or graph-level optimization techniques.
Target Audience
- Compiler engineers working on AI toolchains.
- Systems developers specializing in low-level AI optimization.
- Developers creating custom operators or targeting emerging AI workloads.
14 Hours