Render a neural network into CUDA/HIP code

AITemplate is a Python framework that converts neural networks into CUDA or HIP C++ code, providing lightning-fast inference serving for deep learning models. This framework features high performance, open, and flexible capabilities with seamless unified fp16 deep neural network models for both NVIDIA and AMD GPUs. AITemplate allows the addition of new extensions while offering innovative memory and fusion capabilities. Further integration of AItTemplate has been developed through FX2AIT, which converts PyTorch models into AIT engine for improved inference. This framework provides easy conversion solutions while expanding AITemplate’s support for PyTorch operators.

https://github.com/facebookincubator/AITemplate

To top