The llama.cpp Portable Zip now allows you to run DeepSeek-R1-671B-Q4_K_M with 1 or 2 Arc A770 on Xeon using the latest version. This guide explains how to run llama.cpp on Intel GPU with ipex-llm without manual installations. It supports various Intel processors and Arc GPUs. Windows and Linux quickstart guides are provided for downloading, extracting, configuring runtime, and running GGUF models. The FlashMoE tool optimized for MoE models like DeepSeek V3/R1 is available for Linux, requiring specific resources and model adjustments. Tips and troubleshooting are included for multi-GPUs usage. Experienced users can have performance details and environmental configuration information provided.
https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/llamacpp_portable_zip_gpu_quickstart.md