Tesla introduced the Dojo supercomputer at Hot Chips 2023, focusing on machine learning for automotive applications like self-driving cars. They tackled the issue of limited throughput by optimizing the connection between hosts and the supercomputer using the Tesla Transport Protocol over Ethernet (TTPoE). This protocol simplified TCP processes, reducing latency and improving bandwidth. The congestion control strategy involved packet drops rather than adjusting window size based on network conditions. The TTPoE hardware block, including a shared cache and 1 MB transmit SRAM buffer, ensured efficient data transmission at up to 97.65Gbps. The “Dumb-NIC” design helped deploy cost-effective host nodes for feeding data into the Dojo supercomputer.
https://chipsandcheese.com/2024/08/27/teslas-ttpoe-at-hot-chips-2024-replacing-tcp-for-low-latency-applications/