boyuan-xiao-at-work 27 Jun 2025 GPU Programming For The Brave beyond-the-code bash-to-the-feature GPU programming CUDA parallel programming AI neural network