IBM, May 2023
IBM Research discusses its journey of developing Vela, an AI supercomputer in the IBM Cloud, which aims to accelerate the training, fine-tuning, and deployment of advanced AI models.
Leveraging a cloud-native AI training stack running on the Red Hat OpenShift Container Platform, the article highlights their approach to high-performance AI workloads, Kubernetes-native resource utilization, and efficient data pre-processing, training, and validation. IBM's efforts focus on simplifying the user experience and optimizing foundation model tuning and serving while fostering collaboration with Red Hat and contributing to open-source communities for the broader AI ecosystem.