at xAI
$50 - $60 per hour (estimated)
Palo Alto, 94301, CA, US
Onsite | Full Time
To proceed with your application
The multimodal team at xAI creates magical AI experiences beyond text, enabling understanding and generation of content across various modalities, including image, video, and audio. To accomplish this, we are looking for experienced data and infrastructure engineers to develop and optimize data pipelines related to multimodal data (such as images, videos, and audio), including acquisition, preprocessing, data loading, visualization and management. The role is based in the Bay Area [San Francisco and Palo Alto]. Candidates are expected to be located near the Bay Area or open to relocation. Focus includes building tools to assist the acquisition of multimedia data; building petabyte-scale, high-throughput data processing systems for multimodal data (including text, images, videos, and audio); building high-throughput, and low-latency data decoding and loading pipelines for supporting efficient large-scale training of multimodal models; and building visualization and management tools for all categories of datasets in house. Ideal experience includes expertise in developing software for large-scale distributed machine learning systems, expertise in Spark, GPUs, Kubernetes, and JAX (or PyTorch), and experience in standard software engineering best practices (CI/CD) with care about code quality, testing, and performance. The tech stack includes Python, JAX, Rust, Spark, and CUDA. The interview process involves an initial 15-minute phone interview followed by four technical interviews including research discussion, coding interviews, and a presentation of past exceptional work and vision with xAI. All interviews are conducted via Google Meet. Benefits include base salary, equity, comprehensive medical, vision, and dental coverage, 401(k) retirement plan, disability insurance, life insurance, and various other discounts and perks.