Mirrai Careers
Resume BuilderCareer Test
InsightsPricing
Get Started Free
Jobs/HPC Engineer

HPC Engineer

ifm-us

Sunnyvale, CA Full-time$150k–$300k / year Posted 3w ago
Market rate. This role pays around the $193k median for similar USD roles (373 comparable postings in our corpus).
Apply on company site
About MBZUAI The Institute for Foundation Models (IFM) operates some of the world's largest AI supercomputing environments. Position Summary This role provides operational coverage during Abu Dhabi overnight hours and serves as a primary point of contact for infrastructure monitoring, incident triage, researcher support, and production operations. RESPONSIBILITIES • Monitor health, performance, and availability of large-scale GPU clusters. • Respond to incidents and perform first-level triage. • Support researchers and troubleshoot job failures. • Execute operational runbooks and recovery procedures. • Validate cluster deployments, upgrades, and maintenance activities. • Track infrastructure utilization and operational metrics. • Develop automation and monitoring tools. • Contribute to documentation and reporting. EDUCATION Bachelor's degree in Computer Science, Computer Engineering, Software Engineering, Information Technology, Electrical Engineering, Mathematics, Physics, or related disciplines. EXPERIENCE • 2+ years in Linux systems administration, SRE, DevOps, cloud operations, HPC, or infrastructure operations. • Strong Linux troubleshooting skills. • Experience with scripting using Python or Bash. PREFERRED QUALIFICATIONS • Slurm. • GPU infrastructure. • AWS, Azure, or GCP. • Grafana, Prometheus, Datadog, or similar tools. • Containers and Kubernetes. • AI/ML infrastructure exposure. • Research computing environments. Benefits Include *Comprehensive medical, dental, and vision benefits   *Bonus *401K Plan *Generous paid time off, sick leave and holidays *Paid Parental Leave *Employee Assistance Program *Life insurance and disability

See how well you match this job

Upload your resume and we’ll score your fit for this role and 6 similar roles — then tailor your CV to it with AI. Free, no credit card.

Check your match

Similar jobs

  • HPC Operations Engineer

    lambda

    Remote$240k–$356k
  • Staff HPC Engineer

    biohub

    San Francisco, CA (Hybrid)
  • Systems Engineer, HPC (US & Canada)

    Mistral AI

    Remote
  • AI Research Internship - WM

    ifm-us

    Sunnyvale, CA$100k–$140k
  • Member of Technical Staff - Infrastructure

    gimlet

    San Francisco
  • Software Engineer, GPU Infrastructure - HPC

    OpenAI

    San Francisco$230k–$490k
Apply on company site

Want more roles like this? Browse fresh jobs or tailor your resume with AI.

Mirrai Careers

AI-powered career platform: build resumes, match jobs, and plan your career.

Product

  • All Tools
  • Resume Builder
  • Career Test
  • Pricing

Legal

  • Privacy Policy
  • Terms of Service
  • Fair Use Policy

Company

MIRRAI CHAT LTD (Company No. 16403306)

71-75 Shelton Street, Covent Garden

London, WC2H 9JQ, UNITED KINGDOM

[email protected]

© 2026 Mirrai Careers. All rights reserved.