Research Computing Operations Lead

King Abdullah University of Science and Technology (KAUST)

Job Summary
Responsible for day to day oversight and management of research computing systems including cluster and HPC hardware, and scientific workstations. Leads the effort to provide research computing services to the end users while maintaining availability, supportability, and usability of clusters, tools, and data.  Leads a team of research Computing Operations Specialists and Analysts with various responsibilities including, functional, service systems such as large scale clusters and storage to support academic and research activities through shared IT research computing services and computational facilities. Works with faculty, researchers, and research groups to facilitate achievement of academic and research goals via the use of IT computational resources and services.  Develops ideas and executes projects to support academic and research data growth. 

Major Responsibilities

  • Provides planning, leadership, direction, and advanced technical expertise; serves as high level technical expert in regards to HPC operations, hardware requirements, and data center hosting.
  • Leads a team of research Computing Operations Specialists and Analysts.
  • Makes final decisions on operational matters pertaining to hardware while consulting with other team members in the areas of storage, backup, systems, and applications.
  • Leads the effort to develop, implement, and maintain policies, procedures, and standards pertaining to scientific workstations, and cluster/HPC hardware maintenance/support.
  • Maintains a broad knowledge of current and emerging state of the art HPC architecture concepts, technologies, and products. 

Technical Skills

  • Provides Technical expertise in the areas of HPC operations and hardware support.
  • Maintains in depth knowledge of HPC specialisms, and provides expert advice regarding their application.
  • Specifies and designs complex hardware components/systems.
  • Ensures that hardware designs balance functional, service quality, security, systems management and sustainability requirements.
  • Produces reports with KPIs and critical metrics for research computing services such as utilization, availability, incidents, and integrity.
  • Coordinate with the IT Data center team during planned preventive maintenance activities for the facilities. 

Required Education

  • Degree educated in Computer Sciences, IT, MIS, Computer Engineering or similar with hands on experience in HPC operations.
  • PMP certification is a plus. 

Required Experience
Minimum 8 years of IT experience in HPC operations, supercomputers, and high density computing environments. Including IBM, Dell, HP, Supermicro, and bull server and workstation hardware. 

To apply, send application and CV to Angela Baranski at angela.baranski@kaust.edu.sa

For more information, visit http://apptrkr.com/906907