Arab Center for Research and Policy Studies
About the job
We are seeking a highly skilled and experienced Linux Administrator to join our team. The ideal candidate will have a strong background in Linux system administration, virtualization using KVM, high performance computing, and cloud computing platforms such as AWS and Azure. As a Linux Administrator, she/he will be responsible for supporting research and development activities using Linux platforms for Machine Learning and Artificial Intelligence (ML/AI). He should be resourceful with the open-source solutions and production pipelines in ML/AI and familiar with the latest technologies for working with large language models. She/He will be managing and maintaining our Linux-based systems, virtualization infrastructure, and cloud resources to ensure high availability, security, and performance.
· Linux System Administration:
o Install, configure, and maintain Linux servers (Red Hat, CentOS, Ubuntu, etc.).
o Perform system upgrades, patches, and security hardening.
o Monitor system performance and troubleshoot issues.
o Manage user accounts, permissions, and file systems.
o Implement and maintain backup and disaster recovery solutions. (Veeam Backup Tool Experience is a plus)
· Virtualization (KVM):
o Set up and manage virtual machines using KVM (Kernel-based Virtual Machine).
o Allocate resources, configure networking, and optimize VM performance.
o Troubleshoot virtualization-related issues and ensure high availability.
o Good understanding of Oracle Linux Virtualization Manager.
o Good understanding of Hard Partitioning with Oracle Linux KVM
o Hands on experience in Virtual CPU pinning to optimize the usage of oracle DB and WebLogic licenses.
· High Performance Computing and ML/AI:
o Experience with HPC scheduling software.
o Experience with the installation and systems administration of high-performance compute (HPC) systems including Linux clusters and parallel data storage systems.
o Experience with a range of programming languages relevant for systems administration and HPC.
o Experience with supporting open source development stacks and pipelines for ML/AI technologies.
· Cloud Computing (AWS and Azure):
o Deploy and manage cloud instances and services on AWS and Azure.
o Monitor and optimize cloud resources for cost efficiency.
o Implement security best practices for cloud environments.
o Manage cloud-based databases, storage, and networking components.
· Automation and Scripting:
o Develop and maintain automation scripts (Bash, Python, etc.) for routine tasks.
o Implement Infrastructure as Code (IaC) using tools like Terraform or Ansible
· Security and Compliance:
o Implement security policies, access controls, and firewall rules.
o Perform regular security audits and vulnerability assessments.
o Ensure compliance with industry standards and regulations.
· Collaboration and Documentation:
o Collaborate with cross-functional teams to support application deployments.
o Maintain comprehensive documentation of system configurations and procedures.
o Provide training and guidance to team members as needed.
o Ability to write relevant components of a proposal document (e.g., Prepare RFP, Answer specific RFP questions).
o Implement and document processes and procedures to ensure compliance with standard business practices.
· Troubleshooting and Support:
o Provide support for critical systems and respond to incidents promptly.
o Troubleshoot and resolve system and network issues.
o Participate in On-Call rotation schedules.
o Good knowledge of ITIL concept, Incident, Change and Problem Management process and ticketing tool. (Knowledge in Service Now tool is a plus
· Bachelor’s degree in computer science, Information Technology, or related field (or equivalent experience).
· Proven experience as a Linux Administrator with at least 5 years of experience.
· Strong expertise in virtualization technologies, specifically KVM.
· Strong expertise in HPC.
· Hands-on experience with AWS and Azure cloud platforms.
· Proficiency in scripting and automation.
· Knowledge of networking protocols and security best practices.
· Excellent problem-solving and communication skills.
· Relevant certifications (e.g., RHCE, AWS Certified SysOps Administrator, Azure Administrator) is a plus.