About the job
Location: AbuDhabi, UAE
We are looking for a DevOPs Engineer / Site Reliability Engineer with 4-6 years of hands-on experience.
Experience with a fast-paced product/engineering services company in a Senior Dev Ops engineer role, setting up and maintaining a high-availability, high-performance real-time system is mandatory .
- BE/BTech with CS or related discipline 5+ years of experience as a DevOps, Site Reliability Engineer (SRE) or Systems Engineer with Advanced understanding Linux administration and cloud technology ( preferably in Azure) .
- Experience in writing automation scripts, building application dashboards for proactive monitoring using Ruby, PowerShell, Python scripting or similar technologies, ability to debug and optimize code and automate routine tasks.
- Strong understanding of microservice technology.
- Design and Deploy Automation of Container Applications using Kubernetes and Docker.
- Administration of Kubernetes cluster , maintenance, monitoring and supporting applications in a Kubernetes environment. CKA or CKAD certification would be plus .
- Knows to deploy applications release/patches via automation tool (ex, Ansible playbook). Strong Ansible experience will be preferred.
- Setup application/system monitoring, Work with Developers/QA to build and validate containerized applications.
- Document Deployment Processes, Services and Environments.
- As a Site Reliability Engineer, you will solve exciting technical challenges by analysing, troubleshooting, and designing vital services, platforms, and infrastructure while always thinking about reliability, scalability, resilience, security, and performance.
- As an SRE, you will understand the end-to-end configuration, technical dependencies, and overall behavioural characteristics of the production services you collaborate with. In partnership with your development colleagues, you will have the responsibility to ensure that services are designed and delivered to be mission critical with a focus on security, resiliency, scale, and performance.
- Hands-on knowledge of :
- Kubernetes Administration, Docker, Harbor, Ingress-controller, Helm
- ForgeRock, Cert Manager, Hashicorp Vault, Terraform
- EFK (Elastic, Fluentd, Kibana), Apache ActiveMQ
– HAProxy, Confluent/Kafka, RabbitMQ, Oracle DB
- MySQL, MongoDB, JIRA, GitLab, Sonar
- Jenkins, Ansible, Openstack, Websphere MQ
- Python, Perl, Ruby, JSON, YAML, Bash scripting
- REST Webservices, Microsoft Azure Devops
- Azure pipeline, Azure Kubernetes, Azure container registry, Azure docker registry, Azure git, Azure artifacts.
Personal Style Enablers:
- Strong verbal and written communication with ability to articulate problems and solutions over the phone and emails.
- Strong sense of urgency, with a passion for accuracy and timeliness.
- Ability to work calmly in high pressure situations and manage multiple projects/tasks.
- Ability to work independently and possess superior skills in issue resolution.
Primary Job Responsibilities:
- Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually.
- Able to troubleshoot complicated, cross platform issues handling OS, Networking, Database in a cloud-based SaaS environment and handle live production incidents, debug/troubleshoot applications and infrastructure issues, follow and implement SRE best practices.
- Experience with automation/configuration management tools like Salt/ Puppet/ Chef/ Ansible.
- Experience in troubleshooting production issues and coordinate with the development team to streamline code deployment.
- Monitor application performance take steps to improve overall application performance and stability and follow through with implementation.
- Document your system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it.
- Maintain and monitoring deployment, orchestration of the servers, docker containers, databases, and general backend infrastructure.
- Experience with CI/CD in cloud environments and container technology, Docker and Kubernetes, Docker Swarm.
- Experience as Linux systems administrator (e.g. CentOS, RedHat) and command line system administration such as Bash, VIM, SSH.
- Ability to work and support 24X7 operations.
Good to have familiarity with few of the below listed stakes.
IBM Cognos Analytics
IBM App Connect Enterprise
IBM Business Automation Work flow
IBM Operational Decision Manager
IBM Infosphere Information Server (Datastage)
IBM Websphere Application Server
IGT Solutions provides equal employment opportunities to all individuals based on job-related qualifications and ability to perform a job, without regard to age, gender, gender identity, sexual orientation, race, color, religion, creed, national origin, disability, genetic information, veteran status, citizenship or marital status, and to maintain a non-discriminatory environment free from intimidation, harassment or bias based upon these groups.