Site Reliability Engineer Senior On-site

ID: 7248

1 день назад

Активна

Presight

ОАЭ, Абу-Даби

7 000 $ - 8 000 $

Тип занятости

Полная занятость

Требуемый опыт

Более 6 лет

Формат работы

Полный день

📞Способы связи

@ant1kdreamtelegram

📄 Оригинальный текст вакансии

Публикатор: ant1k Обсуждение: @devops_jobs #SRE #senior #fulltime #onsite #relocation #UAE #linux #kubernetes #vacancy #вакансия Senior SRE (Site Reliability Engineer) Location: Abu Dhabi, UAE (relocation required) Company: https://www.presight.ai/ Employment Type: Full-time, on-site only Salary Range: $7,000-8,000 (25-30k AED) net, based on interview results About Presight: Presight is an ADX-listed public company with Abu Dhabi based G42 as its majority shareholder and is the region’s leading big data analytics company powered by GenAI. It combines big data, analytics, and AI expertise to serve every sector, of every scale, to create business and positive societal impact. Presight excels at all- source data interpretation to support insight-driven decision-making that shapes policy and creates safer, healthier, happier, and more sustainable societies. Today, through its range of GenAI-driven products and solutions, Presight is bringing App Position Overview: Seeking a meticulous Engineer - Site Reliability who will support the Presight delivery model that empowers product & technology teams to develop & deliver high-quality products, improve platform infrastructure and strengthen the reliability of products and solutions. You play a key role in defining & establishing the delivery model deployed in the development of cutting edge, next-gen analytics solutions & services at Presight. Key Responsibilities: - Manage the infrastructure required to run our solutions deployed to public or private cloud (air-gapped). - Analyze service performance, identify bottlenecks, and provide measurable improvement plans. - Maintain the environment’s health by continuously monitoring technical and business metrics, configuring alerts for potential issues, and proactively addressing risks to prevent disruptions - Deploy application updates with minimal disruption to services - Identify, evaluate, and conduct proof-of-concepts for new technologies. - Contribute to the knowledge base. - Review and modify CI/CD principles and service maturity iteratively, striving for continuous - improvement Requirements: - 5+ years of experience in managing Kubernetes clusters. - 5+ years of experience in configuring and using monitoring/observability platforms - Familiarity with at least one type of database - 5+ years in a SRE/DevOps/Sysadmin/Platform Engineer role Mandatory skills: - Strong background in Linux/Unix Administration - Solid hands-on experience deploying and operating Kubernetes or Openshift clusters - Experience configuring and maintaining monitoring and observability solutions - Ability to troubleshoot and resolve complex production issues efficiently, including performing root cause analysis and restoring services quickly during high-pressure incidents or critical outages - Experience in backing up and restoring various systems - Working together with project managers and solution architects while serving as subject matter experts - Implementing basic network security (e.g. configuring VPCs, firewalls/security groups, etc.) - Understand the dependencies of various GPU cards, and upgrade container images as needed in order to ensure compatibility - Deploy and operate products provided by third party providers - Creating releases together with the development team and deploying release packages to all required environments Nice to Have: - Good understanding of typical system architecture and interaction between its components - Experience automating tasks using infrastructure-as-code tools, e.g. Ansible, Terraform - Thorough understanding of a company's systems, including auxiliary components like caching systems (e.g., Redis, Memcached) and message queues (e.g., RabbitMQ, Kafka) - Good understanding of databases, e.g. Postgres, Elasticsearch, Clickhouse - Basic scripting - Working knowledge of OAuth 2.0, OpenID/OpenID-Connect, SAML 2.0, Kerberos, LDAP Contact for questions and CV: @ant1kdream

🛠 Навыки

Ansible

ClickHouse

Elasticsearch

Kafka

Kerberos

Kubernetes

LDAP

Linux

Memcached

OAuth 2.0

OpenID Connect

Openshift

Postgres

RabbitMQ

Redis

SAML 2.0

Terraform

🎯 Домены

Analytics

Big Data

🤖 ИИ навыки

Ansible

CI/CD

ClickHouse

Containerization

Docker

Elasticsearch

Firewalls

GPU

Infrastructure as Code

Kafka

Kerberos

Kubernetes

LDAP

Linux

Memcached

Monitoring

Networking

OAuth 2.0

Observability

OpenID Connect

Openshift

PostgreSQL

RabbitMQ

Redis

SAML 2.0

Scripting

Security Groups

Terraform

VPC

* Навыки определены автоматически с помощью нейросети

🤖 ИИ домены

Analytics

Big Data

Cloud Infrastructure

DevOps

FinTech

GenAI

Platform Engineering

SRE

System Administration

* Домены определены автоматически с помощью нейросети

📢 Информация о публикации

🔗 Оригинальные посты (2)

https://t.me/devops_jobs_feed/22227

https://t.me/devops_jobs_feed/21691

Канал:devops_jobs_feed