Join our Riyadh Office and be part of the team driving network innovation across the region.
We are seeking a skilled Monitoring Engineer to design, implement, and maintain our monitoring and observability systems across infrastructure, applications, and services. The ideal candidate will have hands-on experience with monitoring tools, automation, and performance optimization to ensure system reliability, early issue detection, and rapid incident response.
Design, configure, and maintain monitoring solutions across servers, applications, cloud resources, and networks.
Develop and optimize dashboards, alerts, and metrics for proactive monitoring.
Implement observability practices—including logs, metrics, traces—to improve system visibility.
Create effective alerting policies to identify and escalate performance degradation and system failures.
Collaborate with SRE/DevOps/IT teams to investigate incidents and provide root cause analysis (RCA).
Maintain incident and problem-tracking documentation.
Administer monitoring tools such as Prometheus, Grafana, Zabbix, Nagios, Datadog, ELK/EFK, Splunk, New Relic, or similar.
Build automation scripts for monitoring agent deployment, configuration, and alert updates.
Integrate monitoring tools with ticketing systems (e.g., Jira, ServiceNow) and notification channels (Slack, Teams, email, SMS).
Analyze system performance trends and provide recommendations for capacity planning.
Conduct health checks and performance tuning of applications and infrastructure.
Support optimization of CI/CD pipelines with monitoring data insights.
Maintain monitoring standards, playbooks, and operational procedures.
Train engineering and operations teams on using monitoring dashboards and interpreting alerts.
Ensure compliance with internal reliability and SLA/SLI/SLO guidelines.
Bachelor’s degree in Computer Science, Information Systems, Engineering, or equivalent experience.
2–5+ years of experience in monitoring, observability, DevOps, or system administration.
Strong understanding of Linux/Windows environments, cloud platforms (AWS/Azure/GCP), and networking fundamentals.
Hands-on experience with monitoring tools like Prometheus, Grafana, Zabbix, ELK, Datadog, New Relic, Splunk, or equivalent.
Experience writing automation scripts (Python, Bash, PowerShell, etc.).
Understanding of logs, metrics, and distributed tracing concepts.
Fill out the form and upload your CV to submit your application.