
Staff DevOps Engineer
Posted 1 day ago

Posted 1 day ago
• Spearhead the design, execution, and continuous enhancement of reliable, scalable, high-performance, and secure production platforms and services.
• Collaborate closely with cross-functional teams to establish and maintain resilient infrastructure and deployment strategies.
• Offer technical guidance and mentorship to engineers throughout the organization, fostering robust engineering standards and operational best practices.
• Engage in a 24x7 on-call rotation to support critical services and ensure platform uptime.
• Promote standardization, automation, and documentation to enhance consistency, minimize operational burdens, and facilitate knowledge sharing.
• Contribute throughout the entire lifecycle of platform and service delivery, from design and construction to operation and optimization.
• Over 5 years of experience in DevOps, Site Reliability Engineering (SRE), platform engineering, or software engineering positions.
• Extensive experience with Kubernetes at scale, possessing a thorough understanding of containers and container orchestration.
• Practical experience with infrastructure as code tools such as Terraform, Ansible, or Puppet.
• Proficient programming skills in at least one object-oriented language, complemented by effective scripting and automation capabilities.
• Strong grasp of security principles and best practices across infrastructure, platforms, and services.
• Significant hands-on experience with at least one major cloud platform, with extensive exposure to AWS, GCP, or OCI.
• Robust monitoring, alerting, and observability experience using tools like Prometheus, Grafana, or similar platforms.
• Comprehensive understanding of networking fundamentals and distributed systems.
• Strong experience in Linux and/or Windows systems administration.
• Familiarity with software delivery automation, CI/CD pipelines, and secure software development lifecycle (SDLC) practices, including exposure to static and dynamic security testing.
• Good understanding of SRE concepts such as Service Level Indicators (SLIs), Service Level Objectives (SLOs), Service Level Agreements (SLAs), toil reduction, availability, and observability.
• Experience in managing and scaling Elasticsearch in production environments is highly desirable.
• Health insurance
• Professional development opportunities
• Flexible work arrangements
Arctiq
Arctiq
Software Mind
Mediastream
Get handpicked remote jobs straight to your inbox weekly.