
Site Reliability Engineer
Posted 18 hours ago

Posted 18 hours ago
• Design and manage dashboards and alerting mechanisms utilizing Prometheus, Grafana, or ELK Stack.
• Assist in identifying Service Level Indicators (SLIs).
• Create and sustain Infrastructure as Code (IaC) scripts with Terraform and Ansible for consistent, error-free deployments.
• Oversee automated deployment pipelines, ensuring the inclusion of security scans and automated tests within the workflow.
• Engage in on-call rotations and support the resolution of system outages.
• Participate in blameless post-mortem analyses to foster ongoing improvement.
• Recognize repetitive manual tasks and implement automation to minimize "toil," enabling the team to concentrate on high-impact engineering tasks.
• 3–5 years of experience in Site Reliability Engineering (SRE), DevOps, or Systems Engineering roles.
• Strong proficiency in scripting languages such as Python, Go, or Bash.
• Practical experience with containerization technologies (Docker, Kubernetes) and cloud services (AWS, Azure, or GCP).
• Understanding of NIST SP 800-53 security controls.
• Bachelor’s degree in Computer Science or a related technical discipline.
• Health insurance
• Paid time off
Arctiq
Software Mind
Mediastream
Kyndryl
Get handpicked remote jobs straight to your inbox weekly.