
Monitoring and Observability Architect
Posted Jun 25

Posted Jun 25
This is a fully remote position, open to applicants in California.
• Design, develop, and maintain scalable observability platforms that deliver metrics, logs, traces, and alerts for cloud-native, distributed systems.
• Construct and oversee Observability pipelines and dashboards (metrics, logs, traces).
• Establish and monitor SLOs, SLIs, and SLAs; generate actionable alerts.
• Instrument applications and infrastructure using OpenTelemetry and vendor SDKs.
• Integrate and manage tools such as Splunk, Prometheus/Grafana, ELK/EFK, Datadog, and New Relic.
• Automate deployment processes utilizing CI/CD and Infrastructure as Code (Terraform/CloudFormation).
• Collaborate with development, SRE, and platform teams; create runbooks and post-incident reports.
• Hands-on experience with Splunk and at least two monitoring tools (Splunk, Prometheus & Grafana, ELK/EFK, Datadog, New Relic).
• Proficient in scripting/programming languages such as Python, Go, or Bash.
• Extensive knowledge of Kubernetes, Docker, microservices, and distributed systems.
• Proven experience in defining SLOs, SLIs, and SLAs.
• Familiarity with CI/CD and Infrastructure as Code; experience working on AWS.
• Preferred experience with OpenTelemetry and distributed tracing.
• Knowledge of service mesh telemetry (Istio/Linkerd).
• Experience in regulated environments (e.g., FDA) along with expertise in telemetry cost and retention optimization.
• Strong soft skills as a clear communicator, collaborative team player, proactive, and effective mentor.
• Diversity Inclusion: At Exavalu, we are dedicated to fostering a diverse and inclusive workforce.
• We welcome applications from all qualified candidates, irrespective of race, color, gender, national or ethnic origin, age, disability, religion, sexual orientation, gender identity, or any other status protected by applicable law.
• We cultivate a culture that values all individuals and encourages diverse perspectives, allowing you to make an impact and advance your career.
• Exavalu also promotes flexibility to accommodate the needs of employees, customers, and the business.
• Additionally, we offer a welcome back program to assist individuals in reintegrating into the workforce after a prolonged absence due to health or family reasons.
Conduent
NVIDIA
Trileaf Corporation
Blue Acorn iCi
Get handpicked remote jobs straight to your inbox weekly.