Remotery

Senior/Staff/Principal Software Engineer – Observability Engineering

Posted 1 day ago

This is a fully remote position, open to applicants in New York.

πŸ“‹ Description

β€’ Take ownership of the complete design and execution of the AppGate observability fabric, which includes telemetry SDKs in our clients and gateways, the LogForwarder pipeline, and customer-side integrations.

β€’ Make essential technical choices regarding transport protocols, sampling strategies, schema design, and correlation models that influence the scalability of our platform to handle hundreds of millions of events daily.

β€’ Facilitate the development of advanced capabilities such as OpenTelemetry-Native Telemetry Fabric, High-Cardinality Data Pipeline, End-to-End Distributed Tracing, On-Demand Packet Capture, and more.

β€’ Establish the telemetry schema, correlation model, transport, and sampling strategies that encompass client devices, controllers, and gateways.

β€’ Validate at Customer Scale: Conduct tests in lab settings that replicate our largest deployments and proactively identify cardinality explosions and pipeline backpressure before they reach customers.

β€’ Drive Integration Standards: Manage the OTLP, Prometheus, and JSON-log compatibility surface while ensuring successful ingestion into platforms like Datadog, Splunk, Nexthink, and Elastic.

β€’ Collaborate Cross-Functionally: Engage directly with product teams, R&D, and key customers in defense and critical infrastructure to define requirements and achieve meaningful results.


⛳️ Requirements

β€’ Over 8 years of engineering experience, with a minimum of 4 years focused on observability, telemetry, or large-scale data infrastructure (Datadog, Splunk, Elastic, Honeycomb, New Relic, Grafana Labs, or similar).

β€’ Profound knowledge of OpenTelemetry: OTLP, the OTel Collector, semantic conventions, context propagation, and head/tail sampling β€” you're well-versed in discussing trade-offs.

β€’ Experience with distributed tracing in a production environment: You have designed or made significant contributions to a tracing system that manages real customer traffic, beyond just a side project.

β€’ Proven experience with high-throughput pipelines: Hands-on involvement with systems processing over 100M events daily, including back-pressure management, batching, and storage considerations.

β€’ Strong skills in systems programming: Experience in production environments using Go and/or Rust is preferred. Familiarity across the stack, from agent code to backend services, is essential.

β€’ Fluency in networking and security: Comfortable with TLS, DNS, TCP, and identity protocols. Prior experience in ZTNA, SASE, or SD-WAN is a significant advantage.

β€’ Mindset: Pragmatic, opinionated, and focused on impact. You understand the right moments to prototype and when to deliver.


🏝️ Benefits

β€’ Competitive salary and performance-based bonuses.

β€’ Comprehensive health, dental, and vision insurance.

β€’ Generous paid time off and holiday schedule.

β€’ Opportunities for professional development and continuous learning.

β€’ Flexible work arrangements and a supportive team culture.

People also viewed

Synera9 hours ago

AI-native Integration Developer – Marketplace

DE flagGermany OnlyFull-timeSoftware Engineer€67k – €85k/year
ApplyView job
Stefanini Brasil10 hours ago

Senior Developer Analyst

Anywhere in the WorldFull-timeSoftware Engineer
ApplyView job
Roblox10 hours ago

Developer Engagement Representative – Part-Time Contract

AU flagAustralia OnlyFreelanceSoftware Engineer
ApplyView job
Kapres Technology10 hours ago

Senior Developer, OpenText Exstream

ES flagSpain OnlyFull-timeSoftware Engineer
ApplyView job
Sangoma10 hours ago

Node JS Developer

CO flagColombia OnlyFull-timeSoftware Engineer
ApplyView job
Clearwaters.IT10 hours ago

EDI Mumps Developer

US flagUnited States OnlyFull-timeSoftware Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers