This is a fully remote position, open to applicants in New York.

📋 Description

• Take ownership of the complete design and execution of the AppGate observability fabric, which includes telemetry SDKs in our clients and gateways, the LogForwarder pipeline, and customer-side integrations.

• Make essential technical choices regarding transport protocols, sampling strategies, schema design, and correlation models that influence the scalability of our platform to handle hundreds of millions of events daily.

• Facilitate the development of advanced capabilities such as OpenTelemetry-Native Telemetry Fabric, High-Cardinality Data Pipeline, End-to-End Distributed Tracing, On-Demand Packet Capture, and more.

• Establish the telemetry schema, correlation model, transport, and sampling strategies that encompass client devices, controllers, and gateways.

• Validate at Customer Scale: Conduct tests in lab settings that replicate our largest deployments and proactively identify cardinality explosions and pipeline backpressure before they reach customers.

• Drive Integration Standards: Manage the OTLP, Prometheus, and JSON-log compatibility surface while ensuring successful ingestion into platforms like Datadog, Splunk, Nexthink, and Elastic.

• Collaborate Cross-Functionally: Engage directly with product teams, R&D, and key customers in defense and critical infrastructure to define requirements and achieve meaningful results.

⛳️ Requirements

• Over 8 years of engineering experience, with a minimum of 4 years focused on observability, telemetry, or large-scale data infrastructure (Datadog, Splunk, Elastic, Honeycomb, New Relic, Grafana Labs, or similar).

• Profound knowledge of OpenTelemetry: OTLP, the OTel Collector, semantic conventions, context propagation, and head/tail sampling — you're well-versed in discussing trade-offs.

• Experience with distributed tracing in a production environment: You have designed or made significant contributions to a tracing system that manages real customer traffic, beyond just a side project.

• Proven experience with high-throughput pipelines: Hands-on involvement with systems processing over 100M events daily, including back-pressure management, batching, and storage considerations.

• Strong skills in systems programming: Experience in production environments using Go and/or Rust is preferred. Familiarity across the stack, from agent code to backend services, is essential.

• Fluency in networking and security: Comfortable with TLS, DNS, TCP, and identity protocols. Prior experience in ZTNA, SASE, or SD-WAN is a significant advantage.

• Mindset: Pragmatic, opinionated, and focused on impact. You understand the right moments to prototype and when to deliver.

🏝️ Benefits

• Competitive salary and performance-based bonuses.

• Comprehensive health, dental, and vision insurance.

• Generous paid time off and holiday schedule.

• Opportunities for professional development and continuous learning.

• Flexible work arrangements and a supportive team culture.

Senior/Staff/Principal Software Engineer – Observability Engineering

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

AI-native Integration Developer – Marketplace

Senior Developer Analyst

Developer Engagement Representative – Part-Time Contract

Senior Developer, OpenText Exstream

Node JS Developer

EDI Mumps Developer

Never miss a great job!