
Staff Replication Development Engineer
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in California.
• Oversee the design and development of the replication engine for the Infinia AI Data Platform.
• Concentrate on creating enterprise-level asynchronous replication features that facilitate dependable and secure disaster recovery for extensive data systems.
• Create high-performance replication pipelines, efficient data synchronization methods, and secure data transfer systems.
• Design and develop multi-threaded asynchronous replication systems with capabilities for parallel streaming.
• Construct object-level delta replication with functionality for checkpointing and resuming.
• Develop replication engines that support bucket/share-level replication controls.
• Implement secure data transfer protocols utilizing TLS 1.3 with mutual authentication.
• Ensure comprehensive data integrity through checksum validation and verification processes.
• Design and implement manual failover workflows for disaster recovery situations.
• Build and maintain REST APIs for replication configuration, control, and automation purposes.
• Develop systems for metadata tracking and change detection to facilitate effective replication.
• Implement RPO visibility, alerting, and operational insights regarding replication status.
• Contribute to monitoring dashboards that focus on replication health and performance.
• Ensure that systems are engineered for high availability, fault tolerance, and scalability.
• Collaborate with QA teams to enhance performance, resiliency, and scale validation.
• Work together with backend, security, and platform teams to deliver comprehensive replication workflows.
• Engage in debugging, resolving production issues, and continuously improving replication reliability.
• Provide technical leadership, architectural direction, and mentorship to the engineering team.
• Over 8 years of experience in distributed systems, storage systems, or backend software engineering.
• Proficient programming skills in one or more languages: C++, Go, Java, or Rust.
• Experience in designing and developing data replication systems, data pipelines, or distributed data services.
• In-depth knowledge of distributed systems principles (consistency, availability, scalability, fault tolerance).
• Strong expertise in multi-threading, concurrency, and parallel processing.
• Familiarity with networking protocols and secure communication (TCP/IP, HTTP/HTTPS, TLS).
• Experience in implementing data integrity mechanisms (checksums, validation, consistency checks).
• Experience in designing and building REST APIs and service-oriented architectures.
• Basic understanding of checkpointing, failure recovery, and retry strategies in distributed systems.
• Basic awareness of observability concepts (metrics, logging, alerting).
• Strong debugging, problem-solving, and system design capabilities.
• Dynamic and motivated team environment.
• Opportunity for hands-on involvement.
• Engineering excellence is at the core of everything we do.
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.