This is a fully remote position, open to applicants in California.

📋 Description

• Design, provision, and oversee Apache Kafka clusters (self-managed on GCP/AWS or utilizing Confluent Platform / MSK).

• Configure and optimize brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies to achieve high throughput and minimal latency.

• Conduct cluster upgrades, rolling restarts, and broker replacements without any downtime.

• Implement and manage Kafka Connect pipelines for data ingestion and egress across diverse systems.

• Administer Kafka Streams and ksqlDB deployments for real-time stream processing tasks.

• Maintain the Schema Registry and uphold schema governance standards across teams.

• Define and monitor SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health.

• Design and establish cloud infrastructure using Infrastructure as Code (IaC) – Terraform.

• Develop automated deployment pipelines for Kafka configuration changes utilizing GitOps workflows (ArgoCD, Flux).

• Create self-service tools and runbooks to alleviate manual tasks for development teams.

• Automate topic provisioning, ACL management, and schema registration through APIs and CLI tools.

• Integrate tools such as GitLab CI/CD or Cloud Build for automated testing and deployment.

• Ensure seamless integration of data pipelines with other GCP services like BigQuery and Cloud Storage.

• Monitor and enhance the performance, reliability, and cost-effectiveness of Kafka and streaming pipelines.

• Adopt security best practices for GCP resources, including IAM policies, encryption, and network security.

• Ensure Observability is a fundamental aspect of the infrastructure platforms, providing adequate visibility into their health, utilization, and costs.

• Collaborate extensively with cross-functional teams to comprehend their requirements; educate them through documentation and training, and enhance the adoption of platforms/tools.

⛳️ Requirements

• 10+ years of overall experience in DevOps cloud engineering or data engineering.

• 5+ years of experience working with Kafka at a production scale.

• Deep knowledge of Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode.

• Proficiency in container orchestration (Kubernetes / Helm) and deploying Kafka using Strimzi, Confluent Operator, or equivalent.

• Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) within cloud environments.

• Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming).

• Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, BigQuery, and Cloud Storage.

• Expertise in Infrastructure as Code (IaC) tools such as Terraform or Cloud Deployment Manager.

• Familiarity with data orchestration tools like Apache Airflow or Cloud Composer.

• Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build.

• Knowledge of containerization and orchestration tools such as Docker and Kubernetes.

• Strong scripting abilities for automation (e.g., Bash, Python).

• Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana.

• Familiarity with logging tools such as Cloud Logging or the ELK Stack.

• Strong problem-solving and analytical capabilities.

• Excellent communication and collaboration skills.

• Ability to thrive in a fast-paced, agile environment.

🏝️ Benefits

• As part of the total compensation package, this role may be eligible for a bonus.

Staff Software Engineer – Cloud Platform, Kafka

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Full-Stack Engineer

Senior Full Stack Engineer

Staff Software Engineer

Senior Full Stack Engineer

Senior Software Engineer

Staff Full Stack Engineer, B2B

Never miss a great job!