
Lead Data Engineer
Posted May 2

Posted May 2
• Oversee the design and progression of scalable, distributed data pipelines, guaranteeing high availability and performance at scale.
• Create and execute robust data models to facilitate reporting and advanced data applications.
• Develop and sustain distributed web scraping systems utilizing tools like Playwright, Selenium, and BeautifulSoup.
• Construct systems capable of managing anti-scraping measures, proxy rotation, and high-volume data extraction.
• Incorporate AI and LLMs into engineering workflows for code generation, automation, and optimization purposes.
• Utilize prompt engineering techniques to enhance data processing, documentation, and troubleshooting.
• Recognize and apply system and process enhancements to boost performance and efficiency.
• Oversee and scale cloud-based data infrastructure, including data warehouses, object storage, and search systems.
• Deploy and manage containerized workloads through Kubernetes.
• Establish and uphold data quality monitoring and governance processes to ensure accuracy and reliability.
• Guide junior engineers through code reviews, documentation, and knowledge sharing.
• Clearly communicate technical concepts and provide business context for engineering choices.
• At least 5 years of experience in Data Engineering with a proven history of scaling systems.
• Expert-level proficiency in Python and advanced SQL, encompassing performance tuning and optimization.
• Significant experience with workflow orchestration tools such as Airflow or Prefect and transformation tools like dbt.
• Demonstrated experience in building resilient web scraping systems utilizing Playwright, Selenium, and BeautifulSoup.
• Profound understanding of relational and NoSQL databases, including Postgres, MongoDB, and ElasticSearch.
• Experience with large-scale data systems such as BigQuery.
• Strong expertise in CI/CD pipelines, Git, and Docker.
• Proven experience in designing and maintaining distributed systems that exhibit high availability and fault tolerance.
• Competitive salary and performance-based bonuses.
• Comprehensive health, dental, and vision insurance.
• Flexible work hours and opportunities for remote work.
• Professional development and continuous learning opportunities.
• A supportive and collaborative work environment.
SmartLight Analytics
CloudSmiths
BPCS, Comprehensive marketing solutions, ltd.
Get handpicked remote jobs straight to your inbox weekly.