
Senior Data Developer, AWS
Posted 1 day ago

Posted 1 day ago
• Development of data pipelines: create jobs utilizing AWS Glue/Spark or Python (Pandas/Lambdas) to convert raw data into refined datasets.
• Structuring data storage: arrange data within Amazon S3 (data lake) and integrate it with vector databases or Redshift to facilitate advanced analytics.
• Data processing: manage both structured and unstructured data, transforming it into optimized formats (e.g., Parquet) for improved querying and analysis.
• Configuration of orchestration: set up workflows using AWS Step Functions or Managed Workflows for Apache Airflow (MWAA) to automate data processing activities.
• Collaboration across functions: partner with data scientists, engineers, and business stakeholders to grasp data requirements and provide high-quality solutions.
• Monitoring performance: evaluate and analyze data pipeline performance to guarantee reliability and efficiency.
• Governance and scalability: enforce data governance practices and scalable architectures to bolster enterprise data initiatives.
• Integration of AI: assist in the incorporation of AI capabilities into data workflows to improve the efficiency and effectiveness of solutions.
• Proficient in English (minimum) for effective communication with global teams and for producing documentation.
• Significant experience with the AWS ecosystem, particularly with S3 as the main data source.
• Skilled in data processing frameworks like AWS Glue, Spark, or Python (Pandas/Lambdas).
• Familiarity with Databricks and its connector, including API integrations for data processing and structuring.
• Expertise in managing structured and unstructured data, along with knowledge of vector databases (e.g., Qdrant) for AI applications.
• Demonstrated ability to convert various file formats into processable datasets.
• Knowledge of AI frameworks and methodologies.
• Experience utilizing data orchestration tools such as Apache Airflow.
• Understanding of data governance and compliance standards.
• Insight into performance optimization techniques for data pipelines.
• Health and dental insurance
• Meal and food allowance
• Childcare assistance
• Extended paternity leave
• Partnerships with gyms and wellness professionals through Wellhub (Gympass)/TotalPass
• Profit sharing and results participation (PLR)
• Life insurance
• Continuous learning platform (CI&T University)
• Discount club
• Free online platform dedicated to physical, mental, and overall well-being
• Pregnancy and responsible parenting course
• Collaborations with online learning platforms
• Language learning platform
Sutherland
Syneos Health
Syneos Health
VIZX International
Get handpicked remote jobs straight to your inbox weekly.