
Solutions Engineer, Healthcare
Posted May 30

Posted May 30
This is a fully remote position, open to applicants in Brazil.
• Take charge of cross-cloud data movement and delivery.
• Implement and oversee large-scale data transfers across platforms such as AWS S3, Google Cloud Storage, Azure Blob, Snowflake, and customer environments.
• Utilize command-line interfaces like rclone, s5cmd, and cloud-native utilities to ensure safe and efficient data movement.
• Oversee credentials, permissions, manifests, and delivery packaging artifacts essential for ingestion, subset delivery, and handoff workflows.
• Develop structured data assembly and lightweight transformation workflows.
• Employ Python and SQL for joining datasets, adding derived columns, cleansing data, and validating CSV, Parquet, and database tables.
• Assist in customer-specific assembly tasks that convert raw inputs into ready-to-deliver datasets.
• Maintain a high standard for data integrity, structure, and reproducibility prior to handoff.
• Operate internal pipelines with a focus on production discipline.
• Utilize Protege's Dagster-based platform to orchestrate data processing and delivery.
• Ensure a clear distinction between pre-production and production workflows and validate configurations before execution.
• Create lightweight scripts and command-line workflows for tasks such as filtering, manifest generation, validation, and recovery.
• Generate leverage for the team and platform.
• Document processes, outputs, and recovery strategies for auditability and repeatability.
• Recognize recurring delivery patterns, failure modes, and areas of manual effort.
• Collaborate with Engineering to transform one-off operational tasks into repeatable platform capabilities and test new tools prior to launch.
• Extensive hands-on experience with data pipelines, both orchestrated and manual, in actual production environments.
• Proficiency with command-line tools in Linux or MacOS and strong scripting skills in Python, SQL, and Bash/shell.
• Experience with cloud storage systems and large-scale cross-cloud data transfers.
• High standards for data integrity, validation, reproducibility, and auditability, particularly for regulated data.
• Calm, systematic debugging approach and strong operational judgment regarding when to rerun, recover, or escalate issues.
• Capability to manage multiple delivery workflows concurrently while effectively collaborating with a distributed team.
• Bonus points for experience with AWS S3, GCS, Azure Blob, Snowflake, IAM debugging, Dagster/Airflow, healthcare data, or AI training data.
• Health insurance
• Retirement plans
• Paid time off
• Flexible work arrangements
• Professional development
Intetics
Remote
GitLab
NVIDIA
Get handpicked remote jobs straight to your inbox weekly.