
HPC Engineer
Posted Jun 20

Posted Jun 20
This is a fully remote position, open to applicants in United States.
• Collaborate closely with customer stakeholders, scientists, and IT professionals to deliver Compute at Scale.
• Design, develop, and manage HPC platforms while providing support for Scientific applications, workflows, and related infrastructure, whether on-premises or cloud-hosted.
• Lead the architecture, roadmaps, and execution of projects to implement and maintain IT infrastructure best practices for customers.
• Provide full stack support, including platform design and evolution, application administration, customer workflow support, profiling, and performance tuning.
• Oversee monitoring and maintenance of designated systems, platform and systems administration, and troubleshooting of hardware, software, and networking issues.
• Engage in solution architecture and hands-on engineering for both on-premises and cloud environments.
• Maintain comprehensive documentation.
• Work collaboratively with cross-discipline team members and customers.
• Assist internal and customer Architecture and Design initiatives.
• Support customers with their workflow pipelines, offering both advisory and hands-on assistance.
• Thoroughly document new and existing computational assets.
• Exhibit flexibility to adapt as engagement scopes may change.
• Provide support for AWS and GCP Cloud applications, migrations, and modernization efforts.
• Implement CloudOps/IaC for ongoing platform management.
• Set up and configure AWS and GCP Cloud infrastructure for new platform builds.
• Ensure system compliance with company security policies and relevant regulatory standards.
• Facilitate transition support for modernized services to operational teams.
• Deliver engineering-level troubleshooting and service restoration for operational issues as they arise on supported platforms.
• Offer training and mentorship to junior team members.
• Serve as an escalation point across multiple engagements to guarantee resolution.
• A bachelor's degree or master's degree in Computer Science or a related field.
• Over 5 years of experience in administering HPC clusters and systems.
• Preferred experience with SLURM and Grid Engine scheduling software.
• More than 5 years of professional experience in Solution Architecture or Cloud Infrastructure Deployment and support.
• At least 5 years of professional experience in developing or managing compute solutions for Scientific/Research IT domains, with a preference for Life Sciences.
• Familiarity with POSIT products (Package Manager, Connect, Workbench) in either an end-user or administrator role.
• Experience in developing scientific workflows on HPC systems utilizing Nextflow.
• Extensive command-line system administration expertise, including user and group management.
• Advanced knowledge of Active Directory, DNS, DHCP, LDAP, NFS, and SMB.
• Proficient in building applications from source code, installing, maintaining, and troubleshooting application-level Linux and scientific software based on industry best practices.
• Experience with the installation and fine-tuning of Linux operating systems.
• Understanding of Linux package management systems and their maintenance.
• Intermediate knowledge of OS-level networking.
• Proficient with scripting tools, automation tools, and configuration management tools.
• Preferred experience with Ansible, Terraform, and Cloud Formation.
• Experience in administering and integrating Scientific/Research applications.
• Strong time-management skills; capable of completing projects promptly, planning, and prioritizing tasks while keeping leadership and stakeholders regularly updated on status.
• Excellent communication skills, including the ability to prepare written documentation for IT colleagues and end users.
• Proactive thinking skills to identify potential issues and possible solutions before incidents occur.
• Exceptional attention to detail is essential for interfacing with multiple clients simultaneously.
• Ability to comprehend and analyze complex technical problems and situations.
• Candidates should be passionate engineers with a strong vision and a commitment to staying informed about trends in the Scientific Computing sector.
• Ability to work independently or as part of a team.
• Capability to take projects from inception to completion with minimal supervision.
• Comprehensive health and wellness benefits, including Medical, Dental, and Vision Insurance.
• Company-provided Life and Long-Term Disability Insurance.
• Company-sponsored 401(k) Plan.
• Company-provided continuing education benefit.
• A team-focused culture with unlimited opportunities for advancement.
INDEPTH HYGIENE SERVICES LIMITED
Terabase Energy
Get handpicked remote jobs straight to your inbox weekly.