
Infrastructure Engineer – AI Platform
Posted May 22

Posted May 22
This is a fully remote position, open to applicants in Brazil.
• Take ownership of the deployment and operational management of AI-assisted development tools throughout the engineering department (e.g., Cursor, Copilot, Claude Code).
• Establish and enforce access controls, license management, and usage policies that comply with SOC2/ISO 27001 standards.
• Create cost tracking and reporting mechanisms to provide leadership with insights into AI tool expenditures and usage trends across the organization.
• Minimize barriers for engineers in adopting these tools while ensuring security and auditability are upheld.
• Collaborate with various teams within the organization to identify, develop, and support internal AI applications, such as RAG pipelines, agents, and automation workflows.
• Assess and recommend tools, frameworks, and patterns based on the actual needs of the teams.
• Clearly define the boundaries of responsibility between IaaS and the consuming teams.
• Provide guidance on data governance policies for LLM usage, including which data can be input into which models, where outputs are stored, and how audit trails are maintained.
• Ensure that AI infrastructure and tools adhere to existing SOC2 and ISO 27001 controls, with evidence available for audits.
• Deliver clear and regular reporting to leadership regarding AI adoption, costs, risks, and usage throughout the organization.
• Establish and manage AI/ML infrastructure, primarily on GCP (Vertex AI), within OpenVPN’s current environment.
• Design Terraform modules and IaC patterns for AI infrastructure that align with the team's established conventions (e.g., Atlantis-driven GitOps workflows).
• Provide visibility into AI/ML infrastructure costs and implement controls that are consistent with how compute costs are managed in other areas.
• Analyze build-versus-buy decisions for AI/ML infrastructure components and managed services, focusing on operational compatibility with existing patterns.
• Proven hands-on experience in establishing and managing AI/ML infrastructure, such as Vertex AI or similar platforms (SageMaker, Azure ML).
• Experience in setting up AI developer tools (Cursor, Copilot, etc.) at an organizational level, including management of rollout, access, and costs.
• Proficiency in infrastructure-as-code, with Terraform as the primary tool managed through Atlantis; you should be capable of writing modules that other teams can utilize through self-service.
• Ability to collaborate across teams and define boundaries for new capability areas that lack established patterns.
• Strong communication skills to articulate the cost, risk, and value of AI investments to leadership effectively.
• Competitive pay rates.
• Fully remote work environments.
• Self-managed time off.
• Team trips and special events.
Pagefreezer
Orro Group
Feldera
Webflow
Get handpicked remote jobs straight to your inbox weekly.