
Senior AI Systems Engineer
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in New Mexico, +1 more state.
• Lead the implementation, integration, and operational maintenance of AI platforms, tools, and services, ensuring they are compatible with current systems and enterprise processes.
• Design, execute, monitor, and enhance AI infrastructure in collaboration with server, cloud, and platform engineering teams.
• Operationalize machine learning workflows and provide support for AI-driven applications from development to production deployment and ongoing maintenance.
• Develop and sustain CI/CD and MLOps pipelines for model packaging, testing, deployment, rollback, and lifecycle management.
• Utilize scripting, Infrastructure as Code, and configuration management practices to automate infrastructure.
• Deliver continuous technical support, troubleshooting, root cause analysis, and documentation for AI platforms and user-facing AI services.
• Maintain observability of AI systems through logging, metrics, performance monitoring, alerting, and incident response protocols.
• Ensure adherence to security, compliance, and governance standards, including engagement in audits, vulnerability management, and secure architecture evaluations.
• Evaluate and implement system improvements to enhance performance, scalability, reliability, and cost-effectiveness.
• Collaborate across departments to support various AI projects and align technical implementations with organizational goals and objectives.
• Assess new AI tools, frameworks, and infrastructure strategies for operational viability, supportability, and long-term benefits.
• Create and update technical documentation, runbooks, architecture diagrams, and operational procedures.
• Bachelor’s degree in computer science, engineering, information technology, or a related STEM discipline with 8-10 years of engineering experience.
• Over 2 years of experience in supporting AI/ML platforms, MLOps workflows, model deployment, or AI-enabled infrastructure.
• Proficient coding and automation skills in languages such as Python, Bash, or other similar scripting languages.
• Experience with AI/ML frameworks and tools like PyTorch, Hugging Face, or comparable ecosystems.
• Expertise in DevOps and MLOps methodologies, including CI/CD pipelines, Git-based workflows, containerization, and Kubernetes.
• Proven experience in deploying AI/ML models or AI services in operational settings, including containerized, cloud, or high-performance computing environments.
• Knowledge of security frameworks and compliance standards such as NIST and CMMC.
• Understanding of AI security functionalities in enterprise environments, including OAuth.
• Excellent communication skills with the ability to collaborate effectively across both technical and non-technical teams.
• Health insurance
• Retirement plans
• Paid time off
• Flexible work arrangements
• Professional development
• Security clearance support
Jellyfish
ScalableOS
Pragmatike
Get handpicked remote jobs straight to your inbox weekly.