AI Platform Engineer
Open Systems Technologies
Job Description
What you’ll do in the role: Design and build a firmwide AI development and evaluation platform with a strong focus on enterprise-scale GenAI benchmarking, assurance, and governance. Develop self-service tooling, SDKs, and APIs to enable teams to build, evaluate, and deploy GenAI applications efficiently and safely. Build reusable, scalable platform components for GenAI and agentic systems, including orchestration, evaluation pipelines, and model lifecycle workflows.
Lead the implementation of container-native GenAI workloads on Kubernetes / OpenShift using GitOps-driven deployment patterns. Integrate and operate GenAI ecosystem components including LLMs, vector databases, embeddings, and agent frameworks. Drive key architecture, product, and design decisions across security, authentication, observability, scalability, and reliability.
Establish platform best practices for GenAI evaluations, agentic systems, ModelOps / LLMOps, and production operations. Collaborate closely with engineers, data scientists, security, and product teams to accelerate safe enterprise adoption of GenAI. What you’ll bring to the role: 6+ years of strong hands-on software engineering experience, preferably in Python (FastAPI, Flask), building large-scale, cloud-native platforms.
Deep experience designing and operating Kubernetes / OpenShift workloads using Helm, Customize, container registries, and GitOps practices. Hands-on experience building GenAI and LLM-based applications, including agentic orchestration, embeddings, evaluation workflows, and fine-tuning. Strong understanding of microservices, RESTful API design, asynchronous and concurrent programming, and performance-oriented systems.
Solid foundation in data engineering principles including SQL/NoSQL stores, Kafka, Redis, vector databases, and state management at scale. Proficiency in DevOps, CI/CD, observability (OpenTelemetry, Prometheus, Grafana), and SRE-inspired operational practices. Strong working knowledge of security-first design, OAuth2, secure coding practices, and enterprise-grade platform controls.
Experience with agent-based frameworks or orchestration systems Exposure to LLMOps / ModelOps / evaluation platforms Experience working in enterprise-scale platforms or internal developer platforms