Where is this Nvidia Engineer job located?

This position is located in Pune City.

NEW · Nvidia Engineer — Pune City | Apply Now (Persistent Systems)

About Position: You will be the team's primary authority on NVIDIA's inference ecosystem NIM (NVIDIA Inference Microservices), Triton Inference Server, TensorRT, and the BioNeMo platform. Your core mission is to take structural biology AI models whether NIM-ready or research-grade Python scripts and turn them into production-quality, API-accessible inference services. Critical Requirement: Several target models (LigandMPNN, Boltz, custom AlphaFold2 variants) are not yet available as official NVIDIA NIM services.

This role requires hands-on ability to build NIM-compliant containers from scratch and configure Triton model repositories for models that currently only have CLI or notebook interfaces. Role: Nvidia Engineer Location: All Persistent Locations Experience: 4 to 7 Years Job Type: Full Time Employment What You'll Do: NIM Service Deployment Deploy and configure NVIDIA NIM containers for bio models (AlphaFold2-Multimer, ESMFold, ProteinMPNN) on the GPU cluster Manage NIM service lifecycle: versioning, health checks, rolling updates, rollback strategies Tune NIM deployment parameters: instance count, GPU assignment, concurrency settings, request queuing Integrate deployed NIM endpoints with upstream orchestration (SLURM, Nextflow, REST clients) Custom NIM Packaging (Primary Focus) Analyse non-NIM models (LigandMPNN, Boltz, RFDiffusion, etc.) and design their Triton serving architecture Write Triton model configs (config.pbtxt): input/output tensors, batching policy, backend selection (PyTorch, Python, ONNX, TensorRT) Build NIM-spec Docker images: base layers, model weights, dependency pinning, health endpoint, OpenAPI schema Implement ensemble pipelines in Triton for multi-stage workflows (MSA search → folding → scoring) Export models to ONNX or TensorRT where inference optimization is feasible; document tradeoffs Test packaged services against reference outputs from original model codebases to validate correctness NVIDIA Ecosystem & Optimization Work with NGC private registry: push/pull images, manage model cards, handle credential scoping Apply TensorRT optimization, FP16/INT8 quantization where applicable for throughput gains Profile GPU memory footprints and latency of each packaged model; document per-GPU requirements Stay current with NVIDIA BioNeMo updates, NIM API spec changes, and new bio model releases Evaluate new models from the research community (CASP, bioRxiv) for NIM packaging feasibility Collaboration & Documentation Partner with the MLOps Engineer to ensure packaged services deploy cleanly on cluster Partner with the Computational Biologist to understand model I/O contracts and validation criteria Write and maintain NIM packaging runbooks, Triton config templates, and container build guides Define API schemas (OpenAPI/gRPC proto) for each service so downstream teams can integrate reliably Expertise You'll Bring: NVIDIA NIM Direct hands-on experience deploying NVIDIA NIM containers (not just awareness; actual production use) Thorough understanding of NIM container specifications: Health endpoints Model directory layout Environment variables Experience with: NVIDIA NGC catalog Private registry API key management Familiarity with NVIDIA BioNeMo (advantage): ESMFold NIM ProteinMPNN NIM Triton Inference Server Writing model repository configurations (config.pbtxt) for multiple backends: PyTorch Python ONNX TensorRT Building Triton ensemble pipelines for multi-step inference workflows Experience with: Dynamic batching Sequence batching Model instance configuration Using Triton client libraries (tritonclient) in Python for: Testing Benchmarking Model Optimization Hands-on with TensorRT: Building engines from ONNX Precision modes (FP32 / FP16 / INT8) Profiling ONNX export from: PyTorch JAX models (handling dynamic shapes) GPU memory profiling using: nvidia-smi Nsight Systems torch.cuda.memory_summary Understanding transformer inference patterns: Attention caching Batching strategies Bio Models (Preferred) Practical experience running: AlphaFold2 AlphaFold-Multimer (end-to-end, not just API usage) Understanding of LigandMPNN: Architecture Input/output tensors (protein graph, ligand context) Awareness of: Boltz-1 (MIT, 2024) Differences vs AlphaFold3 (serving requirements) Familiarity with: RoseTTAFold2 ESMFold RFDiffusion Programming Advanced Python: Async programming Packaging CLI development (click, argparse) FastAPI / gRPC wrappers Docker expertise: Multi-stage builds Layer optimisation CUDA base image selection Bash scripting: Container build automation CI pipelines Experience with protein language model embeddings (ESM-2, ESM-3) as model inputs Kubernetes / Helm experience for hybrid HPC + cloud NIM deployments Published benchmarks or blog posts on model serving optimization Experience with Run:ai (workloads, projects, quotas, fractional GPU). NVIDIA AI Enterprise licensed-stack experience.

NVIDIA Dynamo or disaggregated inference experience. NeMo Guardrails / NIM safety filters for any LLM-adjacent endpoints. Slurm + Pyxis/Enroot experience for HPC-style NIM execution alongside Kubernetes.

Benefits: Competitive salary and benefits package Culture focused on talent development with quarterly growth opportunities and company-sponsored higher education and certifications Opportunity to work with cutting-edge technologies Employee engagement initiatives such as project parties, flexible work hours, and Long Service awards Annual health check-ups Insurance coverage: group term life, personal accident, and Mediclaim hospitalization for self, spouse, two children, and parents Values-Driven, People-Centric & Inclusive Work Environment: Persistent is dedicated to fostering diversity and inclusion in the workplace. We invite applications from all qualified individuals, including those with disabilities, and regardless of gender or gender preference. We welcome diverse candidates from all backgrounds.

We support hybrid work and flexible hours to fit diverse lifestyles. Our office is accessibility-friendly, with ergonomic setups and assistive technologies to support employees with physical disabilities. If you are a person with disabilities and have specific requirements, please inform us during the application process or at any time during your employment Let’s unleash your full potential at Persistent - persistent.com/careers “Persistent is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind.”

Nvidia Engineer

Job Description

Related Jobs

Safety Engineer

Quality Engineer

Presales Engineer

Principal Manufacturing Engineer

Senior Manufacturing Engineer

Senior Manufacturing Engineer

Principal Manufacturing Engineer

Chemical Process Specialist

Related Searches

Explore More Jobs