Sr Ai Data Engineer
Honeywell - AEROSPACE
Job Description
As a Sr Data Engineer here at Honeywell, you will play a crucial role in designing and implementing advanced data solutions for AI solutions that drive business insights, enhance decision-making processes and empower AI solutions. Your expertise will help in critical data science development activities across all AI modalities (classic, Gen and agentic) and data types (structured and unstructured). You will report directly to our AI Director and you'll work out of our Phoenix, AZ or Charlotte, NC location on a Hybrid work schedule.
In this role, you will impact the organization by leveraging your technical skills to develop innovative data solutions that support strategic initiatives and improve operational efficiency. KEY RESPONSIBILITIES Support end-to-end data needs for all AI modalities , including classic ML, GenAI/LLMs, and agentic AI systems. Build robust, scalable data pipelines for structured, semi-structured, and unstructured data , including text, documents, images, audio, video, and logs.
Develop feature engineering pipelines for classic ML, including feature extraction, transformation, and feature store management. Build and optimize GenAI and LLM data pipelines , including embedding generation, vectorization, chunking, metadata extraction, and document enrichment for RAG and context retrieval. Develop data ingestion and orchestration workflows that support agentic AI , including memory stores, event-driven pipelines, tool-use data flows, and real-time retrieval services.
Design and implement advanced data solutions using AWS (S3, Glue, Lambda, EMR, Kinesis), Databricks (Spark, Delta Lake, Vector Search), and Dataiku to enable intelligent systems at scale. Implement data governance, quality, lineage, monitoring, and observability to support high-performance, trustworthy AI. Partner with data scientists, ML engineers, and AI product teams to deliver datasets for model development, fine-tuning, evaluation, and production inference.
Optimize pipelines for latency, cost, reliability, and throughput, ensuring AI systems-from batch ML to real-time agents-have the data they need. Responsibilities Lead the design, automation, and operation of end-to-end MLOps pipelines supporting classic ML, GenAI/LLM systems, and agentic AI workloads across Databricks and Dataiku. Build, maintain, and optimize training, evaluation, and deployment pipelines, ensuring reliability, reproducibility, and alignment with business objectives.
Collaborate with data scientists, AI software developers, data engineers, and platform engineers to operationalize models, LLMs, RAG workflows, and agentic AI capabilities. Architect and implement solutions for distributed training , hyperparameter optimization, accelerated inference, and performance-tuned model serving. Develop automated testing, validation, governance, and monitoring frameworks for ML/LLM/agentic workflows, including drift detection, model quality, and guardrail coverage.
Own CI/CD pipelines for model assets, prompts, embeddings, vector search updates, and agent tool registries using GitHub Actions and modern ML deployment frameworks. Manage MLflow experiment tracking, model registry lifecycle, lineage, and promotion flows across multiple environments in Databricks and Dataiku. Optimize integration between ML frameworks (PyTorch, TensorFlow, scikit-learn) and cloud-based compute ecosystems including Spark, Kubernetes, and serverless runtimes.
Ensure production-grade reliability, scalability, performance, and observability of all deployed AI workloads (classic โ GenAI โ agentic). Establish best practices, patterns, reusable templates, and standards for MLOps across the AI delivery lifecycle. Qualifications YOU MUST HAVE Bachelor's degree from an accredited institution in a technical discipline such as science, technology, engineering, mathematics. 5 or more years of experience in data engineering, distributed data systems, or ML data pipelines.
Strong experience working with Apache Spark , preferably in Databricks. Proficiency in Python and SQL; experience with distributed computing and big data frameworks. Hands-on experience with cloud-based ETL/ELT pipelines , preferably AWS (S3, Glue, Lambda, EMR, Step Functions, Redshift, Athena).
Experience building data solutions that support multiple AI workloads , including: ML training and inference data flows Unstructured data ingestion and transformation Embedding/vector pipelines for LLMs Experience working with data modeling, data integration, ETL/ELT frameworks, and reliable production-grade pipelines. WE VALUE Bachelor's degree in a technical field (CS, Engineering, Math, or related). Experience supporting AI at scale across classic ML, GenAI/LLM, and agentic AI systems .
Experience with vector databases and semantic search (Databricks Vector Search, Pinecone, FAISS, Milvus, OpenSearch). Familiarity with LLM and GenAI data preparation , including: Text processing Tokenization Chunking strategies Prompt/context formatting Experience with unstructured data technologies (OCR, NLP pipelines, computer vision data processing). Hands-on experience with Dataiku for automation, workflow orchestration, and AI project management.
Knowledge of MLOps tooling: MLflow, Delta Lake, experiment tracking, CI/CD for ML. Understanding of agentic AI system patterns , such as memory architectures, tool APIs, event-driven workflows, and reasoning chain data requirements. Strong analytical mindset, attention to detail, and commitment to high data quality.
Ability to thrive in a fast-paced, evolving AI environment and collaborate across cross-functional teams. BENEFITS OF WORKING FOR HONEYWELL In addition to a competitive salary, leading-edge work, and developing solutions side-by-side with dedicated experts in their fields, Honeywell employees are eligible for a comprehensive benefits package. This package includes employer-subsidized Medical, Dental, Vision, and Life Insurance; Short-Term and Long-Term Disability; 401(k) match, Flexible Spending Accounts, Health Savings Accounts, EAP, and Educational Assistance; Parental Leave, Paid Time Off (for vacation, personal business, sick time, and parental leave), and 12 Paid Holidays.
For more information visit: click here The application period for the job is estimated to be 40 days from the job posting date; however, this may be shortened or extended depending on business needs and the availability of qualified candidates. Posting Date: February 18, 2026 US CITIZEN REQUIREMENT Must be a US Citizen due to contractual requirements. ABOUT HONEYWELL Honeywell International Inc. (Nasdaq: HON) invents and commercializes technologies that address some of the world's most critical challenges around energy, safety, security, air travel, productivity, and global urbanization.
We are a leading software-industrial company committed to introducing state-of-the-art technology solutions to improve efficiency, productivity, sustainability, and safety in high growth businesses in broad-based, attractive industrial end markets. Our products and solutions enable a safer, more comfortable, and more productive world, enhancing the quality of life of people around the globe. Learn more about Honeywell: click here THE BUSINESS UNIT Honeywell Aerospace Technologies (AT) products and services are found on virtually every commercial, defense, and space aircraft in the world.
We build aircraft engines, cockpit and cabin electronics, wireless connectivity systems, mechanical components and more, and connect many of them via our high-speed Wi-Fi offerings. Our solutions create healthier air travel, more fuel-efficient and better-maintained aircraft, more direct and on-time flight arrivals, safer skies and airports, and more comfortable flights, along with several innovations and services that reflect exciting and emerging new transportation methods such as autonomous and supersonic flight. Revenues in 2023 for Honeywell Aerospace Technology were $14B and there are approximately 21,000 employees globally.
To learn more, please visit click here. #AERO26