Data Engineer

Infoplus Technologies UK Limited

LeicesterFull-timeMid LevelOn-site

Job Description

Key responsibilities on this engagement β€’ Run the Sprint 1 architecture review of the existing UAT codebase (S3 + Glue + S3 Tables + OpenSearch + Athena) and deliver written gap findings. β€’ Design the metadata schema, taxonomy, and field catalogue (Light, Brain, Power). β€’ Tune data orchestration β€” Glue jobs, Athena queries, S3 Tables config, scheduling. Lead the deep-dive technical sessions with analysts on visualization requirements β€’ Build and validate the simulation data onboarding pipeline against real data β€” including the 30 GB-per-run acoustic spectra dataset. β€’ Configure and validate the OpenSearch k-NN vector store and the Bedrock embedding pipeline. β€’ Author the AI/ML data export format specification and the AI onboarding pattern document. β€’ Co-design the API middleware blueprint with the Cloud Infrastructure Architect. Must-have Principal-level hands-on data engineering on AWS β€” 7+ years Deep production experience with S3, S3 Tables, Glue, Athena, and OpenSearch (including k-NN / vector search) Built and shipped vector embedding workloads Strong metadata modelling and data taxonomy design experience for scientific or engineering domains Comfort working with Parquet, JSON-LD, and large binary scientific data formats (mesh, time-series, spectra) Python proficiency; PySpark / Glue job tuning experience

Posted 3 weeks ago

Related Jobs

Related Searches

Apply Now