โšก New

LLM - AI Quality Analyst

ALOIS UK

SheffieldFull-timeMid LevelOn-site

Job Description

Job Title LLM - AI Quality Analyst (Personalization) โ€“ English Location United Kingdom Employment Type Contractor (Short-Term Contract) Working Hours Minimum 20 hours per week (Part-Time Availability) At least 4 hours per day Required overlap with PST time zone: Minimum 2 hours for part-time engagements Up to 4 hours for full project participation Role Overview Client is seeking AI Quality Analysts to evaluate a new personalization feature for Gemini. In this role, you will assess how effectively the AI model utilizes information from previous Gemini conversations, Gmail, Google Search history, and YouTube activity to generate responses that are relevant, personalized, and genuinely helpful. The ideal candidate combines creativity with analytical rigor, designing prompts based on personal experiences and evaluating model outputs against multiple quality dimensions, including grounding, integration, helpfulness, and naturalness.

Key Responsibilities Design and execute multi-turn conversational prompts (typically 1โ€“5 turns) using your own experiences and context. Evaluate AI-generated responses to determine whether personalization has been appropriately applied. Assess response quality based on relevance, grounding, integration, and overall helpfulness.

Identify hallucinations, unsupported assumptions, flawed inferences, and inappropriate personalization. Review and compare Side-by-Side (SxS) model outputs to determine which response performs better overall. Rank model responses based on usability, helpfulness, naturalness, and user experience.

Write clear, concise, and evidence-based rationales supporting model rankings. Reference specific conversation turns when documenting observations and evaluation outcomes. Extract and verify debugging information to confirm the appropriate use of summaries and personal data sources.

Maintain strict data hygiene practices by deleting evaluation conversations after completion. Provide detailed annotations and constructive feedback to support model improvement initiatives. Work independently while collaborating effectively within a globally distributed team.

Key Qualifications Excellent English proficiency with strong reading and writing skills. Ability to understand nuanced instructions and communicate clearly in written form. Willingness to use a primary personal Google account (not a testing account) and enable personal data sources for authentic evaluations.

Strong analytical thinking with the ability to assess ambiguous and nuanced AI outputs. Experience designing creative prompts that effectively test personalization capabilities. Understanding of personalization concepts, including identifying: Incorrect personalization Unsupported assumptions Forced connections Poor inferences Strong attention to detail with the ability to detect subtle differences between responses.

Excellent written communication skills for producing structured evaluation rationales. Ability to provide constructive feedback and detailed annotations. Strong communication and collaboration abilities.

Self-motivated with the ability to work independently in a remote environment. Reliable desktop or laptop setup with a stable internet connection. Required Skills & Experience Experience in data annotation, AI quality evaluation, content moderation, or similar analytical roles is strongly preferred.

Familiarity with evaluating AI-generated content and identifying quality issues. Experience reviewing comparative outputs and making evidence-based judgments is advantageous. Strong critical thinking and decision-making skills.

Education Bachelor's degree (BS/BA) or equivalent experience in a relevant field such as: Policy Law Ethics Linguistics Journalism Computer Science Other analytical disciplines Equivalent professional experience will also be considered. Special Requirements English Proficiency: High level of comprehension and written communication in English. Personal Account Usage: Must be willing to use a personal Google account for evaluation purposes.

Schedule Flexibility: Availability to support global 24-hour operations. Technical Requirements: Access to a desktop or laptop with a dependable internet connection. Availability Requirements United Kingdom Full-time availability required. 8 hours per day.

Minimum 4-hour overlap with the PST time zone.

Posted Today

Related Jobs

Related Searches

Apply Now