Job Details
Job Information
Other Information
Job Description
Role Number: 200641714-3956
Summary
We live in a mobile and device-driven world where Deep Learning technology enables a new class of applications. We are looking for software development engineer to design and build agentic systems for Large Language Model (LLM) evaluation and synthetic data generation. Imagine the countless possibilities powered by Artificial Intelligence! Are you passionate about enabling unique user experiences on Apple products; such as Apple Vision Pro, iPhone, iPad, Apple Watch and the Mac? In the Video Engineering team, we are dedicated to providing hardware software solutions and execution of Deep Learning workloads. Our success is the result of very dynamic people working in an environment which cultivates creativity, partnership and cross-functional collaboration. These elements come together to make Apple an amazing environment for motivated people to do the greatest work of their lives!
Description
As a Software Engineer in the test role, you will collaborate with world-class machine learning engineers and data scientists to understand the features you will support. In this role, you will create end-to-end automated evaluation pipelines that orchestrate multiple LLMs to generate test data, stress models, identify failure modes, and enable safe, scalable model deployment. This is a highly technical, hands-on role at the intersection of AI systems engineering, evaluation science, and automation.
Minimum Qualifications
BS and a minimum of 3 years relevant industry experience
Strong Python skills with experience building production-grade automation
Strong knowledge of software development lifecycle, testing methodologies, QA terminology and processes
Experience designing or implementing agentic or multi-step LLM workflow
Experience generating and validating synthetic data
Preferred Qualifications
2+ years experience in test automation or related areas, background in QA, test engineering
Experience with agent frameworks such as LangGraph, AutoGen, CrewAI or similar
Experience building human-in-the-loop evaluation system
Knowledge of CI/CD pipelines for ML or evaluation workflows
Ability to multi-task and lead tasks with varying priorities
Experience in popular Database management software, e.g. SQL
Excellent written and verbal interpersonal skills, be able to describe and document clearly
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .
Other Details

