Job Details

Job Information

Machine Learning Safety: Evaluation Research Engineer
AWM-2905-Machine Learning Safety: Evaluation Research Engineer
3/18/2026
3/23/2026
Negotiable
Permanent

Other Information

www.apple.com
San Francisco, CA, 94103, USA
San Francisco
California
United States
94103

Job Description

No Video Available
 

Role Number: 200651943-3577

Summary

This role supports the design and development of safety evaluation methodologies for generative and agentic AI features that enable users across the globe to interact with our media products and services.

Description

You will play an impactful role: shaping responsible AI and safety policies, evaluating fidelity to product safety requirements, creating risk assessments and taxonomies, curating exemplar safety evaluation datasets, and ensuring that evaluation frameworks are culturally and linguistically grounded.
An ideal candidate possesses a strong understanding of issues in responsible AI and A and society, technology evaluation design principles and practices, and brings experience designing evaluations to support policies and/or product requirements, classification systems, and annotation and/or study participant guidelines.

Minimum Qualifications

  • 4+ years of experience in an applied research setting related to evaluation design, AI ethics, Responsible AI, AI safety, computational social science, content analysis, or a closely related field.

  • Strong understanding of taxonomy design, classification systems, and annotation methodology.

  • Experience developing evaluation guidelines and exemplar sets for human annotation or labeling tasks.

  • Demonstrated ability to collaborate with subject matter experts (e.g., linguists, cultural consultants, multi-lingual annotators) to inform research design.

  • Able to work independently to drive outcomes among cross-functional teams, with minimal direction.

  • Organized, highly attentive to detail, and manages time well.

  • Excellent written and oral communication skills.

  • Experience working in industry.

  • Advanced degree (MS/PhD) in Linguistics, Information Science, Computational Social Science, or a related socio-technical field.

Preferred Qualifications

  • Experience designing evaluation frameworks for multilingual or cross-cultural contexts.

  • Familiarity with responsible AI, AI safety, or content moderation policy frameworks.

  • Experience with experimental design methodologies, inter-rater reliability data analysis and annotation quality assessment methods.

  • Prior experience working with localization, internationalization, or language service teams.

  • Experience with survey design, AI policy development, and/or structured content analysis methodologies.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .

Other Details

No Video Available
--

About Organization

 
About Organization