[Remote] Clinical Manager
Note: The job is a remote job and is open to candidates in USA. SDLC Technologies is seeking a Clinical Manager to serve as Clinical AI Evaluators and Testers. The role involves executing structured evaluations against complex clinical rubrics and performing stress-testing to identify edge-case failures before models are launched to the public.
Responsibilities
- Structured Grading: Reviewing prompt-response pairs and scoring them across dimensions like clinical accuracy, safety, and various health risk domains
- Dynamic Sprints: Participating in "sweeps" where they dynamically converse with the model to probe specific health risks (e.g., attempting to get the model to prescribe medication or diagnose a complex condition)
- Providing Clinical Rationale: Writing clear, concise, and scientifically backed justifications for why a model response was graded as unsafe or inaccurate
Skills
- Medical Degree: MD, MBBS, or local equivalent from an accredited institution
- Board Certification: Candidates must be Board-Certified in one of the following priority specialties: Primary Care / Family Medicine / Internal Medicine, Pediatrics, Obstetrics & Gynecology, Cardiology, Oncology, Psychiatry / Clinical Psychology, Emergency Medicine / Urgent Care
- Licensure: Active, unrestricted clinical licensure in good standing in their practicing geography (US, UK/EU)
- Active Practice: Direct, current practice in patient care
- Hands-on LLM Experience: Active experience using generative AI platforms (e.g., Gemini, ChatGPT, Claude) for clinical workflows, research, or everyday queries
- Awareness of AI Limitations: Understanding of the inherent limitations of LLMs, including hallucinations, sycophancy, bias, and the risks of generating unsafe clinical advice
- Prompting Skills: Ability to write clear, structured, and complex prompts to interact with models dynamically
- Adversarial Probing: Ability to adopt different user personas to stress-test the model
- Rubric-Based Grading: Experience working with complex, multi-dimensional rubrics and comfort with aligning their grading to strict 'Anchor Examples'
- Structured Grading: Reviewing prompt-response pairs and scoring them across dimensions like clinical accuracy, safety, and various health risk domains
- Dynamic Sprints: Participating in 'sweeps' where they dynamically converse with the model to probe specific health risks
- Providing Clinical Rationale: Writing clear, concise, and scientifically backed justifications for why a model response was graded as unsafe or inaccurate
- Advanced Degrees: An additional advanced degree in a relevant field (e.g., Master of Public Health (MPH), MS/PhD in Medical Informatics, Computer Science, or Behavioral Science)
- Multilingual: Fluency in multiple languages and deep understanding of cultural health beliefs and regional clinical guidelines
- Prior AI Safety Work: Previous experience designing, implementing, or participating in AI safety evaluations, Trust & Safety workflows, or RLHF for medical models
Company Overview
Company H1B Sponsorship