Back to the roster

[Remote] Principal Data Scientist, Health Informatics

Remote Full-time Hiring now

Note: The job is a remote job and is open to candidates in USA. Waymark is a team of healthcare providers, technologists, and builders whose mission is to bring the best healthcare to people with Medicaid benefits. They are seeking a Principal Data Scientist to own clinical data quality and bring senior ML/AI and health economics judgment to their core data science products.

Responsibilities

  • Own clinical data quality across claims, EHR, and ADT: Define standards for how clinical data is structured, normalized, and validated as modeling inputs across payer claims (medical, pharmacy, eligibility), EHR data (Epic, Cerner, Athena), and real-time ADT feeds. Bring deep familiarity with EHR data formats (FHIR, HL7, C-CDA) and how data from systems like Epic, Cerner, and Athena maps to clinical reality. Hold the bar for clinical accuracy and completeness across all three sources
  • Build and ship production ML/AI models: Develop, validate, and deploy risk stratification, care gap prediction, treatment effect estimation, and LLM/foundation model applications — with rigor around leakage, calibration, fairness, and clinical face validity
  • Apply health economics and outcomes methods: Translate raw clinical and claims data into decision-grade evidence through risk adjustment, utilization measurement, cost attribution, quasi-experimental evaluation, and outcomes measurement aligned with CMS, NCQA, and MCO reporting standards
  • Advance machine and AI products: Bring senior modeling judgment to the product roadmap, owning the clinical and methodological soundness of what ships
  • Set standards and mentor: Make architectural trade-offs, drive alignment across data science, engineering, product, and clinical stakeholders, and mentor junior data scientists to raise the technical bar of the team

Skills

  • Healthcare Data Expertise: Deep, hands-on fluency with claims, EHR, and ADT data, and strong command of clinical terminologies (ICD-10, SNOMED CT, LOINC, RxNorm, CPT/HCPCS) and value set curation
  • Standards Fluency: Working experience with healthcare data standards and exchange formats — FHIR, HL7v2, and C-CDA
  • Education: Master's degree in Data Science, Biostatistics, Health Informatics, Computer Science, or a related field
  • Python Proficiency: 7-8+ years of hands-on experience in Python, including data science and ML libraries
  • Applied ML/AI Experience: Demonstrated ability to build, validate, and deploy production ML models on healthcare data, with end-to-end ownership from development through deployment and maintenance in a live environment. Experience with ML pipelines, model versioning, and reproducible workflows at scale
  • Project Ownership: Proven ability to manage complex technical projects independently, align multiple stakeholders, and deliver on timelines
  • PhD in health informatics, statistics, data science, or computer science
  • Experience integrating EHR/HIE data via TEFCA, CommonWell, or comparable networks
  • Health Economics & Outcomes Methods: Experience with risk adjustment, utilization and cost measurement, and quasi-experimental evaluation
  • Familiarity with MLOps best practices including experiment tracking and model registry (e.g. MLflow), CI/CD for ML pipelines, feature stores, and workflow orchestration tools such as SageMaker Pipelines
  • Prior experience building on Medicaid or dual-eligible populations
  • Peer-reviewed publications in healthcare ML, AI, biostatistics, or health economics

Benefits

  • Stock Options:Opportunity to invest in the company’s growth.
  • Work-from-Home Stipend:A dedicated stipend for your first year to help set up your home office.
  • Medical, Vision, and Dental Coverage:Comprehensive plans to keep you and your family healthy.
  • Life Insurance:

Related roles