[Remote] Data Research Engineer
Note: The job is a remote job and is open to candidates in USA. Microsoft is seeking Data Research Engineers to join their Multimodal team, focusing on building next-generation foundation models across various domains. The role involves designing and curating high-quality datasets to enhance AI models, collaborating with diverse teams to ensure data quality and ethical standards.
Responsibilities
- Create high-quality datasets for training and evaluation; run experiments on new datasets (data ablations) to assess their impact and determine the most effective data
- Develop and maintain scalable data pipelines for multimodal ingestion, preprocessing, filtering, and annotation
- Analyze real-world multimodal datasets to assess quality, diversity, relevance, and identify areas for improvement
- Build lightweight tools and workflows for dataset auditing, visualization, and versioning
- Collaborate with Safety, Ethics, and Governance teams to ensure datasets meet standards for quality, privacy, and responsible AI practices
- Embody our culture and values
Skills
- Bachelor's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.) + OR equivalent experience
- Master's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 8+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)
- OR Bachelor's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 12+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)
- OR equivalent experience
- 2+ years of experience in data analysis or data engineering, including work with large-scale datasets that are unstructured or semi-structured
- Proficiency in statistics and exploratory data analysis methods
- Familiarity with data processing frameworks such as Spark, Ray, or Apache Beam
- Ability to communicate technical findings clearly to research and product teams
Benefits
- Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
- Microsoft is an equal opportunity employer.
- If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
Company Overview
Company H1B Sponsorship