[Remote] Senior Software Engineer, Big Data
Note: The job is a remote job and is open to candidates in USA. Zillow is a leading real estate platform in the U.S. seeking a Senior Software Engineer, Big Data to enhance their real-time data platform. This role involves designing and evolving the streaming infrastructure that supports internal services, focusing on scalability, reliability, and long-term sustainability.
Responsibilities
- Design, build, and operate large‑scale Kafka and Flink infrastructure supporting tier‑0 and tier‑1 workloads
- Lead critical initiatives in our streaming platform modernization, including platform architecture evolution
- Develop and enhance streaming control planes, APIs, CLIs, and provisioning systems that standardize how teams create and operate streaming resources across Zillow
- Improve platform reliability through SLO definition, monitoring, alerting, incident response, and automation
- Enable simplified stream processing patterns for product and engineering teams, reducing the need for bespoke infrastructure or specialized expertise
- Evaluate and integrate modern streaming ecosystem capabilities, including managed Kafka offerings, serverless stream processing, and real‑time AI integration patterns
- Make high‑quality architectural decisions under ambiguity, balancing reliability, cost, performance, and developer experience across competing priorities
- Mentor engineers and contribute to raising the bar on distributed systems design, operational excellence, and long‑term platform strategy
Skills
- 5+ years of experience building and operating large‑scale distributed systems, including independently owning critical production systems end to end
- Significant production experience with Kafka and/or Flink, including performance tuning, state management, scaling strategies, and operational incident resolution
- Proficiency in at least one programming language such as Python, Java, or Scala
- Experience operating services in cloud environments (for example, AWS) and working with container orchestration platforms like Kubernetes
- Experience designing scalable, multi‑tenant systems with reliability, cost efficiency, and observability in mind
- Experience defining and operating against SLOs, participating in on‑call rotations, and leading incident response efforts
- Familiarity with infrastructure‑as‑code tooling such as Terraform and CI/CD systems
- Strong systems design skills, including the ability to reason about consistency, state management, fault tolerance, and throughput
- Experience collaborating across platform and product teams to define boundaries, contracts, and integration patterns
- Experience working with streaming vendors (for example, Confluent, MSK, Redpanda) or modernizing legacy Kafka/Flink infrastructure
- Demonstrated experience leading system design efforts for complex, multi‑team platform initiatives
- Experience integrating streaming systems with analytics platforms such as Databricks or building real‑time context engineering capabilities for AI systems
- Background in reliability engineering or platform engineering
Benefits
- Equity awards based on factors such as experience, performance and location
Company Overview
Company H1B Sponsorship