Back to the roster

[Remote] Senior Data Engineer

Remote Full-time Hiring now

Note: The job is a remote job and is open to candidates in USA. Effectual is seeking a Senior Data Engineer with specialized expertise in data streaming technologies to join their data team. This role focuses on building and maintaining high-performance data streaming architectures that enable real-time data processing and analytics.

Responsibilities

  • Design, build, and maintain scalable streaming data architectures using Kafka, MSK, and Kinesis
  • Develop real-time data pipelines that handle high-volume, high-velocity data streams
  • Implement event-driven architectures and microservices patterns for streaming data processing
  • Create and optimize data streaming topologies for complex event processing scenarios
  • Design fault-tolerant streaming systems with proper error handling and data recovery mechanisms
  • Configure, deploy, and manage Apache Kafka clusters and AWS MSK environments
  • Implement Kafka Connect pipelines for streaming data integration
  • Design optimal Kafka topic partitioning strategies and replication configurations
  • Monitor and optimize Kafka cluster performance, throughput, and latency
  • Implement Kafka security configurations including SSL/TLS, SASL, and ACLs
  • Manage Kafka Schema Registry for data serialization and evolution
  • Design and implement Amazon Kinesis Data Streams and Kinesis Data Firehose solutions
  • Configure Kinesis Analytics applications for real-time stream processing
  • Optimize Kinesis shard management and auto-scaling configurations
  • Implement Kinesis data retention and archival strategies
  • Integrate Kinesis with other AWS services for comprehensive streaming solutions
  • Develop real-time stream processing applications using Apache Spark Streaming, Kafka Streams, or AWS Lambda
  • Implement complex event processing (CEP) patterns for real-time analytics
  • Build streaming ETL pipelines that transform data in motion
  • Create real-time aggregations, windowing operations, and stateful stream processing
  • Optimize streaming query performance and resource utilization
  • Ensure seamless integration between streaming systems and data lakes, data warehouses, and operational databases
  • Implement data lineage and monitoring for streaming data pipelines
  • Create automated data quality checks and validation for streaming data
  • Manage data serialization formats (Avro, JSON, Protobuf) and schema evolution
  • Coordinate with data scientists and analysts to ensure streaming data meets analytical requirements
  • Implement Infrastructure as Code (IaC) for streaming data platforms using Terraform or CloudFormation
  • Automate deployment and management of streaming infrastructure through CI/CD pipelines
  • Monitor streaming system health, performance metrics, and alerting
  • Implement disaster recovery and high availability strategies for streaming systems
  • Stay current with emerging trends in streaming technologies and cloud-native solutions
  • Collaborate with data architects, data scientists, and application teams on streaming data requirements
  • Support rigorous project governance through daily progress reviews and time tracking
  • Provide technical leadership and mentorship to junior data engineers
  • Communicate complex streaming concepts to technical and non-technical stakeholders
  • Operate with transparency and responsiveness to support high-performing teams

Skills

  • 7+ years of experience in the data engineering field with significant streaming data specialization
  • Bachelor's degree in Computer Science, Engineering, or related STEM field
  • Extensive hands-on experience with Apache Kafka including cluster management, performance tuning, and ecosystem tools
  • Proven experience with AWS MSK and Amazon Kinesis services in production environments
  • Strong background in real-time data processing and stream analytics
  • Streaming Technologies: Apache Kafka, Kafka Connect, Kafka Streams, Amazon MSK, Amazon Kinesis (Data Streams, Data Firehose, Analytics)
  • Programming Languages: Proficient in Python, Java, and Scala for streaming applications
  • Stream Processing Frameworks: Apache Spark Streaming, Apache Flink, AWS Lambda for stream processing
  • Data Serialization: Experience with Avro, Protocol Buffers, JSON, and schema registry management
  • Big Data Technologies: Hadoop ecosystem, Apache Spark, distributed computing concepts
  • Database Technologies: SQL and NoSQL databases, data warehousing solutions, time-series databases
  • AWS Services: Deep knowledge of AWS streaming and analytics services (MSK, Kinesis, Lambda, EMR, Glue)
  • Containerization: Docker and Kubernetes for streaming application deployment
  • Infrastructure as Code: Terraform, CloudFormation for streaming infrastructure automation
  • Monitoring: CloudWatch, Prometheus, Grafana for streaming system observability
  • Security: Implementation of streaming data security, encryption, and access controls
  • Expert use of code versioning tools such as GitHub
  • Expert knowledge of Agile methodologies and delivery practices
  • Experience with CI/CD pipelines for streaming data applications
  • Understanding of data APIs, REST services, and microservices architectures
  • Leadership & Team Management
  • Risk Management and mitigation strategies for streaming systems
  • Conflict Resolution
  • Strategic Planning & Leadership for data streaming initiatives
  • Resource Management and capacity planning
  • Change Management for streaming technology adoption
  • Core AWS Certifications: AWS Data Engineer Associate (required)
  • AWS Solutions Architect Professional (preferred)
  • AWS Developer Professional (recommended)
  • Confluent Certified Administrator for Apache Kafka (highly recommended)
  • Confluent Certified Developer for Apache Kafka (preferred)
  • AWS Big Data Specialty (if available in current form)
  • AWS Security Specialist
  • Certified Associate Data Analyst with Python
  • Certified Professional Python Programmer Level 1
  • Databricks Data Engineer Professional
  • Certified Associate Python Programmer
  • Java or Scala certification (Oracle Certified Professional)
  • Experience with Apache Flink for advanced stream processing
  • Knowledge of Apache Pulsar as an alternative messaging system
  • Experience with event sourcing and CQRS patterns
  • Understanding of Apache Airflow for batch and streaming workflow orchestration
  • Experience with ksqlDB for stream processing using SQL
  • Background in financial services, IoT, or other real-time data intensive industries
  • Experience with multi-cloud streaming architectures
  • Knowledge of Apache NiFi for data flow automation

Benefits

  • Medical, dental, and vision health insurances
  • Short term disability, long term disability and life insurances
  • 401k with Company match
  • Paid time off (PTO) (120 hours PTO that accrue over one year)
  • Paid time off for major holidays (14 days per year)
  • These and any other employee benefit offerings are subject to management’s discretion and may change at any time.

Company Overview

  • Cloud Service Provider, AWS Premier Tier Services Partner, Generative and Agentic AI, Migration, Modernization It was founded in 2019, and is headquartered in Jersey City, New Jersey, USA, with a workforce of 201-500 employees. Its website is https://www.effectual.ai.
  • Company H1B Sponsorship

  • Effectual has a track record of offering H1B sponsorships, with 3 in 2023, 3 in 2022, 2 in 2021. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    Related roles