Back to the roster

[Remote] Senior Site Reliability Engineer

Remote Full-time Hiring now

Note: The job is a remote job and is open to candidates in USA. OfficeSpace Software is a leading provider of an AI operating system for the built world, focusing on performance and reliability in workplace environments. They are seeking a Senior Site Reliability Engineer to enhance the performance, reliability, and cost efficiency of their production platform, while transitioning to AI-assisted reliability engineering.

Responsibilities

  • Drive measurable improvements in latency, throughput, and availability across a large-scale production environment
  • Own system performance—from Linux internals to Kubernetes scheduling—and eliminate bottlenecks before customers feel them
  • Define and enforce SLIs, SLOs, and error budgets that balance speed, reliability, and growth
  • Partner with application engineers to profile code paths, improve execution efficiency, and harden services under real load
  • Lead database performance optimization across queries, indexing, replication, and workload isolation
  • Design and oversee AI-assisted load testing, stress testing, and capacity planning workflows
  • Guide the migration from monolithic deployments to multi-tenant Kubernetes platforms
  • Reduce infrastructure spend through architectural decisions, right-sizing, and intelligent scaling strategies
  • Build and supervise automation for infrastructure provisioning, configuration management, and observability
  • Set clear operational standards for reliability, performance, and incident response—and raise the bar for how we run production

Skills

  • 7+ years operating and evolving large-scale production systems
  • Deep Linux systems expertise with hands-on performance tuning across CPU, memory, disk, and networking
  • Strong Python skills for automation, tooling, and AI-assisted systems workflows
  • Production experience with Ruby/Rails ecosystems, including Puma and Sidekiq
  • Proven ability to diagnose and resolve complex database performance issues (MySQL/MariaDB or PostgreSQL)
  • Advanced Kubernetes experience—workload sizing, scheduling, and multi-tenant operations
  • Infrastructure-as-code mastery using Terraform and Terragrunt
  • Experience with configuration management tools such as Puppet or Ansible
  • Strong observability instincts across metrics, logs, and traces using tools like Prometheus, Grafana, Datadog, or ELK
  • AI fluency—comfortable supervising AI agents for analysis, testing, and reporting, and validating their outputs
  • A builder mindset. You move fast, take ownership, and raise standards
  • Scaling and refactoring monolithic applications under real production load
  • Extracting databases or stateful components from monoliths
  • Apache and Nginx tuning at scale
  • Redis performance optimization and operational management
  • CI/CD systems and GitOps workflows, including ArgoCD
  • Cloud cost optimization and FinOps-aligned operational practices

Benefits

  • Competitive Benefits and Rewards
  • Comprehensive and competitive benefits packages globally, designed to support our team’s health, well-being, and financial security

Company Overview

  • OfficeSpace Software is the leading AI-powered workplace management platform that helps organizations plan, connect, and perform at scale. It was founded in 2006, and is headquartered in Alpharetta, Georgia, USA, with a workforce of 201-500 employees. Its website is https://www.officespacesoftware.com.
  • Company H1B Sponsorship

  • OfficeSpace Software has a track record of offering H1B sponsorships, with 1 in 2022. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    Related roles