Sr. Site Reliability Engineer
Job Title: Sr. Site Reliability Engineer
Location-Type: Remote (U.S.-based)
Start Date Is: ASAP
Duration: Contract (12 months)
Compensation Range: $70/hr - $80/hr W2
Job Description:
Own and operate mission-critical cloud services supporting highly secure, sovereign cloud environments. This role focuses on ensuring uptime, reliability, and performance of enterprise-scale platform services by driving automation, monitoring, incident response, and continuous improvement across CloudFoundry-based infrastructure.
This position is ideal for a hands-on reliability engineer who enjoys deep technical ownership, solving complex production problems, and working in a structured, security-first environment where stability and precision matter.
Day-to-Day Responsibilities:
- Own reliability and operational health of production platform services in a globally distributed environment.
- Deploy upgrades, patches, and enhancements following detailed release documentation and strict change management processes.
- Monitor system performance, investigate anomalies, and proactively prevent downtime and performance degradation.
- Lead incident response for production issues, including triage, customer communication, resolution, and root cause analysis.
- Build and maintain CI/CD pipelines and automation to support repeatable, secure deployments.
- Troubleshoot distributed systems running on CloudFoundry, Kafka, and Kubernetes.
- Maintain and improve observability frameworks using tools such as Prometheus and Grafana.
- Partner with product and engineering teams to improve service reliability, scalability, and resilience.
- Develop and maintain automation for testing, deployment, and infrastructure management.
- Ensure compliance with security, audit, and sovereign cloud requirements.
- Participate in on-call rotation supporting high-severity production incidents.
- Document system architecture, operational processes, and incident learnings.
- Mentor junior engineers and help elevate operational best practices across the team.
- Operate as a high-ownership engineer in a low-structure environment where services are owned end-to-end.
Requirements:
Must-Haves:
- 8 years of experience in Site Reliability Engineering, Platform Engineering, or Production Infrastructure roles.
- Strong hands-on experience with CloudFoundry (or Pivotal Cloud Foundry).
- Strong experience with Kafka and distributed messaging systems.
- Experience supporting or troubleshooting Zookeeper-managed Kafka clusters.
- Deep experience building and managing CI/CD pipelines, ideally using Concourse.
- Strong Kubernetes experience including troubleshooting, operations, and optimization.
- Strong Linux systems administration experience (SUSE, Ubuntu, or similar).
- Experience building automation using scripting or infrastructure-as-code tools.
- Strong Git experience and source control best practices.
- Experience supporting production systems with on-call or incident response responsibilities.
- Experience working in highly secure, regulated, or enterprise environments.
- Strong troubleshooting mindset across application, infrastructure, and networking layers.
- Strong communication skills and ability to operate cross-functionally.
- Bachelor's degree in Computer Science, Information Systems, or equivalent experience.
Nice-to-Haves:
- Experience working with SAP products or SAP Business Technology Platform.
- Experience supporting sovereign cloud or government-adjacent environments.
- Experience with AWS services such as EC2, S3, IAM, VPC, CloudWatch, Route53, or RDS.
- Experience with Terraform, Jenkins, Chef, or other automation tooling.
- Experience supporting multi-tenant SaaS environments.
- Experience working with enterprise monitoring and observability frameworks.
- Experience supporting large-scale enterprise or global production environments.
Recommended Jobs
Associate Security Specialist
Title: Associate Security Specialist Belong. Connect. Grow. with KBR! KBR's National Security Solutions team provides high-end engineering and advanced technology solutions to our customers …
QA Automation Engineer (Engineer Software 2) - 26943
Requisition Number: 26943 Required Travel: 11 - 25% Employment Type: Full Time/Salaried/Exempt Anticipated Salary Range: $78,405.00 - $90,000.00 Security Clearance: Secret Level of Experie…
Litigation Paralegal
Litigation Paralegal (Onsite) Leesburg, VA $70-75k Position Overview We are seeking an experienced Litigation Paralegal to support a busy litigation practice focused on probate, fiduciary, …
EHS Senior Supervisor - 1st Shift
Logistics at full potential. At GXO, we’re constantly looking for talented individuals at all levels who can deliver the caliber of service our company requires. You know that a positive work …
DevOps Engineer - Active TS/SCI clearance with Polygraph
Distributed Solutions, Inc. (DSI) is a fast-growing company seeking a DevOps Engineer to work closely with DSI’s Customer Engagement project teams and provide the customer with first-line implementati…
UI/UX Developer
Xenith Solutions is a small family focused business where we focus on taking care of our employees and customers equally. We are focused on serving Federal / Civilian, Defense and Intelligence organ…
Travel Assistance Operations Manager
: Falck Global Assistance is a business unit of the Falck Group - a company headquartered in Copenhagen, Denmark. Our services are used by numerous companies in various business sectors to deliver e…
Security Control Assessor (SCA) III
Responsibilities for this Position Location: USA VA Arlington Full Part/Time: Full time Job Req: RQ214739 Type of Requisition: Regular Clearance Level Must Currently Possess: T…
Director of Engineering
The Team UberEther is a leader in the hyper-secure infrastructure space for Identity and Access Management (IAM), #ZeroTrust and compliance acceleration. Our platform and expert services team enab…
Business Analyst (Entry Level)
Location: Onsite/hybrid Employment Type: Full-time About the Role: We are seeking a motivated and detail-oriented Entry-Level Business Analyst to join our team. This role is ideal for recent…