Sr. Site Reliability Engineer

Mondo
Herndon, VA
Apply now: Sr. Site Reliability Engineer , location is Remote (U.S.-based) . The start date is ASAP for this Contract (12 months)position.

Job Title:  Sr. Site Reliability Engineer
Location-Type:  Remote (U.S.-based)
Start Date Is:  ASAP
Duration:  Contract (12 months)
Compensation Range:  $70/hr - $80/hr W2

Job Description:

Own and operate mission-critical cloud services supporting highly secure, sovereign cloud environments. This role focuses on ensuring uptime, reliability, and performance of enterprise-scale platform services by driving automation, monitoring, incident response, and continuous improvement across CloudFoundry-based infrastructure.

This position is ideal for a hands-on reliability engineer who enjoys deep technical ownership, solving complex production problems, and working in a structured, security-first environment where stability and precision matter.

Day-to-Day Responsibilities:

  • Own reliability and operational health of production platform services in a globally distributed environment.
  • Deploy upgrades, patches, and enhancements following detailed release documentation and strict change management processes.
  • Monitor system performance, investigate anomalies, and proactively prevent downtime and performance degradation.
  • Lead incident response for production issues, including triage, customer communication, resolution, and root cause analysis.
  • Build and maintain CI/CD pipelines and automation to support repeatable, secure deployments.
  • Troubleshoot distributed systems running on CloudFoundry, Kafka, and Kubernetes.
  • Maintain and improve observability frameworks using tools such as Prometheus and Grafana.
  • Partner with product and engineering teams to improve service reliability, scalability, and resilience.
  • Develop and maintain automation for testing, deployment, and infrastructure management.
  • Ensure compliance with security, audit, and sovereign cloud requirements.
  • Participate in on-call rotation supporting high-severity production incidents.
  • Document system architecture, operational processes, and incident learnings.
  • Mentor junior engineers and help elevate operational best practices across the team.
  • Operate as a high-ownership engineer in a low-structure environment where services are owned end-to-end.

Requirements:

Must-Haves:

  • 8 years of experience in Site Reliability Engineering, Platform Engineering, or Production Infrastructure roles.
  • Strong hands-on experience with CloudFoundry (or Pivotal Cloud Foundry).
  • Strong experience with Kafka and distributed messaging systems.
  • Experience supporting or troubleshooting Zookeeper-managed Kafka clusters.
  • Deep experience building and managing CI/CD pipelines, ideally using Concourse.
  • Strong Kubernetes experience including troubleshooting, operations, and optimization.
  • Strong Linux systems administration experience (SUSE, Ubuntu, or similar).
  • Experience building automation using scripting or infrastructure-as-code tools.
  • Strong Git experience and source control best practices.
  • Experience supporting production systems with on-call or incident response responsibilities.
  • Experience working in highly secure, regulated, or enterprise environments.
  • Strong troubleshooting mindset across application, infrastructure, and networking layers.
  • Strong communication skills and ability to operate cross-functionally.
  • Bachelor's degree in Computer Science, Information Systems, or equivalent experience.

Nice-to-Haves:

  • Experience working with SAP products or SAP Business Technology Platform.
  • Experience supporting sovereign cloud or government-adjacent environments.
  • Experience with AWS services such as EC2, S3, IAM, VPC, CloudWatch, Route53, or RDS.
  • Experience with Terraform, Jenkins, Chef, or other automation tooling.
  • Experience supporting multi-tenant SaaS environments.
  • Experience working with enterprise monitoring and observability frameworks.
  • Experience supporting large-scale enterprise or global production environments.
Posted 2026-02-09

Recommended Jobs

Associate Security Specialist

KBR
Chantilly, Loudoun County, VA

Title: Associate Security Specialist Belong. Connect. Grow. with KBR! KBR's National Security Solutions team provides high-end engineering and advanced technology solutions to our customers …

View Details
Posted 2025-12-05

QA Automation Engineer (Engineer Software 2) - 26943

Mission Technologies, a division of HII
Suffolk, VA

Requisition Number: 26943 Required Travel: 11 - 25% Employment Type: Full Time/Salaried/Exempt Anticipated Salary Range: $78,405.00 - $90,000.00 Security Clearance: Secret Level of Experie…

View Details
Posted 2026-02-03

Litigation Paralegal

Leesburg, VA

Litigation Paralegal (Onsite) Leesburg, VA $70-75k  Position Overview We are seeking an experienced Litigation Paralegal to support a busy litigation practice focused on probate, fiduciary, …

View Details
Posted 2026-01-20

EHS Senior Supervisor - 1st Shift

GXO Logistics, Inc.
Bristow, VA

Logistics at full potential.  At GXO, we’re constantly looking for talented individuals at all levels who can deliver the caliber of service our company requires. You know that a positive work …

View Details
Posted 2025-12-24

DevOps Engineer - Active TS/SCI clearance with Polygraph

Distributed Solutions, Inc.
Reston, VA

Distributed Solutions, Inc. (DSI) is a fast-growing company seeking a DevOps Engineer to work closely with DSI’s Customer Engagement project teams and provide the customer with first-line implementati…

View Details
Posted 2025-12-29

UI/UX Developer

Xenith Solutions
Leesburg, VA

Xenith Solutions is a small family focused business where we focus on taking care of our employees and customers equally.  We are focused on serving Federal / Civilian, Defense and Intelligence organ…

View Details
Posted 2026-01-08

Travel Assistance Operations Manager

Falck Global Assistance
Richmond, VA

: Falck Global Assistance is a business unit of the Falck Group - a company headquartered in Copenhagen, Denmark. Our services are used by numerous companies in various business sectors to deliver e…

View Details
Posted 2026-02-05

Security Control Assessor (SCA) III

GDIT
Arlington, VA

Responsibilities for this Position Location: USA VA Arlington Full Part/Time: Full time Job Req: RQ214739 Type of Requisition: Regular Clearance Level Must Currently Possess: T…

View Details
Posted 2026-02-11

Director of Engineering

UBERETHER INC
Sterling, VA

The Team UberEther is a leader in the hyper-secure infrastructure space for Identity and Access Management (IAM), #ZeroTrust and compliance acceleration. Our platform and expert services team enab…

View Details
Posted 2026-01-15

Business Analyst (Entry Level)

CollaboraIT Inc.
Virginia Beach, VA

Location: Onsite/hybrid Employment Type: Full-time About the Role: We are seeking a motivated and detail-oriented Entry-Level Business Analyst to join our team. This role is ideal for recent…

View Details
Posted 2025-12-13