Principal Site Reliability Engineer

Clarity Innovations
Hampton, VA

Clarity Innovations is a trusted national security partner, dedicated to safeguarding our nation’s interests and delivering innovative solutions that empower the Intelligence Community (IC) and Department of Defense (DoD) to transform data into actionable intelligence, ensuring mission success in an evolving world.

Our mission-first software and data engineering platform modernizes data operations, utilizing advanced workflows, CI/CD, and secure DevSecOps practices. We focus on challenges in Information Warfare, Cyber Operations, Operational Security, and Data Structuring, enabling end-to-end solutions that drive operational impact.

We are committed to delivering cutting-edge tools and capabilities that address the most complex national security challenges, empowering our partners to stay ahead of emerging threats and ensuring the success of their critical missions. At Clarity, we are people-focused and set on being a destination employer for top talent, offering an environment where innovation thrives, careers grow, and individuals are valued. Join us as we continue to lead innovation and tackle the most pressing challenges in national security.

Position Overview:
In this role, you will focus on ensuring the availability, reliability, and performance of our multi-tenant, microservices application suite. You will collaborate closely with cross-functional teams to troubleshoot issues, automate processes, and build scalable, resilient systems. You will learn the nuances of the entire KRADOS suite of applications and its infrastructure, which will facilitate your missions of 24/7/365 tier 2/3 outage response, and improving the efficiencies of KRADOS.

Key Responsibilities:


  • Monitor system health, define Service Level Indicators (SLIs), and ensure adherence to Service Level Objectives (SLOs).

  • Respond promptly to outages, conduct root cause analyses, and implement durable solutions to prevent recurrence.

  • Collaborate with development and DevOps teams to optimize and maintain Kubernetes environments and CI/CD pipelines.

  • Develop and refine automation scripts to enhance system reliability, including automated recovery and self-healing capabilities.

  • Build and maintain observability frameworks, integrating metrics, logging, and tracing tools for proactive issue identification.

  • Contribute to performance tuning and scalability improvements across the application stack.

  • Document incident responses and contribute to a knowledge base to foster a culture of continuous improvement.

  • Participate in an on-call rotation to provide 24/7/365 support for mission-critical systems.

Qualifications:


  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

  • 6 years of experience in site reliability, systems engineering, or DevOps roles.

  • Proficiency in one or more programming/scripting languages (e.g., Python, Go, Java, Bash).

  • Strong understanding of distributed systems, microservices architecture, and RESTful API design.

  • Hands-on experience with Kubernetes and container orchestration.

  • Familiarity with monitoring, alerting, and logging tools (e.g., Prometheus, Grafana, ELK stack, or Datadog). Experience with Elastic will be highly helpful with this position.

  • Hands-on experience with incident response, including designing and improving incident management processes.

  • Expertise in Observability practices, including metrics, logs, traces, and understanding of distributed tracing tools (e.g., OpenTelemetry).

  • Strong problem-solving skills with a focus on building resilient, fault-tolerant systems.

  • Excellent communication skills and a collaborative mindset.

  • Have to have SEC+ or higher certification or ability to obtain it within six months from hire.

  • Must be willing to do shift work to provide 24/7/365 coverage.

  • Must be within 45 minutes drive of an IL6 workstation location (e.g., SIPR café, SCIF)

Preferred Qualifications:


  • Experience with cloud platforms (e.g., AWS) and their associated managed services.

  • Knowledge of database management and optimization for systems (e.g., PostgreSQL)

  • Familiarity with Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation).

  • Master’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.








We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Posted 2025-09-22

Recommended Jobs

Sr. Systems Engineer - Herndon, VA - TS/CI with Poly

General Dynamics Information Technology
Herndon, VA

Public Trust: None Requisition Type: Pipeline Your Impact Own your opportunity to serve as a critical component of our nation’s safety and security. Make an impact by using your expertise …

View Details
Posted 2025-09-01

Join the Pulse of Healthcare in Historic Salem!

NurseRecruiter
Salem, VA

RN Emergency Room job in Salem, VA Embark on an enriching journey as a travel nurse in historic Salem, where the charm of the past meets vibrant present-day living! This unique role as an Emergency R…

View Details
Posted 2025-07-30

Customer Service and Security Event Staff - Liberty University

REVELxp
Lynchburg, VA

Job Description Job Description Description: Rhino Sports, a division of REVELxp, is looking for reliable and outgoing individuals to join our team! We're looking for people who are interested i…

View Details
Posted 2025-07-27

Field Services Lead (CTL) - Roanoke Transport - FT

Carilion Clinic
Roanoke, VA

Workplace: Onsite   How You’ll Help Transform Healthcare: The Field Services Lead rate starts at $27.70, with credit for experience. The Field Services Lead has direct supervision of emer…

View Details
Posted 2025-07-29

Warehouse Technician, Government Secret Clearance Required

General Dynamics Information Technology
Sterling, VA

Public Trust: None Requisition Type: Regular Your Impact Own your opportunity to work alongside federal civilian agencies. Make an impact by providing services that help the government ens…

View Details
Posted 2025-08-18

ServiceNow Security Engineer (Journeyman)

LaunchTech
Arlington, VA

NOTE: Must be a US Citizen to be considered for this position Overview The ServiceNow Security Engineer will play a critical role in supporting Hybrid Agile development projects within a Software…

View Details
Posted 2025-08-18

Sales Director - Senior Helpers Richmond, VA

Senior Helpers - Mechanicsville
Mechanicsville, VA

Job Description Job Description About Us Senior Helpers Richmond provides compassionate, dependable in-home care for seniors. We’re seeking a driven Business Development Manager to grow our …

View Details
Posted 2025-08-22

Hybrid Partner-Level Privacy Law

Carrie Rikon & Associates, LLC.
Tysons Corner, VA

Hybrid Law Firm Partner Privacy  Salary Range of 225K-250K Plus Yearly Bonus Offered, Equating To 1M-2M  Excellent compensation package plus benefits  Tysons Corner, VA A nationally recogniz…

View Details
Posted 2025-07-30

Software Developer

Barbaricum
Mount Vernon, VA

Barbaricum is a rapidly growing government contractor providing leading-edge support to federal customers, with a particular focus on Defense and National Security mission sets. We leverage more than…

View Details
Posted 2025-09-12

Staff Accountant

Planet Group
Manassas, VA

Job Summary: We are seeking a diligent and experienced Senior Accountant to join our finance team. The Senior Accountant will be responsible for overseeing general accounting operations, preparing fi…

View Details
Posted 2025-07-30