Site Reliability Engineer

Tax Analysts
Falls Church, VA

Tax Analysts is seeking a Site Reliability Engineer (SRE) to help establish and shape our reliability engineering practice from the ground up. This is a unique opportunity to join a mission-driven organization and play a key role in ensuring the reliability, scalability, and performance of our AWS-hosted business applications.

As part of a cross-functional engineering team, you will work to improve observability, automate operational processes, and lead incident response and continuous improvement efforts. This role is ideal for a mid-level engineer with cloud and software engineering experience who is eager to deepen their expertise in site reliability engineering, learn from senior staff, and help build a culture of reliability.

ESSENTIAL DUTIES AND RESPONSIBILITIES:

  • Help define and implement service-level indicators (SLIs) and objectives (SLOs) for cloud-based applications.
  • Build, configure, and maintain monitoring, alerting, and dashboarding solutions using AWS CloudWatch, X-Ray, and third-party tools such as DataDome.
  • Leverage advanced AWS observability tools (e.g., CloudWatch Synthetics, Contributor Insights) to proactively monitor system health.
  • Contribute to the development and implementation of a structured on-call support process as our reliability practice evolves.
  • Implement monitoring, and maintain site protection and bot mitigation solutions, including DataDome, to defend against automated attacks and ensure application availability, and analyze performance during postmortems of incidents.
  • Investigate incidents, security events, and operational anomalies, resolve, perform root cause analysis, and run a postmortem process.
  • Identify repetitive or manual operational tasks (‘toil’) and design scripts or automations using AWS Lambda and CloudFormation to improve efficiency and reliability.
  • Assist in the maintenance and enhancement of CI/CD pipelines and automated deployment processes.
  • Work closely with development, QA, cloud, and DevOps teams to ensure reliability, scalability, and security are integrated into system and application designs.
  • Contribute to the documentation of systems, processes, incident learnings, compliance, and reliability best practices.
  • Stay current with emerging AWS, SRE, and observability technologies, and make recommendations to adopt new tools or approaches that improve system resilience and operational excellence.
  • Participate in the evaluation and rollout of new AWS services and features that can benefit system reliability or team efficiency.
  • Perform other related duties as assigned to support the team and organizational objectives.

KNOWLEDGE & SKILLS:

  • Strong analytical, troubleshooting, and problem-solving abilities.
  • Hands-on experience with AWS CloudWatch (metrics, logs, dashboards, alarms) for proactive monitoring and alerting.
  • Familiarity with AWS X-Ray for distributed tracing and in-depth troubleshooting of microservices architectures.
  • Experience leveraging tools like CloudWatch Synthetics and Contributor Insights for canary testing and log analytics.
  • Knowledge of AWS CloudTrail for auditing and investigating API calls and security events.
  • Experience using AWS Athena for ad-hoc querying and analysis of logs during incident investigations and postmortems.
  • Proficiency with AWS CloudFormation for reliable and repeatable infrastructure provisioning.
  • Experience automating operational tasks and workflows using AWS Lambda or similar event-driven services.
  • Understanding of AWS services such as API Gateway, CloudFront, and Elastic Load Balancer (ELB) to ensure availability, scalability, and optimal performance of distributed systems.
  • Experience working with site protection and bot mitigation solutions (such as DataDome or Cloudflare).
  • Working knowledge of scripting or programming languages such as Python, Bash, or Node.js for automation and tooling.
  • Excellent communication and documentation skills; ability to collaborate effectively with cross-functional teams.
  • Eagerness to learn and adopt new tools, technologies, and best practices in cloud reliability and operations.

  • Bachelor’s degree in computer science, engineering, or a related field; equivalent professional experience considered.
  • 3+ years of professional experience in cloud engineering, DevOps, infrastructure, or observability roles (AWS required).
  • Experience implementing SRE principles (prior work in an SRE role is a plus).
  • Experience with monitoring, incident response, or reliability work in a production environment.
  • Experience working in an Agile development environment, collaborating within cross-functional teams.
  • Eagerness to help establish and improve site reliability practices while learning and applying best practices.
  • Health/Dental/Vision
  • 401K: Immediately vested
  • Tuition assistance
  • Qualified employer under the Public Service Loan Forgiveness program (PFSL)
  • Generous Paid Time Off
  • Dog-friendly office
  • Private gym onsite
  • Medical, Dental, Vision Insurance
  • Health Savings Account (HSA)
  • Flexible Spending Account (FSA)
  • Employee Assistance Program (EAP)
  • Life and AD&D Insurance
  • Disability Insurance
  • Pet Insurance
  • Tuition Assistance
  • Trade Publication/News Subscription Reimbursement
  • Exercise Room
  • Paid Holidays
  • Vacation and Sick Leave
  • Parental Leave

Tax Analysts is an Equal Employment Opportunity Employer.

Posted 2025-11-25

Recommended Jobs

Maintenance Mechanic

The Building People
Reston, VA

Job Description/Summary: The Building People, LLC, has a position open for a full-time Maintenance Mechanic . The Maintenance Mechanic will perform general maintenance and repair of equipment and …

View Details
Posted 2025-08-18

Hourly Manager - Ms. Peaches

Thompson Hospitality Corporation
Sterling, VA

Overview: We are seeking an experienced and dedicated Hourly Manager  to join our team at Ms. Peaches Restaurant . As a Supervisor, you will play a pivotal role in overseeing the day-to-day operati…

View Details
Posted 2025-09-01

Engineering Technician III

Virginia Department of Transportation
Lexington, VA

Job Identification 11198 Job Category Engineering Technology Posting Date 11/13/2025, 04:31 PM Locations Lexington Residency Job Schedule Full time State Role Title Engineering Tech…

View Details
Posted 2025-11-19

Account Executive - Capital One Software (Enterprise Data & Security SaaS) - (Remote)

Capital One
Richmond, VA

Overview Account Executive - Capital One Software (Enterprise Data & Security SaaS) - (Remote) Ever since our first credit card customer in 1994, Capital One has recognized that technology and …

View Details
Posted 2025-11-20

Chief Architecture Office Technical Lead

KBR
Chantilly, Loudoun County, VA

Title: Chief Architecture Office Technical Lead Belong. Connect. Grow. with KBR! KBR's National Security Solutions team provides high-end engineering and advanced technology solutions to our…

View Details
Posted 2025-11-21

Cook/Dietary Aide

Cumberland Hospital for Children and Adolescents
Cumberland, VA

Responsibilities FULL-TIME Cook / Kitchen Prep / Dietary Opportunity on a beautiful, unique campus! - with NO LATE NIGHTS! At Cumberland Hospital for Children and Adolescents, we're always lea…

View Details
Posted 2025-09-10

General Manager Tropical Smoothie Cafe

Prayosha Group - Fuddruckers & Tropical Smoothie Cafe
Hampton, VA

Looking for leaders to become the foundation of our Tropical Smoothie Cafe family in Hampton with potential to grow into a bigger role. Qualifications Ability to hire, train, and coach cre…

View Details
Posted 2025-11-05

PATH Case Manager

Arrowleaf
Vienna, VA

About us: When you join the Arrowleaf team you are committing to a meaningful career where your work will make a difference for your neighbors throughout Southern Illinois and strengthen our regio…

View Details
Posted 2025-11-23

Optometrist Alexandria, Virginia

Eyetastic Services
Alexandria, VA

Join a visionary team dedicated to transforming the future of eye care! We are seeking an exceptional Optometrist who is ready to make a genuine impact in the lives of our patients. This doctor-led p…

View Details
Posted 2025-11-22

Vital Records Program Support Generalist

DHRM
Richmond, VA

Title: Vital Records Program Support Generalist State Role Title: Admin and Office Spec III Hiring Range: $43000-$50000 Pay Band: 3 Agency: Virginia Department of Health Location: …

View Details
Posted 2025-11-23