Site Reliability Engineer

Tax Analysts
Falls Church, VA

Tax Analysts is seeking a Site Reliability Engineer (SRE) to help establish and shape our reliability engineering practice from the ground up. This is a unique opportunity to join a mission-driven organization and play a key role in ensuring the reliability, scalability, and performance of our AWS-hosted business applications.

As part of a cross-functional engineering team, you will work to improve observability, automate operational processes, and lead incident response and continuous improvement efforts. This role is ideal for a mid-level engineer with cloud and software engineering experience who is eager to deepen their expertise in site reliability engineering, learn from senior staff, and help build a culture of reliability.

ESSENTIAL DUTIES AND RESPONSIBILITIES:

  • Help define and implement service-level indicators (SLIs) and objectives (SLOs) for cloud-based applications.
  • Build, configure, and maintain monitoring, alerting, and dashboarding solutions using AWS CloudWatch, X-Ray, and third-party tools such as DataDome.
  • Leverage advanced AWS observability tools (e.g., CloudWatch Synthetics, Contributor Insights) to proactively monitor system health.
  • Contribute to the development and implementation of a structured on-call support process as our reliability practice evolves.
  • Implement monitoring, and maintain site protection and bot mitigation solutions, including DataDome, to defend against automated attacks and ensure application availability, and analyze performance during postmortems of incidents.
  • Investigate incidents, security events, and operational anomalies, resolve, perform root cause analysis, and run a postmortem process.
  • Identify repetitive or manual operational tasks (‘toil’) and design scripts or automations using AWS Lambda and CloudFormation to improve efficiency and reliability.
  • Assist in the maintenance and enhancement of CI/CD pipelines and automated deployment processes.
  • Work closely with development, QA, cloud, and DevOps teams to ensure reliability, scalability, and security are integrated into system and application designs.
  • Contribute to the documentation of systems, processes, incident learnings, compliance, and reliability best practices.
  • Stay current with emerging AWS, SRE, and observability technologies, and make recommendations to adopt new tools or approaches that improve system resilience and operational excellence.
  • Participate in the evaluation and rollout of new AWS services and features that can benefit system reliability or team efficiency.
  • Perform other related duties as assigned to support the team and organizational objectives.

KNOWLEDGE & SKILLS:

  • Strong analytical, troubleshooting, and problem-solving abilities.
  • Hands-on experience with AWS CloudWatch (metrics, logs, dashboards, alarms) for proactive monitoring and alerting.
  • Familiarity with AWS X-Ray for distributed tracing and in-depth troubleshooting of microservices architectures.
  • Experience leveraging tools like CloudWatch Synthetics and Contributor Insights for canary testing and log analytics.
  • Knowledge of AWS CloudTrail for auditing and investigating API calls and security events.
  • Experience using AWS Athena for ad-hoc querying and analysis of logs during incident investigations and postmortems.
  • Proficiency with AWS CloudFormation for reliable and repeatable infrastructure provisioning.
  • Experience automating operational tasks and workflows using AWS Lambda or similar event-driven services.
  • Understanding of AWS services such as API Gateway, CloudFront, and Elastic Load Balancer (ELB) to ensure availability, scalability, and optimal performance of distributed systems.
  • Experience working with site protection and bot mitigation solutions (such as DataDome or Cloudflare).
  • Working knowledge of scripting or programming languages such as Python, Bash, or Node.js for automation and tooling.
  • Excellent communication and documentation skills; ability to collaborate effectively with cross-functional teams.
  • Eagerness to learn and adopt new tools, technologies, and best practices in cloud reliability and operations.

  • Bachelor’s degree in computer science, engineering, or a related field; equivalent professional experience considered.
  • 3+ years of professional experience in cloud engineering, DevOps, infrastructure, or observability roles (AWS required).
  • Experience implementing SRE principles (prior work in an SRE role is a plus).
  • Experience with monitoring, incident response, or reliability work in a production environment.
  • Experience working in an Agile development environment, collaborating within cross-functional teams.
  • Eagerness to help establish and improve site reliability practices while learning and applying best practices.
  • Health/Dental/Vision
  • 401K: Immediately vested
  • Tuition assistance
  • Qualified employer under the Public Service Loan Forgiveness program (PFSL)
  • Generous Paid Time Off
  • Dog-friendly office
  • Private gym onsite
  • Medical, Dental, Vision Insurance
  • Health Savings Account (HSA)
  • Flexible Spending Account (FSA)
  • Employee Assistance Program (EAP)
  • Life and AD&D Insurance
  • Disability Insurance
  • Pet Insurance
  • Tuition Assistance
  • Trade Publication/News Subscription Reimbursement
  • Exercise Room
  • Paid Holidays
  • Vacation and Sick Leave
  • Parental Leave

Tax Analysts is an Equal Employment Opportunity Employer.

Posted 2025-09-22

Recommended Jobs

Outside Sales Representative - Home Improvement

FREDDY AND SON LLC
Manassas, VA

Job Description Job Description Benefits/Perks Competitive Compensation Paid Time Off Career Growth Opportunities Job Summary We are seeking a highly motivated and energetic Outsi…

View Details
Posted 2025-09-20

People & Culture Coordinator

Nuix
Herndon, VA

Job Description Job Description Description The People & Culture Coordinator role forms part of our high performing, fun and collaborative People & Culture team at Nuix. We’re on a mission to be…

View Details
Posted 2025-07-30

Contracts Manager

Bart & Associates
McLean, VA

Job Description Job Description Description: Contracts Manager: At B&A, we foster and embrace a distinct set of values that we live by and instill in all aspects of our organization: dedica…

View Details
Posted 2025-09-20

Host

Hair of the Dog
Virginia Beach, VA

Job Description Job Description Benefits: Company parties Employee discounts Flexible schedule Health insurance Moliar Hospitality Group is a locally owned and operated food and b…

View Details
Posted 2025-09-20

Senior Systems Architect FSP

Tenica and Associates
Herndon, VA

TENICA is looking to hire a Senior Systems Architect, with focus on IT Networks.  ACTIVE TS/SCI with FULL SCOPE POLY CLEARANCE REQUIRED TO BE CONSIDERED FOR THIS POSITION As IT Architect Expert, you…

View Details
Posted 2025-08-06

Painter Supervisor

French Painting Company Inc
Norfolk, VA

Job Description Job Description Hi this is Art owner of a decent painting company in hampton roads area. We are class A painting contractor trying to get into virginia ship repair and shipyard pa…

View Details
Posted 2025-09-14

Hospital | CT Tech

Fredericksburg, VA

Travel CT Technologist Job – Hospital Assignment in Fredericksburg, VA Advance your healthcare career with a rewarding travel CT Technologist job in Fredericksburg, Virginia (22401). Join a leading…

View Details
Posted 2025-08-29

Housekeeper

Blue Sky Hospitality Solutions
Williamsburg, VA

Primary duties include cleaning rooms, making beds, changing linens, restocking toiletries, and addressing guest requests. Here's a more detailed breakdown of a Hotel Housekeeper's job description: R…

View Details
Posted 2025-09-01

Digital Workplace Solutions Architect

NTT DATA, Inc.
Herndon, VA

Req ID: 330787 NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organiza…

View Details
Posted 2025-07-25

Program Analyst 2

Advanced Sciences and Technologies (AS&T)
Warrenton, VA

Duties: This position includes various types of functional support to the ATO Services Unit to enable the organization which includes programmatic, communications, management, and training. Mus…

View Details
Posted 2025-09-06