Senior Manager, Solutions Architecture

Deloitte
Richmond, VA
We are seeking an accomplished HPC/AI Platform Engineering Manager to lead the design, implementation, and optimization of advanced computing environments that power AI, ML, and LLM workloads. This role is ideal for a hands-on technologist with deep expertise in HPC systems, GPU-accelerated infrastructure, and large-scale AI deployments-combined with the leadership's ability to drive fast-paced, innovative initiatives. You will collaborate with engineering, research, and business teams to define infrastructure strategy, assess emerging technologies, and deliver scalable, secure, and high-performance solutions. This role is pivotal in advancing generative AI, analytics, and model training capabilities through robust architecture, automation, and software integration. Recruiting for this role ends on January 31, 2026. Key Responsibilities Architecture & Strategy + Design and implement HPC and AI infrastructure leveraging HPE Apollo, ProLiant, Cray, and similar enterprise-class systems. + Architect ultra-low-latency, high-throughput interconnect fabrics (InfiniBand NDR/800G, RoCEv2, 100-400 GbE) for large-scale GPU and HPC clusters. + Deploy and optimize cutting-edge NVIDIA GPU architectures (e.g. H100, H200, RTX PRO / Blackwell series, NVL based systems) + Develop scalable hybrid HPC and cloud architectures across Azure, AWS, GCP, and on-prem environments. + Establish infrastructure blueprints supporting secure, high-throughput AI workloads. AI/ML & LLM Platform Enablement + Build and manage AI/ML infrastructure to maximize performance and productivity of ML research teams. + Architect and optimize distributed training, storage, and scheduling systems for large GPU clusters. + Implement automation, observability, and operational frameworks to minimize manual intervention. + Deploy and manage GPU-accelerated Kubernetes clusters for AI and HPC workloads. + Integrate open-source GenAI components, including vector databases and AI/ML frameworks, for model serving and experimentation. + Identify and resolve performance and scalability of bottlenecks across infrastructure layers. Software Engineering & Integration + Develop and maintain automation tools and utilities in Python, Golang, and Bash. + Integrate HPC infrastructure with ML frameworks, container runtimes, and orchestration platforms. + Contribute to job scheduling, resource management, and telemetry components. + Build APIs and interfaces for workload submission, monitoring, and reporting across heterogeneous environments. Containerization & Orchestration + Design Kubernetes and OpenShift architectures optimized for GPU and AI workloads. + Implement GPU scheduling, persistent storage, and high-speed networking configurations. + Collaborate with DevOps/MLOps teams to build CI/CD pipelines for containerized research and production environments. Systems & Automation + Oversee Linux system architectures (RHEL, Ubuntu, OpenShift) with automation via Ansible and Terraform. + Implement monitoring and observability (e.g Prometheus, Grafana, DCGM, and NVML) + Ensure system scalability, reliability, and security through proactive optimization. Governance & Leadership + Ensure architecture and deployments comply with organizational and regulatory standards. + Conduct technical workshops, architecture reviews, and presentations for both technical and executive audiences. + Define and drive the infrastructure roadmap in partnership with business stakeholders. + Mentor and lead engineering teams, translating business requirements into actionable technical deliverables. + Foster innovation and cross-functional collaboration to accelerate AI/ML initiatives. Required Qualifications + 10+ years of experience in HPC architecture, systems engineering, or platform design with a focus on architecting and operating on-premises Kubernetes for large-scale AI/ML workloads. + 3+ years working hands on and with a proficiency utilizing Linux, Python, Golang, and/or Bash. + 2+ years leading teams and/or processes + 2+ years of recent experience working with GPU platforms (strong preference for NVIDIA), distributed systems, and performance optimization. + Ability to travel 0-10%, on average, based on the work you do and the customers you serve. + Must be a US Citizen. Preferred Qualifications + Master's or Ph.D. in Computer Science, Electrical Engineering, or related discipline and work experience. + Demonstrated success supporting LLM training and inference workloads in both R&D and production environments. + Strong knowledge of high-performance networking, storage, and parallel computing frameworks. + Exceptional communication and leadership skills, capable of bridging technical depth with executive strategy. The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Deloitte, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is $130,000 to $241,000. You may also be eligible to participate in a discretionary annual incentive program, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance. Information for applicants with a need for accommodation: EA_ExpHire All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or protected veteran status, or any other legally protected basis, in accordance with applicable law.
Posted 2025-11-20

Recommended Jobs

Fitness Director

US Fitness Holdings
Alexandria, VA

Job Requirements Join a fast-growing health-club company! Lead and develop our fitness team, create memorable member experiences, and drive personal training growth — all while keeping the floor s…

View Details
Posted 2025-10-21

Surgical Technologist II - Vascular OR OOJ - 32705

Hatch Global Search
Roanoke, VA

Job Description During an operation, surgical technologists pass the sterile instruments and supplies to surgeons and first assistants. They might hold retractors, hold internal organs in place du…

View Details
Posted 2025-09-29

Kitchen and Bath Designer (Design Sales)

Dulles, VA

Now Hiring: Kitchen and Bath Designer (Design Sales) Dulles ProSource’s mission is to help our trade pros and their customers complete successful projects. As a ProSource Kitchen and Bath Designer…

View Details
Posted 2025-11-20

Medical Assistant

ChenMed
Richmond, VA

We’re unique. You should be, too. We’re changing lives every day. For both our patients and our team members. Are you innovative and entrepreneurial minded? Is your work ethic and ambition off t…

View Details
Posted 2025-11-14

Historic Richmond Awaits: Join Our Telemetry Adventure!

NurseRecruiter
Richmond, VA

Registered Nurse - Telemetry - Travel - (Tele RN) Join a travel Telemetry RN role in historic Richmond, caring for cardiac and step‑down patients on 12‑hour day shifts; immediate start for profession…

View Details
Posted 2025-08-20

Commercial Electrician

Moliar Hospitality Group
Norfolk, VA

We are looking for new candidates to join our ever-growing team! Our ideal candidate would be a detail-oriented/experienced Mechanic Electrician. The ideal candidate must be capable of running jobs w…

View Details
Posted 2025-10-06

OP General Gastroenterogist

BECA Staffing Solutions LLC
Charlottesville, VA

FT Outpatient-only Gastroenterologist Opportunity in Charming Charlottesville, VA Our award-winning, non-profit medical group is seeking a General Gastroenterologist to join our team at our new…

View Details
Posted 2025-11-21

Senior Manager, Project Management Compliance & Ethics

Capital One
Richmond, VA

Senior Manager Project Management - Compliance & Ethics As a Sr. Manager level Project Manager in Capital Ones Compliance and Ethics PMO you will be a part of an organization thats dedicated to …

View Details
Posted 2025-11-22

Commercial Driver - Full Time

AutoZone, Inc.
Fredericksburg, VA

**Job Description** AutoZone's store teams are the frontline of WOW! customer service, ensuring that customers find the right parts and solutions for their automotive needs. Store employees maintain w…

View Details
Posted 2025-11-14

Territory Manager - Soft Surface

Mohawk Industries
Richmond, VA

Are you looking for more? At Mohawk Industries, we’re committed to more – more customer solutions, more process improvements, more sustainable manufacturing and more opportunities for our t…

View Details
Posted 2025-09-26