(Data Engineer)
Role: Data Engineer
Location: McLean VA -hybrid
Duration : Contract
Tech stack: Python Pyspark node.js vue.js AWS EMR/Glue Lambda S3 SNS DynamoDB Databricks
Key Responsibilities
- Design, build, test, deploy, and maintain end-to-end data pipelines (batch, micro batch, and streaming when needed) using PySpark, Python, and other tools.
- Use AWS services (EMR, Glue, Lambda, S3, SNS, DynamoDB, etc.) to orchestrate and manage data workflows.
- Integrate with Databricks environments as needed (e.g. for Spark workloads, notebooks, jobs).
- Ensure data ingestion from diverse sources (structured, semi-structured, unstructured) into data lake / warehouse layers.
- Optimize pipeline performance, reliability, scalability, and cost (e.g. tuning Spark jobs, partitioning strategies, caching, resource sizing).
- Implement data transformations, aggregations, and joins, ensuring correctness of logic and performance.
- Build monitoring, alerting, and logging for data pipelines and data platform components (e.g. CloudWatch, CloudTrail, custom dashboards).
- Ensure data quality, lineage, and accountability (validation, schema enforcement, error handling, retries).
- Collaborate with data scientists, BI/analytics teams, and application developers to understand data requirements and deliver usable datasets.
- Maintain and evolve data models, schemas, and metadata (e.g. catalog, data dictionary).
- Assist in architecture and design reviews; propose improvements and modernization (e.g. adopt new patterns, services, best practices).
- Participate in code reviews, documentation, and team knowledge sharing.
- Help with migrations, refactoring legacy ETL systems into cloud-native architecture when needed.
- Support deployment and CI/CD of data assets (infrastructure as code, versioning, automation).
- Troubleshoot production issues and performance bottlenecks, ensuring high availability and SLAs.
Required Qualifications / Skills
- 3 7+ years of experience in a data engineering / software engineering role (or equivalent) with a strong focus on Big Data, ETL/ELT, and cloud infrastructure.
- Expert in Python and PySpark / Apache Spark for data processing.
- Hands-on experience with AWS data and compute services: EMR , Glue , Lambda , S3 , SNS , DynamoDB (and optionally others like Athena, Kinesis, Step Functions).
- Experience integrating or working with Databricks (or similar managed Spark platform).
- Strong SQL skills and experience working with relational and NoSQL data systems.
- Experience designing efficient data partitioning, bucketing, join strategies, caching, and data-format optimization (e.g. Parquet, ORC, Delta).
- Ability to build and maintain data models / schemas, and understand star/snowflake, dimensional modeling, normalization/denormalization tradeoffs.
- Familiarity with orchestration and workflow tools (e.g. Airflow, AWS Step Functions, Glue workflows).
- Experience with CI/CD, version control (Git), and infrastructure-as-code (Terraform, CloudFormation).
- Good understanding of data governance, security, permissions, encryption, IAM, data lineage, auditing.
- Strong debugging, problem-solving, and performance tuning skills in distributed systems.
- Excellent communication, ability to interact with stakeholders and cross-functional teams.
- Ability to work independently in a hybrid setup and manage deliverables under tight deadlines.
Nice-to-have / Bonus Skills
- Experience with streaming platforms (Kafka, Kinesis, Pulsar, etc.)
- Familiarity with additional AWS services: Kinesis, EMR autoscaling, AWS Glue Catalog, Lake Formation, Athena
- Experience with Delta Lake, Iceberg, or similar lakehouse architectures
- Prior experience migrating from on-prem or legacy ETL to cloud-based architecture
- Experience with containerization (Docker) or orchestration (Kubernetes)
- Experience with data science / ML infrastructure (feature store, model inference pipelines)
- Experience with additional languages / frameworks (Node.js, Vue.js) in context of data services / UI dashboards
- Familiarity with DevOps practices: observability, logging, metrics, alerting frameworks
- Prior government, defense, or federal contracting experience (security clearances, compliance)
Thanks & Regards,
YOGITA
Recommended Jobs
Exploitation Analyst Top Secret Clearance | Norfolk, VA
Exploitation Analyst – Top Secret Clearance | Norfolk, VA Cambridge International Systems, Inc. Join a dynamic global team united by shared values: commitment, integrity, and perseverance. At C…
Java Web Developer
Strongbridge has an engaging opportunity for a talented Software Developer to build their career and help save lives on the nation's road infrastructure.The Software Developer will build and support …
Sales Advisor- Chesterfield Town Center
H&M is a fashion brand that offers the latest styles and inspiration, from fashion pieces and unique designer collaborations to affordable wardrobe essentials. Our business idea is fashion & qual…
Refrigeration Service Technician - Commercial & Industrial
Overview: Our team at Richmond Refrigeration Service is hiring! WHY COOLSYS? At CoolSys, we offer more than just a job—we provide stability, growth, and industry-leading benefits for our emp…
Parts Counterperson
The Parts Counter Person is responsible for customer service and retail sales at an automotive parts store. In this position, your responsibilities include selling parts and taking inventory of stock …
Store Manager
Description Our promise to every body. Living mighty. Living long. Living fit. Every person has a different definition of what it means to live well—and at GNC—we see that as something worth ce…
Test Engineer, Sr
Location: Hybrid: Washington D.C., Maryland, or Northern Virginia area) Clearance Requirement: Must be able to obtain Public Trust Tier 2 clearance for the Department of Defense (DoD) Experie…
System Development Engineer III - AMZ9080472
DESCRIPTION MULTIPLE POSITIONS AVAILABLE Employer: AMAZON.COM SERVICES LLC Offered Position: System Development Engineer III Job Location: Arlington, Virginia Job Number: AMZ9080472 …
Systems Integrator
Overview Arcfield was purpose-built to protect the nation and its allies through innovations in digital transformation, space mission engineering and launch assurance, miniaturized sensors and sat…