Remotery

Senior Data Engineer – Design, Architecture

atK2UnitedUS flagUnited StatesFull-timeData EngineerSenior$120k – $140k/year

Posted Jun 19

This is a fully remote position, open to applicants in United States.

📋 Description

• The Senior Data Engineer will lead the data engineering efforts on K2Share's Federal Team, collaborating with technical and product leadership to create data products that facilitate mission-critical decision-making for federal agency clients.

• Develop and construct relational data layers that manage OSCAL and other structured compliance data, including ingestion, validation, transformation, and export workflows that maintain fidelity to source schemas throughout the entire data lifecycle.

• Create and uphold data models that support governance, risk, compliance, scoring, and reporting processes for federal cybersecurity programs, utilizing OSCAL as the key connection across them, and including long-term retention and archival policies that comply with federal recordkeeping and audit mandates.

• Design and implement big-data processing pipelines on Databricks (PySpark, Delta Lake, Unity Catalog) that standardize cybersecurity data from various federal agency environments and generate analytical layers for trend analysis, executive reporting, and cross-program insights.

• Enhance data systems for performance and cost efficiency by identifying I/O and compute bottlenecks, responsibly scaling compute resources, and balancing throughput against the cost discipline required for federal engagements.

• Architect, build, and sustain AWS data infrastructure that complies with federal security and operational standards, working with services such as S3, Bedrock, Lambda, Fargate, and EC2 to support compliance and analytical workloads.

• Design and implement audit-ready data primitives—such as change capture, access controls, validation, and lineage—that meet agency reporting and continuous monitoring requirements.

• Spearhead AI-first development and responsible AI deployment within the data team, utilizing AI development tools as a standard component of the engineering process, prototyping AI-assisted compliance workflows, and designing the production AI systems that support them (RAG architectures, vector store management, conversational agents, prompt and output guardrails, and evaluation pipelines), in accordance with federal AI governance guidance (OMB, NIST AI RMF).

• Collaborate with federal agency stakeholders and internal teams throughout requirement discovery, delivery, and ongoing support—translating compliance needs into data products and incorporating customer feedback into improvements.


⛳️ Requirements

• Over 5 years of production data engineering experience, demonstrating a history of designing and managing data systems from start to finish.

• Extensive knowledge of relational databases, particularly PostgreSQL or similar, encompassing schema design at scale, indexing and partitioning strategies, access control, and methods for handling semi-structured data like JSON.

• Strong instincts in system design and architecture, capable of converting business and compliance requirements into data system designs, documenting trade-offs, and leading design reviews with both technical and non-technical stakeholders.

• More than 3 years of experience in big-data processing on Databricks, Spark, or similar technologies, including PySpark, Delta Lake, Unity Catalog, and medallion (bronze/silver/gold) architecture patterns.

• Solid AWS experience, including S3, Bedrock, Lambda, Fargate, EC2, relational database services, change-data-capture services, serverless computing, IAM, and KMS, preferably in GovCloud or other regulated cloud environments.

• Proficient in Python development, particularly with data manipulation libraries such as pandas, for ETL, transformation, and analytical workflows.

• Familiar with Git and modern version-control practices, including branching strategies, code review discipline, and collaborative workflows in a team environment.

• Experienced in working with structured external schemas, such as OSCAL or similar standards-based data, with a commitment to maintaining fidelity during transformation.

• Proven focus on optimizing data systems, identifying I/O bottlenecks, resolving performance issues, balancing cost and performance, and responsibly scaling compute resources.

• Knowledge of schema evolution strategies, including migration planning, backward compatibility, change-data-capture-friendly design, and the operational rigor of managing production schemas under change control.

• Experience with asynchronous data processing patterns, such as task queues, message-based pipelines, and idempotent task design.

• Regular use of AI development tools as an integral part of the engineering workflow, with well-informed perspectives on their benefits and areas that require oversight.

• Familiarity with responsible AI deployment patterns, including RAG architectures, vector databases, embedding management, prompt and output guardrails, and evaluation methodologies.

• Working knowledge of federal cybersecurity frameworks such as FISMA, NIST RMF, NIST SP 800-53, and NIST CSF.

• Demonstrated ability to interpret regulatory and policy guidance and translate it into product or data-product requirements.

• Comfortable navigating ambiguous, fast-paced federal programs with minimal supervision and possessing strong collaborative instincts.


🏝️ Benefits

• 401(k) plan with employer matching contributions.

• Comprehensive medical benefits for employees and their families at a low cost.

• Flexibility for individuals needing time off for jury duty, voting, military leave, etc.

• Paid time off.

• Wellness stipend program, which includes a fitness reimbursement initiative.

• Tuition stipend.

• Casual dress work environment.

• Access to technical training and certifications as needed.

• Free access to any of our CareerSafe Online training courses for employees and their immediate family.

People also viewed

Agiloft2 hours ago

AI Data Platform Lead

CA flagCanada OnlyFull-timeData Engineer
ApplyView job
Oscilar2 hours ago

Data Engineer

BR flagBrazil OnlyFull-timeData Engineer
ApplyView job
HubSpot2 hours ago

Senior Product Manager, Events Data Platform

US flagUnited States OnlyFull-timeData Engineer$140k – $175k/year
ApplyView job
Prima3 hours ago

Technical Product Manager – Data Platform

IT flagItaly OnlyFull-timeData Engineer
ApplyView job
Newfire Global Partners3 hours ago

Senior Director, Clinical Data Engineering

US flagMassachusetts OnlyFull-timeData Engineer$229k – $280k/year
ApplyView job
Latino Legends3 hours ago

Senior Data Engineer

AR flagArgentina OnlyFull-timeData Engineer$6,000 – $8,500/month
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers