
Data Engineer, MS
Posted 1 day ago

Posted 1 day ago
• Design and develop ETL pipelines that efficiently ingest, validate, and standardize data from various sources, ensuring high accuracy and performance.
• Create matching logic and implement fuzzy matching algorithms utilizing tools such as rapidfuzz to identify and resolve data discrepancies across datasets.
• Establish automated report generation systems that yield Excel and PDF outputs featuring complex formatting, formulas, and dynamic content.
• Execute data transformation workflows, including date normalization, field mapping, and scenario classification in accordance with business rules.
• Develop thorough unit tests using pytest and uphold code quality through version control (Git) and code review processes.
• Convert business requirements into technical specifications by documenting matching rules, reconciliation scenarios, and data workflows based on stakeholder discussions.
• Enhance database queries and data operations by using relational databases (PostgreSQL, MySQL) and SQLAlchemy ORM for optimized data access.
• Work collaboratively with cross-functional teams to address data issues, validate outputs, and consistently improve pipeline reliability and performance.
• Over 5 years of professional experience in Python development.
• Strong practical knowledge of pandas, openpyxl, and xlsxwriter for data manipulation and Excel generation.
• Demonstrated experience in constructing and sustaining ETL pipelines and data matching logic.
• Comprehensive understanding of relational databases (PostgreSQL, MySQL) and SQL query optimization techniques.
• Familiarity with fuzzy matching methods and date/time normalization processes.
• Proficient in using pytest for unit testing and Git for version control.
• Capability to manage requirements independently, prioritize tasks, and work autonomously in a remote setting.
• Excellent communication abilities, with a talent for clearly documenting technical solutions and business rules.
• Nice to Have: Experience with SQLAlchemy ORM and advanced database design concepts.
• Knowledge of Databricks or other cloud data platforms.
• Understanding of Tableau or similar business intelligence tools.
• Experience with data quality frameworks and validation techniques.
• Background in business analysis or requirements gathering.
• Exposure to CI/CD pipelines and automated testing frameworks.
• Flexible remote work options within a collaborative team environment.
• Opportunity to make a significant impact on data-driven solutions that assist organizations in making informed decisions.
• Professional development opportunities to enhance your skills in cloud platforms, advanced analytics, and system architecture.
• Competitive compensation based on experience.
• Full-time role offering long-term project stability.
• A collaborative culture where your technical skills and insights are recognized and valued.
• Chance to mentor junior developers and influence engineering best practices.
SmartLight Analytics
CloudSmiths
BPCS, Comprehensive marketing solutions, ltd.
Get handpicked remote jobs straight to your inbox weekly.