This is a fully remote position, open to applicants in Brazil.

📋 Description

• Development of Predictive Models: Create propensity to default models using traditional ML techniques (binary classification, ensemble models such as Random Forest, Gradient Boosting, and equivalents), focusing on precision, recall, and production stability.

• Feature Selection and Engineering: Conduct exploratory data analysis (EDA), identify and select relevant variables from structured contractual data, and progressively incorporate external variables (weather, harvest, macroeconomic scenarios, bankruptcy recovery data, and news sources).

• Experimentation in Databricks: Develop and version modeling experiments using Databricks and MLflow, ensuring traceability of runs, parameters, metrics, and model artifacts throughout iterative cycles.

• Model Validation and Evaluation: Design and execute robust validation strategies (cross-validation, temporal backtesting, score stability analysis) to ensure that models perform reliably over increasing projection windows (3, 6, and 12 months).

• Iterative Improvement Cycles: Actively participate in iterative model refinement cycles—each sprint, incorporate new variables, reassess performance, and document learnings in the program's Knowledge Base.

• Collaboration with Data Engineering: Work closely with Data Engineers to ensure that data pipelines correctly feed the models and that model outputs (scores, projections, alerts) are made available at the appropriate platform layers.

• Communication of Results: Translate technical model results into business language, supporting the Data Strategist in communicating with client stakeholders (superintendencies, credit, leadership).

• Technical Documentation: Document methodologies, modeling decisions, and outcomes in structured formats that contribute to the Knowledge Base and can be leveraged by AI agents in subsequent phases.

• Production Monitoring: Monitor model performance in production, identify distribution deviations (data drift, concept drift), and propose corrective actions or re-training.

⛳️ Requirements

• Solid experience in data science with a focus on predictive modeling for production business problems.

• Proven experience with classification and ensemble models (Random Forest, Gradient Boosting, XGBoost, or equivalents) in credit, risk, or anomaly detection contexts.

• Experience with Databricks for model development, experimentation, and versioning (Delta Lake, MLflow, Spark MLlib, or equivalent libraries in a distributed environment).

• Strong expertise in feature selection, handling imbalanced data, and temporal validation strategies for risk models.

• Experience in data analysis and modeling within the AWS ecosystem (S3, Athena, SageMaker, or equivalent managed ML cloud services).

• Ability to communicate model results and limitations to non-technical audiences clearly and with a focus on business decision-making.

• Experience in the financial services sector (credit, risk, delinquency, or similar).

🏝️ Benefits

• Health and dental insurance;

• Meal and food vouchers;

• Childcare assistance;

• Extended parental leave;

• Partnerships with gyms and health and wellness professionals via Wellhub (Gympass) TotalPass;

• Profit-sharing (PLR);

• Life insurance;

• Continuous learning platform (CI&T University);

• Discount club;

• Free online platform dedicated to promoting physical, mental health, and well-being;

• Responsible parenting and pregnancy course;

• Partnerships with online course platforms;

• Language learning platform;

• And many more.

Data Scientist

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Data Scientist

Apprenti Ingénieur Data Scientist – INFRA, STF

Data Scientist – Early Hire, Full Model Ownership, B2C SaaS

Data Manager I

Senior Clinical Data Science Programmer

Data Scientist – Mid

Never miss a great job!