
Senior Data Scientist
Posted 2 days ago

Posted 2 days ago
This is a fully remote position, open to applicants in Pennsylvania.
• Defining and architecting NLP/NLU applications that enhance the clinical interpretation of medical documents and optimize healthcare workflows.
• Converting business and clinical needs into mathematical and experimental benchmarks, ensuring that models are quantifiable, dependable, and compliant with auditing standards.
• Processing, aligning, and enriching a variety of structured and unstructured medical datasets for deep learning, LLMs, and agentic AI systems.
• Creating and assessing deep learning models, including transformer-based structures and neural architectures like CNNs and attention mechanisms for identifying linguistic patterns in clinical text.
• Recognizing and addressing model vulnerabilities, including biases, drift, hallucination behaviors, and issues related to robustness.
• Designing experiments and error analyses that elucidate model behavior and facilitate effective communication with both technical and non-technical stakeholders.
• Implementing models into Solventum’s cloud infrastructure, collaborating with ML engineering to guarantee reliability, observability, and adherence to regulations in healthcare systems.
• Keeping abreast of the latest research in NLP/LLMs, assessing new methodologies, and pinpointing opportunities to enhance Solventum’s deep learning projects.
• Master's degree or PhD in computer science, mathematics, or related fields, or a Bachelor's degree with a minimum of 5 years of IT experience.
• Strong proficiency in Python, particularly in deep learning for text analysis, and familiarity with libraries such as PyTorch and Transformers.
• Solid understanding of statistics and exploratory data analysis.
• US citizenship or permanent residency is required.
• Experience in research-driven NLP/NLU projects involving representation learning, attention mechanisms, or hybrid neural architectures.
• Capability to self-manage across various technical and business environments, effectively communicating complex findings with clarity and confidence.
• Experience in extracting insights from intricate clinical datasets and presenting these findings to diverse audiences.
• Familiarity with AWS, GitHub, CI/CD, and scalable ML deployment practices.
• Practical experience with LLMs, prompting, fine-tuning, or agentic AI frameworks.
• Experience with the ETL of large-scale text using tools like PySpark, Spark NLP, or distributed data frameworks.
• Exposure to clinical coding systems or medical terminologies (e.g., ICD, CPT, SNOMED) is advantageous.
• Medical, Dental & Vision
• Health Savings Accounts
• Health Care & Dependent Care Flexible Spending Accounts
• Disability Benefits
• Life Insurance
• Voluntary Benefits
• Paid Absences
• Retirement Benefits
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.