Scientific Analyst II
Data Analysis and Machine Learning Pipeline Development: Under moderate guidance collaborate in the design, develop, and execution of machine learning and AI-driven analytical pipelines to analyze large-scale biomedical datasets from UK Biobank, All of Us, Insight, and electronic medical records. Apply supervised and unsupervised machine learning algorithms (e.g., logistic regression, random forests, deep learning) to identify risk factors, biomarkers, and patterns associated with neurodegenerative diseases and the effects of menopausal hormone therapy (MHT) on brain health. Collaborate on the development and validation of predictive models integrating genomic, clinical, lifestyle, and imaging data using general knowledge of principals, theories and concepts. Drug Repurposing Research and Bioinformatics Analysis: Collaborating in computational drug repurposing analyses to identify existing FDA-approved compounds with potential efficacy for AD, PD, MS, and ALS prevention and treatment. Integrate multi-omics data (genomics, transcriptomics, proteomics) with clinical outcomes data to prioritize drug candidates. Collaborate with wet lab and clinical teams to support translational interpretation of findings. Epidemiological and Clinical Data Management and Harmonization: Access, curate, harmonize, and manage large population-based datasets including UK Biobank, All of Us, and institutional EMR data. Ensure data quality, reproducibility, and compliance with data use agreements and IRB protocols. Collaborate in the develop and maintenance of reproducible data pipelines using Python, R, and high performance computer. Perform statistical analyses including survival analysis, longitudinal modeling, and causal inference. Scientific Communication, Dissemination, and Collaboration: Compare and contribute to peer-reviewed manuscripts, conference presentations, and grant applications reporting research findings on MHT, menopause, and neurodegenerative disease. Present results to interdisciplinary research teams, departmental seminars, and external stakeholders. Collaborate closely with Dr. Francesca Vitali, co-investigators, and consortium partners. Maintain thorough documentation of analytical methods to ensure transparency and reproducibility. Participate in lab meetings, journal clubs, and professional development activities. Research Infrastructure and Continuous Improvement: Maintain and improve lab computational infrastructure, including code repositories (GitHub), analytical workflows, and documentation standards. Evaluate and adopt emerging AI/ML tools and methodologies relevant to brain science research. Assist in training junior lab members or graduate students on data science methods and tools as needed. Stay current with literature in neurodegenerative disease, computational. Knowledge, Skills and Abilities: Strong theoretical and applied knowledge of machine learning, deep learning, and statistical modeling. Strong data wrangling and preprocessing skills for large, heterogeneous datasets. Expert-level programming skills in Python and/or R; proficiency with ML libraries (scikit-learn, TensorFlow, PyTorch, XGBoost). Knowledge of drug repurposing methodologies or network pharmacology. Knowledge and familiarity with electronic medical records data analysis. Knowledge and proficiency with SQL and database management. Ability to collaborate effectively within interdisciplinary teams spanning data science, neuroscience, clinical research, and epidemiology. Ability to manage multiple concurrent projects and meet deadlines. Ability to critically evaluate scientific literature and translate findings into research hypotheses and analytical strategies. Ability to communicate complex analytical results clearly to both technical and non-technical audiences.