SEND Coding Bootcamp
Project Scope |
---|
The 4-day SEND Coding Bootcamp aims to help those working with SEND datasets become more productive by teaching basic coding and plotting skills. Over the course of a series of separate hands-on coding sessions, participants will learn the basics of programming and plotting using the R programming language. The course will be oriented around SEND datasets and will include reading, writing, plotting, and manipulating SEND datasets stored in .xptformat. Basic knowledge of the SEND standard is expected. No previous knowledge about programming or the R programming language is needed. |
Problem Statement |
---|
The CDISC-SEND data standard has created new opportunities to facilitate scientists for single-study and cross-study analyses of toxicology study data. As SEND datasets are currently in xpt format - though dataset-JSON format is being worked on - without tools or programming knowledge, there are still barriers for scientists to access SEND datasets in xpt or JSON format for analysis purposes. Additionally, SEND datasets are often manipulated manually using Microsoft Office software, e.g. Excel; however, these manipulations could be performed more efficiently and at a larger scale by data managers trained to write scripts using open source software languages, e.g. R and Python, to execute these manipulations. During PHUSE CSS 2024, participants expressed a desire to learn coding so that they can work with SEND datasets more efficiently. |
Problem Impact |
---|
Data managers’ skills are well suited for preparation and review of SEND datasets for submission, but often lack the skills necessary for handling increasingly common yet complex data science questions. Individuals in these positions risk becoming obsolete without the proper training for handling these commonly requested analyses and tasks as AI and automation advance. Data managers with this training may advance into proper data scientist roles as given opportunities in this field have been spurred on by the widespread dissemination and use of SEND datasets. Stakeholder organisations throughout the drug development industry stand to benefit from the upskilling of their own SEND data mangers to fill open data scientist positions without having to train external hires on basic domain knowledge related to SEND. |
Project Leads | |
---|---|
Kevin Snyder, FDA | |
Daniel Russo, Merck | Daniel.Russo@merck.com |
Wenxian Wang, BMS | Wenxian.Wang@bms.com |
Alex Pearce, PHUSE Project Assistant | Alexandra@phuse.global |
CURRENT STATUS Q4 2024 |
---|
|
Objectives & Deliverables | Timelines |
The SEND Coding bootcamp will include a series of sessions focused on teaching coding to allow students to analysis study using SEND datasets.The course will take place over four sessions and will build on knowledge learned from prior lessons. The general outline of the course is as follows (but subject to change). | |
Session 1 | Introduction to R Programming |
Session 2 | Basic Plotting with R |
Session 3 | Advance Plotting with ggplot |
Session 4 | Data Analysis with the tidyverse |