Guest blog contributed by Wenlan Pan, Statistical Programmer Analyst, Johnson & Johnson
Transforming the Landscape – The Rise of R in the China Pharmaceutical Industry
In recent years, the China Pharmaceutical Industry has undergone a significant transformation in data analysis practices, driven by the growing interest in utilizing R, a powerful statistical programming language, as an alternative to SAS. This article aims to delve into the significant strides made in R implementation within the industry in 2023, focusing on key meetings, notable examples of R implementation, and panel discussions surrounding the utilization of sharing R-generated reports and the harmonious development of open-source packages.
Key Meetings – Driving R Implementation in 2023
Several influential meetings and discussions in 2023 played a pivotal role in promoting the widespread adoption of R within the China Pharmaceutical Industry. Two meetings were specifically focused on R, while the other two had a broader scope with a growing emphasis on R-related topics. These meetings provided platforms for professionals to exchange knowledge, foster collaboration, and present innovative ideas:
1. The first China Pharma R User Group (RUG) Meeting
It was the first conference of its kind. This groundbreaking event brought together over 300 participants from Shanghai, Beijing, and online on March 31st. With 13 presentations from nine leading companies, the meeting served as a platform for professionals to share knowledge and explore innovative solutions with R for the pharmaceutical industry. It highlighted the growing importance of R in this field and allowed participants to delve into the latest developments, powerful R packages, and breakthrough methodologies that benefit the industry.
2. Open Source Clinical Reporting SummeR 2023 workshop
Hosted by Roche on August 29, this workshop emphasized the importance of open-source solutions and collaboration for clinical data reporting. Through interactive sessions, industry experts shared their experiences using open-sourced R packages like Admiral, NEST, and tidytlg, for tasks such as SDTM mapping, ADaM data creation, and TLG creation. The workshop featured eight presentations, demonstrated the effectiveness of R in generating clinical reports, and provided valuable insights into the successful utilization of open-source R packages for efficient clinical reporting.
3. Pharma Software Users Group (Pharma SUG) 2023 and PHUSE Single Day Events
These two meetings had a boarder focus, each centered around a specific theme. The first meeting focused on “New Policy, New Technology & New Opportunity”, while the second had a theme of “Standardisation-Driven End-to-End Automation”. Coincidentally, these events witnessed a surge in the number of papers and presentations dedicated to R implementation. During these meetings, professionals presented their research and experiences, highlighting how R, when coupled with domain-specific knowledge and standards, contributed to advanced analytics and navigation of complex challenges. It is worth noting that PharmaSUG 2023 also offered pre-conference training on R titled “Deep Dive into Tidyverse, ggplot2, and Shiny with Real Case Applications in Drug Development.” This training provided participants with practical skills for leveraging R’s powerful data visualization and analysis capabilities.
Exemplifying R’s Potential – Notable Examples of R Implementation
R has found wide-ranging applications in the pharmaceutical industry. Here are notable examples:
1. Regulatory Submissions and Reporting
R’s open-code nature enables the development and utilization of open-source projects focused on implementing or developing CDISC standards. In addition to writing R open codes, open-source packages are leveraged. For instance, OAK automates the mapping of CDASH to SDTM and generates raw synthetic data. Admiral, a modularized toolbox, facilitates the development of ADaM datasets in R. R packages like tidytlg make it easier to create tables, listings, and graphs (TLG) for clinical study reports. Notably, several R-based tools have already been officially recognized by the CDISC Open-Source Alliance (COSA) as open-source projects focused on implementing or developing CDISC standards. Another example is Dataset-JSON – R Package Implementation, which allows users to read and write JSON files while also providing functions to update metadata on the dataset. This could help meet the requirements of regulatory submission and other data exchange scenarios. Valuable experience in the development and implementation of such packages in practice had been shared during the meetings.
2. Statistical Analysis and modelling
R could be extensively used for statistical analysis and modeling in clinical trials since it provides a wide range of statistical functions and packages for efficacy and safety analysis. It offers alternative approaches of SAS like mixed-effect models for repeated measures (MMRM) and negative binomial regression, which may require a combination use of multiple packages but also indicate the flexibility of R considering that users can freely choose any packages as they prefer. Furthermore, R allows for the development of custom packages tailored to specific analysis needs, providing specialized functionalities, and enhancing overall data analysis processes.
3. Quality Control and Validation
R offers comprehensive tools and functions for ensuring stringent quality control and validation processes in data analysis and reporting. This is particularly useful during the transition period when validating the outputs produced by SAS using R. R’s built-in validation functions, combined with customized scripts, provide confidence in the accuracy and consistency of results. For example, R allows for the comparison of data frames and reports, offering a fast and efficient way to execute validation checks and generate a summary report, which is user-friendly and flexible.
4. Data Visualization and Interactivity
R’s Shiny package has revolutionized data visualization in the pharmaceutical industry. Going beyond traditional methods, Shiny enables the development of interactive dashboards. This empowers stakeholders to dynamically explore and interpret data, facilitating timely, data-driven decision-making. There are several examples of R shiny apps shared during the meetings, such as those for Prostate-Specific Antigen (PSA) navigation, baseline shiny framework for standard safety tables and figures as well as efficacy modules, and support popPK analysis even for users without known any programming knowledge.
Panel Discussion – Report Sharing and Open-Source Package Development
During the meetings, concerns were also proposed, and panel discussions were conducted with the joined experts from various companies.
Concerns were expressed about the direct use of R for internal and external sharing of reports. Multinational pharmaceuticals and regulators are exploring and attempting to use the new R language written programs to make submissions, marking a shift from submissions that were mainly based on the SAS language in the past. The industry actively engaged in this process and expected the results of pilot studies to evaluate the feasibility and effectiveness of this transition.
Balancing the development of open-source packages and utilizing packages from other companies is another concern within the industry. Organizations are better off adopting an ecosystem-driven approach, evaluating the strengths and weaknesses of different solutions. Active participation in the open-source community empowers organizations to contribute to the development of packages, thereby advancing the industry’s collective knowledge and capabilities.
Embracing a Data-Driven Future in the China Pharmaceutical Industry
The pharmaceutical industry in China is rapidly adopting R for data analysis. Key meetings and discussions have forested collaboration, knowledge-sharing, and innovation. Examples have showcased the vast array of applications, illustrating how R has been implemented in regulatory submissions and reporting, statistical analysis and modeling, quality control, and validation, as well as data visualization and interactivity. Concerns also arise and have been discussed regarding report sharing and the balance between the development and utilization of open-source packages. Embracing R as an essential tool ensures competitiveness and positions organizations at the forefront of scientific progress in the evolving pharmaceutical landscape.
About the Author
Wenlan Pan is a Statistical Programmer Analyst at Johnson & Johnson with over two years of programming experience in the pharmaceutical industry, currently supporting neuroscience studies. She has been using R for seven years and received a Master of Science degree in Biostatistics from the University of California, Los Angeles.