window.intercomSettings = { app_id: "w29sqomy", custom_launcher_selector:'#open_web_chat' };Skip to main content
Category

Blog

Unlocking Chemical Volatility: How the volcalc R Package is Streamlining Scientific Research

By Blog

The R Consortium recently interviewed Kristina Riemer, director of the CCT Data Science Team at the University of Arizona, and Eric Scott, Scientific Programmer and Educator in the CCT Data Science Team, the developers behind the volcalc package, to discuss the motivation and development of this innovative tool designed to automate the calculation of chemical compound volatilities. volcalc streamlines the process by allowing users to input a compound and quickly receive its volatility information, eliminating the need for time-consuming manual calculations. Initially created to assist Dr. Laura Meredith in managing a large database of volatile compounds, volcalc has since grown into a more versatile tool under Eric’s leadership, now supporting a wider range of researchers. 

Kristina and Eric share insights into the challenges they faced, including managing dependencies, integrating with CRAN and Bioconductor, and refining complex molecular identification methods. They also discuss future enhancements, such as incorporating temperature-specific volatility calculations and expanding the package’s functionality to estimate other compound characteristics. This project was funded by the R Consortium. 

Could you share what motivated the development of the volcalc package and how it aligns with the broader goals of the R ecosystem, particularly in scientific computing?

Kristina: I was heavily involved in the initial development of volcalc, and later on, Eric took over the project. We developed volcalc because we began collaborating with Dr. Laura Meredith, who was compiling a database of volatile chemical compounds. At the time, she had around 300 compounds, and her students manually gathered details for each one by examining their representations and calculating various associated values. This process was tedious and prone to errors, so we thought there must be a more efficient and automated way to handle it.

That’s when we came up with the idea of creating a pipeline where someone could input a compound and quickly receive its volatility information, eliminating the need for all the manual labor. The purpose of volcalc was to transform the process from taking months to gather details for 300 compounds to obtaining information for thousands in a much shorter time.

Eric: volcalc was initially developed specifically for a project where the researchers were mainly interested in chemical compounds from the KEGG database (Kyoto Encyclopedia of Genes and Genomes). When I joined the team and learned about the project, I was thrilled because, as a chemical ecologist, I saw its potential. However, I also recognized a limitation: the tool only worked with the KEGG database. This was a drawback because many researchers, including food scientists and others who work with similar compounds, might not find their compounds in that specific database.

This realization inspired me to apply for the R Consortium grant. We saw a significant opportunity to expand volcalc, making it more flexible and applicable to a wider range of researchers. We also wanted to improve its integration within the R ecosystem by adding features like returning the file path of a molecule representation after downloading it, so it could be easily piped into subsequent steps. These enhancements aimed to make the tool more versatile and user-friendly for a broader audience.

What were the most significant challenges you faced during the development of the initial version of volcalc, and how did you overcome them?

Kristina: One of the most challenging aspects of developing volcalc, which continues to be an issue, is managing dependencies. Specifically, we rely heavily on a command-line program to handle much of the processing. Early on, we struggled with how to enable users to run volcalc without needing to install this program on their own computers, as many of our users aren’t familiar with that kind of setup. I spent a lot of time trying to create a reproducible environment using Binder, but I was never able to get it fully working. Even today, there are still issues related to managing these dependencies, which Eric can elaborate on further.

It was incredibly important to have Eric on this project because I don’t have a strong background in chemistry. His ability to come in and figure out some of the intricate details that would have taken me much longer to grasp was a huge advantage. The more we can collaborate with domain experts, the better our results will be.

Eric: One thing that has helped with the dependency challenges is that we’ve started building volcalc on R-Universe, which means binaries are available there. While it’s not on CRAN yet, having these binaries on R-Universe makes installation a bit easier. However, we’ve faced some challenges with dependencies, particularly because two of them are from Bioconductor. We didn’t originally aim to develop this package for Bioconductor, which uses S4 objects and has different standards than CRAN. Our goal was to get it on CRAN, but our first submission was rejected because the license field for the Bioconductor package wasn’t formatted to CRAN’s liking. These differences between Bioconductor and CRAN have created barriers, even though the authors of the Bioconductor package have been very responsive. Their package works fine on Bioconductor, but it doesn’t meet CRAN’s criteria, which has been a frustrating challenge.

Another major challenge in developing volcalc relates to the method we use for estimating volatility. This method involves counting the numbers of different functional groups on molecules—such as hydroxyl groups or sulfur atoms—and assigning coefficients to them. To do this programmatically, we use something called SMARTS, which is essentially like regular expressions but for molecular structures. Regular expressions for text are already challenging, but SMARTS is even more complex because it deals with three-dimensional molecules.

Before I joined the group, the first version of volcalc had most of these functional groups figured out, but not all. I spent a significant amount of time trying to develop SMARTS strings to match additional molecules. Moving forward, I hope that if we implement new versions, we can get help from the community to refine these SMARTS strings, as there are likely people out there who are more skilled at it than I am.

The original project proposal mentions expanding volcalc to work with any chemical compound with a known structure. What are the key technical challenges you anticipate in achieving this goal?

Eric: This task turned out to be less difficult than I initially expected, but let me explain. In the original version of volcalc, before we received the R Consortium funding, the main function started with a KEGG ID—an identifier specific to the KEGG database. The function would download a MOL file, which is a text representation of a molecule corresponding to that ID. It would then identify and count the functional groups in the molecule, and finally, calculate the volatility based on those counts.

The major change we needed to implement to make volcalc more versatile was to decouple these steps. In the current version of volcalc, the functionality to download a MOL file from KEGG is still available, but it’s now separate from the main function that calculates volatility. This means that the inputs for calculating volatility can now be any MOL file, not just ones from KEGG. The file can come from any database, be exported from other software, or even be downloaded manually. Additionally, the tool now supports SMILES, which is another, simpler text-based representation of molecules.

There are various ways to represent chemicals in text, including another format called InChI. The Bioconductor packages we use, ChemmineR and ChemmineOB, have the ability to translate from InChI and other types of chemical representations. However, that feature isn’t available on Windows. So, I decided to keep volcalc focused on SMILES and MOL files. I believe that chemists and other researchers should be able to obtain data in one of these two formats, or use another tool to translate their data into these formats. I didn’t want to overload volcalc with the responsibility of being a chemical representation translator, as that didn’t seem like its primary purpose.

Can you walk us through the process of implementing the SIMPOL algorithm within the volcalcc package?

Kristina: The algorithm itself is fairly simple; it’s just basic math. You need to input some constants, the mass of the compound, and the counts of the functional groups we discussed earlier. Writing the code for this was straightforward and not particularly challenging.

Eric: Each functional group has a coefficient associated with it, which is multiplied by the number of times that group appears in the molecule. These values are then summed up, and the mass of the molecule is factored in as well. The challenging part wasn’t the algorithm itself, which is straightforward—just multiplying by coefficients and adding them up. The real difficulty was interpreting what the authors of the algorithm meant by each of the functional groups. Some were oddly specific, like how the hydroxyl group that is part of a nitrophenol group isn’t supposed to count toward the total number of hydroxyl groups. I spent a lot of time poring over the paper, particularly one table, to fully understand how they defined each group. That interpretation was the hardest part.

What future functionalities or expansions do you see as crucial for volcalc, especially in the context of evolving research needs in chemoinformatics?

Eric: Right now, we’re working on allowing users to specify different temperatures. The paper that describes the SIMPOL.1 method includes equations for how the coefficients of each functional group change with temperature. These changes aren’t always linear, and the contributions of functional groups can shift in importance as the temperature varies. This is an important feature to include because the version of volcalc we currently have uses coefficients calculated at 20°C, based on a table from the original paper. To accommodate other temperatures, we need to integrate another table that provides equations for calculating these coefficients based on temperature, and that’s what we’re working on.

Another key feature we want to leave room for in the future is the ability to add other methods for estimating volatility. SIMPOL.1 is just one type of group contribution method, but there are other approaches described in various papers that use different functional groups, equations, and coefficients. The basic idea remains the same: count the functional groups in a molecule, apply an equation, and estimate volatility. We’re trying to structure the code in a way that makes it easy to incorporate additional methods later, even if we don’t add them right away. I think these are the most important features we’re focusing on right now.

Kristina: We’re focused on the features I mentioned in the near future, but looking further ahead, I could see volcalc expanding to estimate other characteristics of compounds beyond just volatility. While I’m not a chemistry expert or a chemical ecologist, I imagine that those interested in volatility might also be interested in other compound characteristics that currently lack automated tools for estimation. So, it’s possible the package could evolve to include those features.

That said, one of the things I appreciate about the R package ecosystem is that it allows for specialized tools. Since anyone can build what they need, we don’t end up with massive, overly complex packages that try to do everything and become difficult to maintain. It might be better to keep volcalc focused and leave room for separate packages to handle additional functionality. This way, the tools remain manageable and easier to maintain in the long run.

How has it been working with the R Consortium? Would you recommend applying for an ISC grant to other R developers?

Kristina: The application process was straightforward, and I found the grant format to be very practical. It was focused on milestones and product development, which is refreshing compared to many academic research grants that tend to avoid specific deliverables. I highly recommend considering this grant. I believe people often overlook smaller funding sources, but even small amounts can make a big impact on the work you’re doing.

Eric: The first time I applied for an R Consortium grant was as a grad student, and I strongly encourage trainees to apply as well. It was a great experience for me because I could do it independently—my advisor wasn’t involved as one of the authors, and it wasn’t a complex process like applying for an NSF grant. It was straightforward and really rewarding. The only tricky part was figuring out the payment process, but that’s something people can work out.

I’ve noticed there seem to be fewer projects in recent years, and I don’t think it’s due to a lack of funding. It seems like fewer people are applying, which is why I especially encourage others to give it a shot. From what I’ve seen, there’s a very good chance of getting funded if you apply right now.

People should be creative and think broadly about how their project can benefit the broader R community. This doesn’t mean you need to develop the next big thing like R-Universe or CRAN. It can be something smaller, like a package that other R users will find helpful. For example, with our project, volcalc, our main goal was to encourage chemists—who usually use point-and-click software—to start using R. That was enough of a contribution to the R community to get funded. So, I really encourage people to think creatively about what “benefiting the R community” can mean.

About ISC Funded Projects

A major goal of the R Consortium is to strengthen and improve the infrastructure supporting the R Ecosystem. We seek to accomplish this by funding projects that will improve both technical infrastructure and social infrastructure.

Free Boba Tea and Technical R Topics Lure Young Learners to New Brunei R User Group

By Blog

Haziq Jamil, the founder and organizer of the Brunei R User Group, recently spoke with the R Consortium. Haziq established the first R User Group in Brunei to promote R programming and create collaborative learning environments. Under his leadership, the group hosts monthly meetups and events to advance R skills across various sectors in Brunei. Through these efforts, Haziq aims to build a supportive and inclusive R community, encouraging both personal growth and data-driven innovation in the region.

Please share your background and involvement with the RUGS group.

My name is Haziq Jamil, and I am an Assistant Professor in Statistics at Universiti Brunei Darussalam, the leading higher education institution in Brunei. I have used R for almost ten years during my studies and on many personal and professional projects. 

The Brunei R User Group was founded in February 2024, and I serve as its chair and founder. My role is to lead the group’s administration, oversee its overall direction and strategy, and ensure that its initiatives align with its mission of promoting R programming and fostering a supportive learning environment, focusing on community engagement and collaboration.

Can you share what the R community is like in Brunei? 

The R community in Brunei may be small, but it is growing thanks to the efforts of the Brunei R User Group. The group organizes monthly meetups and events to promote learning and development in R programming and advance its use across various fields in Brunei. These gatherings provide opportunities to expand the community by enhancing participants’ skills, offering a platform for networking with like-minded individuals, and engaging in practical applications such as data analysis, visualization, and spatial data techniques. By creating an inclusive R community, the group aims to support individual growth in Brunei and foster collaboration on data-driven R projects. Whether for students, professionals, or hobbyists, the group strives to provide a supportive space for learning, sharing insights, and driving innovation within the local R community.

You hosted a Meetup, “R>aya with R,” in April. Can you share more about the topic? Why this topic? 

The R User Group’s “R>aya with R” Meetup in Brunei was a lively event that combined Hari Raya Aidilfitri’s (Eid-ul-Fitr) festive spirit with the exploration of R programming. To engage younger audiences, we offered free boba tea as a beverage during the session. The event featured informative sessions led by expert community members, each focusing on advanced topics relevant to different fields.

One of the critical presentations was on “Survival Analysis” by Dr. Elvynna Leong. She explained statistical techniques to predict the time until an event of interest, such as guests’ arrival at a Hari Raya open house. This topic is directly related to fields that depend on time-to-event data, such as healthcare or actuarial science.

One of the highlights was Wafid Sophian’s session on “Simulation Methods for Economic Analysis.” In this presentation, Wafid demonstrated how R can be used to simulate and analyze complex economic scenarios. The topic focused on modeling outcomes and making data-driven predictions. It was chosen due to its significance in finance and business analytics.

Dr. Eden Ng presented on “Mathematical Modeling of Evolutionary Biology,” showcasing how R can be used to model biological evolutionary processes. This session visualized the intersection of R programming and biology, thus emphasizing R’s utility in research areas such as genetics and evolutionary studies.

The event covered three topics in the areas of mathematics, economics, and biology. It was open to all individuals interested in learning about the capabilities and usage of the R language, regardless of whether they were beginners or experts.

Do you recommend any techniques for planning for or during the event? (Github, Zoom, other.) Can these techniques be used to make your group more inclusive to people who cannot attend physical events in the future? 

For planning and executing events like the “Analysing Spatial Data with R” event, the R>aya Meetup, and the “Introduction to R” sessions, we utilized Github. To host R scripts, datasets, and event materials. Our participants can access and review the code before and after the event. It also provides a platform for issue tracking and version control to facilitate feedback and group collaboration. It helps those who can’t attend in person to engage and contribute to our Quarto Blog. To publish event summaries, key takeaways, and additional resources on the official Brunei R User Group blog. We hope to provide a centralized location for information and help remote participants catch up. 

Please share any additional details you would like to include in the blog. 

As the first R user group in Brunei, we are excited to promote the adoption and growth of R across various industries. Our mission goes beyond just hosting events—we are dedicated to creating and nurturing an inclusive R community and showcasing the power of R in numerous fields.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 68,000 members in 33 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

Empowering Data Science: How R is Transforming Research in Cameroon

By Blog

NyAvo RATOVO-ANDRIANARISOA, the co-founder of the R Community Cameroon, recently spoke with the R Consortium about the rapid growth of the R community in Cameroon and the impact of R on local research and data analysis. NyAvo provided insights into the community’s activities, such as developing an R community website using Shiny and implementing innovative projects like a custom search application. He also discussed the challenges and strategies in building a robust R ecosystem in Central Africa.

Please share your background and involvement with the RUGS group.

I am a statistical engineer from the Institute of Statistics of Central Africa in Cameroon, originally from Madagascar. Currently, I serve in a monitoring and evaluation role with the United Nations. My journey with R began about five years ago as a student. I started learning R during my studies and expanded my expertise by freelancing as a Shiny developer. Over time, I’ve also gained experience with OCR technology, working with Tesseract and utilizing Google Cloud Colab’s API Amazon Web services for R and artificial intelligence models.

With a solid statistical background, I specialize in data mining. In 2020, before learning about the R Consortium campaign, I had already envisioned creating an R community in Cameroon. After discussing the idea with some colleagues and discovering the potential support, it became the perfect opportunity to bring this vision to life. My colleagues and I officially launched the R Community Cameroon at the beginning of this year.

I would like to mention my colleagues who have co-founded the group with me: Romain TCHAKOUTE, Idrissa DABO, Saidou BOUREIMA, Mianala MANAMBIRAVAKA, and Ronald DJEUMEN.

Can you share what the R community is like in Cameroon? 

Cameroon is home to one of Central Africa’s largest science schools. I am part of a vibrant academic community of 30 members from several nationalities. This environment brings together some of the brightest minds, and R is an integral part of our curriculum. However, our use of R goes beyond basic statistics and plotting; we focus on more sophisticated applications, such as Monte Carlo estimation, model development, and advanced R programming.

Several professors in our school possess strong statistical backgrounds and rely on R for their research. Additionally, our alums who have transitioned into the industry continue to leverage R for data analysis. While Cameroon has a limited amount of data, we conduct numerous surveys. The initial step most of us take is data cleaning, predominantly using R. Once the data is clean, we employ Quarto to generate automatic reports, allowing us to summarize survey results quickly. Some of my colleagues also explore other functionalities of R, like creating applications with R Shiny.

Another significant group is comprised of economics students or those working on their theses. They often seek our assistance to learn R for tasks such as descriptive statistics and building logistic models.

Do you recommend any techniques for planning for or during the event? (Github, Zoom, other.) Can these techniques be used to make your group more inclusive to people unable to attend physical events in the future?  

Currently, we rely on PowerPoint to create posters for our events and Google Meet for online meetings before having the Meetup Pro account provided by R Consortium. Many of us, the co-founders of the community, are technicians, so we also use Word for various tasks. However, leadership skills have been crucial in convincing people to join our vision. To engage others, we often organize dinner or lunchtime meetings. We’ve invested significant effort into these initiatives, and through them, we’ve successfully negotiated several partnerships. Initially, I contacted colleagues at the National Institute of Statistics to rally support for our cause.

Our community now includes students and academics, and some PhD doctors are still in the learning phase. I discussed creating a community in Cameroon, asking what value we could offer to encourage their participation. I proposed that they become co-founders of the community, a role they could highlight on their resumes. Seven people have already stepped up as leaders within our group.

We’re active both online and in person. It’s important to note that we organize two types of meetings. The first is an internal meeting with our community leaders, typically attended by around seven people. We use a WhatsApp group for communication and usually meet monthly for lunch at a restaurant for these meetings. The second activity involves larger groups. For these, we first coordinate with the administration, such as the school mentioned earlier, who then communicate with the students. We also document and share these activities online to inform others, though most of our communication is direct and specific.

For example, after explaining our vision to a university contact, they were interested and agreed to offer a course at their institution. We then coordinated with the student body leader, planned a session, and shared the event online. We even hired a professional photographer to capture the event, sharing the photos with the school for further distribution. However, we haven’t yet posted about this activity on LinkedIn.

Looking ahead, we’re planning a session with another school—the statistical school where I studied. We’re currently in discussions with their management. Once we’ve had our conversation, possibly next week, we’ll talk with the student leaders. After the session, we plan to share our activities online, including photos, to highlight what we’re accomplishing in Cameroon.

Do you have any upcoming events planned for the group?

We have an upcoming event that I consider one of the most important we’ve planned. It is our group’s quarterly meeting. The main objective of this meeting is to develop our R community website using Shiny. It will be a workshop where we’ll gather in one place and form small groups comprising beginners and experienced members. During the workshop, we’ll collaborate to code and discuss ideas, and ideally, by the end of the session, we will have the code for our community website ready for deployment.

We’re currently facing some logistical challenges to organize this event. In Cameroon, when we organize events, we strive for perfection, ensuring everything from photography to visibility is top-notch. We’re searching for a suitable hotel venue to host our event.

What trends do you currently see in R language and your industry?

In Cameroon, Quarto is one of the most popular packages we promote during our R community sessions. Another widely used package is R Markdown. While I primarily use R Markdown to produce outputs for my job, I am also working on becoming more proficient with Quarto, as it is the future of reporting and documentation.

I frequently use the tidyverse, tidyr, labelled, and haven suite of packages for data cleaning and reporting. A significant part of my job involves data cleaning, and I rely on tidyverse in conjunction with Quarto for these tasks.

We also utilize R for machine learning, though there is still potential for improvement in this area. We are focused on leveraging Shiny, Quarto, and tidyverse for our work.

Please share about a project you are working on or have worked on using the R language. What is the goal/reason, result, or anything interesting, primarily related to your industry?

We are currently working on an exciting project that is still in progress. The aim is to develop an application inspired by the functionality seen in the movie Fast and Furious, where users can search for information on the Internet. We are utilizing the httr package to collect data from online sources. 

The application will enable users to input a search term, such as “lion in Cameroon,” and receive a dataset with all relevant information. Our goal is to provide high-quality data for researchers and other users, which involves considerable effort to ensure the accuracy and usefulness of the data. 

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 68,000 members in 33 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

Thank You, Joseph Rickert: A Legacy of Leadership and Innovation in the R Community

By Blog

As we announce the end of Joseph (Joe) Rickert’s tenure as the Executive Director of the R Consortium, we reflect on his remarkable contributions that have significantly contributed to the R community. Joe’s leadership has been instrumental in fostering growth, innovation, and collaboration within the R ecosystem.

Founding the R Consortium

Joe has been with the R Consortium since its inception in 2014. He was initially appointed to be Microsoft’s representative to the Infrastructure Steering Committee (ISC) and was soon tasked with creating the R User Groups (RUGS) grants program. Joe also pioneered the formation of ISC working groups to foster industry-wide collaborative projects. In 2016, Joe was appointed to be RStudio’s representative to the Board of Directors. In 2018, he took on the role of Secretary, and by 2019, he was elected Chair of the Board. In 2023, Joe took on the role of Executive Director. Under his guidance, the R Consortium has grown into an inclusive organization supporting the R programming language and its community. Our new executive director, Terry Christiani, was affirmed by the board of directors in our August 2024 board meeting after a selection committee interviewed candidates and made recommendations.

Advancing User Groups

One of Joe’s notable achievements is his unwavering support for R user groups worldwide. He recognized the importance of grassroots movements in spreading the use of R and provided essential resources and funding to these groups. He was instrumental in funding the R-Ladies as a top-level ISC project that operates worldwide to provide safe places for women to come together and learn from each other in an otherwise male-dominated space. Joe was also directly involved with the Bay Area useR Group (BARUG), organizing events, speaking, or contributing to discussions around the R programming language, especially in the context of data science and statistical computing. This support has enabled countless R enthusiasts to connect, share knowledge, and collaborate on projects, thereby strengthening the global R community.

Industry Collaboration and Working Groups

Joe actively reached out to industry leaders to create unique working groups aimed at solving industry-wide problems. These collaborations have led to the development of working groups focused on R programming solutions that benefit not only the community but also industries that rely on data science and statistical computing.

A Legacy of Innovation

Throughout his tenure, Joe has been a driving force behind numerous initiatives that have propelled the R community forward. His efforts have ensured that the R Consortium remains a dynamic and inclusive organization, fostering a spirit of collaboration and innovation. His leadership has left an indelible mark on the R community, and his legacy will continue to inspire future generations of R users and developers.

As we welcome new leadership, we extend our heartfelt gratitude to Joe Rickert for his dedication, vision, and tireless efforts in advancing the R community. Thank you, Joe, for your invaluable contributions and for paving the way for a brighter future for the R ecosystem.

R-Ladies Bariloche in Argentina: Fostering a Different Approach to  Leadership

By Blog

Lina Moreno, founder and organizer of the R-Ladies Bariloche chapter in Argentina, recently shared her journey with the R Consortium. A biologist focusing on evolutionary ecology, she discussed her experience building a local R community, the challenges of maintaining engagement post-pandemic, and her efforts to foster discussions on leadership and gender equity within academia. Through her work, she aims to create an inclusive space for women in data science and strengthen the R community in Bariloche.

Please share your background and involvement with the RUGS group.

I am a biologist working on evolutionary ecology, I did a bachelor in Sciences in another Argentinian province and then moved to Bariloche to start my PhD. (That was 16 years ago!) I spent a couple of years working in the field (I study reptiles, mainly lizards, and their adaptations to cold environments), and then I had to analyze the data. When I started my main analysis, I could only do it in R. At that time, my boss told me to start with R immediately, so I started searching online. I am doing this kind of work in my home country, so finding resources was complex. I had to work on a comparative analysis and phylogenies, which was difficult initially. However, I started studying and meeting people who taught me, which was awesome. 

After several years, I encountered a problem I couldn’t resolve. I turned to Google and found a helpful community of ladies. They assisted me a lot and saved me from a tight spot. After some communication, they suggested starting an R-Ladies community in Bariloche. I met with them in person when I traveled to Buenos Aires. They convinced me to start a Bariloche chapter, and by the end of 2019, a few of my colleagues and I, mainly biologists working in the same area, created the Bariloche chapter.

What are some challenges you have faced in organizing this group?

We are currently facing some difficulties as people seem unwilling to get involved. As a result, we are exploring new strategies such as combined meetups (online and in person), and together with other R-Ladies groups. We also plan to organize three or four meetups this year. Despite our efforts, we are getting discouraged by the lack of response, but we will see how it goes.

During the pandemic, there were five organizers, three of us with young children, so it was pretty difficult. We started a study group for the R for Data Science book, which had recently been translated into Spanish. We also organized meetups, mainly led by us. Surprisingly, there was good attendance during the pandemic, with around 40 people each time, similar to before the pandemic. However, after returning to in-person meetups, the attendance dropped significantly. The most crowded meetup had only 15 people, whereas before, we used to have two meetups on the same subject, both with 40 attendees.

We are attempting to integrate virtual and in-person components, which has proven challenging. We aim to introduce a beginner’s course in R through a meetup to help attendees gain confidence for future meetups. Our meetups typically have a good turnout, but there is a lack of interest in specific topics, particularly those related to gender bias and women’s issues. Despite this, we are putting in a significant effort. Last year, we participated in a round table during a conference in Bariloche, focusing on gender bias and the difficulties of being a woman in Academia, and it received a lot of support, especially from women. The rest of the group is currently working on documenting our experience at that round table. I am not participating as I have been busy with other responsibilities.

Last year we surveyed people on how they feel R-Ladies is contributing to their careers, its influence on leadership abilities, their reasons for abandoning our events, or why they were too busy to attend. We have received some responses, but not as many as we expected.

However, we have enough data to write a paper about it. We’re also looking into the topic of leadership. We are opening our minds and exploring the possibilities of supporting the R community here in Bariloche, but it’s quite challenging at the moment.

Can you share what the R community is like in Argentina? 

The field of artificial intelligence and machine learning is rapidly evolving, which greatly helps incorporate knowledge in R. The leading software used for this purpose is R and Python. R is more popular than Python, as it has been around longer and is more user-friendly. Many people from academia with expertise in machine learning are transitioning to private industries. Regarding statistical analysis in biology, tools such as the Vegan package and those incorporating GAM (Generalized Additive Models) are commonly used. Tools that can handle multiple effects simultaneously are in high demand in ecology.

Would you like to talk about any recent activity from the group?

It was exciting what happened after the conference we participated in. As women, we are trying to maintain an open community. Initially, the organizers needed to learn how to manage an open group without a leader or a head, where everyone is considered equal. The participants ranged from students to established researchers, which ignited discussions about leadership and how women and minority groups navigate the world. We are accustomed to a robust and assertive leadership style, often associated with being at the top. However, as women, we wanted a different kind of leadership. This led to discussions on creating a new type of leadership that doesn’t adhere to the traditional patriarchal model. The manuscript the group is working on currently revolves around creating a new style of leadership that focuses on nurturing individuals to become better persons, researchers, or workers. 

The topics discussed in the manuscript revolve around how we can lead differently. These discussions were not limited to our group; we engaged with several groups from Argentina and Spain who also expressed a similar desire for a different kind of leadership. It is interesting to note that as women and minority groups, we want to be in positions to make meaningful decisions, especially given that our work as biologists primarily tackles environmental issues. Despite historically shying away from more traditional forms of leadership, we are now advocating for different styles. This shared sentiment has brought us closer to other minority groups, and we believe it’s an important topic that needs further discussion. It’s important to recognize that women have a different approach, and it doesn’t make us weaker than men; it simply signifies that we have a unique way of contributing.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 68,000 members in 33 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

The 2024 ISC Grant Program will begin Accepting Applications Soon!

By Blog

The R Consortium is excited to announce the second cycle of the 2024 Infrastructure Steering Committee (ISC) Grants Program. The Call for Proposals will open soon. This initiative aims to support projects that strengthen the R ecosystem’s technical and social infrastructure. 

Here is a list of projects that received grants from the R Consortium in the First Cycle in 2024

From the Call for Proposals page:

The ISC is interested in projects that:

  • Are likely to have a broad impact on the R community.
  • Have a focused scope (a good example is the Simple Features for R project). If you have a larger project, consider breaking it up into smaller chunks (a good example of this done is with the DBI/DBItest project submission, where multiple proposals came in over time to address the various needs).
  • Have a low-to-medium risk with a low-to-medium reward. The ISC tends not fund high-risk, high-reward projects.

Whether you’re working on groundbreaking tools or organizing community-driven events, this is your chance to secure funding and make a significant impact on the R community!

Key Dates:

  • September 1, 2024: Grant Application Period Opens
  • October 1, 2024: Grant Application Period Closes
  • November 1, 2024: Notification of Accepted Grantees
  • December 1:  Deadline for acceptance of grant and contract. Public notification of grantees occurs shortly thereafter.

Submit your proposal by October 1, 2024, and contribute to the ongoing growth of the R ecosystem. Visit the R Consortium website for detailed guidelines and submission instructions. Don’t miss this opportunity to bring your innovative ideas to life!

R4SocialScience: Empowering Social Science Research with R in India

By Blog

Dr. Mohit Garg, organizer of the R4SocialScience group in Delhi, India recently talked to the R Consortium about his experience of starting an R user group. The R4SocialScience group aims to bridge the gap between social science research and data analysis, offering support and training to academics, researchers, and industry professionals. Dr. Garg shares his experiences, the growth of the R community in India, and his plans for expanding R’s reach.

Please share about your background and involvement with the RUGS group.

I’m currently working as an assistant librarian at the Indian Institute of Technology, Delhi, one of the premier institutions in India.  My academic background includes a BTech in Information Technology from Guru Gobind Singh Indraprastha University followed by an MS in Librarian Information Science from the Indian Statistical Institute, an institution dedicated to statistics in India started by the late Professor P.C. Mahanobis. After that, I completed my PhD in Library and Information Science from IGNOU, New Delhi.

My interest in R began in 2013 when I started my MS at the Indian Statistical Institute.  Since then, I have taken various courses as part of my MS program and some online courses. I became interested in R due to its open source nature and the free availability of packages for all kinds of analysis. Then, I started promoting R in the academic community. However, in 2013, there was little interest in R because the prevalent approach in India was more focused on using commercial software for data analysis.  However, in the past few years, there has been an increasing interest in R, with many workshops and government-funded events dedicated to it.

I have been providing R training to professors, teachers, and research scholars, and I have also worked on web-based development using Shiny packages. Furthermore, we have developed a web dashboard to visualize real-time research productivity data obtained from sources like Scopus through API. Recently, we completed a 12-week MOOC course on NPTEL SWAYAM platform with a focus solely on R. The course was quite popular, with 2584 learners from India joining, and 515 learners registering for the final examination. Although the course was free, participants had the option to pay for certification.

Can you share what the R community is like in India? 

I have been involved in the academic profession since 2016 and have been giving lectures and providing resource points at various institutions. I believe that there is a need to build a community focused on social sciences, especially for those who may have a limited understanding of mathematics, and statistics. The idea is to create a specific community related to social science, not just in India, but also in collaboration with other institutions. The community will cater to three main groups: those who are proficient in coding and development of R packages, those who are familiar with basic R but need further guidance, and those who are completely new to R.

The community aims to provide support for those interested in social science and to make R more accessible by offering packages related to social science, basic R tutorials. One specific package gaining popularity in academia is “biblo shiny bibliometrics,” which facilitates scientific productivity mapping using R. 

We want to emphasize that R is not just a programming language, but a software for data analysis, to encourage more people to explore its potential. While both R and Python are interpreted languages, we aim to dispel the fear of programming and demonstrate how these languages can be used effectively. Although Python appears to be more widely used in the industry, there is still a growing interest in R.

What are your plans for the group going forward?

I have been teaching R for more than 10 years, and I found that researchers are interested in using R. I have identified three potential co-organizers from different regions in India to make a team of four people. We have already received a grant, and we plan to conduct training sessions in different locations across India.

I am focusing on a “train the trainer” model, where I aim to train individuals who can then carry out training sessions in their respective regions. India has over 50,000 colleges and around 1,200 universities, all involved in significant research and analysis activities. We also aim to have dedicated R trainers in all districts in India by 2026.

Our approach involves dividing the country into five zones, followed by state-wise and district-wise planning. We are not heavily reliant on industry support, as our activities are primarily related to academia and research. 

We plan to charge a nominal registration fee, which would cover expenses such as food and refreshment. We are hoping to minimize travel expenses, as they can be quite costly. But we will explore some way to fund the travel and accommodation expenses. We have hosted a one day workshop on “Doing Research using R” at Galgotias University.  

I am currently focusing on building a community and providing training sessions. I have noticed that online sessions may not be as effective as I had hoped, as participants seem to encounter many problems. Therefore, I am considering conducting more in-person workshops, which I believe will help popularize the training sessions. Additionally, I aim to develop specialized packages for social science and build a dedicated team. I am optimistic about these plans. During a recent workshop, I noticed that many participants preferred simple tools for data analysis. I intend to introduce such tools to make the training more accessible and user-friendly for participants. This is my vision for the community.

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in?

We have developed a platform utilizing the shiny and other text mining packages. This platform is still in the testing phase. The platform allows real-time data fetching from the Scopus API.

For example, if I search for a faculty member, it will display the publication data such as the number of publications, H-index, citations, types of publications, sources of publication, and annual publication distribution. We can also download this data.

We have also developed a word cloud based on the titles of the publications for each faculty member, processed using the TM package. This helps to infer the expertise of the professors. Furthermore, we have included a feature for identifying the H-classic, which is related to the H-index.  This platform is quite useful and efficient, especially for academic institutions. We now have the capability to download data from a specific date range as an Excel file. The data includes publication dates and the number of citations.

We’re in the process of creating a full dashboard for universities or institutions. We’ve also conducted a pilot study for other institutions. We are also considering publishing this work as a research paper to increase its visibility. 

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 68,000 members in 33 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

News from R Submissions Working Group – Pilot 3 Successfully Reviewed by FDA

By Blog

Blog contributed by Ning Leng, People and Product Leader, Roche-Genentech and Joel Laxamana, Principal Data Scientist, Roche-Genentech

The R Consortium is pleased to announce the successful completion of the Pilot 3 Submission which extended the work done in Pilots 1 and 2 by generating the ADaM datasets. The complete FDA response letter is available.  

The objective of the R Consortium R Submission Pilot 3 Project is to test the concept that an R-language based submission package for ADaMs and TLFs can meet the needs and the expectations of the FDA reviewers, including assessing code review and analyzing reproducibility. All submission materials and communications from this pilot are publicly available, with the aim of providing a working example for future R-language based FDA submissions. This is an FDA-industry collaboration through the non-profit organization R Consortium.

All submission materials can be found at: submissions-pilot3-adam-to-fda. This is the first publicly available R-based FDA submission package including R scripts to generate Analysis Data Models (ADaM) and Tables, Listings & Figures (TLFs).

Open-Source Collaboration in standardizing Clinical Trial analyses and Submissions. Broadening ways to bring treatments to Patients.

Pilot 3 Timeline

The initial submission was submitted through the eCTD gateway on Aug 28, 2023. FDA verbal responses were received from Jan-July 2024 during R submission working group meetings. Documentation of this initial feedback and response can be found at response-FDA-IR-pilot3.pdf. The updated submission package addressed reported issues and was re-submitted on Apr 19, 2024. The final response letter from FDA was received on Aug 8, 2024.

Pilot 3 Scope

The Pilot 3 test submission exemplifies an all-R submission package for ADaMs and TLFs, adhering to electronic Common Technical Document (eCTD) specifications. This comprehensive package not only includes ADaMs and TLGs, but it emulates a full study submission package including the source Study Data Tabulation Models (SDTMs) used to generate the Pilot 3 ADaMs. It also encompasses the installation and loading of the proprietary {pilot3utils} R package, various open-source R packages, R scripts for the Analysis Data Model (ADaM) datasets derived from Pilot 3, and Tables, Listings, and Figures (TLFs) from Pilot 1. In addition to other requisite eCTD components, the Pilot 3 package also includes the Analysis Data Reviewer’s Guide (ADRG) providing detailed steps leading to the execution of the analysis R scripts to re-produce the ADaMs and TLFs from a FDA reviewers perspective. These Pilot 3 submission materials are linked above.

Pilot 3 serves as a complement to Pilots 1 and 2, which demonstrated the feasibility of submitting TLF R scripts and R Shiny code, respectively. Furthermore, Pilot 3 successfully validated the submission of proprietary R packages in compressed file formats, serving as another alternative to {pkglite} or installation directly from github.

If you have any questions about this Pilot 3 submission, we would love to hear from you. Feel free to submit any questions you may have as a new issue in our Pilot 3 github repository or you may find any of the Pilot 3 team members in Pharmaverse slack.

Learnings from Pilots 1 through 3

In the three pilots, for various TLFs, the working group members intentionally created different tables in different formats using various open-source R packages. The FDA staff successfully accepted and reproduced the results generated from these different open-source packages. However, though not in scope of these Pilots, we want to share awareness that sponsors are responsible for selecting open-source packages that demonstrate sufficient reliability. Further information on this can be found in the R Validation Hub, formed in 2018 by the PSI AIMS Special Interest Group and supported by the R Consortium. It offers tools like {riskmetric} to quantify the “risk” of R packages and a user-friendly R Shiny app, {riskassessment}, to evaluate package reliability.

The majority of FDA staff feedback falls under the following themes :

  1. Clear ADRG documentation on computing environment, package dependencies, and expected warnings
  2. Clear documentation on data processing rules and statistical method implementation
  3. Good statistical practice in confirmatory trials, such as avoiding the possibility of “p-hacking” 

For future submissions using open-source languages, it is recommended to give special attention to recommendation theme 1. Recommendation themes 2 and 3 are language-agnostic and should always be followed, regardless of the programming language used. All of these themes fall in line with the FDA’s Statistical Software Clarifying Statement.

Upcoming Pilots

As a next step, the R Consortium R Submission Working Group initiated submission pilot 4, to explore the use of novel technologies such as Linux containers and web assembly to bundle a Shiny application into a self-contained package, facilitating a smoother process of both transferring and executing the application.

The R Consortium R Submission Working Group

The R Consortium R Submissions Working Group is focused on improving practices for R-based clinical trial regulatory submissions.

To bring an experimental clinical product to market, electronic submission of data, computer programs, and relevant documentation is required by health authority agencies from different countries. In the past, submissions have been mainly based on the SAS language. 

In recent years, the use of open source languages, especially the R language, has become very popular in the pharmaceutical industry and research institutions. Although the health authorities accept submissions based on open source programming languages, sponsors may be hesitant to conduct submissions using open source languages due to a lack of working examples.

Therefore, the R Submissions Working Group aims at providing R-based submission examples and identifying potential gaps during submission of these example packages. All materials, including submission examples and communications, are publicly available on the R consortium Github page: https://github.com/RConsortium .

The R consortium R submission working group includes members from more than 10 pharmaceutical companies, as well as regulatory agencies. More details of the working group can be found at: https://rconsortium.github.io/submissions-wg/ .

The R consortium R submission working group is open to anyone who is interested in joining. If interested, please contact Joseph Rickert at director@r-consortium.org. 

Pilot 3 FDA Reviewers

FDA reviewers included  

Hye Soo ChoPaul Schuette, and Youn Kyeong Chang.

Pilot 3 Developers

The Pilot 3 development team included Joel Laxamana (Project Lead, Roche), Robert Devine (J&J) , Benjamin Straub (GSK) , Kangjie Zhang (Bayer) , Thomas Neitmann (Roche), Phanikumar Tata (Syneos), Steven Haesendonckx (J&J), Yutong Liu (Moderna), Lei Zhao (Roche), Nicole Jones (Merck), Benjamin Wang (Merck), Dadong Zhang (Illumina), Declan Hodges (GSK).

Building Bridges in Haifa, Israel: How the New R User Group in Haifa is Establishing a Diverse R Community

By Blog

The R Consortium recently interviewed Eli Eydlin, a dedicated member of the R community who has been instrumental in establishing an R User Group in Haifa, Israel. With a background in physics and a recent shift into the biotech industry, Eli was motivated to create the group after noticing the absence of a local R community in his new city. Despite Haifa’s relatively small size, it boasts a diverse R community, including professionals from high-tech companies, academia, and startups. Eli shared his experience organizing their first Meetup, which featured speakers from vastly different backgrounds, and his plans to make future events more inclusive. His story highlights the importance of community building and the impact of taking the initiative, offering inspiration for others looking to contribute to their local R communities.

Please share about your background and involvement with the RUGS group.

I’ve been working in a biotech startup for nearly two years. My background is more on the pharma side, but I decided to dive into this new field. When I moved, I noticed that there wasn’t an R User Group in the area, even though I knew of other groups in different cities and countries that were doing great things. I didn’t like the idea of not having one here, so I decided to start one myself. We just had our first event, and I’m really excited to be a part of this initiative.

Can you share what the R community is like in Haifa? 

One of the reasons I started this was to meet new people who are also R programmers or users. I already know that the community is really diverse. My city isn’t huge—around 300,000 people—but has much to offer. There are big high-tech companies like Microsoft and Amazon here that support their R&D departments, and I’m certain that some of our members are involved. Like in many places in Israel, there are hundreds of small startups. What I find interesting is that people come from all sorts of backgrounds—mostly from academia, as usual, but now also from government, traditional companies, and small businesses. I’m excited to see where this leads.

You had a Meetup on August 6th, 2024. Can you share more about the topic covered? Why this topic

We have two completely different topics and two amazing speakers. Sofia Nazarova is a marine biologist at Israel Oceanographic & Limnological, who also organizes private guided tours. She teaches people about plants and animals, so she’s not a programmer. However, she co-authored the first-ever R textbook published in Russian, which makes her experience unique. Typically, I work with people who are programmers or data scientists, but she’s out there in the field, literally working with marine life.

Our second speaker was Adi Sarid, the CEO of Sarid Institute LTD, a data science company focusing on production and consumption. He’s also writing a book on R, this time in Hebrew. It’s a completely different experience—he’s a business leader working with governments and large firms, and he showcased some fantastic examples of practical applications in his talk.

I deliberately chose speakers with very different backgrounds because that interests me. While organizing the group, I thought about what I wanted to learn and the connections I wanted to make.

Who was the target audience for attending this event? 

To be honest, there wasn’t a specific target audience for this event because I didn’t know anyone. I just tried to reach out to whoever might be interested in participating. After the first event, I noticed a big jump in interest, but only one woman attended. So, for the next event, I want to specifically target women and try to figure out what went wrong. It seems like many women are using R, but for some reason, they didn’t show up. We’ll address that and improve things moving forward.

Any techniques you recommend using for planning for or during the event? (Github, zoom, other) Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?   

We didn’t use anything for the first event, which we may do with the future in mind. We wanted to stay connected, especially since we’re in Israel, and it’s important to support each other under pressure. The event went well, but after it was over, I started receiving messages from people saying they couldn’t attend because they needed to spend time with their kids or were afraid of potential security issues.

I realize how important it is to offer in-person meetings, but I now acknowledge that it’s not feasible for everyone. So, for the next event, we’ll make it easier for people to join remotely, perhaps through Zoom or a similar platform. We’ll also hold events in more secure locations. It’s clear that while in-person events are valuable, they aren’t always possible or suitable for everyone.

We would like to get to know you more personally. Can you please tell me about yourself? For example, hobbies/interests or anything you want to share about yourself.

First of all, I have a physics background and a master’s degree in it. I recently transitioned into the pharma and biotech industry, which has been a new and exciting challenge for me. My interests have always been diverse, and I’m particularly fascinated by this field, as well as by the natural beauty of Israel, especially its trees. On a different note, I’m also a harmonica player, which is another passion of mine. 

Please share any additional details you would like included in the blog. 

I want to take the initiative with the R Consortium and contribute to its efforts. I see how cool and relatively easy it is to organize such a group. Focusing on developing countries is important, but it’s also relevant here in Israel, even though we’re relatively wealthy. What motivated me was seeing others take action, and I realized that I needed to step up and organize a group as well. The tools and support provided by the R Consortium are incredibly helpful for bringing people together. So, I just wanted to say thank you for that.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 68,000 members in 33 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

A New R Community in Ahmedabad, India, focused on Clinical Research and Pharmaceutical Industries 

By Blog

The R Consortium recently interviewed Sanket Sinojia, organizer of the Ahmedabad R User Group (ARUG). With over 14 years of experience in statistical programming and data sciences in the clinical research industry, Sanket spearheaded the formation of ARUG to create a dedicated community focused on applying R in clinical research and pharmaceutical industries. 

Since its official setup in 2023, ARUG has rapidly grown into a vibrant group with over 100 active members. The community fosters collaboration, knowledge-sharing, and mentorship, contributing significantly to the broader R ecosystem. Sanket’s dedication to data science and community building is evident in his efforts to organize impactful events, mentor new members, and drive the group’s growth.

Please share about your background and involvement with the RUGS group.

I’ve been working in statistical programming and data sciences in the clinical research industry for more than 14 years, which naturally led me to explore tools that enhance data analysis and visualization. My journey with the Ahmedabad R Users Group (ARUG) began when I recognized the growing need for a dedicated community focused on applying R in our field. I was sitting with my industry friends Paresh Parekh, Paresh Paghdar, Parin Shah, and Rahul Pandya, discussing what we could do to bring the Ahmedabad R community together. After a few days, when we met again, I proposed setting up the ARUG. They welcomed and fully supported the vision. 

While the group was initially formed in 2021, its formal setup was completed in 2023 with collaboration from the R Consortium. Since then, I’ve been organizing events, facilitating discussions, and contributing to the group’s growth. My passion for data science and community building drives my commitment to ARUG. Beyond organizing, I also mentor new members, helping them navigate the complexities of R. It’s incredibly rewarding to see the community thrive and to witness the impact of our collective efforts on advancing the field. ARUG is the first city-based group in India associated with R Consortium and the only R user group focused on Clinical Research & Pharmaceutical Industries.

Can you share what the R community is like in Ahmedabad? 

Ahmedabad, a world heritage city known for its rich history and vibrant culture, is also known for its Pharma Industry. Ahmedabad is now emerging as a significant hub for clinical research, data sciences, and pharmaceutical innovation. The R community in Ahmedabad is vibrant and rapidly growing, especially within these sectors. With over 100 active members, our community is a diverse mix of seasoned professionals and enthusiastic newcomers. This diversity fosters a collaborative environment where knowledge-sharing and mentorship flourish. The enthusiasm and engagement at our events reflect the community’s dedication to advancing their skills and contributing to the broader R ecosystem. We’ve seen a significant increase in participation and interest, indicating the strong potential for R to drive innovation and efficiency in our industry. The city’s dynamic and innovative spirit resonates through our community, making it an exciting place for anyone passionate about data science and clinical research.

You had a meetup on July 21st, 2024. Can you share more about the topic covered? Why this topic?

The meetup was a highly anticipated event entitled ‘R Evolution: Shaping the Future of Clinical Trials.’ We chose this topic because it aligns perfectly with the emerging demand for R in our industry and aims to make our attendees aware of the transformative potential R brings to clinical research. Here’s a quick recap:

I started the event by taking attendees through the inspiring journey of the Ahmedabad R Users Group (ARUG). I shared key updates, our association with the R Consortium, and outlined ARUG’s long-term goals. The overwhelming response, with event registration filling up within 24 hours, highlighted the high expectations and enthusiasm of our attendees!

Sunil Gupta, a renowned industry expert, delivered an incredible presentation on ‘R Made Easier for SAS Programmers.’ His insights, tips, and tricks on transitioning to R were invaluable, offering a comprehensive learning experience.

Following his session, we enjoyed a vibrant networking session, making new connections and reconnecting with old friends.

Chintan Patel captivated us with a simplified demonstration of creating Kaplan Meier Plots using R. His comprehensive topic coverage inspired many to delve deeper into graphing with R.

Mitesh Patel then presented ‘R Shiny in Action,’ explaining the fundamentals and applications of R Shiny, complete with a live demonstration of three case studies. His session highlighted R Shiny’s potential as a game-changer in data visualization.

I then hosted a dynamic round table discussion on conferences, joined by Parin Shah, Purvi Kalra, and Saumil Tripathi. They shared invaluable tips on conference preparation, R trends, and enhancing presentation confidence.

Finally, Krupa Trivedi, Nidhi Shah, and Nikunj Kothari organized a thrilling ‘Test Your R Skills’ quiz. The live, time-bound quiz with an instant leaderboard brought a fun and competitive spirit to the hall.

The event wrapped up with Rahul Pandya’s vote of thanks, where we celebrated our presenters, the ARUG event team, and quiz winners.

The ARUG Event Team – Ankit Vadodariya, Dishant Parikh, Krupali Ladani, Krupa Trivedi, Nidhi Shah, Nikunj Kothari, Paresh Paghdar, Paresh Parekh, Parin Shah, Purvi Kalra, Rahul Pandya, and Sanket Sinojia – successfully organized the entire event. Their hard work and dedication ensured a seamless and impactful experience for all attendees.

Who was the target audience for attending this event? 

Our target audience primarily includes professionals and researchers from the clinical and pharmaceutical industries. This encompasses data scientists, statistical programmers, biostatisticians, clinical trial analysts, and anyone interested in leveraging R for data analysis and visualization in these fields. We also welcome newcomers who are eager to learn and contribute to this dynamic industry. By targeting this specific audience, we ensure our content is highly relevant and impactful, addressing real-world challenges and opportunities our members face. Additionally, our events offer networking opportunities that can lead to collaborations.

Any techniques you recommend using for planning for or during the event? (Github, zoom, other) Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?   

For planning and executing our events, we rely on a combination of tools such as GitHub for collaborative project management and repositories, Zoom for virtual sessions, and Meetup and LinkedIn ARUG pages for continuous engagement and communication. These tools not only streamline our processes but also enhance accessibility. By utilizing these technologies, we can include members who cannot attend physical events, making our group more inclusive. We share our event summaries and slides on GitHub, ensuring that valuable information is accessible to all members. Additionally, we are considering recording sessions and sharing them on platforms like YouTube to further extend the reach of our events. Using these tools also allows us to gather feedback and improve future events, ensuring we meet the evolving needs of our community. This approach helps us maintain a vibrant and connected community, regardless of geographical limitations.

We would like to get to know you more personally. Can you please tell me about yourself? For example, hobbies/interests or anything you want to share about yourself.

Outside of my professional life, I have several passions that I pursue with great enthusiasm. I enjoy blogging and sharing my thoughts on data science, industry trends, and personal reflections. The mountains always call me, and I find peace and inspiration in hiking and exploring nature. Additionally, I write poetry in my local language, allowing me to express my creativity and connect with my cultural roots. These hobbies provide a balanced and enriching life, complementing my professional endeavors.

Apart from ARUG, I volunteer with organizations like CDISC, PHUSE, Pharmaverse, and RinPharma, contributing to various projects such as SDTM IG v3.4 development, CDISC Primer development, and Shiny for Submission. These experiences offer me a unique perspective and a sense of tranquility that I bring into my work, enhancing my creativity and problem-solving abilities. Engaging in these activities helps me maintain a holistic approach to my professional and personal life.

I cherish spending time with my family, especially with my adorable daughter Saanvi. My better half, Niza, always supports and encourages me in all aspects of life. She is dedicated to taking care of our family and provides unwavering support that allows me to pursue my passions and professional goals.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 68,000 members in 33 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.