Quick update on the 2024 R User Groups (RUGS) Program. The review of the first batch of grants is in progress, marking the beginning of the awarding phase. But there’s still plenty of time for you to apply!
Interested in building up your own R User Group and creating a strong R community where you live? R Consortium would like to help!
User Group Grants: Support for enhancing user engagement and user-centric initiatives.
Conference Grants: For organizing or attending events aligned with R Consortium goals.
Special Projects Grants: For groundbreaking ideas needing an initial push.
For details and to apply, visit here. Your participation is pivotal for the growth of the R language.
R User Groups: Strengthening Global Connections
With 74 active groups and over 67,000 members, R communities are a melting pot of knowledge and innovation. Discover what other RUGS organizers are doing and how they have solved tough problems restarting in person meetings or finding good locations or communicating effectively with their community… on our blog. Many RUGS case studies are available.
Key Dates:
Application Period: Open through September 30th, 2024, but don’t wait!
Note: Grants are not for software development or technical projects. For those, consider the ISC Grant Program. Learn more here.
Join us in building the worldwide R community. Apply now and be part of the journey!
Last year, Antonio Hegar of the R Glasgow user group shared the challenges of organizing an R user group in Glasgow. The group now regularly hosts events, attracting local R users and experts. Antonio shared with the R Consortium the group’s journey and anecdotes that have helped it to build momentum. He also shared his hopes for maintaining this momentum, with speakers lined up for the next three events.
Antonio also discussed his work with R for his PhD research in data analysis for healthcare. He spoke about the ever-evolving nature of R and some of the new developments that have been useful for his research.
What’s new with the Glasgow R User Group since we last talked?
As we discussed the last time, one of the most pressing issues we faced as a local R user group was our lack of engagement with the community. This is particularly interesting given that both Glasgow and Edinburgh have their own R user groups. Both cities are only an hour apart, yet we weren’t seeing the same level of engagement as other groups in the UK.
To address this issue, we have been strategizing and holding several meetings. To summarize, we discussed improving our marketing and engagement with our audience. We also decided to hold one final meeting at the end of the year.
Besides our internal meetings, we also hosted two R events. One of the group’s founders, Andrew Baxter, a postgraduate researcher at the University of Glasgow, has been instrumental in organizing these events. Because he works at the University of Glasgow, he has access to many resources, including physical venues and fellow academics, and this has been a major plus in facilitating our engagement.
Previously, I had been trying to do what other groups have done: finding random venues and hosting events there. However, this was not as effective as we had hoped.
From the discussions that we had, as well as listening to our audience, we learned that people who are interested in working with R have very specific wants and needs. If these needs are not being met, then it is unlikely that people will be attracted to the group, and as such, we had to reframe our approach to attracting people.
We recognized it is key to have a specific venue. We now hold the vast majority of our meetings at the University of Glasgow. This seems to be very appealing to people, as they enjoy the academic setting. Furthermore, the University of Glasgow is well known and respected, not just in Scotland but across the world, and this adds weight to the appeal, and the reputation helps to draw people in.
The second thing that proved essential was consistency. Having a meeting for one month and then having a gap breaks the flow, and sends the wrong message to your audience. When people see that you are committed to what you want to do, they respond to that and are more likely to be engaged in the community.
We had a final meeting in December, and Andrew Baxter contacted Mike Smith, one of the local R Consortium representatives. He is based in Dublin, Ireland, but frequently travels back and forth to Scotland. He leveraged this network to recommend speakers and topics for the conference. This was particularly helpful in attracting people from industry, who are often interested in the latest developments in R. Mike has been a tremendous asset to the group since our meeting in December.
A venue, people on the inside of the industry, and a consistent schedule have been the three key components. Three speakers have been lined up for early 2024: one for January, February, and March.
We will not have much difficulty finding additional speakers based on the academic and industrial contacts. At most, we must determine who will speak on which topic and when they will be available, which is not difficult. Based on the current situation, it does not appear that we will have any trouble maintaining momentum and keeping the meetings going.
What industry are you currently in?
I am a PhD student at Glasgow Caledonian University. My PhD research focuses on data science applied to health, specifically using machine learning to predict disease outcomes.
I am interested in understanding why some people who experience an acute illness, such as COVID-19, develop long-term health problems. In some countries, up to 10% of people who contract COVID-19 never fully recover. These individuals may experience permanent shortness of breath, headaches, brain fog, joint pain, and other symptoms.
I am currently researching how data science can be used to answer questions such as these, using large data sets from, for example, the NHS. R is the primary tool used for this research.
When we last spoke, I was in the second year of my PhD. I am now in my third and final year. I should be submitting my dissertation before the end of this year. Balancing my commitments to R, my PhD work, and other activities is challenging, but I managed to pull it off.
How do you use R for your work?
I extensively use R. One of R’s most beneficial aspects is that it’s constantly evolving and expanding. As a result, it is impossible to master everything. You do not master R; rather, you master certain R areas relevant to your research or area of expertise. In my research, I found several medical statistics and biostatistics packages extremely useful. I was aware of a few of them but unaware of how many there were.
For instance, consider the following brief instance of a task that I began working on yesterday. In the context of medical data, particularly when analyzing health conditions, it is common for individuals to have multiple health conditions that are often linked. This often makes it more difficult for doctors to treat and for individuals to recover fully.
If I were to apply classical statistics using base R, this would be very time-consuming. However, I recently discovered that there are also medical statistical packages specifically designed for analyzing data for individuals with comorbidities. For example, if I wanted to analyze individuals suffering from diabetes, hypertension, cancer, obesity, or a combination of different diseases, I could do so using these packages.
In addition, it is possible to create a score that can be used to estimate the likelihood of a person who becomes ill and goes to the hospital, stays for a long time, or dies. It is possible to perform this task using regular statistics and programming in R, but it would be very tedious. In my case, I am working on a tight deadline and need to submit my work by a specific date. I believe the package I am speaking of is the comorbidity package in R. It was developed recently by researchers at the London School of Hygiene & Tropical Medicine and is an invaluable tool.
I work with NHS data through a third-party organization that controls it and allows me access to it. Last year in December, they provided me with brief training and taught me how to access their data on a DBS SQL server using SQL queries embedded in R code.
Learning about very niche packages, which are very content-specific or topic-specific, is very useful for researchers like myself. Integrating different programming languages is also useful because they are all merging into one. Python, Julia, R, and Java have a lot of cross-fertilization and use between the different programming and software development packages. If R continues to streamline its services to integrate other packages, it will be a win-win situation for everyone.
What is the R Community like in Glasgow? What efforts are you putting in to keep your group inclusive for all participants?
We are not trying to cater to one specific level of expertise. The last meeting had a good mix of participants, including PhD students, undergraduates, people who have worked in finance and tech, software developers, and an individual from the R Consortium in Dublin, Ireland.
The group is open to everyone, and we are trying to mix participants with different needs, wants, and interests. It is understood that attendees will choose which events they would like to attend. Certain events will focus more on entry-level individuals beginning their R learning journey. For example, they are interested in learning what they can do with ggplotand the tidyverse.
Mid-level individuals, including graduate students, will also be targeted. A portion of these students are novices, but many are more experienced. They have a strong foundation in R and RStudio or Posit. However, they are now seeking to learn more advanced techniques, such as how to perform specific calculations. For instance, they may be working with quantitative or qualitative data and are now at the analysis stage of their research and wonder what to do next.
Finally, there are a small number of highly experienced programmers who are interested in learning more about integrating specific features into a package. They may want to know how to create their packages and launch them. They are also interested in learning about Shiny and Quarto and how they can use these tools for their businesses or companies.
Most individuals fall into the beginner or intermediate levels, but there are a few who are highly advanced and still interested in attending. As a result, most of the talks will be geared toward individuals with intermediate-level experience. This will ensure that the material is not too advanced for beginners but also not too basic for advanced learners.
Can you tell us about a recent event that received a good response from the audience?
Of the recent events that were particularly successful, I would like to highlight the one held in November last year. It was titled “Flex Dashboard: Displaying data with high impact using minimal code.” Erik Igelström, a researcher from the University of Glasgow, presented his use of R Shiny to display data from the Scottish government. The presentation was highly informative and demonstrated the potential of Shiny to present data in a user-friendly manner.
The meeting was attended by a representative from R Software in Ireland, who provided us with a wealth of information about industry developments, including the latest trends and upcoming projects. As a result of this meeting, 2023 was the most productive year for our R meetup.
The preceding meetups were not entirely unproductive, but the most recent one, held in November last year, laid the groundwork for the current initiatives.
Professor Lisa DeBruine will be presenting at this Meetup. She is a professor of psychology at the University of Glasgow in the School of Psychology and Neuroscience. She is a member of the UK Reproducibility Network and works in PsyTeachR. She has used the psych package extensively and many other good packages in R to conduct her psychological research. Her presentation will be on how to simulate data to prepare analyses for pre-registration.
As those who work with data know, it is sometimes counterproductive to work directly with the data itself. For example, if one is building a model, it is not advisable to use all of the data to build the model, especially if the data set is small. This is because there is a risk of over-fitting.
Generating dummy data for quantitative data is a well-known technique. However, generating dummy data for qualitative data is rare. This is because qualitative data is often unstructured and difficult to quantify. Professor Lisa DeBruine is an expert in generating dummy data for qualitative data.
SPSS is a popular statistical software package used by sociologists, anthropologists, and psychologists. However, R is a more powerful and flexible tool that can perform a wider range of analyses. Learning to use R and the psych package can greatly simplify the process of conducting factor analysis. Additionally, R can be used to perform calculations and analyses that are impossible in SPSS.
Our team is highly capable, and we have another team member who is particularly skilled in generating graphics and designing flyers. He has been responsible for creating the promotional material and has done an excellent job.
The R Consortium recently connected with Lampros Sp. Mouselimis, the creator of the ICESat2R package, discussing the ICESat-2 mission, a significant initiative in understanding the Earth’s surface dynamics. This NASA mission, utilizing the Advanced Topographic Laser Altimeter System (ATLAS), provides in-depth altimetry data, capturing Earth’s topography with unparalleled precision.
Mouselimis’ contribution, the ICESat2R package, is an R-based tool designed to streamline the analysis of ICESat-2 data. It simplifies accessing, processing, and visualizing the vast datasets generated by ATLAS, which emits 10,000 laser pulses per second to measure aspects like ice sheet elevation, sea ice thickness, and global vegetation biomass. This package enables users to analyze complex environmental changes such as ice-sheet elevation change, sea-ice freeboard, and vegetation canopy height more efficiently and accurately. The R Consortium funded this project.
Lampros Sp. Mouselimis is an experienced Data Analyst and Programmer who holds a degree in Business Administration and has received post-graduate training in Data Processing, Analysis, and Programming. His preferred programming language is R, but he can also work with Python and C++. As an open-source developer, you can find his work on GitHub With over a decade of experience in data processing using programming, he mainly works as a freelancer and runs his own business, Monopteryx, based in Greece. Outside of work, Lampros enjoys swimming, cycling, running, and tennis. He also takes care of two small agricultural fields that are partly filled with olive trees.
You built an R package called ICESat2R using the ICESat-2 satellite. Do you consider your ICESat2R project a success?
ICESat-2 R has 7,252 downloads, which, considering the smaller group of researchers who focus on using ICESat-2 data, qualifies it as a popular tool. It’s not as popular compared to some other remote sensing packages, but I believe it’s been a success based on two main points:
Contribution to the R users community: I hope that the R programmers who use the IceSat2R R package are now able to process altimetry data without any issues, and, if any, then I’ll be able to resolve these by updating the code in the GitHub and CRAN repositories.
Personal and Professional achievement: I applied for a grant to the R consortium, and my application was accepted. Moreover, I implemented the code by following the milestone timelines. Seeing a project through and providing it publicly is a success, I believe.
Who uses ICESat2R, and what are the main benefits? Any unique benefits compared to the Python and Julia interfaces?
The users of the ICESat2R package can be professionals, researchers, or R programming users in general. I assume that these users could be:
Ice scientists, ecologists, and hydrologists (to name a few) who would be interested in the altimeter data to perform their research
Public authorities or military personnel, who, for instance, would like to process data related to high-risk events such as floods
Policy and decision-makers (the ICESat-2 data can be used, for instance, in resource management)
R users that would like to “get their hands dirty” with altimeter data
I am aware of the Python and Julia interfaces, and to tell the truth, I looked at the authors’ code bases before implementing the code, mainly because I wanted to find out the exact source they used to download the ICESat-2 data.
Based on the current implementation, I would say that the benefits of the ICESat2R package are the following:
The R programming users can use NASA’s OpenAltimetry interface, which, as of December 2023, doesn’t require any credentials
There are many examples where the ICESat2R package can be used. For instance, a potential use case would be to display differences between a Digital Elevation Model (Copernicus DEM) and land-ice-height ‘ICESat-2’ measurements. The next image shows the ICESat-2 land-ice-height in winter (green) and summer (orange) compared to a DEM,
More detailed explanations related to this use case exist in the Vignette ICESat-2 Atlas Products of the package.
Were there any issues using OpenAltimetry API (the “cyberinfrastructure platform for discovery, access, and visualization of data from NASA’s ICESat-2 mission”)? (NOTE: Currently, the OpenAltimetry API website appears to be down?)
Currently, I have an open issue in my Github repo related to this migration. Once the OpenAltimetry API becomes functional again, I’ll submit the updated version of the ICESat2R package to CRAN.
In your blog post for the copernicusDEM package, you showed a code snippet showing how it loads files, iterates over the files, and uses a for-loop to grab all the data. Can you provide something similar for ICESat2R?
Whenever I submit an R package to CRAN, I include one (or more) vignettes that explain the package’s functionality. Once the package is accepted, I also upload one of the vignettes to my personal blog. This was the case for the CopernicusDEM R package,
The current version of IceSat2R on CRAN (https://CRAN.R-project.org/package=IceSat2R) is 1.04. Are you still actively supporting IceSat2R? Are you planning to add any major features?
Yes, I still actively support IceSat2R. I always respond to issues related to the package and fix potential bugs or errors. The NEWS page of the package includes the updates since the first upload of the code base to Github.
I don’t plan to add any new features in the near future, but I’m open to pull requests in the Github repository if a user would like to include new functionality that could benefit the R programming community.
About ISC Funded Projects
A major goal of the R Consortium is to strengthen and improve the infrastructure supporting the R Ecosystem. We seek to accomplish this by funding projects that will improve both technical infrastructure and social infrastructure.
Recently, Kamil Sijko of the Warsaw R User Group discussed with the R Consortium his transition from academia to leading data science in the business sector. He noted the current dormancy of Warsaw’s R community and the eagerness to revive its dynamic, pre-COVID meetups. The group’s latest meeting explored new, interactive formats to engage its diverse membership better.
Please share about your background and involvement with the RUGS group.
During my early academic years at the University of Social Sciences in Warsaw, we explored several interesting projects, one of which was ‘webiR’ in 2009. This project was an attempt to blend R’s capabilities with web application development, which was not very common at the time. We developed webiR a few years before the advent of Shiny in 2012 with the idea of making R more accessible to non-technical users.
While webiR might not be widely remembered today, unlike the widely successful Shiny, it represented our early efforts to simplify data analysis. The application allowed users to choose survey questions they were interested in, and then it would automatically select suitable analyses through a set of heuristics. This approach aimed to eliminate the need for users to understand the underlying R functions, making data analysis more approachable.
Although webiR wasn’t a major success, it was a valuable learning experience and a stepping stone in exploring how R could be used innovatively, especially in web development. These kinds of exploratory projects contribute to the ongoing evolution and versatility of R, which we continue to see today.
Later, I transitioned to working at research institutes, including a government-funded Polish Educational Research Institute. Now, I’m in the business sector. I serve as the Head of Data Science at Transition Technologies Science, a company that operates in the medical industry. We collaborate with pharmaceutical companies, universities, and medical scientists. My role involves leveraging data science in various aspects of the medical field.
Can you share what the R community is like in Warsaw, Poland?
The situation is dormant, but it’s good timing for a reboot. There have been no revised activities since the pandemic ended. Before COVID, though, this was a hot topic of discussion. There were frequent meetups, including Python and data science gatherings. These meetups were unique, and I found them slightly unconventional in a good way. For example, Python meetups often focused on deep learning and applications in risk management or insurance.
But with R meetups, there was a broader spectrum of topics, often venturing far beyond conventional subjects. I found this diversity particularly refreshing, especially as many academics were involved, exploring a wide range of innovative applications.
One of the things that stood out was the involvement of women from the Warsaw University of Technology, who ran the ‘R Ladies’ in Warsaw. They organized numerous workshops, which were quite popular. These workshops offered an accessible entry point into data science for those looking to change careers. One interesting observation made was that R is often seen as more approachable as a first language for newcomers from different backgrounds.
We also have a strong scientific group in Warsaw led by Professor Biecek, a fervent advocate of R and leader of MI2.AI. His work in Explainable AI is cutting-edge, making us feel connected to a vibrant local scene. Another point raised was the curiosity about local technological developments, not just the global cutting-edge advancements.
I recall an initiative named ‘PoweR’ – a three-week crash course in data science that attracted about 500 participants. I didn’t participate myself, but it was impressive. Also, the fields of science like medicine, statistics, econometrics, spatial sciences, and humanities were highlighted. R is extremely popular in these areas, allowing for exploration of unique and diverse topics.
It’s clear there’s a strong desire to revive these meetups and initiatives, as they foster a unique learning environment and community spirit.
You had a Meetup on December 11th, 2023. Can you share more on the topic covered? Why this topic?
In our recent meeting, we deviated from the usual format of workshops and lectures, opting for a more unique approach that we may not repeat. Instead, we engaged in a peer-to-peer discussion, which was feasible due to the small number of attendees. We focused on two main topics. The first was understanding what people miss most about our meetings, as I aim to incorporate these elements when I reboot them. The second topic was exploring future directions for our meetings.
We delved into the different types of participants attending our meetings. One group comprises those familiar with R and eager to learn about advanced techniques, for whom lectures are ideal. Another group includes individuals transitioning from other fields to data science. We also considered students, particularly those favoring Python over R, and I believe it’s important to dispel any misconceptions about career prospects in R.
Additionally, we discussed members of the open source community around Warsaw, recognizing their contributions during events like hackathons. Another interesting aspect was the companies’ involvement, not just in recruitment but also in sharing their work with the community.
An unaddressed yet intriguing aspect was attendees transitioning within the data science field, seeking insights into new companies and trends. I also want to focus more on social interactions beyond just having pizza and experiment with ideas like speed dating or extended interactions with lecture presenters.
Lastly, we considered the language of our meetings. Operating in Poland, we debated whether to conduct some sessions in English, stream them, or post them on YouTube to reach a broader audience. I’m excited to experiment with these ideas, which could significantly enhance our meetings.
Who is the target audience for attending this event?
Up to this point, our focus has primarily been on individuals who are already interested in R and seeking to deepen their knowledge with expert insights. That’s been our main audience. The other significant group consists of those completely new to the field who are looking to be introduced to data science through R. These are the two main types of participants we usually have.
We aim to be more inclusive; of course, there’s the ‘R Ladies’ initiative. The ‘R Ladies’ essentially engage in the same activities as the rest of our groups, but they cater to a different audience. The content and structure of their sessions are similar to what we offer to other participants. Still, they focus on creating an inclusive environment for women interested in data science and R.
Any techniques you recommend using for planning for or during the event? (Github, zoom, other) Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
There were various opinions, but one perspective really resonated with me. COVID took away our in-person meetups, and although there was an attempt to transition them to a virtual environment, it wasn’t the same. We miss face-to-face interactions and being in the same physical space together. That’s something special.
There were instances where, despite people already gathering in the room, we had to announce that the expert wouldn’t be able to come and would instead join via Zoom. This often led to disappointment, with some attendees leaving the room immediately, as they weren’t interested in a virtual presentation. After all, there’s plenty of similar material available online.
One comment struck me: even though we could have experts from RStudio (now posit) or other places speak to us from across the ocean about their latest developments, this information is already accessible on platforms like YouTube. The experience is likely to be similar. In terms of using Zoom or similar virtual platforms, we’re leaning towards not pursuing that path for future meetups.
We would like to get to know you more on the personal side. Can you please tell me about yourself? For example, hobbies/interests or anything you want to share about yourself.
A fun fact about me is my deep involvement in an initiative focused on teaching children creative computer skills. I’ve found it incredibly rewarding to help kids learn how to use technology creatively. It’s a lot of fun, both for me and the children. For instance, I recently prepared workshops on creating Electronic Dance Music (EDM). These workshops cover aspects like sampling and looping. I find this work enjoyable and immensely fulfilling, as it combines my passion for technology with the joy of teaching and engaging with children.
Additionally, in my work with CoderDojo, I’ve had the opportunity to engage children in programming projects, including a special focus on encouraging a group of girls. We utilized ‘Kodu Game Lab‘ for these sessions, a platform that offers a more immersive, video game-like environment for coding. This platform enabled the children to learn programming concepts in a playful manner, such as coding a robot to follow or avoid objects and even creating their own simple games.
A key moment came when the girls highlighted a significant limitation: the lack of relatable characters in the games, noting the predominance of robots and other figures but a conspicuous absence of princesses or characters they could identify with. This feedback was invaluable and led us to adapt our approach. We creatively worked around this limitation by incorporating an object—a ‘tag’—which we collectively imagined as a princess needing rescue. This improvisation turned into a unique game by the end of the day.
This experience was not just fun but also enlightening, underscoring the importance of CoderDojo’s approach in offering unique insights into how different groups perceive technology. It highlighted the need to understand and address diverse perspectives and requirements in technology, especially when introducing young minds to the world of programming.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.
In the rapidly evolving sphere of pharmaceutical data analysis, a significant transition is taking place – the shift from traditional SAS to the versatile R programming language. Pfizer, a trailblazer in the pharmaceutical industry, is leading this change. We are excited to invite you to an exclusive webinar that will cover details about how Pfizer has succeeded and what the benefits are: “From Vision to Action: The Pfizer R Center of Excellence-led Journey to R Adoption,” scheduled for February 8th, 2024, at 3:00 pm ET.
At the heart of Pfizer’s data analysis revolution is the adoption of R – a language known for its robust community-driven development and open-source nature. This move is not just about changing tools; it’s about embracing a culture of innovation and collaboration.
The journey began with an internal query at Pfizer: How many of our colleagues are proficient in R? The answer led to the unveiling of a latent community of R users, eager yet unconnected. In 2022, an internal survey highlighted the presence of over 1,500 R users, a clear sign of a burgeoning community within Pfizer.
In response, Pfizer established the R Center of Excellence (CoE) in 2022. This initiative marked a shift from scattered individual efforts to a cohesive, strategic approach to R adoption. The CoE, celebrating its first anniversary in 2023, has become a linchpin in nurturing Pfizer’s vibrant R community.
Webinar Highlights:
This upcoming webinar, hosted by the R consortium, is more than just a case study. It’s a treasure trove of insights for fostering an engaged R community. The session will cover:
Pfizer’s journey in building a robust R community.
Practical strategies applicable across various industries.
Understanding the critical role of an engaged R community in data analysis.
Join Us for the Webinar:
This is an unmissable opportunity for anyone interested in data science, R programming, or community building within large organizations. By attending this webinar, you will gain firsthand insights into how Pfizer successfully integrated R into its data analysis practices and how you can apply these learnings to your organization.
Don’t miss this opportunity to learn from Pfizer’s experience and expertise. Register now for the webinar on February 8, 2024, at 3:00 pm ET and be a part of the conversation shaping the future of pharmaceutical data analysis.
The Karachi R User Group, Pakistan, hosted its second event, “Unveiling the Power of R Shiny Dashboards,” on December 30, 2023. The R Consortium spoke with Uzair Aslam, the group’s founder, about the challenges of starting an R User Group in a budding R community. He also discussed his data analysis project for studying the health deficiencies experienced by the Pakistani population.
Please share about your background and your involvement in the R Community.
My name is Uzair Aslam, and I did my BSc in Economics and Mathematics from the Institute of Business Administration (IBA), Karachi. I have a keen interest in data science, statistics, and econometrics. After graduating, I co-founded a consulting firm called StatDevs. I work with two developers to develop R and Shiny applications for our clients.
At StatDevs, we solve complex problems using data science solutions and data analytics. R is a core language for us, and we’re experienced in Python, too. However, we are focused on R because of its strengths in data analysis, data visualization, and the development of Shiny applications.
My motivation for starting this group came from watching online events of R user groups in the USA and Europe. I attended the presentations and listened to what R is capable of and how they are bringing R to their communities. I noticed much R activity on that side of the world, but nothing was happening on the Asian side. That is when I wanted to make people realize that they could use R for their data analysis in academia and industry so they can solve more problems.
R User Group Distribution Around the World, from Ben Ubah’s R Community Explorer repo using the meetupr package to query Meetup API
Currently, regarding R users, there is a lack of community concept in Pakistan. Tech communities are not nurtured properly, not built properly, and they are not contained properly.
I contacted the R consortium and shared my story of wanting to establish an R user group as the organizer to promote the language.
Can you share what the R community is like in Pakistan?
I have observed that R is used in academia, but not to the extent it should be. I have seen a couple of professors at IBA and some in Islamabad who use R but also use Stata and Excel for their academic purposes and data analysis. In terms of industry, Power BI and Excel are used extensively. This is because not many people know R’s data analysis and analytics capabilities. The acceptance of R is not realized due to the lack of awareness. Some academic researchers use R but may need more training to get the most out of what R offers them. Karachi R User Group aims to narrow down this gap.
Are there any particular challenges you have faced in organizing this RUG?
Indeed, getting people to participate in this R user group has been a challenge. I held our first meetup myself last month in November, and only 4 or 5 people attended. I prepared for the meetup for about two weeks because I wanted an excellent introduction and everything, but fewer people showed up. Of those five people, one was my co-founder, and two were participating from the US and Brazil sides. There was only one person from Pakistan. This happens when you introduce something new in a place people are unaware of. My job is to continue this effort and tell people about the possibilities and opportunities of data analysis and consulting using R.
As we approach our second meetup, more people are showing interest, and the number is growing daily. I am not active on Instagram and very less active on Twitter. However, I use LinkedIn as my platform to reach people and Facebook. On Facebook, I have joined multiple groups, so I share information about the meetups in these groups. Lately, I have been realizing that I should use Twitter as well because I have seen more people promoting their R events on Twitter.
Currently, we have 100 members in our user group, and the upcoming meetup is titled “Unveiling the Power of R Shiny Dashboards.” Jehangeer Aswani is the speaker for this event. Jehangeer is a professional freelancer on Upwork and is based in Islamabad. Due to his motivation and my idea, we started this R user group. He is one of the people I look to for motivation. He has a bachelor’s degree in Statistics and provides R Shiny consulting services.
This meetup is about the fundamental concepts of R Shiny. One may wonder why R Shiny is relevant when we have Power BI and Excel. Jehangeer will provide a hands-on experience with R Shiny applications. This will help participants understand why R Shiny is a better tool. In addition, this meetup will unlock the potential to transform data into captivating visualizations. Participants will also learn how to build R Shiny dashboards. They will get hands-on experience with a real-world application that can be used to solve a business case.
Please share about a project you are working on or have worked on using the R language. Goal/reason, result, anything interesting, especially related to your industry?
I used R for micro-analysis of the Public Health domain. I collaborated with a consultant in Karachi, Pakistan, named Jaweid Ishaque. We worked on a data analysis project for Indus Hospital and Health Networks, a large network of hospitals. The problem statement of the project was to create a broader understanding of the health deficiencies experienced by the Pakistani population, particularly in Punjab, Sindh, and Balochistan. This was a funded study that we conducted.
I worked as a data analyst on this project. The consultant guided me throughout the study. I summarized and presented the current status of health parameters in terms of mortality, disease, incidence, and prevalence. We also compared these parameters to those of other countries, such as Bangladesh, India, Sri Lanka, and Nepal. With the help of R and its packages, I could extract, process, and clean the data sets from multiple sources using dplyr. I used ggplot to visualize the data. Finally, out of the 141 total districts, I identified the most disadvantaged districts in Pakistan in terms of Public Healthcare Delivery (PHC), Social Living Measurements (SLM), and Incidence Of Diseases (IOD). Our rigorous analysis narrowed the list of disadvantaged districts to around 35 districts in Pakistan. There were eighteen districts in lower Balochistan, ten in Sindh, and seven in Punjab. This study helped Indus Hospital And Health Networks deploy mobile health clinics to remote areas of Pakistan.
I wrote and executed all of the analytical scripts for the data cleaning and analysis of the provided surveys in R. This allowed me to gain an overview and insights into the data, which I then reported to the stakeholders. I presented Indus Hospital Health Networks with a comprehensive overview of our seven to eight months of research. I generated Pakistan’s population parameters in these analyses, including birthplaces, provincial distributions, mortality rates, and stillbirth rates by provinces and districts.
In addition to the above, I have also started offering R training. I delivered an online course on R one year ago titled “R for Economics and Finance.” I instructed over 15 students from IBA and all over Pakistan in this online training course, which was solely based on R.
Students were delighted to learn about the practical applications of their economic and financial models, as they had previously only been taught theoretical courses in Universities. I conducted this training last year and will now conduct several R trainings in industry and academia.
I will be conducting one of these trainings in February. This training will be titled “R for Data Science,” and students and industry professionals will attend it. I have begun working on this training to promote R as much as possible through our efforts.
As my commitment to advancing the use of R in data analysis and data science grows, I express gratitude to the R Consortium for their support on this transformative journey. Envisioning a significant impact on Pakistan, I am dedicated to constructing a vibrant open source community. The fruits of my efforts will manifest as I realize my vision: fostering open source data analytics and collaboration throughout Pakistan.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.
The R Consortium is excited to open the doors to the 2024 RUGS Program, starting January 8th, 2024! We are committed to supporting R User Groups (RUGS) worldwide in your efforts to organize and share information and strengthen their local communities. We’re now inviting applications for our program.
This year, the RUGS Program is structured around three distinct categories of grants:
User Group Grants: Tailored for groups seeking support to enhance user experience or develop user-centric projects.
Conference Grants: Ideal for those looking to host or attend conferences that align with the program’s objectives.
Special Projects Grants: Designed for innovative and unconventional projects that require a boost to get off the ground.
For full details and to submit your proposals, visit here. Your contribution can significantly support the ongoing evolution of the R language.
R User Groups: Global Impact
With 74 active R User Groups (RUGS) worldwide and 67,458 members, R communities welcome individuals of all backgrounds and skill levels, from beginners to advanced users.
Community Spotlight
Explore our blog for interviews with R User Group organizers from various industries, offering insights into their experiences and impacts.
The application period for the 2024 RUGS Program opens on January 8th, 2024, and will close at midnight PST on September 30, 2024. Note that these grants do not cover software development or technical projects. For such initiatives, consider the ISC Grant Program, which opens for proposals twice a year. You can learn more about the ISC Grant Program here.
Join Us in Strengthening the R Community
Your participation can significantly contribute to the development and cohesion of the R community. Apply starting January 8th, 2024, and be part of this exciting journey of growth and collaboration!
Are you ready to take your data visualization skills to the next level in the fascinating world of survival analysis? We’re thrilled to invite you to our exclusive webinar: “Visualizing Survival Data with the {ggsurvfit} R Package.” Mark your calendars for January 25, 2024, at 7:00 PM Eastern Time (ET)!
The {ggsurvfit} package is designed for both beginners and seasoned data scientists. It streamlines the process of generating publication-quality, time-to-event, or survival analysis graphs. And the best part? It’s built on the backbone of the beloved {ggplot2} package, marrying simplicity with sophistication.
What You Will Learn
Our interactive session will dive deep into how the {ggsurvfit} functions, like add_confidence_interval() and add_risktable(), seamlessly integrate as {ggplot2} ‘geoms.’ This means you can spruce up your plots using the familiar {ggplot2} toolkit sans the headache of mastering new coding syntax.
It’s not just about learning a new tool – it’s about enriching your data storytelling capabilities.
The Perks of Joining
Interactive Learning: Engage with experts and peers in an interactive online setting.
Skill Enhancement: Elevate your data visualization prowess, specifically in survival analysis.
Network Building: Connect with fellow data enthusiasts and professionals from diverse fields.
The R Consortium recently reached out to Abbie Brookes, Senior Analyst and AI Consultant at Datacove, co-founder and organizer of the Manchester R User Group. During the conversation, Abbie discussed her active participation in the R community and the rapid growth of the community in different parts of the United Kingdom, particularly in Manchester, London, Brighton, and Bristol.
Zac Nash, the other co-founder of Manchester R and a Data Scientist, began his career as a PhD Researcher in Computer Science at Bangor University and in 2022, publishing his research paper ‘Tracking the Fine Scale Movements of Fish using Autonomous Maritime Robotics: A Systematic State of the Art Review’ (Nash et al 2021, viewable here: Tracking the fine scale movements of fish using autonomous maritime robotics: A systematic state of the art review – ScienceDirect) Zac started a job as a Data Scientist at Datacove in 2022, developing machine learning and AI products for many different clients. Zac still actively attends and engages in the Manchester R community, and now works as a Senior Data Engineer at Fresh Egg – a digital marketing consultancy in Worthing, West Sussex. Here he continues to develop new skills within the data space and participate in the R and Python communities, at his home in Manchester and near his work in West Sussex.
Please share your background and involvement with the RUGS group
Abbie Brookes,co-founder and organizer of the Manchester R User Group
I started working for Datacove in the summer of 2022. Datacove is a Data and Analytics Consultancy Team. They work across multiple sectors, specializing in customer and marketing analytics, reporting and visualization techniques, web analytics, R and Python training, and much more! Our company director, Jeremy Horne, who is well-known in the Brighton R community on the south coast, has been deeply involved in the R scene since late 2005. His journey began in London, where he made many friends and connections, inspiring him to create Brighton R.
Zac Nash, far left, presenting at the June 2023 Meetup
When I joined Datacove, I took on the role of co-organizing Brighton R. We operated as a remote company with team members on the south coast and the north of the UK. My ex-colleague Zac, who is based in Manchester, pointed out the lack of significant tech communities and idea-sharing groups. Inspired by our success in Brighton, we established a similar initiative in Manchester. Despite the challenge of the long commute, Zac and I worked together to rejuvenate the R scene in Manchester, starting with our first major event in June 2023, followed by another one in October. Our expansion from Brighton to Manchester highlights the growth and impact of our community-driven R initiatives.
Can you share what the R community is like in Manchester, United Kingdom?
October 2023 Meetup
The Manchester R User Group was considered our largest R community until London R! We consistently attract a lot of attendees, more so than our other groups. It’s an incredibly lively and passionate group, and the level of engagement is just fantastic.
What I find particularly rewarding about the Manchester group is its inclusivity. We see diverse individuals from various backgrounds and minorities in the data field. This diversity is important to me, especially as a woman in a STEM field. Women in tech or data fields are often in the minority, and it’s refreshing to see a different dynamic in Manchester.
Our attendees range from complete beginners who have never engaged with R before to seasoned professionals in the field since the early 2000s. It’s this mix of experience levels that makes our meetups so enriching.
Knife dancers at Manchester R User Group October 2023
We’ve also been fortunate to receive strong support from local companies. A tech recruitment firm, Better Placed, sponsored our last event in October and is sponsoring our next one in March. They provide an amazing venue with a stunning bar adorned with palm trees, serving champagne, craft beers, and cocktails, plus the usual pizzas and chips because everyone loves that at a meetup.
After our events, we often take the group to a pub, adding to the experience as a community. Manchester’s nightlife is vibrant, with many young professionals and great bars. Our last meetup even had impromptu dancers at the bar, making it a unique and quirky experience. It’s a fun and unusual meetup, but that’s what makes it so memorable and enjoyable.
Our next event is scheduled for the 14th of March 2024 (Manchester R Meetup! March 2024). Unfortunately, I can’t share much information about the speakers yet, you’ll just have to join the group to find out! We typically announce the speakers about one to two months before the event, but I assure you, we have a great lineup planned. We’ve hosted many interesting talks in the past.
For instance, we had a fascinating geospatial talk at our last event in October. We’ve also had presentations from people within our company, including our director. Additionally, we’ve had speakers from Posit and various maintainers of R packages. One notable talk was about the ‘Arrow’ package, which was fantastic.
So, for now, we’ll just have to wait and see. We will inform the R Consortium when everything is confirmed and ready to be announced.
Who is the target audience for attending this event?
Our target audience is incredibly broad and inclusive. To join our events, all you need is an interest in data and a desire to enjoy time with others who share similar passions. We welcome absolutely anyone, regardless of their career stage or academic background. Whether you’re a complete novice in the field or possess extensive knowledge, our events are designed to cater to all levels.
We carefully curate our talks to ensure they’re enjoyable and accessible, regardless of the attendees’ expertise. We provide introductory talks for beginners, and for those with more experience, we offer content that delves into deeper expertise. Even in our more advanced talks, we include steps on getting involved and starting, ensuring nobody feels left out.
An essential part of our events is the interactive aspect. There’s always an opportunity to ask questions; we often share example code. This approach ensures that everyone, no matter their knowledge level, feels welcomed and can benefit from our events. We’re dedicated to creating an inclusive environment where everyone can learn, share, and grow.
Any techniques you recommend using for planning for or during the event? (Github, zoom, other) Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
Of course, I’d happily explain how we accommodate those who can’t attend our events in person. Firstly, we always ensure that our event venues are disability-friendly and accessible. However, we have alternatives for those who prefer not to attend in person.
For our Brighton events, which have been running for over three years, we livestream them on YouTube every time. This allows anyone, regardless of location or ability to attend physically, to participate in our events. We’re planning to implement the same for our Manchester events soon. It’s still in its early stages, so we haven’t set up live streaming yet, but it’s definitely on our agenda. The only challenge is ensuring we have the right technical equipment to provide a high-quality streaming experience.
Additionally, we accommodate speakers who can’t be present physically. Often, speakers join us via Zoom, and we broadcast their presentations to the attendees in the room. We understand that not all speakers are able or willing to come in person, but we still want to include their valuable insights. A good talk can be delivered over Zoom, and we believe in leveraging technology to make our events as inclusive and engaging as possible, regardless of physical presence.
Please share any additional details you would like included in the blog.
Datacove 2022 company photo with Laura (Brighton Py organizer and Data Scientist), Zac (Manchester R co-founder and Senior Data Engineer), Jeremy (BrightonR co-founder and Company Director), and Abbie (Manchester R co-founder and organizer and Senior Analyst and AI Consultant). Other members present: Sarah (director), Nathaniel (Jeremy’s wonderful son!), and Yagyansh.
The most interesting thing happening right now is the expansion of our meetup groups. As I’ve mentioned, it’s becoming complex because we’re growing rapidly. We started with Brighton R, which has been running for over three years. From there, we expanded to Manchester R, which has been active for about half a year.
Our director loves the R community; therefore, we take pride in organizing Brighton R, Brighton Py, and Manchester R – with our new ‘Shiny’ editions of London R and Bristol R to come soon, hopefully! Plus, keep your eyes peeled for some huge announcements coming soon. I can’t reveal it just yet. It’s scheduled for next year, and I can barely contain my excitement about it. Our expansion has a lot of momentum, and seeing our community grow is incredibly exciting.
We would like to get to know you more on the personal side. Can you please tell me about yourself? For example, hobbies/interests or anything you want to share about yourself.
My hobbies and interests initially leaned more towards academia, which is surprising to most people, considering I’m now a senior data analyst. I pursued a psychology degree, primarily fascinated by its research, statistics, and coding aspects. This interest led me to conduct a significant research project during my degree, coincidentally during the pandemic. This timing posed challenges, as all my lectures were online and difficult to navigate.
My research project focused on at-home interventions for managing anxiety related to the COVID-19 pandemic. I used the programming language R extensively for this project. Interestingly, during this project, I realized my preference for data and coding over psychology. As a result, I transitioned into the data field, leaving most of the psychology aspects behind.
My path to data science has been unconventional. Unlike many who may go directly from a mathematics or computer science degree into data, I took a unique and somewhat unexpected route from psychology to data science. It’s an odd path, but I’ve embraced it.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.
Last year, Julia Silge, co-organizer of the Salt Lake City R User Group discussed the group’s plans to meld in-person and online activities with the R Consortium. This year, Andrew Redd, founder of the group, provided an update on the group’s recent and upcoming events. The group has successfully implemented its plan, with online presentations coupled with in-person networking events. Andrew also discussed his work with the Veterans Affairs and trending topics being discussed at the group’s events.
Andrew is a Biostatistician and works as an Assistant Professor at the University of Utah School of Medicine. He also works as a Research WOC at the Department of Veterans Affairs. Andrew is also an R expert for VINCI.
Please share about your background and involvement with the RUGS group.
I was initially introduced to the R programming language during my time in graduate school at Texas A&M University. I was a member of the statistics department there and was working on my PhD. Even though Texas A&M has a close affiliation with Stata, R was the language of choice for that year. I quickly took to the language, as I already had a background in programming in C. I had also done extensive work in Mathematica during my undergraduate studies and other languages, such as Visual Basic and various other programming languages. I found R to be a pleasant language to use and am a firm supporter of open source software. As a result, I quickly became proficient in the language. I have since made a career out of working with R and have published a few early packages. One thing that brought me early recognition was my NppToR package, which is still available online. I abandoned this package when RStudio became sufficiently developed to fully utilize the capabilities that I relied on with Notepad++ as my primary editor.
I arrived at the University of Utah in 2010 and founded the Utah R Users Group shortly thereafter. The group was originally called the University of Utah and Salt Lake City R Users Group as it was centered around the university and its users. We later expanded beyond the university and now have members from all over the world. Our meetings where we present material are now fully online. We supplement these meetings with social gatherings that are not centered on presentations. Instead, we meet at local venues such as bars or ice cream shops and simply talk about R. These gatherings allow us to meet new people, see what others are doing with R, and network in a more casual setting.
Would you like to tell us about some recent and upcoming events from your group?
The next meetup we have is in January. January is always a great meetup, as we have a tradition of doing lightning talks as our first meeting of the year. Our lightning talk series aims to highlight our local members. We prioritize our local members and give them five minutes to present an interesting project they have completed, such as a cool analysis or a new package. These presentations are low-stress and brief.
Our February event will focus on package development and the latest developments from R Studio and consortium regarding package development and maintenance. As for recent events, we had an event titled “Slide Crafting with Quarto” in November and another meetup titled “Fairness and Machine Learning” in December. We strive to provide a wide range of topics for all levels of our programming.
Any techniques you recommend using for planning for or during the event?
Meetup has been extremely beneficial. It did not exist when we founded the group, or at least I was not aware of it when we first organized the R users’ group. Thanks to the R Consortium grant, which pays for the Meetup page, it has proven to be a very useful tool. We are currently in a hybrid format, with all our presentations being held online. We live stream the recordings to YouTube and on Zoom, which is where we usually host them. It then simulcasts to YouTube, where it is saved. This allows anyone who wishes to do so to view our previous meetings.
We have achieved significant success, and many of our presentations have garnered a considerable number of views. While these views are not viral by YouTube standards, they are a significant number for our community.
However, I must admit that we have always operated our organization in a manner that differs somewhat from other user groups. We have a unique culture here in Utah that makes it easier for us to meet during the day, which I know is not typical of other user groups, which typically meet in the evening. However, we have found a schedule that works for us and have stuck with it. I believe that the most important thing for anyone trying to organize a user group is to find a schedule that works and stick with it.
Please share about a project you are currently working on or have worked on in the past using the R language?
I would like to discuss my work with the Veterans Affairs. The Veterans Health Administration has the VA Informatics Computing Infrastructure (VINCI), a secure remote desktop environment for conducting research. With this infrastructure, we have access to all the VA records. Once we have approval and access, we can use tools such as R, SAS, or Stata, along with other various tools to perform all the data analysis that we need. This is a very useful resource that I have been working with for about 10 years. I am the R expert for VINCI, so I receive a lot of questions regarding R.
Most of the questions asked of us are related to connecting R with databases. In particular, I rely heavily on dbplyr, DBI, and odbc, since the VA is SQL Server based. My compliments to the team behind the DBI and odbc packages, as they have saved me from many difficult situations.
What trends do you currently see in R language and your industry? Any trends you see developing in the near future?
Everything is trending towards tidyverse and tidy principles. This has been a trend for several years now, with everything trying to be more uniform in the way it is done. This is done by treating data as data, which I really appreciate. It makes it easier to program and extend.
Our group has also had a lot of topics that are not just about R, but also about statistical analysis. For example, I will point out the meeting on Fairness in machine learning, which is a very important topic. If you have biases in your data going into a machine learning model, those biases can easily be propagated through the model. Sensitivity to this is something that we should all be aware of. So, we are not only talking about the hows of programming and how to work with R, but also a lot of best practices for programming. Just because something works does not mean it is necessarily the best way to do it. At least in our group, we have been very mindful of these things.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups organize, share information, and support each other worldwide. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.