Donald Szlosek, the MaineR User Group organizer, recently highlighted the significant growth and impact of the R community in Maine to the R Consortium. Donald emphasizes the crucial role of the R language in life sciences and showcases the group’s remarkable work in bridging the gap and empowering data professionals through collaboration and insights throughout the state of Maine.
Donald Szlosek has a great track record as a biostatistician, shaped by his collaboration with Beth Israel Deaconess Medical Center, Harvard Medical School, and his current work at IDEXX Laboratories. His job consists of diving into big data to find new actionable medical insights in collaboration with clinicians. He has worked on more than 40 clinical and real-world evidence studies ranging from pre-clinical toxicology studies to Phase III randomized clinical trials in oncology, nephrology, cardiology, dermatology, infectious diseases, parasitology, anesthesiology, and medical imaging.
Why did you personally get interested in learning R? How do you use it in your work? What do you do when you’re not programming?
If I remember correctly, my first foray into R started while I was working on interventional radiological clinical trials at Harvard Medical School. The goal was to recreate the results of a paper on bivariate analysis – a visual assessment of the safety-efficacy profile of antithrombotic drugs. In order to perform the task, I had to learn how to recreate the example code that they had using R to make the figures for our publication.
Currently, I work as a biostatistician for a medical device company called IDEXX Laboratories, and use R in every facet of my work. My job is divided into three areas. The first is everything related to clinical studies, like classic biostatistical work. I use R for statistical analysis using the tidyverse suite of packages and some specific packages developed for method comparisons like the mcr package. The second is big data real-world evidence studies using data collected from our reference laboratories and electronic medical records. This requires the use of some R packages that allow easier manipulation of large data sets like sparklyr and arrow. And lastly, the design and analysis of external validation studies for our machine learning algorithm projects. For this work, I lean heavily on some excellent groundwork on the assessment of clinical prediction models by Ewuot Steyerberg’s team and Frank Harrell’s wonderful rms package.
When I am not programming, I am usually found outdoors hiking, biking, and skiing around New England! Currently, I am trying to finish the last 14 mountains on the New Hampshire 48 4,000-footers. When I am not outdoors I usually am reading (a little bit of everything) and playing the piano.
What is the R community like in Maine? What was most surprising to you about the community?
We were initially the Greater Portland R User Group when we started our activities in 2018, but when COVID happened, it completely changed the way we worked. During the pandemic, we tried to keep things going, but we only had three meetings in 2020, nine meetings in 2021, and unfortunately none in 2022. We knew we needed to think of ways to restart ourselves this year, and the first action we took was to rename the Greater Portland R User Group to the MaineR User Group to be more statewide and to be able to include all the different universities, hospitals, and research institutes across the state of Maine!
Definitely, every step has been a process, and 2023 is showing a lot of promise as we currently have speakers lined up for the entire year, which is really exciting. We are slowly becoming more and more active in trying to find the rest of our users in Maine.
The thing that surprised me the most about the Maine R community, is that there are a lot of research institutes and hospitals that use R throughout the state of Maine and that there are a lot of people that seem motivated and excited to be a part of a statewide community.
Who comes to these meetups? What industries do you see more in Maine?
Specifically in our group, the medical and life science industries are the most dominant. I would say that around 90% of the people who join our meetups are in some part of the healthcare field. Maine also has a lot of research institutes and non-profit organizations focused on preservation and conservation of our marine and forest ecology.
How has COVID affected your ability to connect with members? What techniques (Github, zoom, other) have you used to connect and collaborate with members? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
Actually, this is something we have been discussing recently. Since we decided to make the group statewide and since Maine is a large state we decided to hold our first hybrid event starting on March 30th. Everything will be conducted simultaneously in person and online, using the Zoom platform. This will be our first hybrid event and there are bound to be a few hiccups, but we are hoping that over time we can get things running smoothly. The group hopes to get feedback from those who will be attending online to make things run as efficiently as possible so that both in-person and online attendees feel like they walked away with a great experience.
What trends do you see in R language over the next year?
This is truly an exciting time to be an R user in the clinical study space. Attending the R/Medicine and R/Pharma conference over the last few years has given me the opportunity to observe that there has been a big push to develop open-source software for regulatory submissions such as the United States Department of Agriculture (USDA) and Food and Drug Administration (FDA). Packages such as the admiral, validatoR, and others part of the pharmaverse are changing the landscape for regulatory submissions. All in all, I am very excited to see what the future holds!
What is your favorite R event that you have attended? From a small meetup to a big conference!
Talking about one event, in particular, is difficult, but I would definitely choose the first conference I attended, which was held in 2018 and was named Nor’EastR Conference.
At that time, I only had three years of experience in R; I sat with three people I did not know but who were very welcoming to me in my first R conference. Little did I know at the time, the three people who worked at Revolution Analytics: JD Long, who is going to be one of the keynotes for the Posit 2023 conference: David Smith, R developer advocate for Microsoft; Kirk Metler, Chief Data Scientist at IBM. It was just amazing to be in a place where everyone was talking passionately about R.
When is your next event? What are your plans for the group for the coming year? Please give details!
As I previously mentioned, we are very excited about our first hybrid event to be held on March 30, titled “Data Mining Methods for Improving Health Outcomes”. The event will feature Dr. James Quinlan, Associate Professor of Mathematics and Data Science at the University of New England, who will provide a presentation on the use of R for frequent dataset mining, with an emphasis on the application of these techniques to improve health outcomes.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!