One of the difficult parts of running a group in an online space is maintaining social interactions that you would normally foster with in-person meetups. R Consortium talks to Adnan Fiaz about how he is attempting to create those interactions in the online meetups.
Adnan is Senior Data Scientist at National Grid in Birmingham, England. He is an analytics professional with a passion for mathematics and complex challenges. And outside of work and R, has a keen interest in playing football, cinema and general aviation.
RC: What is the R community like in Birmingham?
AF: I took over about 3 years ago. Before that, there were several meetings, but they slowed down quite a bit. I came in with quite a few new ideas. We started with 1 or 2 meetings every quarter. We had a good rhythm until 2020. It was getting harder to get speakers, but I was able to find them. We were able to have about 20 people attend the meetings and that was quite good for Birmingham. It was a mix of academics from the local university, NHS staff from the area, and scattered R Users that used it in their businesses. We also had new users and people who had been using it for years. It was very diverse.
RC: How has COVID affected your ability to connect with members?
AF: We were struggling in the beginning because we didn’t know what to do. It depends on how you deal with it as an organizer and a community builder. I was leaning on face-to-face contact and others to help me out. Online, it was harder to engage with people and to ask for speakers from the local community. We didn’t have a meetup for 4 or 5 months. In the autumn we had a meetup. Then we didn’t have anything until the winter. At the beginning of this year, we decided to do meetings online and jumped on the bandwagon of the Global R community. We advertised their meetings on the Birmingham R page to give the community something to watch. Then we organized a meetup of our own in between their events. There is a lot less interaction with the members this way. People tend to be less interactive in online meetings and spaces. Since you must put a lot of effort into forcing socialization in these online spaces. I am looking forward to being able to go back in person.
RC: In the past year, did you have to change your techniques to connect and collaborate with members? For example, did you use GitHub, video conferencing, online discussion groups more? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
AF: Slack, Twitter, and Zoom are the technologies that we use mainly. We also have a GitHub page that we use. These allow a lot of people to attend our online meetings.
RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?
AF: We had several from the local meetup. We had a presentation/workshop from Birmingham University about mixed effect models by Bodo Winter. He explained mixed effect models from the basics of how they work as well as the more complex models that can be done. Once people had an understanding, they were able to ask more pointed questions. I was surprised because there was more engagement in the second part, mostly because people understood the concepts. There were a lot of questions, and people seemed to take it in a good way.
RC: What trends do you see in R language affecting your organization over the next year?
AF: We will probably see more people branching into how to build different statistical models. In the last year, we saw packages brought into the tidymodels framework and building upon caret and splitting it, and building specific parts. In short, having better support for model building.
RC: Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?
AF: I think one that got a lot of attention was the covid visualization by the Financial Times by John Murdoch. They were very informative. He also spent a lot of time discussing how he created them and engaging with everyone on Twitter.
AF: The most useful one to me is the R Ladies Project. I have used the materials from that project to start the meetup again as well as tips to increase engagement.
AF: I think the most interesting one is the R Certification. I remember when it was first proposed that it would be useful for meetups to give them a framework. We started with small segments before the meetup to start learning R in a beginner’s course. Just 10 minutes before the meetup to warm up. The R Certification would help guide that.
RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?
AF: I saw some work from Heather Turner on the future of R developers. That would be interesting to get more focus on because it would be good if we had more diversity in the core team of R Development.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.
To understand what R Ladies in Santiago is like, R Consortium talked to Riva Quiroga about how they are dealing with organizing and meeting during the pandemic. We also discussed how the chapter is becoming more inclusive and helpful to others in Santiago, as well as all of Chile and Latin America as a whole.
RC: What is the R community like in Chile?
The R community in Chile is very active and diverse. We have members who come from different fields and who have very different interests. There are members who have a STEM background, but also a lot of people from social sciences and the humanities. And they work in very different places: academia, industry, public services, ONGs, students, etc. There are currently four active RLadies Chapters, and one R User group.
The R user group in Chile started in 2012 in Santiago. I think I found out about them around 2015, but never attended an event because I wasn’t sure I would feel welcomed (it was an all-male group and I was just a beginner), and then they stopped organizing meetings. In 2017 they resumed their activities and we, the team that was planning the launch of RLadies Santiago, started attending. They supported us when we were starting our chapter in 2017, by helping us find venues for our first events. Since then, we see each other as collaborators. We have organized joint activities, and organizers and members of both groups have collaborated together in R-related projects (such as packages, courses, etc.)
2017 was the year the R community started growing at a very fast pace here. Chile is a very centralized country, so everything usually happens only in Santiago, the capital city. So it was great to see in the next few years new RLadies chapters in other parts of the country: Valparaíso (2018), Concepción (2019), and Talca (2020).
RC: How has COVID affected your ability to connect with members?
We were a very active group until October 2019, when we had to stop our activities due to social unrest in the country. Probably because it was a very difficult time for everyone, it didn’t occur to us to organize online events.
And then came COVID. Working remotely and online events became the “new normal,” so we decided to resume our activities. We saw this as an opportunity for collaboration between all the RLadies chapters. So since March 2020 all our activities have been branded as “RLadies Chile.” All our events are held via Zoom, thanks to the licenced account provided by DataUC. We post our videos on Vimeo and use GitHub to share code and materials.
Online events have been a great opportunity to make our community grow. We have been able to reach people in cities that currently don’t have an RLadies chapter, and also people who were unable to attend in-person events.
This means that our “local” community is now bigger than before. It is no longer limited to the four cities that have an RLadies chapter. People from different parts of Chile and Latin America are joining our events, and even spanish-speaking folks who live around the world. As a consequence, when possible, we try to organize our events in time slots that are not too late for someone based in Europe, and not too early for someone in México.
This collaboration between the four RLadies chapters to organize online events has been a great experience. On the one hand, it allowed us to connect with a broader community in a new way, so we plan to keep organizing online events even when meetings in person are back. On the other hand, we as organizers became closer. At least for me, having the opportunity to share time with such an awesome group of people has been one of the things that motivates me to keep going during these difficult times.
RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?
A couple of months ago we ran an event about how we have been using R during the pandemic, and what new things we have discovered and learned. In that context, Alejandra Silva Tapia, the organizer of RLadies Talca, gave a talk about sonification techniques with R. In her presentation she not only showed some explorations she did with meteorological data using packages like {tuneR} and {sonify}, but also the teaching potential of these techniques. She shared her experience sonifying plots in order to explain statistical distributions to blind students. With just a couple of lines of code, she gave attendees a tool to make their learning materials more accessible.
It was so interesting that in our internal Slack we added a new channel to share our sonification experiments and new ideas on the subject.
RC: What trends do you see in R language affecting your organization over the next year?
There are currently more than 18000 packages on CRAN, and many of them are very field-specific, so it is very difficult to keep up to date with all the new possibilities that they offer. Therefore, it is very challenging to decide what new workshop to offer; what new package to share with our community. Should we run a workshop about something broad and general that might benefit anyone? Or do we target a specific audience that will benefit from learning about new packages or techniques for their field?
To face that challenge we have been trying to do a mix of both. We have organized workshops focused on general tasks, such as cleaning data, modeling, visualizing, etc., and also subject specific events. Organizing both types of workshops (general and specific) has been our way to attend the needs of a very diverse audience.
Another trend we are very happy to see is the discussion around diversity, inclusion, accessibility and algorithmic bias in Data Science. We are currently running a book club based on the book Data Feminism to discuss some of these topics. Discussing the social and ethical issues involved in coding and data science is something that interests all of us. And a safe space like RLadies is a great place for starting that conversation.
In regions like Latin America the decision about what workshops to offer is not trivial. Here, being able to understand English is, in most of the cases, a sign of privilege: you went to a private school, you had the opportunity to study abroad, you work in international projects, people in your family speak English, etc. And that is not very common. This means that RLadies chapters and R Users groups are sometimes the only place for many non-english speakers to learn about the new developments in R and Data Science. So when deciding which workshop to run, we have this in mind. We see this as part of our mission.
This was also the reason why the Latin American R community translated, as a joint effort, the book R for Data Science into Spanish. We saw the need (and impact) of having this kind of resource available for everyone who has interest in learning R.
Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?
We have had journalists attending R Ladies events in the past, and we are very happy to see that some of them, who were also professors, started promoting the inclusion of R as part of their undergraduate curriculum.
Regarding the second question, there are three data journalism projects that have had a positive impact on society here in Chile, from my point of view. The first one is La bot, a Telegram and Facebook Messenger bot that sends you short and precise data based analysis of current issues. This is a women-led project that has received support and funding from the International Women’s Media Foundation and the Open Society Foundation. The team has made an amazing job showing new ways in which journalists can seek and connect with different audiences. And by always discussing current issues supported by data, La bot has also been a great way to fight the spread of fake news.
The second one are the reports Alejandra Matus did in the first months of the pandemic. She explored the death rates in Chile for the past ten years and exposed that the government was underreporting COVID deaths. She revealed that there was an excess death rate for March 2020 that the authorities were not explaining. They were only reporting PCR-positive patients who died at hospitals, not people who were dying at their homes or elderly nursing homes. Her work had a great impact. Not only because we started demanding more transparency from the government regarding COVID data, but also because many people began to realize the seriousness of the pandemic.
The third one is called Plataforma Telar. Chile is currently drafting a new constitution and, in this context, Plataforma Telar is using innovative methodologies to gather and analyze data related to this process (and they are using R!). What I find really interesting is that, although this is an interdisciplinary project based in academia, they have made alliances with networks like CNN Chile to showcase their findings to reach a broader audience.
RC: When is your next event? Please give details!
Our next event is the sixth session of the book club about “Data Feminism”, which will be held in late November. For December we are planning a workshop about building your first R package and one about using git/github in RStudio.
We have been able to be a very active chapter, even during the pandemic, mainly because of two reasons. First, because all our activities are the joint effort of the four RLadies chapters of our country. That makes all the planning easier and keeps us motivated. Second, because we plan the workshops not only taking into account what we already know, but also what we want to learn. For example, if I want to learn about a specific package, I will volunteer to run a workshop about it in a couple of months. That way I have an incentive and a deadline to achieve that objective. Because R Ladies is a collaborative and safe space, we feel comfortable running events that are not about something that we have already mastered, but about something we are currently learning.
Obviously, the R Ladies project is very dear to my heart. The support of the R Consortium has been crucial to offer current and prospective chapters the human and technological support to operate.
I also want to mention the SatRdays project led by Steph Locke and Gergely Daroczig. They developed a starter kit, a knowledge base, and all the infrastructure you might need to run your own SatRday.
SatRdays are accessible R-focused conferences organized by local R communities, that are held on Saturdays. In 2018 RLadies Santiago and the Santiago R Users group organized one of these events, and we wouldn’t have been able to do it without all the support this project provided. That event was very important for growing our local community, so we planned another one for April 2020. But we had to suspend it due to the pandemic. We expect to be able to run it again in 2022.
I’m not sure if I have a favorite one, but I really like the idea of working groups that are focused on specific fields, like R/medicine and R/pharma. They are a great way to bring together people that are using R for similar purposes to collaborate on events and advocacy, and to make advances in different areas by promoting cooperation.
It would be great to see in the future similar working groups for other fields (e.g. R/social sciences, R/humanities, R/ecology, R/open government, etc.).
RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?
I would love to see a project about multilingualism. Currently there are many people working toward this aim, and not only by translating learning resources, but also by developing packages that take into account that English is not the only language that exists. For example, Michael Chirico has made a package called potools which allows you to internationalize your own package by translating user-facing communications (e.g., warnings, errors, etc.) into different languages. Also, the Datasketch team (Colombia) developed a package called Shi18y, that allows you to create multilingual Shiny apps. And a group of RLadies from Brazil and other countries from Latin America are currently developing a package with datasets in Portuguese for people to use when teaching/learning R, similar to the ones that already exist for Spanish and Turkish.
All these are great efforts that are helping to make R more accessible to non-English speakers. It would be great to see a Top Level Project that promotes these kinds of initiatives.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.
By Amanda Hart, Co-organizer of South Coast MA UseR Group
2020 was a year of firsts: first global pandemic in my lifetime, first year working remotely, first year not traveling home for the holidays, and on a more positive note the first year for the South Coast MA UseR Group. Just before lockdown I had the opportunity to join an Openscapes workshop led by Julia Stewart Lowndes. I spent the week thinking about open data science and learning about tools to aid in collaboration, but I never expected to need to exercise those tools so quickly or extensively. Overnight my community changed. Offices closed, schools closed, classes turned into remote learning experiments, and my community of students and coworkers scattered. This is when the South Coast MA UseR Group was born.
Our R user group started as an excuse to stay in touch and build on the small community formed by the Openscapes workshop. I found a wonderful co-organizer, Amanda Meli, and together we started our monthly meetings. We invited anyone remotely interested in R to join our fledgling group and take the opportunity to “see” people again. Over the course of this last year, our local R user community in southern Massachusetts, USA, expanded to include students and professionals from across Massachusetts, the U.S. and the world with several guest speakers calling in from abroad (one of the benefits of remote meetings).
In so many ways 2020 was a hot mess, but our monthly UseR meetings have been a welcome source of consistency, community, and learning throughout. We are continuing our remote meetings as we start year two, but with any luck (and a dash of science) I look forward to meeting some of my newfound community members in person in the coming months. But until that hope is realized I cannot thank enough my co-organizer, group members, guest speakers, and behind the scenes supporters for making this crazy year better.
It can be difficult running an R user group in a country where R does not have a large following. R Consortium talked with Francis Mensah, who runs the R Users group in Accra, the capital of Ghana. Francis discussed how they went virtual during the pandemic, and how they are working on ways to help grow R users in Ghana.
Francis is a Statistical Consultant, Data Quality Scientist, Chief Operations Manager, M$E Fellow, Programmer, Data Analyst, and Principal R Organizer and co-founder for the Accra R Users Group. He is also a Business Development Consultant for Kims International. Kims International provides M&E, Research and Capacity Building in education, public health, gender, water and sanitation and livelihood for governmental and non-governmental institutions.
RC: What is the R community like in Accra?
FM: The R community in Accra is relatively new. Most people here are not aware of it. Most people here use Stata or SPSS, as they are taught in schools here. Awareness in Ghana is relatively low for R. We are trying to create awareness of R through our group.
RC: How has COVID affected your ability to connect with members?
FM: Because of restrictions imposed, we have not been able to meet face to face. Because of this, we meet online. We were planning our first meeting when COVID came. We meet almost every weekend virtually. We get people from not just Accra, but a lot of international people as well.
RC: In the past year, did you have to change your techniques to connect and collaborate with members? For example, did you use GitHub, video conferencing, online discussion groups more? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
FM: For our meetings, we use zoom, goto meeting, and WhatsApp to meet virtually. These work best for us. We will also try to have a Ghana R conference with other groups as well. We will use the same apps for our meeting for the country-wide one.
RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?
FM: What Every Data Scientist Must Know About Teaching and Learning by Greg Wilson. Everyone was happy and excited after the presentation. It opened our eyes. When going through the technology we use, it can be difficult to see. He used images to go through the talk. They were highly effective. It was so well received that members asked when the speaker was coming through again. It was very exciting and it opened our eyes to teaching and learning for all of our members, including me.
Another good one was given by Dr. Riinu about ggplot that we were excited about because it was about graphics. The part that we liked was that she gave out exercises to try during the presentation.
RC: What trends do you see in R language affecting your organization over the next year?
FM: Over the next year we believe using R to get insight into and finding solutions for Health, finance, Agriculture and the economy as a whole. With the census ongoing, we hope to get data for some of these.
RC: Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?
FM: We have one member who uses data to tell stories about events that happen in Ghana. She uses it for data presentations.
RC: When is your next event? Please give details!
FM: We are having a speaker from Argentina speak. She will talk about creating packages using learnr through R. This will be in September. We also have a speaker in September from Spain who will also talk about creating a package in R as well. We will also do local programs on zoom in the meantime.
RC: Of the Funded Projects by the R Consortium, do you have a favorite project? Why is it your favorite?
FM: Interactive visualisations in R via R-to-JavaScript-transpilation.In general I like data visualization and it’s wonderful to explore.
RC: Of the Active Working Groups, which is your favorite? Why is it your favorite?
FM: R Certification. Being proficient in R will also help in our effort to spread the use of R in Ghana as a whole and it’s also a source of motivation to do more with R
RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?
FM: I think something along the lines of R Clinics to bring awareness in all tertiary institutions and industry. For instance, they could develop R clinics and workshops in Ghana that professor’s and industry players or members would use the software. Again Data journalism would be great here in Ghana
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.
R consortium talks to Nino Macharashvili of DataFest Tbilisi (also on Facebook) on how they are dealing with life in the COVID age. As they were one of the early adopters of online conferences, having one shortly after the start of COVID, they have an interesting take on attendance. Nino also has an interesting take on a top-level project to help train more future professionals into the R language.
What is the R community like in Tbilisi?
Our event is very regional. We started DataFest Tbilisi in 2017, and it was mostly an Eastern European Union (EU) and Central Asian event. Our speakers were always from all over (North America, EU, and Asia). In the last two years, we have been online due to COVID, and our event has become more global, with more than 50 countries attending. However, we still had the biggest chunk from Eastern EU and Central Asia. Because of our location, we tend to have a manageable time zone for a global conference, with only a few people waking up early or staying up late.
How has COVID affected your ability to connect with members?
With the virus, we had to have everything go online. We were able to start experimenting with different ways to run a conference online. In the first year, for the first 2 months, we noticed that there were hardly any online events. Come March we noticed that many different events were going online. We started going in right away and ended up pioneering online conferences. We got a nice response from the audience for the first conference because we were available when many people were in a strict lockdown. We were able to offer an opportunity to connect with others in the community and learn. For me, it was a very positive experience. We wanted to be a global conference, not just for speakers but also for the audience. It was much easier to do this with online events. However, after 1 year we saw some differences, Georgia and most of the EU were going back to normal. We also noticed people didn’t want to go back on zoom due to zoom fatigue. I’d personally like to keep the conference offline, online, and hybrid.
In the past year, did you have to change your techniques to connect and collaborate with members? For example, did you use GitHub, video conferencing, online discussion groups more? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
We ended up using Zoom and a slack channel. We did look at using hopin, but we decided that simpler was for the best. So we decided to keep with the technology that most people were used to.
Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?
My favorite presentation was on Artificial Intelligence (AI) from our recent meeting. One striking fact was how the EU is racing to catch up with the US in AI and are investing heavily in it. The US is ahead, and the EU is close behind, but Georgia needs to catch up. I like it because it was a talk that brought up the issues and obstacles of AI and not just the overhyped part of AI.
What trends do you see in R language affecting your organization over the next year?
R is not the only language that is used by our members. As far as trends in coding in general, we are looking at tools that do coding themselves (self-coding code). We still need people who know how to use it and why to use it, however. We need to interest people from other sectors so we can show people how to apply the code to their field.
Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?
We usually have data journalism as a track event. One of the main goals is to use data to debunk misinformation. This is one of our current projects. We have so many different projects. One of my favorite talks was a talk on COVID-19 in Brazil.
R ladies is an important program that should be kept. The R Ladies’ role is very important in popularizing R among everyone and not just in bringing in women. By making R more approachable, it increases interest in many different groups, and R Ladies has shown us that. It is important in communities like ours, where the R community is not large. It has the power to unify the groups.
There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?
While I think R in data visualization would be a great choice, but I’m a bit biased because that is my field. However, what would be better for my area would be to provide support to start teaching R in University programs. In Georgia, there are not many people who use R. Most people learn about it from Twitter and start using it there. Some people start learning at workshops, and it slowly starts to trickle into the professional world. However, some professors are set in their ways and the software that they use. So, having support to get professors into using R and competitions for students would be very helpful. Their students will be the new professionals of the future and will push R in their jobs.
The largest data journalism conference in Latin America reaches its sixth edition in a row bringing discussions on the climate crisis, access to information, and data protection, in addition to dozens of workshops with experts.
Editor’s note: There are four R-related workshops @ Coda.Br 2021 (Data and health: Sivep without secrets, Tools to mitigate AI biases, Creating a reproducible project in R, and Graphs everywhere: how to create and analyze graphs). Please see below for details.
The conference was created by Escola de Dados (School of Data Brazil), the Open Knowledge Brasil’s data literacy program, Coda.Br is the leading data journalism event in Latin America and will be entirely online for the second year in a row, with free and paid activities.
Three main debate panels, three keynote presentations with international guests, and the final of the Cláudio Weber Abramo Data Journalism Award will be broadcasted openly and free of charge on the event’s website. Paid activities include more than 30 hours of hands-on workshops with experts in the field. The audience can join the workshops with a simple registration (from R$40) or via the Escola de Dados membership program.
Two hundred ninety-five free subscriptions will also be offered to increase the attendance of underrepresented groups. The public call is open until November 1st.
The sixth edition of the Digital Data and Methods Journalism Conference is developed with Google News Initiative and has the support of the US Embassy and Consulates in Brazil; the Hivos Foundation; the Brazilian Institute of Teaching, Development, and Research (IDP); the Brazilian Association of Investigative Journalism; the Brazilian Institute for Research and Data Analysis (IBPAD); from Insper; R Consortium and Datopian.
LAI and LGPD, book launch and climate crisis
Focusing on the complementary relationship between transparency and privacy, the first panel will discuss how public institutions deal with the Law on Access to Information (LAI) after the General Data Protection Law (LGPD) came into force in Brazil. Fernanda Campagnucci, CEO of Open Knowledge Brasil, will moderate the discussion of the following speakers Maria Vitória Ramos (Fiquem Sabendo), Jamila Venturini (Derechos Digitales), Paulo Rená (Instituto Beta).
The panel “Data Journalism in the World” marks the launch of the Portuguese version of “The Data Journalism Handbook: Towards a Critical Data Practice” with Natália Mazotte (Insper), one of the founders of the School of Data in Brazil, in addition to the participation of Cédric Lombion (Open Knowledge Foundation), Liliana Bounegru and Jonathan Gray (King’s College London).
And while the United Nations Conference on Climate Change (COP26) brings together global leaders, Coda.Br will debate the coverage of the climate emergency by journalism, pointing out problems and possible solutions in this area in the panel “Climate crisis in data journalism”. The activity will be moderated by Gustavo Faleiros (InfoAmazonia) and will feature Letícia Cotrim da Cunha (UERJ), Francy Baniwa (National Museum), and Clayton Aldern (Grist).
This year’s keynote presentations include Gurman Bhatia, an independent data visualization designer; Sondre Solstad, data journalism editor at The Economist; and Jim Albrecht, director of product management at Google. The Cláudio Weber Abramo Award for Data Journalism ceremony ends the Conference, with presentations by the finalists and the announcement of the winning projects of this edition of the award.
INFO
6th Coda.Br – Brazilian Conference on Data Journalism and Digital Methods
Date: November 8th to 13th
Value: R$40 (access to all event activities) or R$180 (one-year subscription to Escola de Dados, which allows access to the event and other benefits).
Registration and more information about the schedule:
School of Data is a global network aiming to empower citizens to contribute to the strengthening of democracies. Escola de Dados is the local chapter of this network and part of Open Knowledge Brasil (OKBR). The program trains researchers, NGOs and journalists, teaching them how to use open data to promote well-informed debates and create effective narratives for their agendas.
ABOUT OPEN KNOWLEDGE BRAZIL
Created in 2013, Open Knowledge Brasil (OKBr) is the local chapter of Open Knowledge Foundation. It is a non-profit Civil Society Organization (CSO) that uses and develops civic tools, projects, public policy analysis, and data journalism training to promote open knowledge in various fields of society.
R-related workshops @ Coda.Br 2021
Data and health: Sivep without secrets
By Carolina Moreno and Raphael Saldanha
Come learn how to analyze the most useful database to cover Covid-19 in Brazil: the Sivep-Gripe. It is using it that authorities, experts and journalists follow the trends of hospitalizations and deaths. This anonymized base is public and is available to anyone who knows how to handle large datasets. However, knowing the code to manipulate the data is not enough. In this workshop, you will have access to specific knowledge about the correct filters to be made, in addition to the dynamics of information systems and epidemiological issues that must be taken into account in the coverage.
Carolina Moreno is a senior data journalist for TV Globo. She has been a journalist since 2006, specializing in journalism editing since 2009, and has produced data-driven reporting since 2017. She covers Covid-19 pandemic data from its beginning for local and national news programs. Winner of the 2014 and 2015 Andifes Award, second place in the 2019 Impa Award. Participant in R-Ladies São Paulo since 2019.
Raphael Saldanha is a health data scientist, with PhD in Health Information and Communication from Fiocruz, one of the most prestigious health institutions in Brazil. He works on quantitative health research and the production of data visualization dashboards. He has been working with COVID-19 data since the beginning of the pandemic, building Fiocruz MonitoraCovid-19’s COVID-19 monitoring panel. He has been teaching R courses since 2010.
Tools to mitigate AI biases
By Gabriela de Queiroz e Paolla Magalhães
In this workshop, you will learn how to measure and mitigate bias in your data and models using the AI Fairness 360 open-source toolkit. You will learn which metric is most appropriate for a given case and when to use many of the different bias mitigation algorithms. The workshop will mention the R package.
Gabriela de Queiroz is a Chief Data Scientist at IBM California leading AI Strategy and Innovations. She drives the AI adoption across existing and potential customers, lead outreach strategy across our open source ecosystem and data science community. Previously she was a Program Director working on Open Source, Data & AI Technologies at IBM.
Creating a reproducible project in R
By João Santos
In corporate and scientific works we are increasingly faced with scenarios where we try to reproduce the code written by someone else and we find inconsistencies and errors. The solution to these problems lies in a series of practices and conventions that ensure that your code runs consistently. In this workshop, you will learn how to develop a reproducible project in R. We will make use of libraries and directory organization best practices, making our results permanently consistent.
João is currently a Jr. Data Engineer at Account Split. He serves as a research assistant in the Department of Political Science at Emory University, where he researches political disinformation. He is a major in International Relations at PUC-Rio, and holds the AWS Certified Cloud Practitioner certification.
Graphs everywhere: how to create and analyze graphs
By Janderson Pereira
The purpose of this workshop is to present the concepts of graphs and relational data used to identify groups and their subjectivities. The idea is to show how to extract data from social networks, especially Twitter or Youtube, and then treat them to visualize interactions in order to be able to find groups that emerge when individual behaviours are aggregated. The R language and the Gephi program will be used to create the graphs.
Janderson is a data scientist and coordinator of innovation and forecasting at Natura & Co. He is a researcher at Citelab/UFF – Research Laboratory in Science, Innovation, Technology and Education and has a major in Media Studies at the Fluminense Federal University. He develops research in the area of social network analysis, focusing on methodologies for disseminating disinformation on social networking sites.
R Consortium talks to Harvey Lieberman on their growth both pre and post COVID. They have adapted in a way that promotes R in Pharma as well as allowing them to be more inclusive.
R/Pharma is being held Nov 2-4, 2021. Register today! More information available here: rinpharma.com
RC: What is the R community like in R Pharma?
We have an amazing community! We have been able to pull together a group of like-minded people who wish to contribute to R/Pharma. Each year we hold a conference that is entirely community driven from the organizing and program committees to those who work on presentations and workshops.
As a community-led effort anyone who wants to help can do so. Last year we tried to identify people who work on R in smaller biotechs so that we do not become too polarized towards bigger pharma companies.
We also have an active slack group that helps build community.
RC: How has COVID affected your ability to connect with members?
A little history of R/Pharma so you can see how it evolved with COVID. We formed a few years ago with the main focus being holding a conference. It was clear that a lot of people were working with R in Pharmaceutical companies from early research through to production, but there wasn’t a conference focussed on this. There were many statistics-based conferences, several geared towards SAS, but nothing industry-based for R practitioners. The first two conferences we held were face-to-face at Harvard University in 2018 and 2019 with 150 attendees. It was clear that more people wanted to attend but we were limited in space. Late 2019 we started to think about how to expand, to accommodate more attendees, and then COVID hit. We quickly pivoted to a virtual event and ended up reaching far more people – with over 1000 registrations for 2020 and we are expecting more for 2021.
Our conference historically attracted attendees from USA and Europe. The benefit of going virtual is that we can bring together people from all over the World. The challenge in managing this post-COVID. R/Pharma has always strived to be a free conference without sponsors and we will be relying on our community to help put future events on in this spirit.
For 2022 we are hoping to host a hybrid event.
RC: In the past year, did you have to change your techniques to connect and collaborate with members? For example, did you use GitHub, video conferencing, online discussion groups more? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
We have an active slack group which has been growing steadily since 2018. For the conference we use a GitHub repo to archive presentations and workshops, linked to our website. We also have a YouTube channel containing recorded talks and workshops from 2020. We can look at COVID as a double-edged sword with respect to connection – we were able to reach many more people last year but we lost the interpersonal interactions. It’s important to us to be inclusive and virtual experiences break down many barriers.
With regards to the conference in 2020, we held workshops via Zoom and the main conference through the hopin platform. One way in which we promoted additional interaction was through virtual conference booths so that open source authors could showcase their packages and shiny apps. We aim to host the 2021 workshops and conference the same way.
RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?
We have been blessed with so many great speakers over the past three years. In our first year Joe Cheng gave a talk on Using Interactivity Responsibly in Pharma. Joe is an amazing presenter who can take a topic that is complex and explain it in a way that everyone can understand. The R/Pharma community is amazing and we always have incredible workshops in addition to talks. One that comes to mind is Leon Eyrich Jessen’s workshop on Artificial Neural Networks in R with Keras and TensorFlow. It’s a highly complex topic which Leon teaches in a 3- or 4-hour workshop, from which you leave thinking “how can I now apply this to my own problems?”
RC: What trends do you see in R language affecting your organization over the next year?
I think the big one in Pharma, in general, is R for Submissions. This is a space that traditionally has been very heavily SAS-oriented. There is certainly a move in the industry to start to use R. It’s slow because it requires a large amount of retraining, changing infrastructure and dealing with regulations. Leaving college now, you are more likely to be an R expect than a SAS expert.
Another area of growth within the industry are shiny apps. This has democratized the ability to communicate complex statistical outputs. Couple that with shiny modules and you have the ability to build complex interactive graphical apps rapidly.
RC: Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?
Externally I do not but everyone in the industry uses these as a way to communicate internally on a daily basis. I’m working in a group that has started using data stories as a way to communicate complex information in a digestable way. As a Brit I tend to read the BBC a lot and like how they are embracing data journalism. FiveThiryEight too is a great site.
RC: When is your next event? Please give details!
R/Pharma 2021 will be held from November 2-4. Workshops will be running the week before. The event is free and you can find registration details on our website at rinpharma.com.
R Ladies is my favorite, mainly because it was something very conscious. We did have an imbalance in our industry Ladies is a favorite. Our industry is trying to address a gender imbalance and R/Pharma, as an organization, is very conscious of that.
The R Validation Hub is heavily connected to R/Pharma. Having a way to validate packages is very important to our industry. Members of the R Validation Hub regularly present or host workshops at R/Pharma.
RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?
R for submissions. The R Consortium is spearheading an effort that is complex but important to our industry. Having a way to bring multiple companies together to work with regulatory bodies is essential.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!
ELC: We are a national organization called R Hispano that hosts activities. We have a yearly conference called Jornadas which is like a workshop. The last conference in 2019 was a proper conference and we had a lot of companies committed to the conference. We had a lot of local user R groups there as well. We had a large number of national members in the organization from the local groups: Madrid, Canarias, Murcia, Málaga, Sevilla, Córdoba, Galicia, or Castilla-La Mancha. You can find an interview with R Hispano, president of an agency in the Ministry of Economy here.
RC: How has COVID affected your ability to connect with members?
ELC: Our annual conference is held in November. In 2019 we did it in Madrid, with the collaboration of the multinational company Repsol, and we invited famous speakers like Max Kuhn, Bernd Bischl, and Jo-Fai Chow (videos available here and slides and other material available here). We planned the 2020 meeting in Córdoba, but we had to postpone it due to COVID. This year it will not be possible to hold it, so we will do it next year. To keep the community alive, last year we organized, jointly with U-TAD, a two-day online event that was quite successful (encuentRo en la fase R, encounteR in the R phase). We used the online platform of U-TAD and Blackboard collaborate. Javier Luraschi was our invited speaker, and the Ecuador R User Group organized the session. Also, the annual assembly of the association was held online, thanks to the University of Murcia Zoom platform. Definitely yes, these techniques help spread our activities and engage more people. Whenever it was possible, past annual conferences were also accessible in streaming.
The local groups have also adapted to this situation. The Canary Islands group organized a YouTube streaming event last April. The Madrid group resumed its meetings on May 26 and they share materials and videos online. The Murcia group has organized several events during the Pandemic, the last one was online, workshop videos and materials are available here). Next month, the most recent group in Castilla-La Mancha, R Quixote is hosting a workshop on R for Business, Teaching, and Research, both in-person (30 spots, filled in 24 hours) and online (unlimited)
RC: In the past year, did you have to change your techniques to connect and collaborate with members? For example, did you use GitHub, video conferencing, online discussion groups more? Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?
ELC: We used Blackboard collaborate to run a conference. It was what we had and it worked fine. I prefer Microsoft Teams which we use in university. Sadly, a lot of people in the R community don’t tend to have access to them.
RC: Can you tell us about one recent presentation or speaker that was especially interesting and what was the topic and why was it so interesting?
ELC: The presentation by Borja Andrino during the Canarias meetup in April was very interesting to me. He is a data analyst in the prestigious “El País” newspaper. Within the team of Kiko Llaneras, they analyze election data, and all types of data using R and other tools, and we could see how R is used in something we see every day in the news.
RC: What trends do you see in R language affecting your organization over the next year?
ELC: The new pipe in 4.1 and will it substitute or replace tidyverse. In our group, we have a lot of fans of data.table package and a lot of people use base R and a lot of people use tidyverse. Also, artificial intelligence with the new algorithms and how they are integrated into R. This is a trend that we will have to keep an eye on.
RC: Do you know of any data journalism efforts by your members? If not, are there particular data journalism projects that you’ve seen in the last year that you feel had a positive impact on society?
ELC: This is a topic that we include in many activities. At the 2019 conference, we held a round table with top actors in the Spanish scene. Not to mention the presentation by Borja remarked before.
RC: When is your next event? Please give details!
ELC: As far as the local groups are concerned, the one by R Quixote is the next meeting. The next annual conference will be held in Córdoba. We probably will have more details after the summer, when vaccination in Spain advances and we can make plans for 2022. Maybe an online event (a new “encuentRo”) will also be planned before 2021 ends.
ELC: Everything related to spatial data and analytics is worth mentioning, as “maps” is something very important for outreach, so I would say my favorite now is Spatiotemporal Data and Analytics.
ELC: Even though I have been using R in Pharma and in Business (so those groups are amongst my favorites), I would say my very favorite is the Code Coverage one. I plan to improve my SixSigma package and my developments for companies adding quality software good practices in there.
RC: There are four projects that are R Consortium Top Level Projects. If you could add another project to this list for guaranteed funding for 3 years and a voting seat on the ISC, which project would you add?
ELC: If I had to choose among the current projects, I would say “Database interoperability for spatial objects in R”, to facilitate “in production” applications of spatial analysis and visualization. If I could suggest a new project, I would support something related to R communities in production (business and/or public bodies). Similar to other community projects, people with common interests in R and their business could spread the word in sectors with high potentials, such as the food industry, manufacturing, etc. I did submit a proposal some time ago on this as well.
How do I Join?
R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past 4 years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!
The deadline for submitting proposals is October 31, 2021.
The September 2021 ISC Call for Proposals is now open. The R Consortium’s Infrastructure Steering Committee (ISC) solicits progressive, pioneering projects that will benefit and serve the R community and ecosystem at large. The ISC’s goal is to foster innovation and help bring your ideas into tangible realities.
Please consider applying!
Although there is no set theme for this round of proposals, grant proposals should be focused in scope. If you are currently working on a larger project, consider breaking it into smaller, more manageable subprojects for a given proposal. The ISC encourages you to “Think Big” but create reasonable milestones. The ISC favors grant proposals with meaningful detailed milestones and justifiable grant requests, so please include measurable objectives attached to project milestones, a team roster, and a detailed projection of how grant money would be allocated. Teams with detailed plans and that can point to previous successful projects are most likely to be selected.
The Enterprise Applications of the R Language Conference (EARL) is a cross-sector conference focusing on the commercial use of the R programming language. The conference is dedicated to the real-world usage of R with some of the world’s leading practitioners. This year, it was held September 6-10, 2021.
Thank you to everyone who joined us for EARL 2021 – especially to all of the fantastic presenters! We were pleased to receive lots of really positive feedback from the online event and there are plenty of highlights to share.
Branka Subotic, NATS
It was great to kick off EARL 2021 with our first keynote of the day from Branka. She has worked for NATS since 2018 and is currently their Director of Analytics. Branka shared with us interesting ways to help teams to work together and also some unusual ways to upskill! Her talk was peppered with some videos showing us flight data and the impacts of Covid.
Chris Beeley, NHS – Stronger together, making healthcare open- building the NHS-R Community
We are always delighted to hear from the NHS at the EARL Conference and this year was no exception. We were treated to a passionate talk from Chris on how the NHS-R community has been built up over the years and how their conference has gone from strength to strength. We all know how supportive the R community can be, so it is great to see this in action.
Amit Kohli – Introduction to network analysis
Amit gave us an introduction to the principles of network analysis and shared several use-cases demonstrating their unique powers. Amit also included a fun way to interact with his talk with the use of a QR code – we can always rely on Amit to entertain us! Our team thought it was a really interesting topic and it felt accessible to those who perhaps don’t know much on the subject.
Emily Riederer, Capital One – How to make R packages part of your team
We loved Emily’s fun concept of making R packages a real part of your team and her use of code, and the choices she made along the way. Her talk examined how internal R packages can drive the most value for their organisation when they embrace an organisation’s context, as opposed to open source packages which thrive with increasing abstraction. Read our interview with Emily here.
Dr. Jacqueline Nolis, Saturn Cloud
We closed the day with our final keynote talk from Jacqueline Nolis. She is a data science leader with over 15 years of experience in managing data science teams and projects, at companies ranging from DSW to Airbnb. She currently is the Head of Data Science at Saturn Cloud where she helps design products for data scientists. Jacqueline spoke to us about taking risks in your career and shared with us the various risks she has taken over her career and how they went! It was inspiring to hear from an experienced data scientist that it’s ok to take a risk every now and then – and refreshing to hear her honesty about what could have gone better – and how she has ultimately learned and grown from this.
These are just a few of the brilliant talks from a fantastic conference day. It was a delight to have speakers and attendees joining us from across the world – so thank you again to all that came along.
We are hoping to be back in London next year to host EARL in-person again. We are tentatively holding the 6th-8th of September 2022 as our conference dates. If you’d like to keep up-to-date on all things EARL please join our mailing list. We will open the call for abstracts in January 2022.