by Hadley Wickham and Joseph Rickert
The Infrastructure Steering Committee (ISC) was very pleased with both the quantity and quality of proposals received during the recent round of funding which closed on February 10th. Funding decisions were difficult. In the end, the ISC awarded grants to ten of the twenty-seven proposals it received for a total award of $234,000. Here is a brief summary of the projects that received awards.
Adding Linux Binary Builders to R-Hub – Award: $15,000. Primary Contact: Dirk Eddelbuettel (edd at debian.org)
This project proposes to take the creation of binary Linux packages to the next level by providing R-Hub with the ability to deliver directly installable binary packages with properly-resolved dependencies. This will allow large-scale automated use of CRAN packages anywhere: laptops, desktops, servers, cluster farms and cloud-based deployments.
The project would like to hear from anyone who could possibly host a dedicated server in a rack for long term use.
An Infrastructure for Building R Packages on MacOS with Hombrew – Award: $12,000. Primary Contact: Jeroen Ooms (jeroenooms at gmail.com)
When installing CRAN packages, Windows and MacOS users often rely on binary packages that contain precompiled source code and any required external C/C++ libraries. By eliminating the need to set up a full compiler environment or manage external libraries this tremendously improves the usability of R on these platforms. Our project will improve the system by adapting the popular Homebrew system to facilitate static linking of external libraries.
Conference Management System for R Consortium Sponsored Conferences – Award: $19,000. Primary Contact: Heather Turner (ht at heatherturner.net)
This project will evaluate a number of open source conference management systems to assess their suitability for use with useR! and satRdays. Test versions of these systems will be set up to test their functionality and ease of use for all roles (systems administrator, local organizer, program chair, reviewer, conference participant). A system will be selected and a production system set up, with a view to be ready for useR! 2018 and future satRdays events.
Continued Development of the R API for Distributed Computing – Award: $15,000. Primary Contact: Michael Lawrence (michafla at gene.com)
The ISC’s Distributed Computing Working Group explores ways of enabling distributed computing in R. One of its outputs, the CRAN package ddR, defines an idiomatic API that abstracts different distributed computing engines, such as DistributedR and potentially Spark and TensorFlow. The goal of the project is to enable R users to interact with familiar data structures and write code that is portable across distributed systems.
The working group will use this R Consortium grant to fund an internship to help improve ddR and implement support for one or more additional backends. Please contact Michael Lawrence to apply or request additional information.
Establishing DBI – Award: $26,500. Primary Contact Kirill Müller (krlmlr at mailbox.org)
Getting data in and out of R is an important part of a statistician’s or data scientist’s work. If the data reside in a database, this is best done with a backend to DBI, R’s native DataBase Interface. The ongoing “Improving DBI” project supports the DBI specification, both in prose and as an automated test. It also supports the adaptation of the `RSQLite` package to these specs. This follow-up project aims to implement a modern, fully spec-compliant DBI backends to two major open-source RDBMS, MySQL/MariaDB and PostgreSQL.
Forwards Workshops for Women and Girls – Award $25,000. Primary Contact: Dianne Cook (rowforwards at gmail.com)
The proportion of female package authors and maintainers has remained persistently low, at best at 15%, despite 20 years of the R project’s existence. This project will conduct a grassroots effort to increase the participation of women in the R community. One day package development workshops for women engaged in research will be held in Melbourne, Australia and Auckland, New Zealand in 2017, and at locations yet to be determined in the USA and Europe in 2018. Additionally, one day workshops for teenage girls focused on building Shiny apps will be developed to encourage an interest in programming. These will be rolled out in the same locations as the women’s workshops. All materials developed will be made available under a Creative Commons share-alike license on the Forwards website (http://forwards.github.io).
Joint Profiling of Native and R Code – Award: $11,000. Primary Contact: Kirill Müller (krlmlr at mailbox.org)
R has excellent facilities for profiling R code: the main entry point is the Rprof() function that starts an execution mode where the R call stack is sampled periodically, optionally at source line level, and written to a file. Profiling results can be analyzed with summaryRprof(), or visualized using the profvis, aprof, or GUIProfiler packages. However, the execution time of native code is only available in bulk, without detailed source information.
This project aims at bridging this gap with a drop-in replacement to Rprof() that records call stacks and memory usage information at both R and native levels, and later commingles them to present a unified view to the user.
R-hub #2 – Award: $89,500. Primary Contact: Gábor Csárdi (csardi.gabor at gmail.com)
R-hub is the first top level project of the R Consortium. The first stage of the project created a multi-platform, R package build server. This proposal includes the maintenance of the current R-hub infrastructure and a number of improvements and extensions including:
- R-hub as the first step of package submissions to CRAN
- R package reverse dependency checks, on R-hub and locally
- General R code execution, on all R-hub platforms
- Check and code quality badges
- Database of CRAN code
- The CRAN code browser
School of Data Material Development – Award: $11,200. Primary Contact: Heidi Seibold (heidi at schoolofdata.ch)
School of Data is a network of data literacy practitioners, both organizations and individuals, implementing training and other data literacy activities in their respective countries and regions. Members of School of Data work to empower civil society organizations (CSOs), journalists, civil servants and citizens with the skills they need to use data effectively in their efforts to create better, more equitable and more sustainable societies
Our R consortium will develop learning materials about R for journalists, with a focus on making them accessible and relevant to journalists from various countries. As a consequence, our content will use country-relevant examples and will be translated in several languages (English, French, Spanish, German).
Stars: Scalable, Spatiotemporal Tidy Arrays for R – Award: $10,000. Primary Contact Edzer Pebesma (edzer.pebesma at uni-muenster.de)
Spatiotemporal and raster data often come as dense, two-dimensional arrays while remote sensing and climate model data are often presented as higher dimensional arrays. Data sets of this kind often do not fit in main memory. This project will make it easier to handle such data with R by using dplyr-style, pipe-based workflows, and also consider the case where the data reside remotely, in a cloud environment. Questions and offers to support are welcome through issues at: https://github.com/edzer/stars .