Skip to main content

Google Summer of Code 2020 improved automatic maintenance of conda-forge

· 3 min read
Filipe Pires Alvarenga Fernandes

The conda-forge "autotick" bot is a crucial part of conda-forge's infrastructure. It enables automatic maintenance of conda-forge packages by pushing version updates to the underlying software and enabling large migrations of packages from one dependency to another (e.g., Python 3.7 to Python 3.8). As conda-forg grows in size, with over 9,000 packages to date, automatic maintenance of the conda-forge ecosystem will become even more important.

We here at conda-forge have a large number of potential Google Sumer of Code projects around maintenance and development of the autotick bot infrastructure. These projects are high-impact, affecting the entire conda-forge ecosystem. They also cover multiple systems including databases, conda's CDN provider, continuous integration providers, and user interactions on GitHub.

Want to be a part of the team? Great! Take a look at the projects below and get in touch with us on GitHub! You can check the GSoC label for a detailed listing of the issues that need work.

  1. Maintenance and Refactoring

    We have a large backlog of maintenance and refactoring issues that are great for people with a range of experience from beginners to true code ninjas.

    • Integration Testing for the Autotick Bot

      Run true integration testing on a copy of the graph to better test code changes and improve our CI process.

      Issue: regro/cf-scripts261

      Experience Level: advanced

    • Work on the "code hardening" Milestone

      Address any of the issues in the milestone above related to code refactoring and cleanup.

      Issues: regro/cf-scripts milestone #4

      Experience Level: beginner to advanced

  2. Automated PRs for Globally Pinned Packages

    conda-forge maintains a list of globally pinned packages. These are typically dependencies whose version needs to be the same across all of conda-forge (e.g., the compiler versions or packages like HDF5). While we have infrastructure to run a migration of the downstream packages from a given pinned package, we do not have automated infrastructure to propose the migration of the pin itself. The project here is to add this functionality to our infrastructure.

    Issue: regro/cf-scripts#665

    Experience Level: advanced

  3. Check the conda CDN for Updated Packages before Issuing PRs in a Migration

    conda relies on a CDN provider to serve the index of available packages. There is a moderate ~30 minute delay between when a package is uploaded to anaconda.org and when it will appear in the conda index. We currently do not take this delay into account when issuing PRs in a migration.

    Issue: regro/cf-scripts#595

    Experience Level: beginner

  4. Finish Migrations with PR into the conda-forge Pinnings File

    Right now, when the migration of say package ABC to version X from version Y is done, we do not automatically merge the change in the globally pinned value of ABC into our listing of global pinnings. We should be issuing a PR to the pinnings file once we have determined that a suitable fraction of the packages effected by a migration have been properly rebuilt.

    Issue: regro/cf-scripts#595

    Experience Level: moderate

  5. Fully Render conda Packages in order to Determine Migration Dependencies

    Determining the dependencies of a given package is actually a computation expensive task due to the way conda recipes are structured and parametrized through the use of Jinja2 and conda-build-config.yaml files. Currently, the autotick bot examines the static metadata in the meta.yaml file and not the fully rendered metadata. For this reason, we sometimes miss dependencies of a given package that need to be migrated first. Addressing this issue involves both calling the rendering process and also scaling that process to the entire set of conda-forge packages.

    Issue: regro/cf-scripts#664

    Experience Level: moderate