2016-06-24: General discussion

(please note this document previously incorrectly slated the meeting for the 17th)

Time: 14:00 UTC

Hangout link: https://hangouts.google.com/call/v5olhwzpfzgzpoq5i3wthjpqpie

Attendees

Bjˆrn Gr¸ning

Filipe

John Kirkham

Michael Sarahan

Jonathan Helmus

Matt Craig

Phil Elson

Standing items

  • How many repos?

  • How many contributors?

  • New core devs?

Agenda

  • Low level packaging

  • Split gcc or work with defaults? We need a better and more consistent way to build packages that depends on Fortran and libgomp or we will keep seeing broken packages when mixing conda-forge and defaults.

  • Basic community practices when PR-ing to staged-recipes.

  • Recently I present conda-forge in a NOAA/IOOS in DC. Most people are excited about conda-forge, but reluctant to switch from the IOOS channel to conda-forge. The main reason is, of course, control. I made my best ensured them that conda-forge will follow all the good community practices as any other open source project that they already rely on. However, there are still some concerns. I would like to present a summary of the discussion in our meeting.

  • NetCDF (also curl/ca-certificates and Perl packages) - Done?

    *   curl and ca-certificates are done and available. 
    
    • Perl is no longer relevant as part of this process

  • GitHub rate limiting. How can we further mitigate these? This is a duplicate, it appears again below.

  • PR reviews

    *   Treat every PR as a Work in Progress. At least let PRs sit for a few hours before merging them.
    
    • Wait for answers when we ask clarification questions and avoid acting before we have them.

    • Respect the first reviewer by not repeating her/his review comments with another words. That is also bad for the person submitting the PR as it is confusing.

    • Avoid the death by a thousand cuts: Many small “nit” comments that might scare new contributors ( ping Mike S ;-)

  • Notifications (how do we stay on top of them)

  • Standardizing installs

    *   Mention [`toolchain`](https://github.com/conda-forge/toolchain-feedstock) .
    
            *   Discuss rollout to feedstocks.
    
    *   Get feedback on [`python-toolchain`](https://github.com/conda-forge/staged-recipes/pull/642) 
    
  • MSYS2

    *   Discussing Ray Donnelly's work on MSYS2 packages and how we want to use and integrate these into conda-forge.
    
    • Some use cases to consider OpenBLAS, FFTW, build tools, others?

  • Binary data

    *   Do we include it in recipes?
    
    • What kinds do we allow if any (e.g. icons)?

    • How do we verify the licensing?

    • How do we verify that they are safe?

  • OpenBLAS (on Windows)

  • Dev releases: Where do they happen?

    *   Do we do them at conda-forge?
    
            *   Maybe add a label.
    
    *   Do we let others do them with a feedstock on their own repo?
    
    • How do we enforce whatever we decide?

  • Conda-forge installer

    *   We have Python 3.5 now
    
    • Still need conda.

    • New repo?

    • Where do we host the installers? Git tags?

  • GitHub rate limitations. How can we further mitigate these?

    *   GitHub letter ( [](https://docs.google.com/document/d/19HLtYPwg6IKAwmxPwL7Vd3AX0n47ANP-ZTpZROn-Cwc/edit?pref=2&pli=1)https://docs.google.com/document/d/19HLtYPwg6IKAwmxPwL7Vd3AX0n47ANP-ZTpZROn-Cwc/edit?pref=2&pli=1 ).
    
            *   +1, this reads very well
    *   +1 also -- is it appropriate to ask for advice on how to reduce our API calls or queue them up in the event they are unwilling to raise  limit?
    *   So, there have been updates since this was initially added. See this issue ( [conda forge/conda forge.github.io#88](https://github.com/conda-forge/conda-forge.github.io/issues/88) ). They wrote this letter in reply ( [](https://docs.google.com/document/d/1lzWNxvmEtrgjSBVrUWEO-imDryBOLRfObz3PkI9qT5Y/edit?pref=2&pli=1)https://docs.google.com/document/d/1lzWNxvmEtrgjSBVrUWEO-imDryBOLRfObz3PkI9qT5Y/edit?pref=2&pli=1 ). Basically, they said that it wouldn't make sense for them to bump our rate limit in this way as our current usage scales poorly. I think I agree with that sentiment. Wrote up this proposal for more optimizations ( [conda forge/conda forge.github.io#172](https://github.com/conda-forge/conda-forge.github.io/issues/172) ). Have done some of them. See this PR ( [conda forge/staged recipes#733](https://github.com/conda-forge/staged-recipes/pull/733) ) for part of the fix. This has greatly improved the situation. Though we still have some issues.
    
  • Channel mirroring

    *   Can this point be a little bit explained? I thought about this as well and would like to contribute to this point.
    
  • Feedstock history

    *   Is it sacred?
    
    • Do we rebase/force push?

          *   If so, under what conditions?
      
      • How do we avoid multiple people doing this simultaneously?

  • Docker hosting solution

    *   Docker Hub builds were broken for a week and a half.
    
    • Have switched to quay.io currently.

    • Mirroring quay.io image on Docker Hub.

    • Thoughts about quay.io? Thoughts about hosting in general?

  • Continuum metadata request: can we add these to linter?

    *   example metadata: [](https://github.com/ContinuumIO/anaconda-recipes/blob/master/anaconda-build/meta.yaml#L36-L44)[https://github.com/ContinuumIO/anaconda-recipes/blob/master/anaconda-build/meta.yaml#L36-L44](https://github.com/ContinuumIO/anaconda-recipes/blob/master/anaconda-build/meta.yaml#L36-L44)
    
    • Also, distinguish summary (limit of 77 or 80 chars) from description (unlimited)

  • Google hangouts has a max capacity of 10. Is it worth considering other methods of communication so everyone who wants to participate can?

  • Maybe this ( http://www.freeconferencecalling.com/ ) is an option.

  • Bluejeans

  • Continuum has webex. Past experience is that some Linux platforms had trouble connecting

  • Drop numpy 1.10 and reduce our build matrix. (Numba now works with numpy 1.11.)

  • This comment from the PR for graphviz is the best summary I’ve seen: conda forge/staged recipes#568#issuecomment-225315370

  • Thanks for pointing this out. The described solution looks reasonable and is preferable to prefixing package names. Great!

  • Signing packages

    *   Should be easy to do. ( [](http://conda.pydata.org/docs/signed-packages.html)http://conda.pydata.org/docs/signed-packages.html )
    
    • There has been some interest previously.

  • HTTPError: 503 Server Error: Service Unavailable: Back-end server is at capacity for url…

    *   Seems we are regularly running into this issue under normal usage conditions.
    
    • Had discussed previously caching packages on AppVeyor and trying to reuse those to start.

    • Maybe we need to consider caching on all CIs.

    • Building our own Miniconda-like self-extracting scripts with packages via constructor.

Notes

Most pressing issues: naming conventions

Naming conventions

  • Continuum’s opinion : will take some time for name spaces to take effect, does not want to break anyone’s setup, so keep current names, can we follow defaults where defaults have precedent? Where Continuum does not have packages can they follows conda-forge?

  • simplegeneric issue, clobbering

  • how to know what package gets installed when you do a “conda install gplot”? Leads to reproducible environments.

  • Start with no namespaces, get name-spaces after you install a “core” package (python, r, etc), then you will get packages which match the languages in your environment

  • Want conda to act like pip, cran, etc, “just works”

  • What to do about dependencies?

  • Proposed that when you install a package you will get the packages in all namespaces?

  • Another option is to specify language in package name (python-simplegeneric), and have lookup table for “common” packages

  • Should raise issues on conda GitHub repo

  • No easy solutions, but we need to choose some solution

  • meta-package which use “common” name

  • Correct solution is to prefix everything with “python-“ but people do not want to do this when installing and people are already used to the old method.

  • Filipe’s issue with namespaces is that it makes choices for users, would rather have that choice… raise on GitHub issue XXX

  • For many users conda is a drop in replacement for pip, should we keep this big advantage?

  • Are there less engineered solution than namespacing?

    *   Should be raised in GitHub, submit PRs 
    
  • Prefix everything and have conda install be smart about finding these packages?

  • Do not prefix packages which are in defaults but anything not in defaults should be prefixed with python-, r-

  • Post toy examples in a PR to conda, see if it works?

  • Continue discussion later…

Skeleton generator

  • skeleton generator should use prefix names?

  • skeleton needs some updates, does not

  • John has Jinja template which generates meta.yaml, could we use this? Needs to pull data from setuptools

  • Should conda-forge ship it’s own skeleton generator? Or something different

Governance

  • NOAA worried about losing control over repo

  • Worried about hastily merged PR and similar issues

  • Write proposal for guidance of what a good PR looks like, self-merging, and similar issues

NumPy issue

  • Would libgfortran fix this issue?

  • Would like Micheal Grant look at solver before creating conda-forge libgfortran

  • libquadmath, current plan to include with libgfortran, not used in defaults, should these be separate packages?

  • libstdc++

  • Need to standardize on common compiler stack between conda-forge and Continuum

Suggestions for Phil’s priorities

  • conda-build-all

    *   [SciTools/conda build all#41](https://github.com/SciTools/conda-build-all/issues/41)
    

Service to run builds on beta releases of conda-build

Copy of “stable” packages?

  • Consolidate multiple PR into a single version

  • conda-build-all PR