Skip to content

For comment: Roslyn team SLA for addressing community PRs and bug submissions #26266

Open

Description

It's been nearly four years since we moved Roslyn to open source, and during that time we’ve worked on refining our processes and infrastructure to better enable and support it. Much of that effort has been focused on the CI (continuous integration) train, particularly focused on automated testing.

One thing that we recognize that we haven’t done as well as we’d like is the processing of community-submitted PRs and bugs. In fact, we have over 200 unprocessed PRs, many of which are quite old, along with rather more bug reports which are still open. We’ve attempted several times to impose more rigor on getting these processed. However, for various reasons (but ultimately due to the availability of engineers to drive through the backlog vs. other high priority work in Visual Studio), we haven’t been able to sustain those efforts. We want to do better here.

In discussing this in one of our weekly Roslyn leads’ meetings, we agreed that previous attempts to make progress here were too ad hoc. We further agreed that this type of community support can only succeed if we bake in time to each sprint to address it. To that end, we want to start by creating an SLA to which we would hold ourselves accountable by setting goals and reporting progress against them in a transparent way each sprint, which then would allow us to better build in cost for support each sprint after a few sprints’ worth of data.

What follows is a draft of that plan, reviewed by the leads of the Roslyn team and cross-checked by managers in peer organizations. In the spirit of open source, we would like community & Roslyn team feedback on it before ratifying it.

PRs:

While the Roslyn team has the right and obligation to move Roslyn in directions which might conflict with community-submitted PRs, we should not leave PRs hanging without any feedback at all. We now have accumulated a lot of PRs, many of which are old and possibly not even relevant to the state of the codebase. To address this, we are considering the following:

Existing PRs:

  1. We would reject all current PRs older than six months, with apology. This does us no credit, but it is a realization that the codebase will have changed significantly with regard to many (but not all) of the PRs involved. Those PRs would be rejected “without prejudice,” meaning that the owner could re-submit them against the current state of the code, updating the PR as appropriate.
  2. The remaining backlog of existing PRs would be triaged against the process below as if they had been newly submitted.

New PRs:

The lead developer for each area of the repo will do first-pass triage community PRs within a week of submission for the PRs in their area. (In practice, this will probably mean that a lead will set aside one block of time each week for this.) The first pass will result in the following visible status update within that week:

  1. Approved by Lead: The PR clearly meets the bar (i.e. it fixes a meaningful problem, has associated unit test(s), and does not cause a deviation from planned product direction).
    • The PR will be absorbed into the next available branch which is not in a stabilization phase.
    • The PR will be assigned a Microsoft “buddy” on the team who will help mentor it through. Besides helping with code and process, the buddy will look for any additional testing needs (our existing test scenarios may not cover this fix and it should not be assumed that the new/changed scenario will be covered in our daily testing) and also look for additional regression prevention required that cannot be fulfilled through unit testing
    • If completion is blocked by unforeseen complications involving the team’s internal infrastructure or code collisions, the buddy will keep the submitter informed, loop them in to address issues, etc.
    • During that PR completion phase, the PR buddy will get back to the contributor within three business days to review any follow-ups (the buddy will delegate that to a fellow team member if away).
    • Conversely, if the contributor has been asked for updates or clarifications, and the buddy has not heard back from them in one month, the PR will be assumed to be abandoned (and thus rejected) unless other time limits have been agreed upon with the PR buddy.
  2. Pending discussion: There are questions about the PR that need to be resolved – code needs to be changed, unit tests are missing, is unclear on whether it is in the direction we want to take the codebase, or we need clarification on the purpose of the PR.
    • The lead will make it clear as to what needs to be done or answered.
    • A “buddy” will be assigned to assist the contributor if they have any questions, but the onus is on the contributor to follow up on any questions/concerns/code within two weeks from the last question unless otherwise agreed upon.
    • PRs need to keep up with the rest of the codebase during this time.
    • Discussions will always take place in the GitHub issue except in the rare case where some part of the discussion needs to happen internally due to legal issues that we cannot control (e.g., code change collides with undisclosed Microsoft IP).
  3. Rejected: The PR is not in the direction we want to take the codebase, has too much risk, or does not solve a problem of meaningful priority. (Or, in the case of the existing backlog, the PR needs to be brought up-to-date.)
    • The lead will make it clear why the PR is being rejected.
    • PRs can be resubmitted “without prejudice;” however, realistically such PRs would not be approved unless significantly changed (or, in the case of the existing backlog, updated).

PR report:

The Roslyn leads will each produce a brief report each sprint which shows how we are tracking against PRs for their area (PRs completed in bounds, PRs in discussion, PRs out of bounds, PRs rejected). These areas include Productivity (IDE/Debugger/Analyzers), Compiler & Language, Project System, and .NET Core CLI/SDK.

Bugs:

Reported bugs from customers (w/o PRs) in GitHub will be treated identically to any other bug, whether reported from customer via the Visual Studio Feedback mechanism, discovered internally by the team, etc. Specifically:

  • During the normal daily bug triage, each lead will prioritize the bug.
    • High priority bugs (crash, hang, data loss, regression, security, major malfunction) will get triaged to a Microsoft developer and put in their queue to address in the current milestone.
    • Small malfunctions will be marked as such and slotted/assigned when developers become available, and will be moved to a backlog milestone mapping to no particular release.
    • Minor issues (spelling errors, fit-n-finish, small issues with easy/obvious workaround) will be marked as “Up for Grabs” for community PR contributors, and will be moved to a backlog milestone mapping to no particular release.
    • Even if a bug is not marked “Up for Grabs,” we welcome community contributions on it (but if it’s assigned to someone else, check with them first) regardless of priority.
    • In triage, bugs may also be marked as “Not repro” (which can be cleared if a repeatable repro is found) or “Won’t Fix” if the lead determines that the bug has a valid reason to be punted (too risky, code is scheduled for obsoletion, etc.) (Although we are using the term “bugs” here in this SLA, the same process would be relevant to suggestions for improvements, etc.)
  • The amount of bugs that the team can fix is constrained by the availability of engineers as well as shipping schedules. (Community contributions help with that, but can’t totally offset that.)
  • Because of this, we gradually amass some non-trivial number of minor bugs that we are unlikely to ever get to, and which become increasingly irrelevant as the product changes. This, in turn, makes it difficult for the team to understand our actionable bug debt when scheduling.
  • Therefore, the team periodically clears out (“Won’t fix”/”Closed”) the collection of minor bugs which haven’t met the bar for fixing.
    • We don’t have a hard-and-fast schedule for doing such a pass through the bugs, but the rule of thumb would be about once a year.
    • These “won’t fixes” can be reactivated if subsequently discovered to be more important, on a per-instance basis, but in general, closing an issue will mean an end to any consideration of it.

Please discuss. The goal is to have a plan in place by May 15, 2018.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions