Page MenuHomePhabricator

Implement a way to bring GitHub pull requests into gerrit
Closed, DeclinedPublic

Description

First: Find the various GitHub mirrors of MediaWiki. Turn one into an up-to-date clone of our codebase.

Next, harder step: Make an interface to accept GitHub pull requests as merge requests in Gerrit, or do two-way syncing automatically via a bot.

Erik: "although I'm guessing that falls into the "hard" category, it could give us huge wins in terms of casual contribution.

Chad: "Yeah, that's definitely not straightforward--would need some careful thought to make sure we're doing it in a way that makes sense on our end too."

Details

Reference
bz35497

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

(In reply to comment #5)

We can probably do something to indicate "sorry, please don't submit issues
here", but we do want people to submit pull requests to GitHub

(In reply to comment #6)

You can disable issue tracking on GitHub, iirc.

(In reply to comment #13)

  • admin settings: pull-request: true, issues: false, wiki: false

(In reply to comment #17)

The Issues and Wiki features in GitHub repositories can be disabled easily.

bug 38196 has been fixed. We have replication now, and the issue tracker and wiki are disabled:

https://github.com/wikimedia/mediawiki-core

In case it helps, I have listed this task at

https://www.mediawiki.org/wiki/Project:New_contributors#Merging_GitHub_pull_requests

What help do you need to get it done?

Also a question: is this about bringing pull requests to Gerrit as soon as they are submitted or about merging the ones that the maintainers accept in GitHub? The difference is where is the discussion supposed to happen.

sumanah wrote:

http://lists.wikimedia.org/pipermail/wikitech-l/2013-March/067665.html ("Moving a GitHub Pull Request to Gerrit Changeset manually") and the subsequent thread may be useful.

Yuvi's tools are now bridging GitHub and Gerrit for a number of repostories that have requested it. More details at https://www.mediawiki.org/wiki/User:Yuvipanda/G2G

Seems to work. FIXED?

It is currently enabled for the following repositories, after consulting with people responsible for the repos:

qa/browsertests
extensions/MobileFrontend
extensions/PostEdit
extensions/GuidedTour
extensions/EventLogging
extensions/GettingStarted

Any pull requests to the appropriate repo in GitHub will automatically (and instantly!) have a Gerrit patchset created, and a comment will be left on the GitHub Pull Request informing people of the appropriate Gerrit URL. This is currently running on Wikimedia Tools Labs, and should be considered experimental. If anyone else wants to have their repo added to the 'watched' list, do let me know :)

Resolving as FIXED. There is a functional proof of concept and some precedents. There is a process to request new repos to be added. And there might be bugs, but we have the usual Bugzilla process for this.

Thank you to everybody involved! This is just the beginning.

It now syncs Comments / Changeset Abandon / Merging / Restore on Gerrit on GitHub! See https://github.com/wikimedia/apps-android-commons/pull/6 for an example (though usually it closes the PR automatically).

It doesn't sync inline comments yet, so I'll need to figure that out. There is also a Redis based reliable queue running on Tools-labs, so future tools that depend on Gerrit can be easily built too. Ideas for such welcome :)

Yuvi, what's the process for adding new repos? Maybe some docs can be added to http://www.mediawiki.org/wiki/Gerrit (which links to this bug).

I guess it's time to reopen this. The bot has been down for a long while and Yuvi doesn't know when he's going to get time to fix it.

sumanah wrote:

If Yuvi does not have time to do this (and, presumably, to get GitHub->Phabricator syndication going), does anyone else?

If Phabricator is the focus of this report now, then we will have a duplicate after the migration:

GitHub->Phabricator bridge for new contributors
https://phabricator.wikimedia.org/T173

Consolidating discussion in T173 - please comment there.

Spage renamed this task from Implement a way to bring GitHub pull requests into Phabricator to Implement a way to bring GitHub pull requests into gerrit.Mar 26 2015, 9:15 PM
Spage reopened this task as Open.
Spage lowered the priority of this task from High to Medium.
Spage set Security to None.

This bug from 2012 was about GitHub -> Gerrit. Someone mistakenly changed it to GitHub -> Phabricator and made it a dupe of T173. But the Gerrit integration described in T37497#398909 is no longer working, and it's still nice to have regardless of what happens with Phabricator. YuviPanda commented on IRC

I have a bot that does that but it has been dead for years. Takes maybe a day of work to bring back up but I haven't had that day. Hackathon project maybe

greg lowered the priority of this task from Medium to Low.Sep 10 2015, 11:05 PM

@yuvipanda hi do you know if you can get the bot working that previously worked please.

Or we can use this plugin https://gerrit-review.googlesource.com/#/admin/projects/plugins/github I'm not sure if that will automate things but it seems you managed to get the bot to do more then what this plugin will do.

@demon we can use https://gerrit-review.googlesource.com/#/admin/projects/plugins/github to bring in pull requests from GitHub.

I also wrote this a very very long time ago (disclaimer: Probably needs rewriting)

This proposal is selected for the developer wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) summarizing the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.

We can install https://gerrit-review.googlesource.com/admin/repos/plugins/github (which is what gerrithub uses to pull in pull requests)

Have you got this working on your labs test instance?

Nope, haven't tried it yet. Though i can :)

Nope, haven't tried it yet. Though i can :)

Would be good for starters to see it working etc.

Ideally on the version we're running in prod if it's possible

Oh, i have upgraded everything to 2.16.

Oh, i have upgraded everything to 2.16.

Well, start with the plugin there... And see what happens

I asked luca weather the GitHub plugin supported ldap (not just oauth). It does not see https://groups.google.com/forum/#!topic/repo-discuss/cos1rCFld8Q

But possibly we could break out pull requests from it and create a custom plugin.

I asked luca weather the GitHub plugin supported ldap (not just oauth). It does not see https://groups.google.com/forum/#!topic/repo-discuss/cos1rCFld8Q

But possibly we could break out pull requests from it and create a custom plugin.

I'm not sure what ldap has to do with it

FWIW, ideal end goal would be two way comment syncing between PR, and auto closing of the github issue when merged (or abandoned)

I asked luca weather the GitHub plugin supported ldap (not just oauth). It does not see https://groups.google.com/forum/#!topic/repo-discuss/cos1rCFld8Q

But possibly we could break out pull requests from it and create a custom plugin.

I'm not sure what ldap has to do with it

The plugin uses oath. It imports the user when you import a pull.

(seems it dosen't allow you to use ldap as a login system but use oauth just for GitHub to authenticate.

We can install https://gerrit-review.googlesource.com/admin/repos/plugins/github (which is what gerrithub uses to pull in pull requests)

Readme. Seems to do a lot of things we might not want.

  • Users can login to Gerrit using the same username and credentials in GitHub. Gerrit login points to GitHub for generating the OAuth token to be used for the code-review authenticated session.
  • Existing GitHub repositories are automatically replicated to Gerrit for the purpose of performing code-review and pushing back changes once approved.

Really the "Pull-request to Change" part is the only one relevant for us. The patch for that says "Pull requests with new commits are displayed, as the new commits will be imported as new dependent changes." which does not seem like an ideal workflow (Github will turn all those commits into a single merge commit; they often just consist of small fixes in response to code review, turning them into a Gerrit commit chain seems like the wrong way to handle them. Also, how are force pushes handled then?). Granted that patch is five years old.

the "Pull-request to Change"

This is the part that I care about too. I don't need two way sync (even though that is indeed cool). I'm fine with merging and managing pull requests from github in gerrit, I just don't want to make non gerrit users have to figure out how to use gerrit just to make a patch.

For uploading patches one can use https://www.mediawiki.org/wiki/Gerrit_patch_uploader . Interface: https://tools.wmflabs.org/gerrit-patch-uploader/ Should be straightforward :-)

For me, I would want a way for contributors to submit pull requests to Github. I will handle code review in Github, but the patch when ready for merged would hit Gerrit. It's important to me that any contributor doesn't have to use any tooling that they are not familiar with. For them they stay in Github, the onus is on me to manage the Gerrit to Github relationship. As soon as you put new tools in the way e.g. Gerrit patch uploader, I suspect we make it harder for people to contributor as you're forcing them to learn a new work flow. I've been contributing to similar free-knowledge projects hosted on Github and the difference in contributions compared to wikimedia Gerrit hosting is disappointingly higher. I'm hoping the only reason for that is Gerrit, but of course it might be other factors, so I'd love to rule out the Gerrit/Github problem as the reason.

It's important to me that any contributor doesn't have to use any tooling that they are not familiar with

But you're drawing the line arbitrarily. What about those only familiar with Atlassin tools? MS's VSS/TFS? Gitlab? Bitbucket?

What about those that are only familiar with svn, cvs? Those that can't even make a diff? Don't want to use a mailing list or phabricator to submit a patch?

We can't support everyone. And hell, we can still accept patches from github as is. Git makes this easy.

It's not difficult to have two remotes (or more) in a git repository. Anyone can pull it from github and push it to gerrit (adding a change-id automatically along the way), keeping the authorship and then CR+2 it in gerrit.

For uploading patches one can use https://www.mediawiki.org/wiki/Gerrit_patch_uploader . Interface: https://tools.wmflabs.org/gerrit-patch-uploader/ Should be straightforward :-)

For me, I would want a way for contributors to submit pull requests to Github. I will handle code review in Github, but the patch when ready for merged would hit Gerrit. It's important to me that any contributor doesn't have to use any tooling that they are not familiar with. For them they stay in Github, the onus is on me to manage the Gerrit to Github relationship. As soon as you put new tools in the way e.g. Gerrit patch uploader, I suspect we make it harder for people to contributor as you're forcing them to learn a new work flow.

I don't see the problem there, they submit in Github, you handle the code review, you use the patch uploader to bring the patch to Gerrit. It's not very convenient in that you have to download/reupload the patch as a file but that can be easily improved in the tool.
(There will be a different kind of problem, in that we don't want to fragment code review discussion into many separate systems. But as long as it's a small component where all the reviewers are fine with Github, and not something like MW core, it should be OK.)
I do wonder though how good a trade-off it is in the long term to make the first contribution easy vs. keep the user segregated in an external system that our community does not really use.

If you want something fully automatic where you don't even need to click a "move this to Gerrit" button, you run into the same problem that made the author of the Gerrit pluin use OAuth: you need to convert GitHub users into Gerrit users somehow. Fully switching to Github authentication is obviously not an option for as; supporting both Github and LDAP as auth methods would in theory be nice, but gets you all kinds of namespace problems.

I've been contributing to similar free-knowledge projects hosted on Github and the difference in contributions compared to wikimedia Gerrit hosting is disappointingly higher. I'm hoping the only reason for that is Gerrit, but of course it might be other factors, so I'd love to rule out the Gerrit/Github problem as the reason.

IMO there are plenty of far lower hanging problems, like how do even contributors find out where to go (see T136863: Should Wikimedia have standard policies for managing github mirror repos? and all the related issues).

What about those only familiar with Atlassin tools? MS's VSS/TFS? Gitlab? Bitbucket?

Well there's no reason not to support them all, if we have the bandwidth to do so. If don't, it's pretty obvious why we would prioritize Github support.

Those that can't even make a diff?

You don't actually need to know that, in Github you can just edit the file in a webform. For tools that contain e.g. rules a non-technical contributor might be interested in (like blacklists or patterns or template names), that's a pretty cool feature to have.

It's not difficult to have two remotes (or more) in a git repository. Anyone can pull it from github and push it to gerrit (adding a change-id automatically along the way), keeping the authorship and then CR+2 it in gerrit.

We should ensure that anyone who makes a pull request gets a response explaining them how things work, so that part needs some automation in any case. The rest can be done by hand, sure, but why not automate it if we can.

But you're drawing the line arbitrarily.

Sure, but my line is based on where I think my repos contributors are likely to be (and of course that may be wrong). That's likely different for everyone, but any integration with Github that is not this seamless to me is not worth pursuing. Probably, as Gergo suggests, I could enable pull requests on Github for a repo, manage reviews there and use the Gerrit patch uploader myself when patches are ready to merge (provided that retains credit).

Personally, having a sign in with your Github via OAuth for MediaWiki services (Gerrit, MediaWiki) would be higher up my priority list than pursuing this further.

hashar added a subscriber: chrisportelli.

The canonical place is Gerrit which is where patches should be send. A long concern has been pull requests send to github being ignored but as @Aklapper mentioned they are now automatically rejected. An example of such close is https://github.com/wikimedia/mediawiki/pull/118#issuecomment-629750242

👋 Hi @chrisportelli,

Thank you for contributing to Wikimedia! This Github repository is a read-only mirror.
The project uses Gerrit for code review.
Please submit your pull request to Gerrit, so your code changes can be reviewed.

If you are interested in sticking around within the Wikimedia community, please take a look at https://www.mediawiki.org/wiki/New_Developers
Thanks in advance!