Page MenuHomePhabricator

[OPS] publish Zuul git repositories over git protocol
Closed, ResolvedPublic

Description

When using http://integration.wikimedia.org/zuul/git/* (private sorry) as a reference, git clone is unreliable and hangs doing some git-upload-pack operation. There is no problem using the Gerrit url though.

Gerrit as reference:

git clone --verbose \

http://gerrit.wikimedia.org/r/p/operations/puppet.git \
--reference=/srv/ssd/gerrit/operations/puppet.git

Cloning into 'puppet'...
POST git-upload-pack (gzip 1319 to 678 bytes)
POST git-upload-pack (gzip 1619 to 823 bytes)
POST git-upload-pack (gzip 3169 to 1597 bytes)
POST git-upload-pack (gzip 6269 to 3151 bytes)
POST git-upload-pack (gzip 12419 to 6232 bytes)
POST git-upload-pack (gzip 24819 to 12458 bytes)
POST git-upload-pack (gzip 49219 to 24805 bytes)
POST git-upload-pack (gzip 98269 to 49070 bytes)
POST git-upload-pack (gzip 144369 to 71881 bytes)
POST git-upload-pack (gzip 187919 to 93436 bytes)
POST git-upload-pack (gzip 235319 to 116863 bytes)
POST git-upload-pack (gzip 281819 to 139842 bytes)
POST git-upload-pack (gzip 329719 to 163517 bytes)
POST git-upload-pack (gzip 377369 to 187082 bytes)
POST git-upload-pack (gzip 396724 to 196635 bytes)
remote: Counting objects: 94078, done
remote: Total 0 (delta 0), reused 0 (delta 0)
Checking out files: 100% (2082/2082), done.

With the private Zuul repo:

git clone --verbose \

http://integration.wikimedia.org/zuul/git/operations/puppet \
--reference=/srv/ssd/gerrit/operations/puppet.git

Cloning into 'puppet'...
POST git-upload-pack (gzip 1369 to 699 bytes)
POST git-upload-pack (gzip 1569 to 793 bytes)
POST git-upload-pack (gzip 2819 to 1416 bytes)
POST git-upload-pack (gzip 5169 to 2598 bytes)
POST git-upload-pack (gzip 9969 to 4996 bytes)
POST git-upload-pack (gzip 19269 to 9670 bytes)
POST git-upload-pack (gzip 36969 to 18632 bytes)
POST git-upload-pack (gzip 74869 to 37493 bytes)
POST git-upload-pack (gzip 98919 to 49411 bytes)
POST git-upload-pack (gzip 123269 to 61460 bytes)
POST git-upload-pack (gzip 146719 to 73076 bytes)
POST git-upload-pack (gzip 169069 to 84142 bytes)
POST git-upload-pack (gzip 193569 to 96277 bytes)
POST git-upload-pack (gzip 218169 to 108427 bytes)
POST git-upload-pack (gzip 240819 to 119623 bytes)
POST git-upload-pack (gzip 253719 to 125957 bytes)
<HANG>

Had to revert a bunch of jobs I have deployed for bug 53594.


Version: wmf-deployment
Severity: normal

Details

Reference
bz53683

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:13 AM
bzimport set Reference to bz53683.

Might be due to some limit with the amount of data that can be send with a POST to the git-http-backend CGI script.

Created attachment 13218
clone with GIT_CURL_VERBOSE=1

Attached:

I have reproduced the bug on labs instance integration-bug53683.pmtpa.wmflabs

hashar@integration-bug53683:/mnt/workdir$ git clone --verbose --reference /mnt/workdir/replicated-puppet.git/ http://integration-bug53683.instance-proxy.wmflabs.org/zuul/git/puppet
Cloning into 'puppet'...
POST git-upload-pack (gzip 1369 to 704 bytes)
POST git-upload-pack (gzip 1669 to 868 bytes)
POST git-upload-pack (gzip 2869 to 1468 bytes)
POST git-upload-pack (gzip 5219 to 2644 bytes)
POST git-upload-pack (gzip 10119 to 5094 bytes)
POST git-upload-pack (gzip 19469 to 9786 bytes)
POST git-upload-pack (gzip 37169 to 18760 bytes)
POST git-upload-pack (gzip 74869 to 37506 bytes)
POST git-upload-pack (gzip 99019 to 49475 bytes)
POST git-upload-pack (gzip 123319 to 61491 bytes)
POST git-upload-pack (gzip 146919 to 73199 bytes)
POST git-upload-pack (gzip 169219 to 84231 bytes)
POST git-upload-pack (gzip 193869 to 96444 bytes)
POST git-upload-pack (gzip 218419 to 108551 bytes)
POST git-upload-pack (gzip 241019 to 119738 bytes)
POST git-upload-pack (gzip 254019 to 126122 bytes)
POST git-upload-pack (gzip 263769 to 130999 bytes)
POST git-upload-pack (gzip 272919 to 135488 bytes)
POST git-upload-pack (gzip 283019 to 140446 bytes)
POST git-upload-pack (gzip 292869 to 145311 bytes)
POST git-upload-pack (gzip 301719 to 149718 bytes)
POST git-upload-pack (gzip 310069 to 153890 bytes)
<HANG>

I copied the Debian packages of git 1.7.10.4 from Ubuntu Quantal and installed them on the instance:

dpkg -i git-man_1.7.10.4-1ubuntu1_all.deb git_1.7.10.4-1ubuntu1_amd64.deb
(Reading database ... 30792 files and directories currently installed.)
Preparing to replace git-man 1:1.7.9.5-1 (using git-man_1.7.10.4-1ubuntu1_all.deb) ...
Unpacking replacement git-man ...
Preparing to replace git 1:1.7.10.4-1ubuntu1 (using git_1.7.10.4-1ubuntu1_amd64.deb) ...
Unpacking replacement git ...
Setting up git-man (1:1.7.10.4-1ubuntu1) ...
Setting up git (1:1.7.10.4-1ubuntu1) ...
Installing new version of config file /etc/bash_completion.d/git ...
#

git --version

git version 1.7.10.4

Then restarted Apache and retry the clone with a reference:

$ git clone --verbose --reference /mnt/workdir/replicated-puppet.git/ http://integration-bug53683.instance-proxy.wmflabs.org/zuul/git/puppet
Cloning into 'puppet'...
POST git-upload-pack (919 bytes)
POST git-upload-pack (gzip 1369 to 733 bytes)
POST git-upload-pack (gzip 2619 to 1347 bytes)
POST git-upload-pack (gzip 4919 to 2508 bytes)
POST git-upload-pack (gzip 9819 to 4951 bytes)
POST git-upload-pack (gzip 19169 to 9645 bytes)
POST git-upload-pack (gzip 36869 to 18616 bytes)
POST git-upload-pack (gzip 74619 to 37390 bytes)
POST git-upload-pack (gzip 98719 to 49335 bytes)
POST git-upload-pack (gzip 123069 to 61375 bytes)
remote: Counting objects: 7, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 4 (delta 3), reused 1 (delta 0)
Unpacking objects: 100% (4/4), done.
$

Worked! So we are hit by a git issue that got resolved between: 1.7.9.5-1 and 1.7.10.4-1ubuntu1

Created attachment 13221
clone with git v1.7.10.4-1ubuntu1 GIT_CURL_VERBOSE=1

Attached:

I compiled git v1.7.9.5 from source with default options and can not reproduce the issue :/ Might be one of the Ubuntu patch, there is a bunch of them that are network related.

Will ask ops/engineering to get git upgraded using the Quantal version.

Rephrased summary.

I used the Ubuntu source package of git 1.7.9.5 to compile:

apt-get source git
cd git-1.7.9.5
./debian/rules build

And pointed apache to it. It does not hang :( So I have no clue.

Using again the stock Precise package does hang :/

I tried out on gallium with the git packages 1.7.10.4-1ubuntu1 from raring and the git-http-backend process hangs as well :-(

$ rm -fR foobar; git clone --verbose --progress --reference /srv/ssd/gerrit/mediawiki/core.git -o origin http://integration.wikimedia.org/zuul/git/mediawiki/core foobar
Cloning into 'foobar'...
POST git-upload-pack (gzip 1119 to 598 bytes)
POST git-upload-pack (gzip 1519 to 791 bytes)
POST git-upload-pack (gzip 2719 to 1389 bytes)
POST git-upload-pack (gzip 5519 to 2801 bytes)
POST git-upload-pack (gzip 11019 to 5549 bytes)
POST git-upload-pack (gzip 21019 to 10576 bytes)
POST git-upload-pack (gzip 41869 to 21130 bytes)
POST git-upload-pack (gzip 82619 to 41290 bytes)
POST git-upload-pack (gzip 115019 to 57364 bytes)
POST git-upload-pack (gzip 145019 to 72182 bytes)
POST git-upload-pack (gzip 177369 to 88219 bytes)
POST git-upload-pack (gzip 208819 to 103749 bytes)
POST git-upload-pack (gzip 239519 to 118861 bytes)
POST git-upload-pack (gzip 270669 to 134281 bytes)
^C

Reverted git back to git version 1.7.9.5-1 from Precise.

Mailed operations team to throw smarter brains at the issue.

A solution would be to use git daemon to publish the repositories.

On the server (gallium):

git daemon --base-path=/srv/ssd/zuul/git --export-all --verbose

On the client side, use the git protocol to clone:

git clone --verbose --progress \
--reference /srv/ssd/gerrit/mediawiki/core.git \
git://localhost:9418/mediawiki/core

That takes a few seconds to initialize, the resulting .git directory is quite small.

Change 82625 had a related patch set uploaded by Hashar:
contint: publish Zuul git over git protocol

https://gerrit.wikimedia.org/r/82625

git-http-backend is bugged and hangs whenever we attempt to clone huge repositories. That is bug 53683.

Instead, we can use git daemon which does not have the issue. https://gerrit.wikimedia.org/r/82625

Tried out with git-daemon:

git clone --verbose --progress git://localhost:9418/mediawiki/core/ --reference=/srv/ssd/gerrit/mediawiki/core.git

Works :-]

I have closed RT 5698 which was requesting to upgrade git. Instead this is now waiting for git daemon: https://gerrit.wikimedia.org/r/82625

Rephrasing summary.

Change 82625 merged by Akosiaris:
contint: publish Zuul git over git protocol

https://gerrit.wikimedia.org/r/82625

Ran puppet and did basic checks, that seems to work fine and the port is properly firewalled.