Page MenuHomePhabricator

GlusterFS deployment-prep-project has too many errors -> beta cluster is down
Closed, ResolvedPublic

Description

ERROR

The requested URL could not be retrieved

While trying to retrieve the URL: http://en.wikipedia.beta.wmflabs.org/

The following error was encountered:

Unable to forward this request at this time.
This request could not be forwarded to the origin server or to any parent caches. The most likely cause for this error is that:

The cache administrator does not allow this cache to make direct connections to origin servers, and
All configured parent caches are currently unreachable.
Your cache administrator is benapetr@gmail.com.
Generated Sun, 21 Apr 2013 19:35:03 GMT by squid001.beta.wmflabs.org (squid/2.7.STABLE9)


Version: unspecified
Severity: normal

Details

Reference
bz47479

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 1:24 AM
bzimport added a project: Cloud-VPS.
bzimport set Reference to bz47479.

Cant ssh to either apache32 or apache33 instances although the servers ping. Port 80 does not answer.

I have rebooted apache32. Restarting apache I got:

/etc/init.d/apache2 start

/etc/init.d/apache2: 55: [: nice: unexpected operator

  • Starting web server apache2 Warning: DocumentRoot [/usr/local/apache/common/docroot/wikispecies.org] does not exist

Warning: DocumentRoot [/usr/local/apache/common/docroot/config] does not exist
Warning: DocumentRoot [/usr/local/apache/common/docroot/ee-prototype] does not exist
(5)Input/output error: apache2: could not open error log file /home/wikipedia/logs/apache-error.log.
Unable to open logs
Action 'start' failed.

So this is caused by the Gluster volume having an issue of some sort.

GlusterFS deployment-prep-project has too many errors.

Created attachment 12155
Some errors logs from /var/log/glusterfs/data-project.log

Some Gluster errors on deployment-apache32 instance showing that the deployment-prep-project volume has some issues. The files under /logs/ have conflicting entries.

Attached:

Whenever the /logs files are fixed, one would have to restart apache2 service on deployment-apache32 and deployment-apache33. Puppet might take care of it though :)

Created attachment 12156
error accessing one file

Attached:

Suffering a lack of curiosity, I just deleted the affected logfiles on labstore1 and labstore2.

I have restarted apache2 on both instances. Solved :-] Will have to reboot the other instances tomorrow. At least beta is up again, thank you Andrew!!!