Allow crawling of bugzilla.wikimedia.org select content
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	Nemo_bis
	Oct 25 2014, 10:16 AM

Description

The robots.txt rules are unnecessarily restrictive. As bugzilla is being deprecated, and only a portion of its content migrated to phabricator, it's essential that we allow third parties to do their job. All crawlers, or at least ia_archiver (wayback machine), should be allowed to crawl:

any content which
doesn't specifically cause load issues and
is not being semantically migrated to phabricator.

Ideally we'd drop requirement (3) but let's start somewhere.

Example URLs which shouldn't be blacklisted:

/page.cgi?id=voting/bug.html*
/duplicates.cgi*
/report.cgi* (unless load)
/weekly-bug-summary.cgi*
/describecomponents.cgi*

In fact, is there any reason not to allow everything, minus:

/show_bug.cgi
/showdependencytree.cgi
/query.cgi

Version: wmf-deployment
Severity: enhancement
URL: http://web.archive.org/save/https://bugzilla.wikimedia.org/duplicates.cgi

Details

Reference: bz72507

Event Timeline

• bzimport raised the priority of this task from to Low.Nov 22 2014, 3:49 AM

• bzimport added a project: Wikimedia-Bugzilla.

• bzimport set Reference to bz72507.

• bzimport added a subscriber: Unknown Object (MLST).

Nemo_bis created this task.Oct 25 2014, 10:16 AM

Duplicate of bug 13881?

Wikimedia has migrated from Bugzilla to Phabricator. Learn more about it here: https://www.mediawiki.org/wiki/Phabricator/versus_Bugzilla - This task does not make sense anymore in the concept of Phabricator, hence closing as declined.

Allow crawling of bugzilla.wikimedia.org select contentClosed, DeclinedPublicActions

Description

Details

Event Timeline

Allow crawling of bugzilla.wikimedia.org select content
Closed, DeclinedPublic
Actions