⚓ T54709 Add Index and Page namespaces to $wgContentNamespaces

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:04 AM

• bzimport added a project: ProofreadPage.

• bzimport set Reference to bz52709.

• bzimport added a subscriber: Unknown Object (MLST).

Nemo_bis created this task.Aug 10 2013, 6:20 PM

This needs more discussion on Wikisource wikis?

(In reply to Glaisher from comment #1)

This needs more discussion on Wikisource wikis?

No: this is not a site request. Most/all wikisources already do so, but it's easy to forget it for new wikis.

This definitely needs attention as the creation of orWS seems to have been missed in the setting of ''wgContentNamespaces' for Cirrus Search. So if the defaults cannot be fixed, we will need a separate bug to get the content spaces at orWS added to the search for content.

As a separate point that to get a default setup for this harder when we don't have a default/static namespace set for the WSes. If yes, then we may wish to add that bug as a blocker to this.

I could have swore they finally made ns-250 & ns-252 the default namespaces for Index: and Page: just about week or two ago.

(won't be the first time I was wrong either)

nor the first time that I missed something. I look forward to seeing other comment about this.

In T54709#962541, @Billinghurst wrote:

nor the first time that I missed something. I look forward to seeing other comment about this.

Found it... T74525

....I think that It only means that it'll be 250-252 unless overridden for the wiki (all wikis currently having different ids will continue to have that).

should we do this within the extension itself or just on Wikisources?

In T54709#965477, @Glaisher wrote:

should we do this within the extension itself or just on Wikisources?

The extension. The two namespaces definitely fit the general definition of "content namespace" from the docs; until proven otherwise, I think most wikis need to set them as content namespaces and will be grateful for the reduced configuration burden.

Aklapper added a project: All-and-every-Wikisource.Mar 10 2015, 4:16 PM

This looks to be related to ProofreadPage only as the WSes would seem to have this set as a default? Or is it that this still needs to get done for WSes as any new WSes need to have configuration added separately until this is done?

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 23 2015, 6:17 AM

This looks to be related to ProofreadPage

Yes, see my January 10 comment.

This can be done without doing T74525 first but it looks like we currently have several Wikisource projects without Page/Index namespaces as content namespaces (meaning we'll have to rebuild search index and run updateArticleCount.php (this is done monthly so probably no need to worry much about this) for these wikis.

Glaisher added a project: Wikimedia-Site-requests.Sep 23 2015, 5:43 PM

Restricted Application added a subscriber: Matanya. · View Herald TranscriptSep 23 2015, 5:43 PM

Wikis without page or index namespace set as content namespaces currently.

angwikisource
azwikisource
bewikisource
bswikisource
cywikisource
eowikisource
fiwikisource
fowikisource
glwikisource
guwikisource
htwikisource
iswikisource
jawikisource
knwikisource
liwikisource
ltwikisource
mkwikisource
mrwikisource
orwikisource
sahwikisource
sawikisource
skwikisource
tawikisource
thwikisource
yiwikisource
zh_min_nanwikisource
test2wiki
bgwikisource
bnwikisource
cswikisource
hewikisource
kowikisource
nlwikisource
srwikisource
trwikisource

Change 240639 had a related patch set uploaded (by Glaisher):
Add Index and Page namespaces to $wgContentNamespaces by default

https://gerrit.wikimedia.org/r/240639

gerritbot added a project: Patch-For-Review.Sep 24 2015, 6:00 AM

Change 240640 had a related patch set uploaded (by Glaisher):
Remove Page and Index namespaces from $wgContentNamespaces

https://gerrit.wikimedia.org/r/240640

We need someone to rebuild the search index for the wikis listed above when ProofreadPage patch is merged and reaches Wikisources. I don't know who can do this. Maybe @demon or @EBernhardson can help?

Excellent, I had forgotten that I had pushed for enWS to have its ContentNamespaces added separately, so having it automated is a good step forward and a more mature approach.

Billinghurst mentioned this in T54708: Add Index namespace to $wgNamespacesToBeSearchedDefault by default.Sep 24 2015, 7:53 AM

Billinghurst moved this task from Backlog to Pagelist Widget on the ProofreadPage board.Sep 24 2015, 7:55 AM

Change 240639 merged by jenkins-bot:
Add Index and Page namespaces to $wgContentNamespaces by default

https://gerrit.wikimedia.org/r/240639

ReleaseTaggerBot added a project: MW-1.27-release (WMF-deploy-2015-10-27_(1.27.0-wmf.4)).Oct 13 2015, 10:00 PM

Rebuilding index is being tracked at T115402: Reindex namespaces after October 22, 2015

matej_suchanek updated the task description. (Show Details)Oct 14 2015, 2:29 PM

matej_suchanek set Security to None.

Glaisher mentioned this in T115402: Reindex namespaces after October 22, 2015.Oct 14 2015, 3:04 PM

Glaisher added a project: User-notice.

Once that is rolled out, I would suggest that we then wish to remove the manual configurations in
https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php

...
// Note that changing this for wikis with CirrusSearch will remove pages in the
// affected namespace from search results until a full reindex is completed.
'wgContentNamespaces' => array(
 ...
'+arwikisource' => array( 102, 104 ),
'+aswikisource' => array( 102, 104, 106 ), // T45129, T72464
'+bgwikisource' => array( 100 ),
'+bnwikisource' => array( 100 ),
'+brwikisource' => array( 100, 102, 104 ),
'+cawikisource' => array( 102, 104, 106 ),
'+cswikisource' => array( 100 ),
'+dawikisource' => array( 102, 104, 106 ),
'+dewikisource' => array( 102, 104 ),
'+elwikisource' => array( 100, 102 ),
'+enwikisource' => array( 102, 104, 106, 114 ), // T52007
'+eswikisource' => array( 102, 104 ),
'+etwikisource' => array( 102, 104, 106 ),
'+fawikisource' => array( 102, 104 ),
'+frwikisource' => array( 102, 104, 112 ),
'+hewikisource' => array( 100, 106, 108, 110 ), // T98709
'+hrwikisource' => array( 100, 102, 104 ),
'+huwikisource' => array( 100, 104, 106 ),
'+hywikisource' => array( 100, 104, 106 ),
'+idwikisource' => array( 100, 102, 104 ),
'+itwikisource' => array( 102, 108, 110 ),
'+kowikisource' => array( 100 ),
'+lawikisource' => array( 102, 104, 106 ),
'+mlwikisource' => array( 100, 104, 106 ),
'+nlwikisource' => array( 102 ),
'+nowikisource' => array( 102, 104, 106 ),
'+plwikisource' => array( 100, 102, 104 ),
'+ptwikisource' => array( 102, 104, 106 ),
'+rowikisource' => array( 102, 104, 106 ), // Follow-up for T31190
'+ruwikisource' => array( 104, 106 ),
'+slwikisource' => array( 100, 104 ),
'+sourceswiki' => array( 104, 106 ),
'+srwikisource' => array( 100 ),
'+svwikisource' => array( 104, 106, 108 ),
'+tewikisource' => array( 102, 104, 106 ),
'+trwikisource' => array( 100 ),
'+ukwikisource' => array( 102, 114, 250, 252 ), // T52561, T53684
'+vecwikisource' => array( 100, 102, 104 ),
'+viwikisource' => array( 102, 104, 106 ),
'+zhwikisource' => array( 102, 104, 106, 114 ), // T66127
...

They are predominantly 104/106, though not exclusively, and this will assist in the harmonisation. (T74525)

Johan moved this task from To Triage to Announce in next Tech/News on the User-notice board.Oct 15 2015, 7:40 AM

In T54709#1727223, @Billinghurst wrote:

Once that is rolled out, I would suggest that we then wish to remove the manual configurations in
https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php

There's this already. https://gerrit.wikimedia.org/r/#/c/240640/

In T54709#968402, @Nemo_bis wrote:

In T54709#965477, @Glaisher wrote:

should we do this within the extension itself or just on Wikisources?

The extension. The two namespaces definitely fit the general definition of "content namespace" from the docs; until proven otherwise, I think most wikis need to set them as content namespaces and will be grateful for the reduced configuration burden.

Whoa... back-up a second. I can reluctantly accept the notion of the Index: namespace being classified as a "content namespace" but hardly pages found in the Page: namespace. Normally, everything associated with what amounts to "a single work" found in the Page: namespace is ultimately transcluded to the main (or more recently the //Translation://) namespace(s) to meet that "single work" premise in the end.

Plus, contributions made to a Page: namespace page reflects one of four possible states of "flux" as it relates to transcription and formatting (e.g. Proofread page) against ever static content only unlike a Wikipedia article where the state of "flux" comes from contributions that are refinements to forever-changing content.

If the issue is purely the lack of one or both of these namespaces as part of default search results, the answer surely can't be to "double count" the same [always being transcluded] content by reclassifying what currently amounts to draft or work space only?

This doesn't change the default search namespaces. That's controlled by another configuration variable.

Why Index and Page is $wgContentNamespaces? They not!

I disagree. If you look at $wgContentNamespaces at MediaWiki, you will see
that it fits the definition and, more importantly, the desired
functionality.

In T54709#1733754, @Shizhao wrote:

Why Index and Page is $wgContentNamespaces? They not!

For future reference, please elaborate in such comments and provide arguments.
Arguments help a discussion, short "I don't like it" comments help less. Thank you! :)

In T54709#1733800, @Billinghurst wrote:

I disagree. If you look at $wgContentNamespaces at MediaWiki, you will see
that it fits the definition and, more importantly, the desired
functionality.

Good argument - where exactly does the Manual:$wgContentNamespaces page state that a "page" used in the end for no other purpose than transclusion to the mainspace equate to being content worthy? All that page talks about is being tracking and/or statistic 'based'; again to what end?... to be able to "be searched"? No thanks - I don't want search results of partially transcribed works - or worse; partially un-proofread ones (does anyone?)

To me, the Page: namespace is no more than a temporary work space -- a custom space. Its premise follows that 'custom nuance' as inferred on this page in....

A custom namespace can be used to hold content that should not be shown on the search results page, for example pages that are used only for transclusion.

We're not really changing anything here for most Wikisources. Page/Index namespaces are already in $wgContentNamespaces for most of the Wikisources. We're just moving that config from Wikimedia-specific configuration to the extension to make it more manageable. I don't think there has been any issue with page and index namespaces being set as content namespaces at those wikis until now..?

In T54709#1735552, @Glaisher wrote:

... I don't think there has been any issue with page and index namespaces being set as content namespaces at those wikis until now..?

Dimes to dollars not more than 3 or 4 Wikisources even know what they have or why. I'm strictly concerned with en.WS regardless.

In T54709#577545, @Glaisher wrote:

This needs more discussion on Wikisource wikis?

Because? See the above.

Please explain why we need to reclassify a custom workspace where the end product is used for nothing else BUT transclusion to the main namespace and how that is "content".

Dimes to dollars not more than 3 or 4 Wikisources even know what they have or why.

Multiple Wikisources had specific discussions and requests about this, you can probably re-find the discussions about it from WS:COORD and related updates I made in 2011.

Liuxinyu970226 subscribed.Oct 19 2015, 11:38 PM

In T54709#1733800, @Billinghurst wrote:

I disagree. If you look at $wgContentNamespaces at MediaWiki, you will see
that it fits the definition and, more importantly, the desired
functionality.

Index just a table of Contents; page is a workspace, similar draft namespace.

Disagree. Both are permanent pages, draft is not.

The pages should be included in edit countsb theyshould be counted statistically. It maintains our validation data, the pages contain content. Sure that this content should be lesser priority than main ns, but it still should be findable and counted.

My reasoning for findability was expressed in my discussion about the biographical dictionaries. Those pages are pertinent to search irrespective of proofread status, especially at their rate of proofreading. They require visibility for referencing.

• demon unsubscribed.Oct 22 2015, 1:11 PM

Quiddity moved this task from Announce in next Tech/News to In current Tech/News draft on the User-notice board.Oct 22 2015, 9:25 PM

In T54709#1744951, @Billinghurst wrote:

Disagree. Both are permanent pages, draft is not.

So? The point is neither space contains a full work -- they only help facilitate the eventual creation of "a single work" in the main namespace -- in our case via transclusion.

The pages should be included in edit countsb they should be counted statistically.

To what end? Some works are comprised of a handful of Page:s while others run into the hundreds; unless all the pages (i.e. the entire content) are organized, transcribed, proofread and then transcluded to the main namespace, there is no finished work -- just a work in progress.

It maintains our validation data, the pages contain content. Sure that this content should be lesser priority than main ns, but it still should be findable and counted.

Correction: pages in the Page: namespace contain portions of content. Only when these portions are combined to reflect the entire content (e.g. are transcluded into a single main namespace title) do we have a hostable- & search- worthy work.

My reasoning for findability was expressed in my discussion about the biographical dictionaries. Those pages are pertinent to search irrespective of proofread status, especially at their rate of proofreading. They require visibility for referencing.

Sorry, I know not of any discussion about 'the biographical dictionaries' [pointer?] but I don't see how a specific type of work changes anything since the process for all works, regardless of type, is pretty much the same.

In T54709#1751218, @GOIII wrote:

In T54709#1744951, @Billinghurst wrote:

Disagree. Both are permanent pages, draft is not.

So? The point is neither space contains a full work -- they only help facilitate the eventual creation of "a single work" in the main namespace -- in our case via transclusion.

The pages should be included in edit countsb they should be counted statistically.

To what end? Some works are comprised of a handful of Page:s while others run into the hundreds; unless all the pages (i.e. the entire content) are organized, transcribed, proofread and then transcluded to the main namespace, there is no finished work -- just a work in progress.

It maintains our validation data, the pages contain content. Sure that this content should be lesser priority than main ns, but it still should be findable and counted.

Correction: pages in the Page: namespace contain portions of content. Only when these portions are combined to reflect the entire content (e.g. are transcluded into a single main namespace title) do we have a hostable- & search- worthy work.

My reasoning for findability was expressed in my discussion about the biographical dictionaries. Those pages are pertinent to search irrespective of proofread status, especially at their rate of proofreading. They require visibility for referencing.

Sorry, I know not of any discussion about 'the biographical dictionaries' [pointer?] but I don't see how a specific type of work changes anything since the process for all works, regardless of type, is pretty much the same.

It would seem that you have a different interpretation of "content", it doesn't specify complete works, I have never argued such.

Content as defined by MW is at
https://www.mediawiki.org/wiki/Manual:$wgContentNamespaces

If you have an issue with the configuration then it may be better to pull the conversation back from Phabricator and return it to the Wikisources. This ticket simply applied (standardised) to the remaining wikis what has long been applied to English Wikisource and numbers of others.

Re my commentary about biographical dictionaries, it was at Scriptorium on English Wikisource.

If you have an issue with the configuration then it may be better to pull the conversation back from Phabricator and return it to the Wikisources.

Yes, please. Individual wikis can always ask their own, different configuration.

In T54709#1751530, @Billinghurst wrote:

It would seem that you have a different interpretation of "content", it doesn't specify complete works, I have never argued such.
Content as defined by MW is at
https://www.mediawiki.org/wiki/Manual:$wgContentNamespaces

No but you keep pointing to that "definition" which ultimately has to do with the "counting" of articles (Wikipedia mainspace titles); for us, the equivalent is generally thought of as works (Wikisource mainspace titles).

What is so useful about inflating the count of 'real' content (titles in Author, Project & any other useful namespaces aside) with the addition of 'partial' content?... So the same exact content can be counted twice -- once in the Page: ns as a partial stand-alone and again when its part of the entire range of Page:s transcluded to the main ns in a finished work? What am I missing here?

If you have an issue with the configuration then it may be better to pull the conversation back from Phabricator and return it to the Wikisources. This ticket simply applied (standardised) to the remaining wikis what has long been applied to English Wikisource and numbers of others.

Can't say -- I'm left with questions because I don't see because its been that way since... or because others are doing it that way... as justification of the premise to begin with.

Re my commentary about biographical dictionaries, it was at Scriptorium on English Wikisource.

Thanks. I'll look for it there.

Change 240640 merged by jenkins-bot:
Remove Page and Index namespaces from $wgContentNamespaces

https://gerrit.wikimedia.org/r/240640

Change has been on all wikis since Thursday and the indices were rebuilt yesterday. wmf-config and ProofreadPage has also been updated just now. If specific wikis want to opt out, they can go through the regular Wikimedia-Site-requests process.

Perhaps my comment about the importance of nsIndex and nsPage could be surprising, but IMHO they are the true content pages, while transcluded ns0 text are merely one of many possible derived text. They are the true digitalization of the specific edition, while ns0 transcluded text is something like a new, "original" edition of the work. NsPage is a NPOV kind of digitalization, while ns0 transclusion is not.

Liuxinyu970226 unsubscribed.Oct 28 2015, 4:57 AM

Johan moved this task from In current Tech/News draft to Recently announced in Tech/News on the User-notice board.Oct 28 2015, 8:29 AM

Johan moved this task from Recently announced in Tech/News to Already announced/Archive on the User-notice board.Nov 5 2015, 7:55 AM

GOIII moved this task from Pagelist Widget to Done: to deploy/check on the ProofreadPage board.Jun 12 2016, 3:51 AM

Restricted Application added a subscriber: JEumerus. · View Herald TranscriptJun 12 2016, 3:51 AM

Liuxinyu970226 removed a parent task: T37925: [DO NOT USE. Please use the Wikisource project] Wikisource related bugs and enhancements (tracking).Dec 23 2016, 12:09 PM

Dereckson mentioned this in T46320: Clean Page and Index namespaces configuration for Wikisource.Feb 13 2017, 11:06 PM

Ladsgroup edited projects, added User-notice-archive; removed User-notice.Aug 13 2022, 1:54 PM

Meno25 removed a subscriber: • wikibugs-l-list.Jan 15 2023, 12:21 PM

	Subject	Repo	Branch	Lines +/-
	Remove Page and Index namespaces from $wgContentNamespaces	operations/mediawiki-config	master	+27 -33
	Add Index and Page namespaces to $wgContentNamespaces by default	mediawiki/extensions/ProofreadPage	master	+10 -3

		Status	Subtype	Assigned	Task
		Resolved		Glaisher	T54709 Add Index and Page namespaces to $wgContentNamespaces
		Resolved		Tpt	T39483 ProofreadPage extension should handle its namespaces in a more sensible manner

Add Index and Page namespaces to $wgContentNamespaces
Closed, ResolvedPublic
Actions

Description

Details

Related Objects
Search...

Event Timeline

Add Index and Page namespaces to $wgContentNamespacesClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Add Index and Page namespaces to $wgContentNamespaces
Closed, ResolvedPublic
Actions

Related Objects
Search...