Page MenuHomePhabricator

Add Anexos to content namespaces on Spanish Wikipedia
Closed, ResolvedPublic

Description

They are lists usually placed in main namespace on most Wikipedias, and they skew statistics otherwise. This bug mainly to avoid forgetting it.
This doesn't affect anything else and just makes sense, so I'm not sure it requires consensus, but at least a notification surely; adding some sysops from those wikis to cc for input.


Version: unspecified
Severity: enhancement
URL: https://es.wikipedia.org/wiki/Wikipedia:Caf%C3%A9/Portal/Archivo/Propuestas/Actual#Anexos_como_espacio_de_nombres_de_contenido

Details

Reference
bz39866

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 12:54 AM
bzimport set Reference to bz39866.

We don't have any feedback in one week. Shouldn't you advertise this bug on their village pumps?

erikzachte wrote:

According to comments on http://infodisiac.com/blog/2012/08/growth-in-article-count-at-largest-20-wikipedias/ more wikis are affected:

"The ‘Annex’ namespace is namespace 102 in pt: and hr:, namespace 104 in ar:, es:, fr: and lt:. Another custom namespace that I think should better be counted as content, is namespace 104 in als:, which is the former als: dictionary which has been merged with Wikipedia."

I think I need to adapt wikistats scripts so as to mine countable namespaces via the API, but that will require broad consensus and I will have to discuss it within WMF as some of the core metrics WPF tracks are involved.

This is already a content namespace for pt., so it should be set on es.

Taking this bug. Git change 23068.

Erik: please open a new bug for your proposal. I would like to point the majority of wikis are already configured to include references namespace as content one.

(In reply to comment #2)

According to comments on
http://infodisiac.com/blog/2012/08/growth-in-article-count-at-largest-20-wikipedias/
more wikis are affected:

"The ‘Annex’ namespace is namespace 102 in pt: and hr:, namespace 104 in ar:,
es:, fr: and lt:. Another custom namespace that I think should better be
counted as content, is namespace 104 in als:, which is the former als:
dictionary which has been merged with Wikipedia."

Thank you. I wonder if this namespace is used in the same way everywhere; hosting a whole dictionary, for instance, it's quite a different thing.

I think I need to adapt wikistats scripts so as to mine countable namespaces
via the API, but that will require broad consensus and I will have to discuss
it within WMF as some of the core metrics WPF tracks are involved.

I think that's bug 35198, although of course there should also be some criteria for content namespaces settings.

On fr., it's used for bibliography notices.

(In reply to comment #3)

Erik: please open a new bug for your proposal. I would like to point the
majority of wikis are already configured to include references namespace as
content one.

Erik: could you confirm us you raised this issue on es.?

(In reply to comment #4)

Gerrit change #23068

I have deployed that change on the wiki since there seems to be a consensus on the spanish café :)

erikzachte wrote:

Here is up to date list of countable namespaces, collected via api.
http://stats.wikimedia.org/wikimedia/misc/StatisticsContentNamespaces.csv

Mentioned above but not yet effective:

wp:hr:102
wp:fr:104
wp:lt:104
wp:als:104

Suggested in http://infodisiac.com/blog/2012/08/growth-in-article-count-at-largest-20-wikipedias/ but not yet effective:

wp:ru:102

A maintenance script has still to run to update es.wikipedia statistics count.

I guess it's maintenance/updateArticleCount.php

[ Moving tracking bug from depends to blocks ]

(In reply to comment #13)

A maintenance script has still to run to update es.wikipedia statistics count.

I guess it's maintenance/updateArticleCount.php

So that script is going to take an awful long time to run. I have sent a mail to Asher (DBA / performance engineer) about it and find out if it is safe to run it.

The sequence should be:

  1. Get current statistics mwscript showStats.php --wiki=eswiki
  2. Update article count: mwscript updateArticleCount.php --wiki=eswiki --update
  3. Get new statistics mwscript showStats.php --wiki=eswiki

I've an alternative suggestion to avoid to count pages the number is already known.

(1) Count the pages in references namespace

(2) Add this number to the current count

Whether manually, whether under the form of a script.

What exactly is an Annex (or Anexo) namespace for? The word in English is very generic and just means anything that is an addition to something else.

(In reply to comment #17)

What exactly is an Annex (or Anexo) namespace for? The word in English is
very
generic and just means anything that is an addition to something else.

Where? The usage varies slightly across wikis, see comments above and at http://infodisiac.com/blog/2012/08/growth-in-article-count-at-largest-20-wikipedias/ for further info on each.

I see. In the cases of the Spanish and Portuguese Wikipedias, it seems to be a namespace for list articles.

(In reply to comment #20)

What's outstanding here?

See the comments 15 and 16.

(In reply to comment #15)

(In reply to comment #13)

A maintenance script has still to run to update es.wikipedia statistics count.

I guess it's maintenance/updateArticleCount.php

So that script is going to take an awful long time to run. I have sent a mail
to Asher (DBA / performance engineer) about it and find out if it is safe to
run it.

Slow, but only minutes slow, not hours.

reedy@tin:/a/common$ mwscript updateArticleCount.php --wiki=eswiki
Counting articles...found 1016669.
Updating site statistics table... done.

(In reply to comment #22)

Slow, but only minutes slow, not hours.

Wonderful! I guess the new counting method using links tables is very efficient?