Page MenuHomePhabricator

Categories created using Tamil words not recognised in stats
Closed, DeclinedPublic

Description

Author: sundarbecse

Description:
http://en.wikipedia.org/wikistats/EN/CategoryOverview_TA_Complete.htm - this
page seems to recognise the categories created using [[Category:some category]]
tag and NOT the ones created using [[பகுப்பு:some category]] tag. "பகுப்பு" is the
Tamil word for "category" and it is a valid namespace in Tamil wikipedia.

Thanks,
Sundar


Version: unspecified
Severity: normal
URL: http://stats.wikimedia.org/EN/CategoryOverview_TA_Complete.htm

Details

Reference
bz4537

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:03 PM
bzimport set Reference to bz4537.
bzimport added a subscriber: Unknown Object (MLST).

gangleri wrote:

If the configuration of the underlying tool is wrong this can be a consequence of

https://bugzilla.mozilla.org/show_bug.cgi?id=321607

Bug [Bugzilla] 321607: "'copy and paste cuts off text (Tamil/Hindi scripts)''"

*hints*
a) copy "பகுப்பு" with apostrophes and remove them after pasting them;
b) use &#nnnn; character encoding in the configuration; you can convert the
characters at http://pastebin.com/ and will get

பகுப்பு

regards reinhardt [[user:gangleri]]

sundarbecse wrote:

This doesn't seem to be related to the other problem. Because we always have
"பகுப்பு" enclossed by [[ and : characters. Also, we're able to see that the
categories are created successfully, but only the statistics don't reflect that.
Thanks Reinhardt.

gangleri wrote:

Hi Sundar!

comment 1 referes to the configuration of the underlying tool for
http://en.wikipedia.org/wikistats/EN/CategoryOverview_TA_Complete.htm

I assume that in that configuration the last character is missing.

regards reinhardt [[user:gangleri]]

sundarbecse wrote:

Oh, I got your point now. Thanks.

Regards,
Sundar

sundarbecse wrote:

Modified LanguageCodes.csv to fix category counts

Thanks to Reinhart and Natkeeran, I fixed the entry for Tamil in
LanguageCodes.csv as indicated at http://meta.wikimedia.org/wiki/Wikistats_csv.
Please update the file with the diff
(ta,(utf-8),பகுப்பு,படிமம்,பயனர்,(#Redirect)).
Also, will the default categories marked with the English tag "Category" come
under the stats?

Thanks,
Sundar

Attached:

Another old one that may be obsolete.

Comment on attachment 1351
Modified LanguageCodes.csv to fix category counts

The attached file is not a patch.

erikzachte wrote:

Wikistats harvests language specific tags by parsing language specific config files (php). So just supplying an updated tag file won't do it. It will be overwritten. I will hunt bug in php parser code.

Lowering priority since this bug has sat around for 2.5 years.

Categories like
பகுப்பு:ஐக்கிய அமெரிக்கக் குடியரசுத் தலைவர்கள்
exist but are not listed on http://stats.wikimedia.org/EN/CategoryOverview_TA_Complete.htm while other categories like
பகுப்பு:சிங்கள தலைவர்கள்
are. Obviously they both use பகுப்பு: instead of Category:.

[mass-moving wikistats reports from Wikimedia→Statistics to Analytics→Wikistats to have stats issues under one Bugzilla product (see bug 42088) - sorry for the bugspam!]

Nuria moved this task from Blocked to Deprioritized on the Analytics board.