Author: hemanshu_desai
Description:
Mediawiki should have a random page in this category feature
Version: unspecified
Severity: enhancement
Author: hemanshu_desai
Description:
Mediawiki should have a random page in this category feature
Version: unspecified
Severity: enhancement
robchur wrote:
No. That retrieves a random category. The request is for a way of generating a
random page from a specified category.
anthony.bentley wrote:
My thoughts are that this could be either by adding arguments to
SpecialRandompage or by creating a new special page for this purpose.
SpecialRandompage currently has the argument of the namespace in which it
searches.
If carrying out a search within a category then shouldn't be required as
the search would always be over the main namespace. Not aware that users
etc. can assign themselves to categories?
Would this only be required over the main namespace or is there a good
reason for allowing choice of namespace?
Is category (cl_from) the only index that it's worth filtering on or is it
worth having additional functionality to choose the index?
dan.bolser wrote:
The DPL extension can do this (random n pages in category X).
DPL should be a part of MediaWiki :D
dan.bolser wrote:
How about some docs as to how the feature works? Should I add a 'documentation' bug?
ayg wrote:
Note, the patch might not be acceptable for efficiency concerns. It should stay in the software but might not be enabled on Wikimedia sites.
Wiki.Melancholie wrote:
*** Bug 17068 has been marked as a duplicate of this bug. ***
Wiki.Melancholie wrote:
As this is not fixed for Wikimedia wiki, a nice toolserver link:
happy.melon.wiki wrote:
*** Bug 23181 has been marked as a duplicate of this bug. ***
Vasiliev implementation was reverted on r27436.
However, I don't think we can make it faster without adding a category_random, it seems quite good:
EXPLAIN SELECT page_namespace, page_title FROM page USE INDEX(page_random) JOIN categorylinks ON page_id = cl_from WHERE page_is_redirect = 0 AND page_random >= 0.15564 AND cl_to = 'GFDL' ORDER BY page_random LIMIT 1;
Both select_types SIMPLE:
+-------------+------+-----------------+-------------+------------------------+
table | type | key | key_len | ref | Extra |
+-------------+------+-----------+-----+-------------+------------------------+
page | range | page_random | 8 | NULL | Using where |
categorylinks | eq_ref | cl_from | 261 | page_id,const | Using where; Using index |
+-------------+------+-----------+-----+-------------+------------------------+
It would change to Using temporary; Using filesort if we weren't using a LIMIT, but that's not the case.
Accessing the pages on the category is O(1), the problem is that for all the results it needs to go to page to see the page_random. And for large categories that would be a worse case of checking thousands of entries.
My testing shows that in practise it is run in a fraction of second, probably due to the index + random numbers uniformly distributed.
ayg wrote:
AFAICT, that will have to scan the entire page_random index in the worst case, e.g., if there are no actual pages in the category. You left out part of the EXPLAIN -- this is full thing for me on enwiki (on toolserver).
possible_keys: page_random
key: page_random key_len: 8 ref: NULL rows: 10001064 Extra: Using where
possible_keys: cl_from,cl_timestamp,cl_sortkey
key: cl_from key_len: 261 ref: enwiki.page.page_id,const rows: 1 Extra: Using where; Using index
Note rows: 10001064. Try running that on a large database with 'GFDL' replaced by 'Nonexistent category' and you'll see it takes forever. It's O(N) in number of pages in the worst case, only acceptable for very small sites.
Yes, I tried to fit it into bugzilla width.
Hmm. You are right. Is it checking page table before categorylinks? Checking categorylinks first, it should be immediate if there are no pages in the category, but O(N) pages in the category otherwise.
mysql should have implemented a 2choose a random row matching this" :(
ayg wrote:
If it checks category first, it can't use the page_random index, so it's O(N log N) in the size of the category to sort its contents. You may as well skip the page table join and ORDER BY RAND() in that case.
I don't think it would have been trivial for MySQL to implement efficient "pick a random row" without some kind of special index. In any event, they don't, so we need cl_random if we really want this enough.
Reason to close bug 15824 was the existance of http://www.mediawiki.org/wiki/Extension:RandomInCategory
Either there is no documentation for this fix or the fix was never reinstated after it was reverted. http://www.mediawiki.org/wiki/Help_talk:Random_page/Archive_1 It is being asked for by others, and now I've found need of something like this myself on en.wikipedia. There is a banner for Today's Article For Improvement that is currently populated using a bot and a lot of "template" style pages with less than optimal code that could greatly be simplified if I could use a [[:Special:Random/Category:This_weeks_TAFIs]] to pick a random article from a category for the week. The only "extra" that I would ask is that the page would be picked on page load instead of clicking on the link so that the link when using the "piping trick" would show the name of the article it was going to take you to. Can this be done?
(In reply to comment #22)
Either there is no documentation for this fix or the fix was never reinstated
after it was reverted.
http://www.mediawiki.org/wiki/Help_talk:Random_page/Archive_1 It is being
asked for by others, and now I've found need of something like this myself on
en.wikipedia. There is a banner for Today's Article For Improvement that is
currently populated using a bot and a lot of "template" style pages with less
than optimal code that could greatly be simplified if I could use a
[[:Special:Random/Category:This_weeks_TAFIs]] to pick a random article from a
category for the week. The only "extra" that I would ask is that the page
would be picked on page load instead of clicking on the link so that the link
when using the "piping trick" would show the name of the article it was going
to take you to. Can this be done?
http://en.wikipedia.org/wiki/Wikipedia_talk:Today%27s_articles_for_improvement#Teahouse_TAFI_banner is the link to the full discussion for using this feature.
(In reply to comment #22)
Either there is no documentation for this fix or the fix was never reinstated
after it was reverted.
No, the extension in comment 20 was simply never deployed to WMF sites.
(In reply to comment #24)
(In reply to comment #22)
Either there is no documentation for this fix or the fix was never reinstated
after it was reverted.No, the extension in comment 20 was simply never deployed to WMF sites.
Can this bug be re-opened since there was no actual implemented fix?
(In reply to comment #25)
Can this bug be re-opened since there was no actual implemented fix?
No, this bug (implementing such a feature in MediaWiki) is fixed. If you want it deployed on enwiki or another WMF site, file a new bug under the Wikimedia category.