Page MenuHomePhabricator

Prevent searchNs collisions by distinguishing different wikis' namespaces in user_properties on wiki farms with a shared user table
Open, LowPublicFeature

Description

Author: tisane2718

Description:
Presently, the namespace checkboxes in the advanced search options on the Preferences tab cause settings to be put in the user_properties table as "searchNs100" if the user, setting preferences on Wiki2, opts to include the "Foo" namespace in his searches and Foo's namespace index is 100. This can cause a conflict when you have your wiki farm configured using $wgSharedDB and the wikis share a user table. Consider, for instance, what happens if there is a "Bar" namespace on Wiki3 with a namespace index of 100. If the user from the example above went to Wiki3 and searched, the "Bar" namespace would be included in his search because of how Wiki3 would interpret the "searchNs100" property.

Ultimately I am seeking to fix bug 1837 by implementing multi-wiki search, and I need to make sure this conflict doesn't get in the way. Some wikis will want to use CentralAuth, while others will simply want to share a user table, and multi-wiki search functionality should be available for them all.


Version: unspecified
Severity: enhancement

Details

Reference
bz23741

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:03 PM
bzimport set Reference to bz23741.
bzimport added a subscriber: Unknown Object (MLST).

tisane2718 wrote:

In the second paragraph above, I meant, "Some wiki FARMS will want to use CentralAuth, while others will simply want to share a user table"

happy.melon.wiki wrote:

Have you encountered this in practice? I would expect this only to occur if the user_properties table is shared; and it could be resolved by not sharing said table. Of course, that introduces its own problems vis having to set preferences on each wiki separately. Perhaps the user_properties table needs a up_wiki field which is NULL or '*' for settings which should apply to all wikis, and can contain a wiki name for properties which should apply only to that wiki. But this would only be necessary for wikis using shared tables; it should be implemented in a different fashion for wikis using CentralAuth, which (if done properly) should avoid having to do a schema change on the hundred-million-row enwiki user_properties table...

tisane2718 wrote:

I don't have a wiki farm that I use for production purposes; I just have test wiki farms, one of which I used to verify that this collision can occur. As a side note, it would seem that user_properties is shared by default when you share databases, because I didn't explicitly set that table to be shared.

I would think that users would prefer to have certain preferences set globally, especially when large checklists are involved, as might be the case with interwiki search and watchlist options (see bug 3525). I thought about having a big two-dimensional checklist with preferences on one axis and wikis on the other, and "select/deselect all" options for each row and column, but I think on large wiki farms (e.g. Wikimedia) it would be too big to be manageable.

In order to help future-proof the database, perhaps we should just add a "wiki" field to all the tables, so that tables can be more readily shared without causing a bunch of collisions. In cases where that field should be applicable globally, it could be set to NULL or "*" as you suggest. There is a lot more that needs to be done as far as cross-wiki integration is concerned, and I think that this will create flexibility to implement such functionality without having to create new global tables and/or global databases. True, the field would be useless in isolated wikis, but one never knows when one will want to merge the wiki into a farm, e.g. to take a monolingual project multilingual.

There might be some benefit, for instance, in having a wiki field for the page table, so that templates could be shared across wikis by setting a "*" value for that field. Changing the page on one wiki would change it on them all. Although maybe an even better option would be put a foreign key in that field leading to another table, and thereby provide arrays of wikis to which a given setting applies. E.g., a preference, page, or whatever might apply only to some wikis but not others, rather being global.

happy.melon.wiki wrote:

(In reply to comment #3)

I would think that users would prefer to have certain preferences set globally,
especially when large checklists are involved, as might be the case with
interwiki search and watchlist options (see bug 3525). I thought about having a
big two-dimensional checklist with preferences on one axis and wikis on the
other, and "select/deselect all" options for each row and column, but I think
on large wiki farms (e.g. Wikimedia) it would be too big to be manageable.

Just make Special:Preferences set global prefs on the home wiki, and local prefs on other wikis. Only needs some good documentation and ghost indications of the default settings.

In order to help future-proof the database, perhaps we should just add a "wiki"
field to all the tables, so that tables can be more readily shared without
causing a bunch of collisions. In cases where that field should be applicable
globally, it could be set to NULL or "*" as you suggest. There is a lot more
that needs to be done as far as cross-wiki integration is concerned, and I
think that this will create flexibility to implement such functionality without
having to create new global tables and/or global databases. True, the field
would be useless in isolated wikis, but one never knows when one will want to
merge the wiki into a farm, e.g. to take a monolingual project multilingual.

"having to create" new fields and databases is indeed a huge and difficult-to-organise operation. All the more reason *not* to do it except where it's genuinely needed. Adding fields willy-nilly purely for the sake of it or because it "might" be useful one day, is not intelligent coding, and will *not* make it into production code.

There might be some benefit, for instance, in having a wiki field for the page
table, so that templates could be shared across wikis by setting a "*" value
for that field. Changing the page on one wiki would change it on them all.
Although maybe an even better option would be put a foreign key in that field
leading to another table, and thereby provide arrays of wikis to which a given
setting applies. E.g., a preference, page, or whatever might apply only to some
wikis but not others, rather being global.

Sharing the page table is totally unrealistic; it has relational connections to twelve other tables which in turn connect to just about every other table in the database. For instance, "changing the page" requires creating a revision, which must be accessible to all wikis, because the revision record contains the link to the text table; equally the link tables such as imagelinks, for cache integrity. But the revision requires access to the user table, and imagelinks requires access to the image table; and the image table requires oldimage, and so on and so forth.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:01 AM
Aklapper removed a subscriber: wikibugs-l-list.