dispatchChanges has performance issues. One major bottle neck is the getPendingChanges() function. It works be loading a block of changes, then for the item of each change loads the sitelinks, then check whether the target wiki is mentioned in the sitelinks. This means one extra database query for each change (per default, 1000 per batch). This is far too slow.
One solution would be to join the wb_changes table against the wb_items_per_site table directly. This however would no longer work when we have client side usage tracking. Also, wb_changes uses a single field for the prefixed ID of the entity, while wb_items_per_site uses one field for the entity type and one for the numeric ID. This makes joining inefficient and inconvenient.
An alternative solution would be to provide a storage layer service for
a) checking for a given client wiki which items from a given list are used there.
b) provides all pages on a given client wiki that use one of a list of items.
Using the first method, we could filter a given block of changes using a single query.
Version: unspecified
Severity: normal