The dumps are growing very big and too cumbersome for people to work with. They also contain a large amount of data that is not relevant for the specific usecase. One option we have to solve this is to meaningfully split the dumps. We are already splitting them between Lexemes and Items + Properties. We need to figure out which other meaningful splits there are. We ideally divide the data into distinct subsets.
Current alternatives:
- https://tools.wmflabs.org/wdumps/ provides the opportunity to create custom dumps.
Relevant notes and discussions: