Page MenuHomePhabricator

Make RegexFilterPageGenerator work on page bodies.
Closed, ResolvedPublic

Description

Originally from: http://sourceforge.net/p/pywikipediabot/patches/552/
Reported by: loxley
Created on: 2012-05-12 16:09:39
Subject: Make RegexFilterPageGenerator work on page bodies.
Original description:
Make RegexFilterPageGenerator work on page bodies. As suggested by valhallasw.
The re.S flag is backwards compatible and allows matching newlines.


Version: unspecified
Severity: normal
See Also:
https://sourceforge.net/p/pywikipediabot/patches/552

Details

Reference
bz54563

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:11 AM
bzimport set Reference to bz54563.
bzimport added a subscriber: Unknown Object (????).

Commandline parameter to filter articles based on their bodies.

Looks good to me. If you have time, could you
a\) add a command line parameter for this option in the GeneratorFactory,
and
b\) change the re.I and re.S to the full names \(I think it's IGNORECASE and MULTILINE?\)

Thanks\!

Hi again\!

I added a new parameter -articlefilterregex which works on all subsequently given generators.

Regards,

Niki

Hi Valhallasw\!

Thanks for the feedback. I changed the flags to their verbose representation and will add the commandline switch within the next to days.

Regards

Niki

jayvdb assigned this task to Mpaa.
jayvdb set Security to None.
jayvdb added a project: Pywikibot.
jayvdb removed subscribers: Unknown Object (????), Xqt.
jayvdb subscribed.

Mpaa pushed this through in late 2013 for core.

https://gerrit.wikimedia.org/r/#/c/86813/