Page MenuHomePhabricator

DRTRIGON-85 Integrate sum_cat_disc as bot
Closed, ResolvedPublic

Description

This issue was converted from https://jira.toolserver.org/browse/DRTRIGON-85.
Summary: Integrate sum_cat_disc as bot
Issue type: New Feature - A new feature of the product, which has yet to be developed.
Priority: Major
Status: Closed
Assignee: drtrigon <dr.trigon@surfeu.ch>


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sat, 19 Feb 2011 08:58:50

Flominator requested to get the functionality 'sum_cat_disc' provides built into a bot. Some suggestions on how to do this and what features should be available are given here:
http://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:DrTrigon&oldid=85471743#Diskussionsbeitr.C3.A4ge_nach_Kategorie

e.g.:

  • Liste wie bei DrTrigonBot wäre super, die Lösung per Vorlage jedoch perfekt
  • Optionen, die man sich vorstellen könnte:
    • Kategorientiefe
    • Dauer bis zum Abräumen
    • Ignorieren von Änderungen durch Bots oder an eingebundenen Vorlagen
    • Ausgabe der Abschnittsüberschrift oder des Ändernden
    • Mindestanzahl an Änderungen (bevor Seite überhaupt angezeigt wird)

Mir würde allerdings auch eine initiale Version helfen, die "einfach" nur die entsprechenden Links auf irgendeiner Seite postet.


Version: unspecified
Severity: major

Details

Reference
bz59431

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:30 AM
bzimport set Reference to bz59431.

From: drtrigon <dr.trigon@surfeu.ch>

Date: Sat, 19 Feb 2011 09:35:18

Mir würde allerdings auch eine initiale Version helfen, die "einfach" nur die entsprechenden Links auf irgendeiner Seite postet.

What about 'subster'.... ![][1]) test to try it was setup in: http://de.wikipedia.org/w/index.php?title=Benutzer:DrTrigon/Spielwiese&oldid=85478950#Test%201
It's more or less ok, with exception of the missing link syntax and all dates are lost... but is close ![][1]

[1]: https://jira.toolserver.org/images/icons/emoticons/wink.gif

From: Florian Straub <flominator@gmx.net>

Date: Sat, 19 Feb 2011 09:57:16

Looks great, especially since it contains one talk page I edited ![][1]

[1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif

From: drtrigon <dr.trigon@surfeu.ch>

Date: Sun, 20 Feb 2011 16:14:18

Added 'wikilinkedlist' as 'postproc' option in r118; it enables to output linked (within the wiki) lists.

This should enable beta testing of the first "sum_cat_disc bot"; Flo if you like you can start using
it like on my sandbox, the timestamps are missing but the list is ordered by timestamp.


From: Florian Straub <flominator@gmx.net>

Date: Mon, 21 Feb 2011 07:13:19

It currently throws an exception at http://de.wikipedia.org/w/index.php?title=Benutzer:Flominator/Freiburg&oldid=85558651


From: drtrigon <dr.trigon@surfeu.ch>

Date: Mon, 21 Feb 2011 15:17:25

That is because you have chosen a wrong (not matching) regex. Your choice was

regex=<br />\n<table>(.*?)</table>\n<br />\nTime to process

which produced wrong output.

Correct is:

regex=<br>\n<table>(.*?)</table>\n<br>\nTime to process

and produces correct output.

You may want to have a look at following links:

  • DrTrigonBot subster simulation panel: where you can simulate and check the correctness of the template settings (since the bot runs just once a day)
  • regex
    • Regular expression operations: the python module 're' for regex
    • Regular expression: in general, wiki page

From: Florian Straub <flominator@gmx.net>

Date: Mon, 28 Feb 2011 07:05:57

Thanks. Works great. Two more things:

  1. What about including the subster-Bot-Template for sum_cat_disc in a separate template, that can be handled like this {{User:DrTrigon/Talk pages|Category 1;Category 2;...}
  1. The bot seems to append exceptions instead of replacing them: http://de.wikipedia.org/w/index.php?title=Benutzer%3AFlominator%2FBreisgau-Hochschwarzwald&action=historysubmit&diff=85860993&oldid=85821723

From: drtrigon <dr.trigon@surfeu.ch>

Date: Mon, 28 Feb 2011 10:46:22

Great!

  1. a first draft was created in http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Entwurf/Vorlage:Talk_pages usage, e.g.

    {{Benutzer:DrTrigon/Entwurf/Vorlage:Talk_pages |cat=Baden-W%C3%BCrttemberg |value=valSumCatDiscTest }}

this template work exact like the other but hides the params 'regex' and 'postproc' from you since especially 'regex' causes you some problems. To have all needed options; the new template supports a parameter 'period' as well. You may use one category per template only. If something is left unclear, please ask!

  1. I have to say, the bot works correctly and as it should, the issues your are experiencing are all related to wrong 'regex' specified; you use

    <br />

in several places instead of

<br>

which causes the bot not to work! Be carful!! Anyway the new template mentioned hides the 'regex' option from you and should help solving this issue.


From: Florian Straub <flominator@gmx.net>

Date: Sun, 06 Mar 2011 10:09:00

That is much easier, thank you.

I do know the difference between the br-tags, but some script in my monobook used to automatically replace br by br/. I deactivated that now.

The template is great, except the naming of the parameters. It works for me, but should be understandable for others as well, don't you think? Maybe we should continue the discussion at the talk page of the template?

What about the parameter names "Kategorie" and "Stunden" (maybe Tage)? Does the consumer really need to specify the "value"?


From: Florian Straub <flominator@gmx.net>

Date: Sat, 19 Mar 2011 14:08:14

To be able to scan the results far easier, it would be helpful to have the name of the last user who editied the page.


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sat, 09 Jul 2011 19:52:03

The correct usage for the template (in order to work) is according to: http://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:DrTrigon&oldid=91043669#Subster-Bot

{{subst:Benutzer:DrTrigon/Entwurf/Vorlage:Talk_pages
|cat=Baden-W%C3%BCrttemberg
|value=valSumCatDiscTest
}}

DO NOT FORGET THE

subst:

Confer http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Entwurf/Vorlage:Talk_pages also.


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sat, 09 Jul 2011 21:54:23

Example - DrTrigonBot category discussion summary

added to http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Entwurf/Vorlage:Subster


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sun, 10 Jul 2011 21:43:40

Replace (and remove) helper template in Vorlage:Talk_pages with the concept described at Benutzer Diskussion:DrTrigon#Subster-Bot:

Introduce new parameter 'simple' with following syntax (example):

{{Benutzer:DrTrigon/Entwurf/Vorlage:Subster
|simple={'<type>': 'sum_cat_disc', 'cat': 'Freiburg im Breisgau', 'period': '336'}
|value=valSumCatDiscTest
}}

now use e.g. page Benutzer:DrTrigonBot/Subster to store the full parameter set belonging to a 'simple type' (here 'sum_cat_disc') like:

{|
|sum_cat_disc
|url=http://toolserver.org/~drtrigon/cgi-bin/sum_cat_disc.py?wiki=de&cat=%(cat)s&period=%(period)s
|regex=<br>\n<table>(.*?)</table>\n<br>\nTime to process
|postproc=('wikilinkedlist', '"_blank">(.*?)<')
|-
|...
|}

From: drtrigon <dr.trigon@surfeu.ch>

Date: Fri, 12 Aug 2011 14:54:10

Respect https://wiki.toolserver.org/view/Database_access#Query_time_limits and add limits to 'sum_cat_disc' where needed.


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sun, 14 Aug 2011 20:30:13

Query-Killer is back in action: http://lists.wikimedia.org/pipermail/toolserver-l/2011-August/004305.html


From: drtrigon <dr.trigon@surfeu.ch>

Date: Fri, 26 Aug 2011 17:35:39

Enabled parse of time-stamps through new postproc option 'formatedlist' [1], an example is given in [2]. The format used '* [[%s]] - %s' can be freely chosen.

[1] http://de.wikipedia.org/w/index.php?title=Benutzer:DrTrigon/Benutzer:DrTrigonBot/config.css&diff=92915758&oldid=92859420
[2] http://de.wikipedia.org/w/index.php?title=Benutzer%3ADrTrigon%2FSpielwiese&action=historysubmit&diff=92915813&oldid=92914625


From: drtrigon <dr.trigon@surfeu.ch>

Date: Fri, 26 Aug 2011 22:13:48

New parameter 'simple' introduced in r158. An example of usage can be found at [1] together with [2]. The syntax is as follows:

Instead of using e.g.

{{Benutzer:DrTrigon/Entwurf/Vorlage:Subster
|url=http://toolserver.org/~drtrigon/cgi-bin/sum_cat_disc.py?wiki=de&cat=Freiburg%20im%20Breisgau&period=336
|regex=<br>\n<table>(.*?)</table>\n<br>\nTime to process
|postproc=('wikilinkedlist', '"_blank">(.*?)<')
|value=valSumCatDiscTest
}}

use now

{{Benutzer:DrTrigon/Entwurf/Vorlage:Subster
|simple={{Benutzer:DrTrigon/Entwurf/Vorlage:Subster/Simple:sum_cat_disc|cat=Freiburg im Breisgau|period=336}}
|value=valSumCatDiscTest
}}

in fact all parameters (including 'value') can be hidden or simplified this way. The Template used look like this

{{((}}Benutzer:DrTrigon/Entwurf/Vorlage:Subster/Simple:sum_cat_disc

|url=http://toolserver.org/~drtrigon/cgi-bin/sum_cat_disc.py?wiki=de&cat={{{cat}}}&period={{{period}}}
|regex=<br>\n<table>(.*?)</table>\n<br>\nTime to process
|postproc=('wikilinkedlist', '"_blank">(.*?)<')
{{))}}

and is called Benutzer:DrTrigon/Entwurf/Vorlage:Subster/Simple:sum_cat_disc [2]. This name can obviously also be chosen quite shorter and then used more comfortable.

This is implemented by passing the text given in 'simple' to the expandtemplates mediawiki api call. This should be as fast as reading a page with options stored but could be improved by caching the used templates for eventual re-use during (same) bot run.

[1] http://de.wikipedia.org/w/index.php?title=Benutzer:DrTrigon/Spielwiese&diff=92924020&oldid=92915813
[2] http://de.wikipedia.org/w/index.php?title=Benutzer:DrTrigon/Entwurf/Vorlage:Subster/Simple:sum_cat_disc&oldid=92923712


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sat, 27 Aug 2011 18:50:45

This 'simple' syntax can also be (ab)used to solve the problem of changing URLs (e.g. FIFA), look at [1].

[1] http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Spielwiese#Test_5.2


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sat, 27 Aug 2011 19:26:07

Integrating sum_cat_disc.py as bot by subster was done finally in r158.

Enhancing the options of sum_cat_disc itself was copied/moved to DRTRIGON-97 because its another issue.


From: Florian Straub <flominator@gmx.net>

Date: Sun, 28 Aug 2011 10:30:25

Great changes. Cam we move http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Entwurf/Vorlage:Subster/Simple:sum_cat_disc and localize the names of the parameters used by this template of do you need them for parsing?


From: drtrigon <dr.trigon@surfeu.ch>

Date: Sun, 28 Aug 2011 11:02:44

You can do with the template [1] whatever you want (rename/move, use another, ...), you can also change all the parameter freely. The only thing you have to preserve and follow is the data (and format) of the text returned by the template when processed through [2]. The bot simply expands the template as very first action to get further info, that way it supports full wiki syntax in the 'simple' option. You can even use further functions like done in e.g. [3] to have full dynamic parameters... ![][1]

[1] http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Entwurf/Vorlage:Subster/Simple:sum_cat_disc
[2] http://de.wikipedia.org/wiki/Spezial:Vorlagen_expandieren
[3] http://de.wikipedia.org/wiki/Benutzer:DrTrigon/Entwurf/Vorlage:FIFA-Weltranglistendaten

[1]: https://jira.toolserver.org/images/icons/emoticons/wink.gif

This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: dr.trigon@surfeu.ch
CC list: flominator@gmx.net, dr.trigon@surfeu.ch