Page MenuHomePhabricator

Set up an easy to parse recentchanges feed
Closed, ResolvedPublic

Description

RFC on mediawiki.org:

Author: mnh

Description:
Current browne IRC feed isn't exactly parser-friendly, with colors and rather peculiar field arrangement
making things a lot harder than would be necessary. Since many tools depend on the stream, I'd suggest
to additionally offer a less painful format.

Proposed format: separated list using either U+001E (std record separator) or U+001F (std unit separator)
as delimiter with field order:

page, editor, minor, new, edit type, action flags, action duration, size delta, diff link, summary

s.t. (just BNFing the non-obvious)
<minor> ::= 'M' | ''

<new> ::= 'N' | ''

<edit type> ::= 'block' | 'protect' | <whatever> | ''
<block action> ::= 'anononly' | 'nocreate' | 'nomail' | <block action>',' <block action>
<protect action> ::= 'autoconfirmed' | 'sysop' | 'none'
<action flags> ::= <block action> | <protect action> | ''
<action duration>::= <timestring> | ''

Examples: (; used instead of U+001[EF] here)
User:myVictim;EvilAdmin;;;block;anon-only,nocreate;infinite;http://XY.wikipedia.org/wiki/Special:Log/block;because I can\n
Testpage;Editor;M;N;;;10;http://XY.wikipedia.org/w/index.php?title=...;just testing\n

Advantages:

  • tool programmers could simply fetch and split() each line, not much chance for bugs to creep in
  • independent of any particular project settings, whereas current stream uses e.g. "„[[Benutzer:Foo]]“ für den Zeitraum: Unbeschränkt (Erstellung von Benutzerkonten gesperrt)", the format thus eases i18n of tools
  • easy to specify
  • probably pretty straightforward to implement

Disadvantages:

  • not particularly well suited for human readability. When offered as a supplement stream instead of entirely replacing the old one, this is however neglectable.

Regards, mnh


Version: unspecified
Severity: enhancement

Details

Reference
bz14045

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:11 PM
bzimport set Reference to bz14045.

leon wrote:

XML would be cool. We could even distribute it over XMPP! :-)

Although the current scheme is parseable, this proposal seems reasonable, but only if it would exist along with the "human readable" channels. I suggest something like #meta.wikimedia.bot for the channel names.

mike.lifeguard+bugs wrote:

(In reply to comment #2)

Although the current scheme is parseable, this proposal seems reasonable, but
only if it would exist along with the "human readable" channels. I suggest
something like #meta.wikimedia.bot for the channel names.

It's actually not so parseable. In particular, localization is a problem, as is the rights log. We make do, but a standardized output built specifically to be machine-readable would be a boon.

Changed component to "RecentChanges"

Bug 17450 is specifically about RC-via-XMPP. Since a easy-to-parse feed over IRC is impossible due to random cut-off, i'll mark this bug as depending on the XMPP one.

Change 131040 had a related patch set uploaded by Krinkle:
Add 'rcstream' module for broadcasting recent changes over WebSockets

https://gerrit.wikimedia.org/r/131040

Change 131040 merged by Faidon Liambotis:
Add 'rcstream' module for broadcasting recent changes over WebSockets

https://gerrit.wikimedia.org/r/131040

Moving from MediaWiki to Wikimedia. This bug is mostly about the IRC feed that Wikimedia provides. And has been referred to by Ori and Faidon for the RCStream service that's being created at the moment.

  • Bug 30555 has been marked as a duplicate of this bug. ***

See:

There is still a few minor issues to be resolved. An announcement to wikitech-l will be made when the service is ready for wide usage. However this bug as-is is done.