Page MenuHomePhabricator

Replicate Bugzilla database to Labs testing instance
Closed, DeclinedPublic

Description

Author: happy.melon.wiki

Description:
To allow our many toolserver users to do exciting things with the bugzilla database which are not possible (or are very messy) to do through the web interface.


Version: unspecified
Severity: enhancement
URL: https://jira.toolserver.org/browse/TS-901
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=40331

Details

Reference
bz28339

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 11:26 PM
bzimport set Reference to bz28339.
bzimport added a subscriber: Unknown Object (MLST).

This isn't so much a WMF action item, as a TS one

Max has already reported this on JIRA https://jira.toolserver.org/browse/TS-901

happy.melon.wiki wrote:

It probably requires action from both ends, however, because the BZ database is not (AFAIK) part of the main cluster, and so its binlogs are probably not accessible.

Don't we have "secret" bugs in the security section now? That might have to be taken into account when doing this (Although I imagine that's not much different then the other secret info which is taken into account by the toolserver).

(In reply to comment #2)

It probably requires action from both ends, however, because the BZ database is
not (AFAIK) part of the main cluster, and so its binlogs are probably not
accessible.

As per Max originally, River is WMF root also, so can just do it if necessary.

Granted, there is some filtering needed...

(In reply to comment #3)

Don't we have "secret" bugs in the security section now? That might have to be
taken into account when doing this (Although I imagine that's not much
different then the other secret info which is taken into account by the
toolserver).

There's none yet, but there could be some someday :)

(In reply to comment #3)

Don't we have "secret" bugs in the security section now? That might have to be
taken into account when doing this (Although I imagine that's not much
different then the other secret info which is taken into account by the
toolserver).

I don't that that would be any differnt than the revdel entries in the wiki dbs, which iirc are available to the ts users.

(In reply to comment #5)

(In reply to comment #3)

Don't we have "secret" bugs in the security section now? That might have to be
taken into account when doing this (Although I imagine that's not much
different then the other secret info which is taken into account by the
toolserver).

There's none yet, but there could be some someday :)

I thought there was one atm? Which obviously, go public when the fix is released...

(In reply to comment #7)

(In reply to comment #5)

(In reply to comment #3)

Don't we have "secret" bugs in the security section now? That might have to be
taken into account when doing this (Although I imagine that's not much
different then the other secret info which is taken into account by the
toolserver).

There's none yet, but there could be some someday :)

I thought there was one atm? Which obviously, go public when the fix is
released...

I lied, there's one added since I last checked :p

Adding (likely necessary) "shell" keyword. Lowering priority since this information is available via the API, which takes care of any "secret" bug problems. I should have a demonstration of getting information out of Bz using the API shortly.

For the record, I'm not saying the API is just-as-good as being able to do do raw SQL queries, but it is better than the web interface for programmatic extraction of data. Bz4 has improved on the API some, as well.

(In reply to comment #9)

Adding (likely necessary) "shell" keyword. Lowering priority since this
information is available via the API, which takes care of any "secret" bug
problems. I should have a demonstration of getting information out of Bz using
the API shortly.

For the record, I'm not saying the API is just-as-good as being able to do do
raw SQL queries, but it is better than the web interface for programmatic
extraction of data. Bz4 has improved on the API some, as well.

You mean this api: http://bugzilla.wikimedia.org/bzapi ? (The one that gives a 500 error after about a 5 minute timeout) ;)

happy.melon.wiki wrote:

(In reply to comment #10)

You mean this api: http://bugzilla.wikimedia.org/bzapi ? (The one that gives a
500 error after about a 5 minute timeout) ;)

I think if you align the runestones correctly you can get it to at least fail noisily (I managed to get a "you must use POST requests" out of it, can't quite remember how); but it's certainly suboptimal.

I think it used to work and was subsequently broken. I've tried queries directly out of the bugzilla docs (that work on mozilla's bugzilla instance), and they still fail here. (for example, https://api-dev.bugzilla.mozilla.org/0.9/bug/1234?include_fields=summary vs https://bugzilla.wikimedia.org/bzapi/bug/1234?include_fields=summary ).

I suppose this is getting off topic though.

Has anyone actually tried the bz4 api?

happy.melon.wiki wrote:

Does BZ4 *have* a built-in API?

(In reply to comment #10)

You mean this api: http://bugzilla.wikimedia.org/bzapi ? (The one that gives a
500 error after about a 5 minute timeout) ;)

No, I don't mean that one. This one: https://bugzilla.wikimedia.org/jsonrpc.cgi I've been using it to build some queries and reports. See http://www.bugzilla.org/docs/3.6/en/html/api/Bugzilla/WebService/Server/JSONRPC.html for documentation.

(In reply to comment #14)

Does BZ4 *have* a built-in API?

Yes. http://www.bugzilla.org/docs/4.0/en/html/api/Bugzilla/WebService.html

pdhanda wrote:

Bugmeister is the new Bugzilla maintainer and default assignee.

Removing "shell" keyword for things that aren't directly doable by shell users etc

Removing shell keyword if exists

Why is this assigned to Mark? And has there been any progress on this?

(In reply to comment #20)

Why is this assigned to Mark?

When he became the Bugmeister, All the BZ open bugs got reassigned to him. Assigning back to wikibugs-l

Is this leaning to a WONTFIX or do we think that API is still not flexible enough and that private stuff is not a problem?

There are XMLRPC and JSONRPC interfaces, see http://www.bugzilla.org/docs/4.2/en/html/api/Bugzilla/WebService.html - however there is no full availability of all web search options, see https://bugzilla.mozilla.org/show\_bug.cgi?id=475754

I (bug wrangler) do not plan to work on this, also considering https://meta.wikimedia.org/wiki/Future_of_Toolserver , so to me this is a WONTFIX (will not fix this request).

(In reply to comment #23 by Betacommand)

The API is not sufficient

Please elaborate what is needed.

Changing summary to Labs, as Toolserver will die.

We have an instance on https://wikitech.wikimedia.org/wiki/Nova_Resource:Bugzilla but without data, and I think we cannot easily replicate due to privacy concerns.

Happy-melon: Please elaborate some examples that you have in mind what to do, otherwise I'd say it's not worth the hassle.

From JIRA (as that ticket isn't going anywhere...):


MZMcBride added a comment - 12 January 2011 10:52:50

The Bugzilla database schema is available here: http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/index.cgi?action=single&version=3.4.2&view=View+schema.

(In reply to comment #25)

We have an instance on
https://wikitech.wikimedia.org/wiki/Nova_Resource:Bugzilla but without data,
and I think we cannot easily replicate due to privacy concerns.

We replicate a lot of tables with much more sensitive data (the MediaWiki user table, for example). I don't think it's reasonable to argue that privacy concerns are relevant here (it's a public Bugzilla instance). There are a few database table fields that would need to be filtered out (only one, as I recall based on the database schema, but we'll assume there's more than one and say "a few").

Given that we already have tools and infrastructure in place to replicate database tables, I'm almost inclined to mark this bug with the "easy" keyword. But I'll resist.

There should perhaps also be some consideration for "Security" bugs, but that would be mostly paranoia, I think. Surely the important security bugs are resolved fairly quickly. :-)

MZMcBride: I cross my fingers and hope that you'll be right - I don't know or understand the infrastructure well enough to really judge.

(In reply to comment #1)

Max has already reported this on JIRA
https://jira.toolserver.org/browse/TS-901

Who's Max?

Copying from that ticket, while it's still accessible:


Wikimedia Bugzilla database should be replicated with views to Toolserver users

The Wikimedia bug tracker (https://bugzilla.wikimedia.org/) has a MySQL backend that's currently inaccessible to Toolserver users. It would be helpful (for statistics particularly) if the underlying databases could be exposed to Toolserver users in the same way that the MediaWiki wiki databases are exposed (using replication and views). This requires auditing the Bugzilla database schema to ensure that no private information is leaked, however doing so shouldn't be very difficult.

The Bugzilla database schema is available here: http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/index.cgi?action=single&version=3.4.2&view=View+schema.

(Wikidata also requested this.)

Somebody interested in pushing this forward should bring it up on the Ops mailing list, to discuss what is needed here.

(In reply to comment #30)

[...]
Somebody interested in pushing this forward should bring it up on the Ops
mailing list, to discuss what is needed here.

Eh, that's non-public AFAIK.

Meh. Still <ops@lists.wmorg> should have a moderator who allows external postings. I hope.

Wikimedia has migrated from Bugzilla to Phabricator. Learn more about it here: https://www.mediawiki.org/wiki/Phabricator/versus_Bugzilla - This task does not make sense anymore in the concept of Phabricator, hence closing as declined.

This makes sense again if old-bugzilla is at risk of removal.

While replication is probably the wrong word here, it is the intended plan to make the dump available on labs on the dbstores. @Andrew can advise further with this for once the dump is done and I'll poke him when he gets to work about this as well.

Since bugzilla is a dead end and everything was migrated to phab, I don't understand what this is for. Can someone explain?

Presumably for easy searching of old BZ data that will be hard to find in Phabricator, and to get at data (e.g. non-migrated users' email domains) that's not currently shown in Phabricator.

Since bugzilla is a dead end and everything was migrated to phab, I don't understand what this is for. Can someone explain?

  • some things weren't migrated (T95266, T78747, ..)
  • there's no search
  • there is still old-bz, because see above
  • T85140 is an attempt to get rid of old-bz but can't have security related bugs
  • this is T1198 and specifically T85141 , this might be considered duplicate, because only after sanitizing it it could be used in labs
  • actual replication is not going to happen, but a one-time dump minus private data if we can resolve the above
  • the schema is outdated again meanwhile

I suggest closing this as duplicate of T85141 or rejected. it's something in between.

JohnLewis claimed this task.