Page MenuHomePhabricator

New page created, db lag, result in double entries in categories
Closed, ResolvedPublic

Description

Author: anthony

Description:
After creating the page

http://en.wikipedia.org/wiki/Singles_93-03

I received a "page does not exist, please create one", so I tried saved it
again, which resulted in another "page does not exist". Probably because of db
lag, the page had been saved anyway as I was looking for what went wrong. So
that didn't seem to be buggy.

However, in the mean time the page -- which had been saved twice because of db
lag -- had put itself into its categories twice, see e.g.

http://en.wikipedia.org/wiki/Category:The_Chemical_Brothers_albums
http://en.wikipedia.org/wiki/Category:2003_albums
...

Removing the categories from the page didn't help, as only one of two entries
was removed from the category. Bringing the categopries back to the article,
also brought the number of entries in the categories back to two.

Thanks in advance,

[[User:Aliekens]]


Version: unspecified
Severity: major

Details

Reference
bz1239
TitleReferenceAuthorSource BranchDest Branch
Update search UIsebastian-berlin-wmse/wikipedia-homepage!12sebastian-berlin-wmsecherry-pick-b965357fmain
plugins: use proxy to download plugins on production hostsrepos/releng/jenkins-deploy!12jnucheplugins-proxymaster
jenkins-rel: add missing releases target releases2002.codfw.wmnetrepos/releng/jenkins-deploy!11jnucheadd-cold-releases-jenkins-targetmaster
Update search UIsebastian-berlin-wmse/wikipedia-homepage!11sebastian-berlin-wmsesearch-fieldmain
jenkins-rel: sync plugins/CasC config with latest production versionrepos/releng/jenkins-deploy!10jnuchejenkins-rel-versionsmaster
jenkins-rel: add CasC configuration for secretsrepos/releng/jenkins-deploy!9jnuchejenkins-rel-secretsmaster
jenkins-rel: add configuration for devtools environmentrepos/releng/jenkins-deploy!8jnuchejenkins-rel-devtoolsmaster
jenkins-rel: add secretsrepos/releng/scap3-dev!11jnuchejenkins-rel-secretsmaster
deploy_services.sh: inject local scap.cfg configrepos/releng/scap3-dev!10jnucheinject-envmaster
scap.cfg: factor common configuration out of environmentsrepos/releng/jenkins-deploy!7jnuchefactor-out-scap-configmaster
jenkins-deploy: update deployment paths to match productionrepos/releng/scap3-dev!9jnucheprod-values-updatesmaster
add production valuesrepos/releng/jenkins-deploy!6jnucheprod-values-updatesmaster
kubernetes.py: Ensure log messages aren't too largerepos/releng/scap!55dancyreview/dancy/split-big-log-messagesmaster
releasing casc: add authentication/authorization configrepos/releng/jenkins-deploy!5jnucheauthenticationmaster
add new service with OpenLDAP serverrepos/releng/scap3-dev!8jnucheldap-servicemaster
Send image build/deploy transcripts to syslogrepos/releng/scap!51dancyreview/dancy/T323939-k8s-build-transcriptsmaster
add plugins and CasC config from production releasing instancerepos/releng/jenkins-deploy!3jnuchecasc-config-and-plugins-releasingmaster
Show related patches Customize query in GitLab

Revisions and Commits

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:08 PM
bzimport set Reference to bz1239.
bzimport added a subscriber: Unknown Object (MLST).

plugwash wrote:

so we have an orphaned catagory entry and no easy way to remove it (the only way
i can think of is probablly for a developer to use sql directly)

i don't know how wikipedia stores pages so i don't know if we also have two page
records in there for the same page name

maybe the database could be made to enforce no duplicates on entrys in a catagory.

i've marked this as major because its database curruption that isn't easy to undo

jeluf wrote:

I removed the duplicate entry from the database.

zigger wrote:

*** Bug 1285 has been marked as a duplicate of this bug. ***

gal86 wrote:

Category "Логика" in ru.wikipedia.org
(http://ru.wikipedia.org/wiki/Category:%D0%9B%D0%BE%D0%B3%D0%B8%D0%BA%D0%B0)
contains 2 exact same entries "Парадокс"
(http://ru.wikipedia.org/wiki/%D0%9F%D0%B0%D1%80%D0%B0%D0%B4%D0%BE%D0%BA%D1%81).

Only one should be there.

gal86 wrote:

Category "X86" in ru.wikipedia.org
(http://ru.wikipedia.org/wiki/Category:X86)
contains 2 exact same entries "Am486 SX2"
(http://ru.wikipedia.org/wiki/Am486_SX2).
Only one should be there.

bugzilla_wikipedia_org.to.jamesd wrote:

Do not save again when you see the page not there after creating it. If you
didn't get a database error, the save worked. Wait 30 seconds, reload, and
you'll probably see the new page.

The database setup is being changed to use a unique index which will make it
impossible to create duplicates. You'll get a database error on the second and
later saves instead. Part of the process of creating that will remove all of the
existing duplicates.

zigger wrote:

*** Bug 1320 has been marked as a duplicate of this bug. ***

  • Bug 2080 has been marked as a duplicate of this bug. ***

zigger wrote:

*** Bug 2179 has been marked as a duplicate of this bug. ***

gangleri wrote:

see http://bugzilla.wikimedia.org/show_bug.cgi?id=2382#c1
bug 2382: "existence of duplicate records as a result of bug 1202"
bug 2388: "handling: add a "purge" link to [[MediaWiki:Noarticletext]]"

1.5 quite firmly fixes this index on the conversion.
We've already fixed most of the Wikimedia sites as well manually.

Resolving FIXED again.

epriestley added a commit: Unknown Object (Diffusion Commit).Mar 4 2015, 8:21 AM