Page MenuHomePhabricator

Canonical URL on all content pages
Closed, ResolvedPublic

Description

Author: pioneeroverc

Description:
Right now, the canonical URL meta tag shows only on redirect(ed) pages. Since sometimes (i.e. using file cache) URLs are not redirected, it would be nice to have the canonical URL meta tag on all (content) pages, which wouldn't hurt anyway.


Version: 1.16.x
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=48402

Details

Reference
bz25882

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:17 PM
bzimport set Reference to bz25882.
bzimport added a subscriber: Unknown Object (MLST).

A wiki I help to run had need for this, so I wrote a quick and dirty hook to add rel=canonical links to all normal page views:

$wgHooks['ArticleViewHeader'][] = 'addCanonicalLinks';
function addCanonicalLinks ( &$article ) {

global $wgRequest, $wgOut;
if ( $article->getOldID() || $wgRequest->getVal( 'redirect' ) == 'no' ) return true;
$tags = ( $wgOut->mLinktags ? $wgOut->mLinktags : array() );
foreach ( $tags as $tag ) {
        if ( $tag['rel'] === 'canonical' ) return true;
}
$wgOut->addLink( array( 'rel' => 'canonical',
			'href' => $article->getTitle()->getFullURL() ) );
return true;

}

Ideally, of course, we should have some way for extensions and core components to indicate which URL parameters actually affect the content of the page, so that correct canonical links could be automatically constructed for all pages and actions. But this simple workaround should handle 99% of all cases.

  • Bug 26116 has been marked as a duplicate of this bug. ***

pioneeroverc wrote:

Any progress on this? Someone willing to implement the hack above (or write another one) into MediaWiki's main code?

I don't understand the usecase. What do you mean by

Since sometimes (i.e. using file cache) URLs are not redirected

(Since we don't use http redirects for redirects, url's are never redirected if you're saying what i think you are).

pioneeroverc wrote:

(In reply to comment #4)

I don't understand the usecase. What do you mean by

Since sometimes (i.e. using file cache) URLs are not redirected

(Since we don't use http redirects for redirects, url's are never redirected if
you're saying what i think you are).

Say I go to http://mywiki.com/my%20page. If I'm logged in, I'm redirected to http://mywiki.com/My_page. However, if I'm not logged in, I have file cache enabled and I visit that URL, I'm not redirected, though the page shows the content of what technically is My_page. I have a non-standard configuration (using nginx and short URLs like mywiki.com/Page) and I don't know if it's the same using Apache and/or short URLs like mywiki.com/wiki/Page, but enabling canonical URLs on all pages would definitely get rid of the SEO and indexing-related problems, possibly even in other cases, using index.php?title=Page instead of /Page and surely others I can't think of right now.

Bawolff, do you understand this now? Feel like taking a stab at it?

I understand the request, but I don't know enough about how search engines use canonical links, to know if having a canonical link pointing to the current page will have an adverse affect. One would assume it wouldn't, but I don't really know.

Nemo_bis assigned this task to tstarling.
Nemo_bis set Security to None.