Page MenuHomePhabricator

Cannot use the API from PHP
Closed, DeclinedPublic

Description

Some API links, such as the following:

http://he.wikisource.org/w/api.php?action=parse&text=%7B%7B%F6%E9%E8%E5%E8%E9%ED+%F9%EC+%EE%F7%F8%E0%E5%FA+%E2%E3%E5%EC%E5%FA+%F2%EC+%F4%F1%E5%F7%7C%E0%E9%E5%E1%7C%EE%E1%7C-%7C%E9%7C-%7C%7D%7D&redirects=1&format=xml

Work from a browser, but when I try to get them from a PHP script, I get an error "HTTP/1.0 403 Forbidden", for example with this program:

<?php
print file_get_contents("http://he.wikisource.org/w/api.php?action=parse&text=%7B%7B%F6%E9%E8%E5%E8%E9%ED+%F9%EC+%EE%F7%F8%E0%E5%FA+%E2%E3%E5%EC%E5%FA+%F2%EC+%F4%F1%E5%F7%7C%E0%E9%E5%E1%7C%EE%E1%7C-%7C%E9%7C-%7C%7D%7D&redirects=1&format=xml");
?>

It worked until about a week or two ago.


Version: unspecified
Severity: major

Details

Reference
bz22561

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 11:01 PM
bzimport set Reference to bz22561.

Try: ini_set( 'user_agent', 'Erel\'s bot' );

Thank you, it works only servers where ini_set is enabled.

I found a solution that works on all servers, including shared hosting, where ini_set is usually disabled:

function get_url_with_agent($url) {

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPGET, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Erel Bot');
$result = curl_exec($ch);
curl_close($ch);
return $result;

}

Hope it helps someone.