Page MenuHomePhabricator

Add Thailand in Thai to the monuments database
Closed, ResolvedPublic

Description

Thailand needs to be added to the monuments database.

It appears that the lists of monuments are available on the Thai Wikipedia, but not in a bot-friendly format (yet). There is some good-looking documentation gathered on the country's page on Commons though — so we'll just need to wait a bit until the lists are converted into the Proper Format[TM].

For a list of ISO 3166 codes, see [2] (yes, the country is indeed divided into 76 provinces and one special administrative unit for the capital, Bangkok.)

References


Version: unspecified
Severity: normal
URL: https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Monuments_2013_in_Thailand/Eligible_sites

Details

Reference
bz49049

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:41 AM
bzimport set Reference to bz49049.

KaewWiki wrote:

We have got a first draft of bot-friendly list and a few additional details:

  • project - wikipedia
  • lang - th
  • headerTemplate - หัวโบราณสถาน
  • rowTemplate - แถวโบราณสถาน

We already have a lot of sources in the monuments database (see
https://commons.wikimedia.org/wiki/Commons:Monuments_database/Statistics). This
is a tracker bug to keep track of what more to add. Bugs should contain the
source wikipedia language and the country (or region).

Useful information to include:

  • project -wikipedia
  • lang - language code of the Wikipedia
  • headerTemplate - the name of the header template
  • rowTemplate - the name of the row template
  • commonsTemplate - template to track images at Commons
  • commonsTrackerCategory - the tracker category at Commons (blala with known

IDs)

  • commonsCategoryBase - the base of the category tree at Commons for all images
  • autoGeocode - To automagicly geocode the images at Commons (be careful1)
  • unusedImagesPage - Page to put the list of unused images.
  • imagesWithoutIdPage - Page to put the list of images without an id.
  • registrantUrlBase - The url which can be combined with the id to get more

info

  • namespaces - What namespaces are the lists in (0, please!)
  • table - The name of the table, convention monuments_<country>_(<lang>)
  • truncate - False if you don't have a real primkey
  • primkey - What is the primkey? Can be one or more fields. Should be unique,

strong etc.

Working on it. If you look at https://th.wikipedia.org/w/index.php?title=%E0%B8%A3%E0%B8%B2%E0%B8%A2%E0%B8%8A%E0%B8%B7%E0%B9%88%E0%B8%AD%E0%B9%82%E0%B8%9A%E0%B8%A3%E0%B8%B2%E0%B8%93%E0%B8%AA%E0%B8%96%E0%B8%B2%E0%B8%99%E0%B9%83%E0%B8%99%E0%B8%88%E0%B8%B1%E0%B8%87%E0%B8%AB%E0%B8%A7%E0%B8%B1%E0%B8%94%E0%B8%A2%E0%B8%B0%E0%B8%A5%E0%B8%B2&action=edit you'll see an unnamed parameter "yala" in the header template. This should be a named parameter and a second one should be added for the iso code, see https://en.wikipedia.org/wiki/ISO_3166-2:TH

Than we get

  • Country (Thailand)
  • Province ----- need the local translation for that
  • District
  • Tambon

Other fixme's
'unusedImagesPage' : u'Wikipedia:Wiki Loves Monuments/Unused images of cultural heritage monuments in Thailand', #FIXME: Translate
'imagesWithoutIdPage' : u'Wikipedia:Wiki Loves Monuments/Images of cultural heritage monuments in Thailand without ID', #FIXME: Translate

Can't use the url in the current format. Can you change it so it's only the url and not [] around it with text?

KaewWiki wrote:

Sorry, I didn't see this before because of my email setting. I have changed my setting now and I should see all future notifications of bugzilla.

(In reply to comment #4)

you'll see an unnamed parameter "yala" in the header template. This should
be a
named parameter and a second one should be added for the iso code, see
https://en.wikipedia.org/wiki/ISO_3166-2:TH

  1. my bot is changing the unnamed parameter to a proper one called จังหวัด (Changwat).
  2. the second parameter can be added manually, this needs some time.
  • Province ----- need the local translation for that

Changwat

Other fixme's
'unusedImagesPage' : u'Wikipedia:Wiki Loves Monuments/Unused images of
cultural
heritage monuments in Thailand', #FIXME: Translate

วิกิพีเดีย:Wiki Loves Monuments/ภาพโบราณสถานในประเทศไทยที่ไม่ได้ใช้

'imagesWithoutIdPage' : u'Wikipedia:Wiki Loves Monuments/Images of cultural
heritage monuments in Thailand without ID', #FIXME: Translate

วิกิพีเดีย:Wiki Loves Monuments/ภาพโบราณสถานในประเทศไทยที่ไม่มีรหัส

Can't use the url in the current format. Can you change it so it's only the
url
and not [] around it with text?

I don't understand this. Please clarify

Ok, great. I'll update the harvester to grab the province. Can you already make up a parameter name for the province iso code? That way I can alreay configure it and it will just work when you add it.

I'll update the reporting pages.

About the url, it's now:

ลิงก์ = [http://www.gis.finearts.go.th/fad50/fad/display_data.aspx?id=0005308 กรมศิลปากร]

it should be something like:

for the bot to be able to parse it. If the link always goes to the same url, you might want to make it something like:

ลิงก์ = 0005308

And have the template do [http://www.gis.finearts.go.th/fad50/fad/display_data.aspx?id={{{ลิงก์ |}}} กรมศิลปากร]

KaewWiki wrote:

(In reply to comment #7)

Ok, great. I'll update the harvester to grab the province. Can you already
make
up a parameter name for the province iso code? That way I can alreay
configure
it and it will just work when you add it.

I am adding

code=TH-xx

manually to the 80+ list pages. It will be finished in a hour or so...

Ok. Everything seems to work. That just leaves the url thing.

KaewWiki wrote:

TH-xx are added to all pages manually.

for the bot to be able to parse it. If the link always goes to the same url,
you might want to make it something like:

ลิงก์ = 0005308

And have the template do
[http://www.gis.finearts.go.th/fad50/fad/display_data.aspx?id={{{ลิงก์ |}}}
กรมศิลปากร]

My bot is now fixing this. The filed ลิงก์ (link) is intended to work for multiple links - one for fine art dept, one for government gazette. So, my implementation is a bit different. We will find out soon whether it works or not...

Works. Remaining issues can be solved outside of this bug.