Page MenuHomePhabricator

Document-centric semantic wiki
Closed, DeclinedPublic

Description

I know this is likely to be a very tall order, but I am wondering whether the Semantic Mediawiki extension or its like, and all its abilities for enabling querying, could also be taken advantage of, with adaptations which allowed nested textual semantic markup, to simplify say the creation of XML semantically-rich documents like TEI (see http://tei.oucs.ox.ac.uk/P5/Guidelines-web/en/html/REF-ELEMENTS.html for sample tags) (e.g., by using Mediawiki-like syntax without enabling XML which requires closing tags; as well known to wiki syntax lovers, even if closing tags can sometimes provide readability, it can also hamper it).

In this case, the "properties" would not necessarily be about the page per se, but about information within the article relevant to its immediate context of the document or possibly some external context.

For example, if plays, novels, Scriptures, etc. were added to such a wiki, any references to say a date could be marked up as such, as could structural information like letter openings, closings, etc., or annotations of meaning like irony, humor, etc., which could then be made available to queries, and even possibly allowing joins of other documents (e.g., searching for all document passages mentioning that date range (for the current document, or those belonging to a given category or with a given property)).

(And to the extent security/performance is a concern with such repeatable and open-ended user queries, the extension might allow documents to be added by the user to say locally stored HTML5 IndexedDB collections, which could in turn be queried by the user in an offline-friendly manner. jQuery or XQuery (if made available to JavaScript) could be made available by any HTML5 applications granted access to these database collections, including ideally some built-in one available at the wiki itself such that one could query the XML-generated output within a document or collection.)

While "semantic" often tends to be associated with data-centric applications in standards-based, web-centric discussions, I am really eager to see it expand to document-centric data as well. When both are combined, one might be able to both query a document in a rich way relative to the content, while also access additional data, perhaps via references to a data-centric wiki like Wikipedia, in an integrated way.

Thank you.


Version: unspecified
Severity: enhancement

Details

Reference
bz26119

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:21 PM
bzimport set Reference to bz26119.

I might add this could really expand the important of a site like Wikisource, given that semantic mark-up really can benefit from a large base of users whose role expands from not only being proofreaders or formatters, but potentially also being markup editors for enriching important documents.

Amen to that!

Since you posted this (2010!), there have been a couple of developments that you might be interested in:

Following a feature request for the eagerly awaited VirtualEditor extension, James Forrester of Wikimedia dropped a note at bug 37934. See the response by Gabriel Wicke and http://www.mediawiki.org/wiki/Parsoid/RDFa_vocabulary (to which he refers).

A more ad hoc solution was designed for the Transcribe Bentham Project at UCL. It uses a number of custom extensions:

http://www.mediawiki.org/wiki/Extension:TEITags
http://www.mediawiki.org/wiki/Extension:JBTEIToolbar
http://www.mediawiki.org/wiki/Extension:JBZV

Although this stuff has not been tested for later versions of MW, it shows at least that such things are possible.

I don't know what role SMW could play here. Perhaps the syntax required for a TEI document could be semi-automated using templates and Semantic Forms?

I'm not a developer nor am I even remotely familiar with the way that all this is supposed to work, but this information seemed to me worth passing on.

Aklapper subscribed.

The Semantic MediaWiki developers requested in https://phabricator.wikimedia.org/T64114 to move their task tracking to https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues and to close remaining tasks in Wikimedia Phabricator. If you still face the problem reported in this task in a supported version of SMW, please feel free to transfer your report to https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues . We are sorry for the inconvenience.