For individual pages, the __NOINDEX__ and __INDEX__ page properties (available in page_props.sql lowercase and without the underscores) can be used to determine overrides.
However, the baseline robot policy for each namespace should also be dumped. For each namespace, this can be determined by starting with [[mw:Manual:$wgDefaultRobotPolicy]] and then overriding it with [[mw:Manual:$wgNamespaceRobotPolicies]]. For convenience (it's not a significant storage cost since there are generally not many namespaces), it should state the policy for each namespace, even the ones that simply inherit $wgDefaultRobotPolicy.
I think it would be simplest to just use robotpolicy="noindex,nofollow" (or whatever the actual policy is) on each <namespace> element, since that's the format used in the HTML output and the configuration variables.
Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=58758