Page MenuHomePhabricator

Database upgrade MariaDB 10: Metadata access in INFORMATION_SCHEMA causes complete blocks
Closed, ResolvedPublic

Description

Author: metatron

Description:
When accessing INFORMATION_SCHEMA, query hangs up on random schemas and causes a widespread block for that server/listener.

  • This occurs especially when accessing INFORMATION_SCHEMA through listener s2/s4/s5.labsdb (192.168.99.12). No timeout.
  • This occures also when accessing INFORMATION_SCHEMA through listener s1.labsdb (192.168.99.1), but here with kill/timout of 16 sec.

-both: kill clean-ups take up to 500 sec.

  • This does not occur when accessing INFORMATION_SCHEMA through listener s3.labsdb (192.168.99.3) - still old DB.

Currently, access to INFORMATION_SCHEMA on new Servers is basically very slow. A simple query takes up to 3 sec (if it doesn't hang up). As (console)-clients do auto-rehashing on connect (unless it's turned off), they hang, too.

Maybe it's realted to spinning-rust, but this doesn't explain the difference between s1 and eg. s5. - and it shouldn't hang up at all, even if running slow.
Is there something like innodb_stats_on_metadata enabled for tokuDB? Or are some schemas kind of broken?

References:

http://tools.wmflabs.org/tools-info/misc/schema-block-hang.png
http://tools.wmflabs.org/tools-info/misc/schema-block-cleanup.png

Simple Metadata-query:
SELECT table_schema, data_length, index_length
FROM INFORMATION_SCHEMA.TABLES
WHERE table_schema = '<some_schema>'


Version: unspecified
Severity: critical

Details

Reference
bz69182

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:39 AM
bzimport added a project: Cloud-VPS.
bzimport set Reference to bz69182.

labsdb1001 has now been switched onto SSD and labsdb1002 is in progress. Have also set tokudb_empty_scan=disabled [1] which is a large part of the problem.

Providing there is always a table_schema = '...' clause the queries should be faster. If the clause is omitted the query will take forever and trash the table cache. That would be nice to avoid.

If issuing many, stagger the timing.

[1] http://www.mysqlperformanceblog.com/2014/07/09/tokudb-gotchas-slow-information_schema-tables/

metatron wrote:

While normal queries now perform greatly with SSD, there's still no cure for that problem. I removed every fancy stuff like GROUP BY and SUM() and touched it like a virgin, querying one schema after another.
This simple loop freezes the whole listener for like 200 sec. Just tried it again with s1 and s2.

foreach ( $schemata as $i => $schema ){

    $queryString ="
      SELECT table_schema, data_length, index_length
      FROM INFORMATION_SCHEMA.TABLES
      WHERE table_schema = '$schema'
    ";

    if ( $result = $db->query($queryString) ){
       while( $row = $result->fetch_assoc() ){
           $datalen += $row["data_length"];
	   $indexlen += $row["index_length"];
        }
     }

}

Setting tokudb_empty_scan=disabled looks to have improved the speed of TokuDB /opening/ tables.

Stack traces indicate the eventual lockup is due to TokuDB /closing/ tables slowly, which starts to occur en masse once the MariaDB table cache becomes full and one of these INFORMATION_SCHEMA queries is causing thousands of previously opened tables to be flushed to make room. This contention affects all threads because the table cache is global.

We could potentially increase the table cache further -- presently 10000 -- however there are issues with that approach, such as hundreds of thousands more file handles (multiple per table) and poor scalability for misses [1].

We could go the other way and decrease the table cache which can sometimes, counter intuitively, improve performance. This would need to be attempted carefully.

In the meantime:

a) Have you tried the above loop with a specific delay between queries? There may be a sweet spot at which things can keep up.

b) Try only querying table_schema on the proper <schema>.labsdb host names. Those are more likely to already have a stable table cache for their schemas, and/or reduce the overall impact if in combination with (a).

c) A static version of INFORMATION_SCHEMA.TABLES would be better, say re-materialized daily as information_schema_p.tables. Will look at generating this upstream in the prod>labs replicas where table cache contention is much less.

[1] http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/ (though we should retest this with MariaDB 10)

This certainly sounds like confirmed...

As an interim measure the MariaDB 10 labsdb instances now have "information_schema_p":

+--------------------------------+

Tables_in_information_schema_p

+--------------------------------+

schemata
tables

+--------------------------------+

Those are updated hourly via replication. Depending on how well s3 behaves, it might become daily. Contains only the wikis for now. Centralauth stats will appear soon.

Do you need more than those two tables from information_schema?

metatron wrote:

Great, thanks for this interim solution. Daily update is ok (for me) as the tools-info tool did a 68000 sec caching anyway.
Still missing are all user databases uxxxxx, sxxxx, pxxxgyyy - one of the main intentions for this kind of information. It would be great to have them integrated, too.

chasemp subscribed.

^Reason why the ban on information_schema you asked about.

yeah, thanks @jcrespo. moving this subscription to my other account

Edit: which is actually already subscribed

Marostegui assigned this task to Springle.

I am going to close this as resolved with the creation of information_schema_p and the current content (tables and schemata)
The only part pending apparently was:

metatron wrote:
Still missing are all user databases uxxxxx, sxxxx, pxxxgyyy - one of the main intentions for this kind of information. It would be great to have them integrated, too.

And we don't have any other databases apart from wikis on the new labs hosts.