Page MenuHomePhabricator

hhvm Jenkins job fill up /tmp with perf-*.map files
Closed, DeclinedPublic

Description

The Jenkins job running MediaWiki core unit tests under hhvm produces files such as /tmp/perf-\d+.map which are filling out /tmp on the Jenkins slaves.

Looking at hhvm source code, the path is hardcoded:

$ git grep /tmp.*map
hphp/runtime/vm/debug/debug.cpp: "/tmp/perf-%d.map", getpid());
hphp/runtime/vm/debug/debug.h: * Stuff to output symbol names to /tmp/perf-%d.map files. This stuff
hphp/util/stack-trace.cpp: "/tmp/perf-%d.map", getpid());
$

We would need a configuration setting to either disable the performance map or have them written to a configurable directory.


Version: wmf-deployment
Severity: normal

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:59 AM
bzimport set Reference to bz62788.
bzimport added a subscriber: Unknown Object (MLST).

Alternatively, or in addition, can we have /tmp be part of /mnt (part of instance storage, we control) or /tmpfs (part of memory, we control) instead of the fixed 9GB / disk which we don't have any control over it seems.

Either way, it should be garbage collected.

Or maybe even disable the perf maps entirely, do we actually use them for anything? It seems a debug feature that is probably not enabled by default (wouldn't make sense in production to produce 30M files for every PHP process...). We can't use them as long as they're not tied to build anyway, so right now they're just useless afaics.

Apparently the file is supposed to be deleted by hhvm, it might not cleanup when it crashes though.

The hardcoded settings are:

hphp/runtime/base/runtime-option.h: F(bool, PerfPidMap, true)
hphp/runtime/base/runtime-option.h: F(bool, KeepPerfPidMap, false)

That should get rid of the file as per hphp/runtime/vm/debug/debug.cpp:

if (!RuntimeOption::EvalKeepPerfPidMap) {
  unlink(m_perfMapName);
}

Maybe we can just disable the debugger entirely.

We can't disable perf maps because they're required for stack trace translation, so we just need to GC then.

(In reply to Max Semenik from comment #3)

We can't disable perf maps because they're required for stack trace
translation, so we just need to GC then.

I don't think it is acceptable to have up to 5MB generated every time we throw an exception.

hphp/runtime/base/runtime-option.h: F(bool, PerfPidMap, true)

Suggests that it can be disabled entirely though that is hardcoded.

Sure, but it's either that or stacktraces.

Pinged Ori about this bug so it can be talked about during the weekly hiphop check in.

We don't need the stacktraces for Jenkins, so just add the following to /etc/config.hdf:

Eval {

PerfPidMap=false

}

Or invoke HHVM with hhvm -vEval.PerfPidMap=false

Change 135557 had a related patch set uploaded by Hashar:
HHVM disable stacktraces

https://gerrit.wikimedia.org/r/135557

Change 135557 merged by jenkins-bot:
HHVM disable stacktraces

https://gerrit.wikimedia.org/r/135557

Thanks Ori. I have adjusted the shell wrapper to pass -vEval.PerfPidMap=false as suggested. Cleaned up /tmp on both integration-slave1001.eqiad.wmflabs and integration-slave1002.eqiad.wmflabs.

We will see what happens :)

Looks like this is back. Over 2000 files in /tmp on integration-slave1008. Oldest one is from Feb 26 04:58.

Change 195035 had a related patch set uploaded (by Hashar):
contint: disable hhvm stacktraces / map

https://gerrit.wikimedia.org/r/195035

I have poked Giuseppe and Ori by private email to attract review of the puppet patch https://gerrit.wikimedia.org/r/195035

Change 195035 abandoned by Hashar:
contint: disable hhvm stacktraces / map

Reason:
Turns out it is already set by the puppet hhvm module:

$ grep hhvm.perf_pid_map /etc/hhvm/php.ini
hhvm.perf_pid_map = false

The /tmp/perf-*.map files are still created though but remain empty.

Abandoning change, will follow up on task.

https://gerrit.wikimedia.org/r/195035

The HHVM puppet module does default the cli configuration file to disable the perf_pid_map:

integration-slave1004$ grep hhvm.perf_pid_map /etc/hhvm/php.ini
hhvm.perf_pid_map = false

The /tmp/perf-*.map files are still created though! Albeit with zero size which is an improvement.

$ hhvm --version
HipHop VM 3.3.1 (rel)
Compiler: heads/master-0-g3efe530cd55a886d88b29acc014dde14c4cff755
Repo schema: b538518f4338eecd9dc2ff13865fd306926fde1b
Extension API: 20140829
$ apt-cache policy hhvm
hhvm:
  Installed: 3.3.1+dfsg1-1+wm3.1
  Candidate: 3.3.1+dfsg1-1+wm3.1
  Version table:
 *** 3.3.1+dfsg1-1+wm3.1 0
       1001 http://apt.wikimedia.org/wikimedia/ trusty-wikimedia/main amd64 Packages
        100 /var/lib/dpkg/status
hashar claimed this task.

Wont fixing it for now. There is not much disk spaces issues any more and the issue will be definitely solved whenever we migrate the jobs to Nodepool instances.