From time to time a Jenkins job will fail when all tests fail. Console log also does not contain any information on why the job failed.
Example:
...
Build step 'Execute shell' marked build as failure
...
Version: unspecified
Severity: normal
zeljkofilipin | |
Jan 14 2014, 12:56 PM |
F12898: TEST-features-font_selection_default_enabled.xml | |
Nov 22 2014, 2:53 AM |
F12896: file_60037.txt | |
Nov 22 2014, 2:53 AM |
From time to time a Jenkins job will fail when all tests fail. Console log also does not contain any information on why the job failed.
Example:
...
Build step 'Execute shell' marked build as failure
...
Version: unspecified
Severity: normal
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Invalid | None | T62338 Investigate how often browser tests fail because of Jenkins performance problems (tracking) | |||
Resolved | None | T62037 Jenkins job fails when all tests pass |
Cloudbees support suggested adding --backtrace to "bundle exec cucumber":
https://gerrit.wikimedia.org/r/#/c/106260
It did not help, no additional information was displayed in Jenkins console log.
The next advice was to add this:
echo "Failure in cucumber" | |
to the end of every "bundle exec cucumber":
https://gerrit.wikimedia.org/r/#/c/107164/
It did not help. It actually made things worse. When a test failed, instead of marking a job as failed, it marked it as unstable. That caused no e-mail notifications to be sent.
Change 107363 had a related patch set uploaded by Zfilipin:
Send e-mail for every unstable Jenkins job
Change 108493 had a related patch set uploaded by Zfilipin:
removed debugging code from Jenkins jobs
Created attachment 14398
def of a successful and failing consoles
Diff of two consoles:
https://wmf.ci.cloudbees.com/job/browsertests-en.wikipedia.beta.wmflabs.org-linux-chrome/618/consoleText
https://wmf.ci.cloudbees.com/job/browsertests-en.wikipedia.beta.wmflabs.org-linux-chrome/619/consoleText
Attached:
Cloudbees support said this will return shell error code:
do_something_that_may_fail || (echo failed with $?; false)
You might want to run cucumber with some increased verbosity. In its current mode it might be hiding a stacktrace and exit 1 without any message.
Looking at the failing job:
https://wmf.ci.cloudbees.com/job/browsertests-en.wikipedia.beta.wmflabs.org-linux-chrome/619/
The console does not show anything relevant beside the job failing:
https://wmf.ci.cloudbees.com/job/browsertests-en.wikipedia.beta.wmflabs.org-linux-chrome/619/console
Looking at the test report we get way more details:
With a nice stack trace and way more details for each tests:
config/cucumber.yml of qa/browsertests.git has:
ci: --format Cucumber::Formatter::Sauce --out reports/junit
That define a profile named 'ci' with options which are passed to cucumber. That configuration means cucumber will write output using the Sauce format to a file (reports/junit) discarding output to the console.
It might be possible to add a second formatter and have write to stdout which would show up on the console.
Another possibility is to intercept cucumber exit code and echo a clear message saying the job has some failure and craft an URL pointing to the test report. Something like:
bundle exec cucumber --backtrace --verbose --profile ci --tags @en.wikipedia.beta.wmflabs.org || (echo -e "\nJob as failed (exit code: $?).\nSee test report at $BUILD_URL/testReport/\n")
Jenkins env variables are listed at:
Change 110345 had a related patch set uploaded by Zfilipin:
Increases verbosity of Cucumber output
Chris has noticed this:
...
@en.wikipedia.beta.wmflabs.org @test2.wikipedia.org
Feature: File
Scenario: Anonymous goes to file that does not exist # features/file.feature:15 Given I am at file that does not exist # features/step_definitions/file_steps.rb:12 Then page text should contain No file by this name exists. # features/step_definitions/page_steps.rb:123 @login Scenario: Logged-in user goes to file that does not exist # features/file.feature:20
too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 70344678247960, last used 120.314153617 seconds ago (Net::HTTP::Persistent::Error)
...
Apparently this is also the source of our intermittent "Unable to pick a platform" test failures, see https://wmf.ci.cloudbees.com/job/browsertests-en.wikipedia.beta.wmflabs.org-linux-firefox/625/testReport/(root)/AFTv5/Check_if_AFTv5_is_on_the_page/ from 5 February
The next step we need to do is to download all xml files from ws/reports/junit/ folder when a job fails but no tests fail. We should inspect the xml files and figure out if the failure is recorded there and Jenkins is not seeing it, or if the failure is not reported there for some reason.
The XML files in question are located at e.g. https://wmf.ci.cloudbees.com/job/browsertests-en.wikipedia.beta.wmflabs.org-linux-firefox/ws/reports/junit/
The latest example:
...
Scenario: Applying the live preview of interface font # features/font_selection_default_enabled.feature:44
bad URI(is not URI?): (URI::InvalidURIError)
...
The scenario is marked "skipped" instead of "failed":
From TEST-features-font_selection_default_enabled.xml (attached):
...
<testcase classname="Font selection" name="Applying the live preview of interface font" time="65.901432">
<skipped/> <system-out/> <system-err/>
</testcase>
...
(In reply to Nemo from comment #22)
Same, right?
https://integration.wikimedia.org/ci/job/mediawiki-core-phpunit-databaseless/23010/consoleFull
That one is a PHP segfault tracked by bug 62623 and is completely unrelated.
Chris, I do not remember seeing this since we moved to wikimedia jenkins. Can this be resolved as fixed?