Page MenuHomePhabricator

Logstash occasionally hangs as a result of a memory leak caused by non-UTF-8 input
Closed, InvalidPublic

Description

Logstash stops processing log events occasionally in both beta and production. I happened to catch the cause in the logs today:

{:timestamp=>"2014-04-03T12:03:43.449000+0000", :message=>"Failed to flush outgo
ing items", :outgoing_count=>1, :exception=>java.lang.OutOfMemoryError: GC overh
ead limit exceeded, :backtrace=>["java.nio.HeapByteBuffer.<init>(HeapByteBuffer.
java:57)", "java.nio.ByteBuffer.allocate(ByteBuffer.java:329)", "sun.nio.cs.Stre
amDecoder.<init>(StreamDecoder.java:249)", "sun.nio.cs.StreamDecoder.<init>(StreamDecoder.java:229)", "sun.nio.cs.StreamDecoder.forInputStreamReader(StreamDecoder.java:68)", "java.io.InputStreamReader.<init>(InputStreamReader.java:74)", "org.jruby.util.RubyDateFormatter.compilePattern(RubyDateFormatter.java:261)", "org.jruby.util.RubyDateFormatter.compileAndFormat(RubyDateFormatter.java:360)", "org.jruby.RubyTime.strftime(RubyTime.java:425)", "org.jruby.RubyTime$INVOKER$i$1$0$strftime.call(RubyTime$INVOKER$i$1$0$strftime.gen)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)", "org.jruby.ast.AttrAssignTwoArgNode.interpret(AttrAssignTwoArgNode.java:36)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:290)", "org.jruby.RubyProc.call19(RubyProc.java:271)", "org.jruby.RubyProc$INVOKER$i$0$0$call19.call(RubyProc$INVOKER$i$0$0$call19.gen)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "rubyjit.Cabin::Channel$$publish_0CC14B6056DE0968D0FA81C715E46013862F8AD3.block_0$RUBY$file(file:/opt/logstash/logstash.jar!/cabin/channel.rb:161)", "rubyjit$Cabin::Channel$$publish_0CC14B6056DE0968D0FA81C715E46013862F8AD3$block_0$RUBY$file__.call(rubyjit$Cabin::Channel$$publish_0CC14B6056DE0968D0FA81C715E46013862F8AD3$block_0$RUBY$file__)", "org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:135)", "org.jruby.runtime.Block.yield(Block.java:142)", "org.jruby.RubyArray.eachCommon(RubyArray.java:1606)", "org.jruby.RubyArray.each(RubyArray.java:1613)", "org.jruby.RubyArray$INVOKER$i$0$0$each.call(RubyArray$INVOKER$i$0$0$each.gen)"], :level=>:warn}

This looks to be a known bug upstream related to "malformed utf-8" errors: https://logstash.jira.com/browse/LOGSTASH-1389

We get malformed utf-8 errors occasionally mostly from database inserts:

{:timestamp=>"2014-04-03T10:44:52.106000+0000", :message=>"Received an event tha
t has a different character encoding than you configured.", :text=>"1235761 web
wikidatawiki-bcb77336: 0.6721 20.2M DatabaseBase::query: Writes done: INSERT
INTO text (old_id,old_text,old_flags) VALUES (NULL,'=\\x8EA\\x8B\\xC20 \\x85\\
xFF\\x8A \\x82\\x97 L[5\\xF6\\xBC\\xB7]\\x84BAX\\xF10\\xAD\\x93:4\\x8D\\xBBMV]K\
\xFF\\xBB\\xB1\\x8A\\x97\\xE1\\\\r3ッ \\x83% \\xC8z ܾ\\x8Emq\\xDB\\xE40 8\\x90\\
xAB:\\xFE\\xF1|\\xB2\\xEF{\\x85\\xFE\\xD2l>\\xB6E\\xFE\\xD9\\\\\\\\\\xBB\\xF2\\x
9B\\xEAb\\xFCE\\xC3\\xE8\\xC8A\\xB6\\xDB \\xA8 r\\xFB\\xD0=\\xB4a\\xC2 \\xCD \\x
81\\x90\\x89\\x9A p\\xBEc[\\x83\\\\0\\xF637\\xC1\\xC9k \\xB6ߧ\\xBB 5y,\\xD5*\\x9
9\\xA6\\xA9\\x8E\\xD7z\\xB1\\x8AҹT\\x91>$2\\\\\\\"J\\xAA\\xA8T\\xCBx\\x89\\xA4\\
tS \\x92:\\xB4\\\\rd2 \\xD2#\\xC2 b \\xDB\\xE6\\xC5Cֳ\\xFF\\u007F\\x90\\xB0\\xA7
\\xC4 \\xBE \\xEE','utf-8,gzip')\\n", :expected_charset=>"UTF-8", :level=>:warn}


Version: wmf-deployment
Severity: normal

Details

Reference
bz63490

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:11 AM
bzimport set Reference to bz63490.
bzimport added a subscriber: Unknown Object (MLST).

This problem has not been seen for quite some time following the last Logstash software upgrade.