Page MenuHomePhabricator

action=parse only fetches from the parser cache, it does not store to it
Closed, InvalidPublic

Description

like Bug #29223: it only fetches from the parser cache, it doesn't
store to it. This problem reduces our parser cache hit ratio
significantly since we now have huge numbers of action=parse hits due
to android and iphone apps


Version: unspecified
Severity: normal

Details

Reference
bz29907

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 21 2014, 11:30 PM
bzimport set Reference to bz29907.

If the Api hits against $article->getParserOutput(), that will read from the parser cache, and when it gets down to "return $this->getOutputFromWikitext( $text, $useParserCache );", a save back to the parser cache is performed

So, that is proof that at least *some* of the code paths do hit things that at least in some cases set/get from the parser cache.

Do you have a specific use case for this bug where you've hit a limitation that's saying you can't do that?

(In reply to comment #1)

Do you have a specific use case for this bug where you've hit a limitation
that's saying you can't do that?

CCing Tim since he raised this issue. See http://hexm.de/54 and search for "action=parse".

It looks like the only place the cache isn't being used/saved is the code for parsing arbitrary text. The only way to handle that would be with some title/page/md5 keys or something. People should really just be calling the right API params in that case (e.g. not having to give the text).

Lowering priority since this seems relatively rare. If there are lots of clients handing us arbitrary text to us to parse, it makes sense to change it back, but we need some numbers for that.

I've added a few comments about stuff that is/isn't cached.

Per Aaron, arbitary text passed by the user isn't (should it?) cached.

Neither are parses of old page text. Should we maybe be caching them incase they are getting multiple hits?

Also, neither are section parses, should we be saving those? Are they common?

Hopefully Tim can weigh in here, and suggest the way forward - if we should, or shouldn't be caching other stuff...

I'd be surprised if it was worth caching sections (or even old revs). Though with disk-space we don't have to worry *as much* about filling the cache with seldom-used stuff that much.

Maybe we could log (with md5 of text) the raw-text and section hits to see if there are usage patterns that justify caching?

(In reply to comment #5)

I've added a few comments about stuff that is/isn't cached.

Per Aaron, arbitary text passed by the user isn't (should it?) cached.

Neither are parses of old page text. Should we maybe be caching them incase
they are getting multiple hits?

Also, neither are section parses, should we be saving those? Are they common?

Hopefully Tim can weigh in here, and suggest the way forward - if we should, or
shouldn't be caching other stuff...

It sounds like action=parse has the same caching behavior as 'normal' MW then. I should admit that I filed this bug report solely based on a statement by Tim without verifying it myself. Maybe Tim's statement was based on old behavior that has since been fixed?

(In reply to comment #7)

(In reply to comment #5)

I've added a few comments about stuff that is/isn't cached.

Per Aaron, arbitary text passed by the user isn't (should it?) cached.

Neither are parses of old page text. Should we maybe be caching them incase
they are getting multiple hits?

Also, neither are section parses, should we be saving those? Are they common?

Hopefully Tim can weigh in here, and suggest the way forward - if we should, or
shouldn't be caching other stuff...

It sounds like action=parse has the same caching behavior as 'normal' MW then.
I should admit that I filed this bug report solely based on a statement by Tim
without verifying it myself. Maybe Tim's statement was based on old behavior
that has since been fixed?

I know during some bits of refactoring to fix other bugs, I believe I added methods that use the parser cache in a place or 2..

I am really glad that we got to look at this carefully. I think that
unless no one else objects, this can be closed.