Memory and page-sequences

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory and page-sequences

robrez
I was delighted to learn that fop 2.0 has been released (I live under a rock).

For some time now we've been manually breaking up our content into multiple page-sequences -- many more than we would like to use.

I saw that one long page-sequence still eats memory:
https://xmlgraphics.apache.org/fop/2.0/running.html#memory

For years now, I've been using a less-than-ideal strategy to break my content up into multiple page-sequences. I'm wondering if there is any hope that we will be able to stop doing this at some point and let the pages break naturally.

The part about multiple page sequences that presents difficulty is the fact that physical page break occurs.

Background use-case:  FOP renders massive reports

Thanks,
Rob
Reply | Threaded
Open this post in threaded view
|

Re: Memory and page-sequences

Andreas Delmelle-2
> On 05 Jan 2016, at 22:09, robrez <[hidden email]> wrote:
>
> I was delighted to learn that fop 2.0 has been released (I live under a
> rock).
>
> For some time now we've been manually breaking up our content into multiple
> page-sequences -- many more than we would like to use.
>
> I saw that one long page-sequence still eats memory:
> https://xmlgraphics.apache.org/fop/2.0/running.html#memory
>
> For years now, I've been using a less-than-ideal strategy to break my
> content up into multiple page-sequences. I'm wondering if there is any hope
> that we will be able to stop doing this at some point and let the pages
> break naturally.

Not in the near future, I'm afraid... It would require a very thorough rework of the line- and page-breaking interaction.

At any rate, barring forced page-breaks, what happens is that first ALL line breaks for the entire page-sequence are determined, and only then is control handed over to the page-breaking algorithm, which in turn computes the page-breaks based on the line-boxes.
There have been lots of theories/ideas on improving that, but one thing is certain: it should start with making the basic line-breaking process interruptible. As long as that is not the case, it is virtually impossible to improve this.
Roughly: if it is more or less a certainty that there are enough line-boxes to fill the current page, return control to the page-breaking algorithm, so that there is at least an opportunity to flush, and e.g. detect page-width changes much earlier. Right now, such changes are only detected after all the line-breaks have been computed once... If they occur, the line-breaking process is restarted as of the point where the page-break to a new page with different inline-progression-dimension occurred. In the end, that could very well mean that a set of hundreds of line-breaks are computed for nothing, just wasting memory and CPU cycles.

Definitely not a trivial matter to resolve.


KR

Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]