Markers: Determining the last generated area for a LM

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Markers: Determining the last generated area for a LM

Jeremias Maerki
As you may have seen I've been working through the layoutengine
testcases to fix various failures/bugs last week. One of the last
problems that need to be fixed is markers. Markers already work fine
under the new page breaking mechanism when an FO is not broken over the
page/column boundaries.

The problem is getting the two last booleans on getCurrentPV().addMarkers()
right. Currently the calls are hardcoded to:
getCurrentPV().addMarkers(markers, true, true, false);
and
getCurrentPV().addMarkers(markers, false, false, true);

The isfirst and islast parameters must be set correctly. Currently, I
don't see a reliable way to determine these values. For example, there's
some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
the layout context but I found this doesn't work reliably. I've
experimented with two other approaches both of which were not good
enough. One (flags on Position instances) failed because the first n
elements at the beginning of the element list may be removed which also
removed the marker for the first element in the list. The other
(counting Position instances) failed because the element list may be
modified after the initial generation thus throwing off counters. I
discarded this mainly because I didn't want to make the code more
complicated just to get the indices right again.

The only thing that sounds like worth pursuing right now is to do
look-behind and look-ahead in the Position iterator, which is in a way
extending the approach that is currently visible in AreaAdditionUtils.
This approach checks whether the current LM changes or not.

Maybe someone has another idea on how to approach this problem. I'll let
it rest for a moment until I've made keeps and breaks work on tables.

Jeremias Maerki

Reply | Threaded
Open this post in threaded view
|

XHTML 2 PDF

Dirk Bromberg
Hi,

Is there a good / easy way to get a pdf document form an xhtml website?

xhtml -> xsl -> fo -> fop -> pdf ?


Thanks

Dirk





Reply | Threaded
Open this post in threaded view
|

Re: XHTML 2 PDF

Pasi Nummisalo

Hi

If you like to use HTML+CSS and tables without fixed widths,
Webisor (www.davisor.com/webisor) might be tool for you.

Regards,
Pasi

On Mon, 23 May 2005, Dirk Bromberg wrote:

> Hi,
>
> Is there a good / easy way to get a pdf document form an xhtml website?
>
> xhtml -> xsl -> fo -> fop -> pdf ?
>
>
> Thanks
>
> Dirk
>

Reply | Threaded
Open this post in threaded view
|

Re: XHTML 2 PDF

Jeremias Maerki
In reply to this post by Dirk Bromberg
Look in the archives:
http://marc.theaimsgroup.com/?l=fop-user&w=2&r=1&s=xhtml+pdf&q=b

And please send questions to fop-users in the future, not fop-dev. Thank
you.

On 23.05.2005 10:00:13 Dirk Bromberg wrote:
> Is there a good / easy way to get a pdf document form an xhtml website?
>
> xhtml -> xsl -> fo -> fop -> pdf ?


Jeremias Maerki

Reply | Threaded
Open this post in threaded view
|

Re: XHTML 2 PDF

Manuel Mall-2
In reply to this post by Dirk Bromberg
Seems to me that this is more of a fop-user questions.

Any way, there are a few stylesheets on the web which do xhtml to fo
transformations. Just "Google" for 'xhtml fo stylesheet' and you get a few
sensible hits.

For example there is one published by James Tauber
(http://blogs.pingpoet.com/overflow/archive/2004/09/03/768.aspx).

Manuel

On Mon, 23 May 2005 04:00 pm, Dirk Bromberg wrote:

> Hi,
>
> Is there a good / easy way to get a pdf document form an xhtml website?
>
> xhtml -> xsl -> fo -> fop -> pdf ?
>
>
> Thanks
>
> Dirk
Reply | Threaded
Open this post in threaded view
|

Re: XHTML 2 PDF

Dirk Bromberg
Ok,

Thanks for quick answers!

Dirk



Manuel Mall wrote:

>Seems to me that this is more of a fop-user questions.
>
>Any way, there are a few stylesheets on the web which do xhtml to fo
>transformations. Just "Google" for 'xhtml fo stylesheet' and you get a few
>sensible hits.
>
>For example there is one published by James Tauber
>(http://blogs.pingpoet.com/overflow/archive/2004/09/03/768.aspx).
>
>Manuel
>
>On Mon, 23 May 2005 04:00 pm, Dirk Bromberg wrote:
>  
>
>>Hi,
>>
>>Is there a good / easy way to get a pdf document form an xhtml website?
>>
>>xhtml -> xsl -> fo -> fop -> pdf ?
>>
>>
>>Thanks
>>
>>Dirk
>>    
>>
Reply | Threaded
Open this post in threaded view
|

FAQ'ish questions (was: Re: XHTML 2 PDF)

J.Pietschmann
In reply to this post by Jeremias Maerki
Jeremias Maerki wrote:
> Look in the archives:
> http://marc.theaimsgroup.com/?l=fop-user&w=2&r=1&s=xhtml+pdf&q=b

This is becoming FAQ material.
To my great surprise, the various (x)html2fo tools are neither in
the FAQ nor in the additional ressources list.

Related questions:
- Where do I edit the FAQ's xdoc source: HEAD or maintenance branch?
- Why is the FAQ TOC gone? This makes it difficult use direct links
   to individual FAQ entries in mails. Should I open a forrest
   requirement for a TOC per section, preferably in a customizable way?
- What's the current publishing process? Wasn't there a Wiki
   page about this?
- What about moving the FAQ to the Wiki, or establishing a supplement
   FAQ in the Wiki? (same for "additional ressources")

Bonus points: Is there anybody out there willing to work on canonical
non-rude FAQ answers? (See
  http://www.joelonsoftware.com/articles/FogBugzII.html
section "snippets")
I've already installed Thunderbird QuickText (great stuff...).

J.Pietschmann
Reply | Threaded
Open this post in threaded view
|

Re: FAQ'ish questions (was: Re: XHTML 2 PDF)

Clay Leeds-2
On May 23, 2005, at 3:20 PM, J.Pietschmann wrote:
> Jeremias Maerki wrote:
>> Look in the archives:
>> http://marc.theaimsgroup.com/?l=fop-user&w=2&r=1&s=xhtml+pdf&q=b
>
> This is becoming FAQ material.
> To my great surprise, the various (x)html2fo tools are neither in
> the FAQ nor in the additional ressources list.

I'll look into adding something.

> Related questions:
> - Where do I edit the FAQ's xdoc source: HEAD or maintenance branch?

xml-fop/src/documentation/content/xdocs/faq.xml

the HEAD branch is what is used to build the site.

> - Why is the FAQ TOC gone? This makes it difficult use direct links
>   to individual FAQ entries in mails. Should I open a forrest
>   requirement for a TOC per section, preferably in a customizable way?

Forrest has the ability to generate one... but there's some problem on
the FOP site. All other pages generate one except the FAQ page. Our
page has exactly the same structure as the Forrest FAQ, as does our
skinconf.xml. Unfortunately, it works for them, but not us.

I asked on the forrest user list, and received an answer which
unfortunately didn't help much. I'd since forgotten about it. I'll see
if I can come up with a solution.

> - What's the current publishing process? Wasn't there a Wiki
>   page about this?
> - What about moving the FAQ to the Wiki, or establishing a supplement
>   FAQ in the Wiki? (same for "additional ressources")

That'd be a good idea. That would certainly make it easier for
fop-committers to edit those pages. It could still live in our sidebar
as well.

> Bonus points: Is there anybody out there willing to work on canonical
> non-rude FAQ answers? (See
>  http://www.joelonsoftware.com/articles/FogBugzII.html
> section "snippets")
> I've already installed Thunderbird QuickText (great stuff...).
>
> J.Pietschmann

That sounds like a great tool and suggestion...

Regards,

Web Maestro Clay
--
<[hidden email]> - <http://homepage.mac.com/webmaestro/>
My religion is simple. My religion is kindness.
- HH The 14th Dalai Lama of Tibet

Reply | Threaded
Open this post in threaded view
|

RE: FAQ'ish questions (was: Re: XHTML 2 PDF)

Victor Mote
In reply to this post by J.Pietschmann
J.Pietschmann wrote:

> - What's the current publishing process? Wasn't there a Wiki
>    page about this?

It is on the web site, under the "Development" tab, Deploy/Doc Mgmt menu:
http://xml.apache.org/fop/dev/doc.html
It may have started as a Wiki -- I don't remember. I don't know whether it
is up-to-date or not.

Victor Mote

Reply | Threaded
Open this post in threaded view
|

Re: Markers: Determining the last generated area for a LM

Luca Furini
In reply to this post by Jeremias Maerki
Jeremias Maerki wrote:

> The isfirst and islast parameters must be set correctly. Currently, I
> don't see a reliable way to determine these values. For example, there's
> some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
> the layout context but I found this doesn't work reliably.

Did you find out why this does not work? I mean, do you think it is an
incorrect approach, or there's something wrong somewhere in the code?

I remember that some time ago I had problems with these flags, and it was
because of some LM that did not set / propagate the correct values when
creating LayoutContexts for children LMs.

Regards
    Luca



Reply | Threaded
Open this post in threaded view
|

Re: Markers: Determining the last generated area for a LM

Jeremias Maerki

On 24.05.2005 18:41:39 Luca Furini wrote:
> Jeremias Maerki wrote:
>
> > The isfirst and islast parameters must be set correctly. Currently, I
> > don't see a reliable way to determine these values. For example, there's
> > some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
> > the layout context but I found this doesn't work reliably.
>
> Did you find out why this does not work? I mean, do you think it is an
> incorrect approach, or there's something wrong somewhere in the code?

The problem is that in a break situation, AreaAdditionUtils is called
more than once, each time signalling a first and last area instead of
signalling a first area once and a last area once over multiple calls.

Hmm, I think I need to check if the LayoutContext instance remains the
same of multiple addArea calls on the same LM. If that's the case the
problem should be solveable.

> I remember that some time ago I had problems with these flags, and it was
> because of some LM that did not set / propagate the correct values when
> creating LayoutContexts for children LMs.


Jeremias Maerki

Reply | Threaded
Open this post in threaded view
|

Re: FAQ'ish questions (was: Re: XHTML 2 PDF)

Jeremias Maerki
In reply to this post by J.Pietschmann
Actually, it's still accurate, although that will change as soon as the
migration to SVN is done. ...which reminds me.... :-)

On 24.05.2005 01:51:21 Victor Mote wrote:

> J.Pietschmann wrote:
>
> > - What's the current publishing process? Wasn't there a Wiki
> >    page about this?
>
> It is on the web site, under the "Development" tab, Deploy/Doc Mgmt menu:
> http://xml.apache.org/fop/dev/doc.html
> It may have started as a Wiki -- I don't remember. I don't know whether it
> is up-to-date or not.
>
> Victor Mote



Jeremias Maerki

Reply | Threaded
Open this post in threaded view
|

Re: Markers: Determining the last generated area for a LM

Jeremias Maerki
In reply to this post by Jeremias Maerki
After a lot of thinking and experimenting I finally resolved to take up
the idea below again. When I started distinguishing between Positions
that indirectly generate area and those that do not, I was suddenly able
to create a relatively easy and (hopefully) stable machanism to
determine the first and last areas of a LayoutManager. It already works
on my machine for flow, block and block-container (markers6b passes).
Now I'm trying to add marker support for tables which is a bit special
since we don't have the rigid hierarchy of LMs like before. But I'm
pretty sure this is also doable without to much effort.

There's a downside with all this. There was the idea earlier of not
nesting Positions anymore, but with the above approach I need at integer
member variable on Position. That means we'll have to stick with the
nesting if noone comes up with a better idea. The LM only needs two
integers, one determining the first index ever passed through to an
addArea() method and an integer that has the double function of serving
as a running counter of Positions (seeds the Position.setIndex(int)) and
of helping determine if a Position is the last. At least, this way the
nested Position have more of a reason to exist and take up memory.

I think this approach should be pretty stable against the
getChangedKnuthElements() stage, though I could be wrong. This stage is
a topic I'm still not 100% familiar with, yet.

I'll wait a bit before I commit, so you'll have a chance to veto if
anyone sees a serious problem with this. After all, I still have to deal
with markers on tables first.

On 23.05.2005 09:35:48 Jeremias Maerki wrote:
> The other
> (counting Position instances) failed because the element list may be
> modified after the initial generation thus throwing off counters. I
> discarded this mainly because I didn't want to make the code more
> complicated just to get the indices right again.


Jeremias Maerki

Reply | Threaded
Open this post in threaded view
|

Re: Markers: Determining the last generated area for a LM

gmazza
In reply to this post by Jeremias Maerki
Jeremias, I think we do something like this for ID's already -- I wonder
if we can use a similar approach here.

We already have a PSLM.getFirstPVWithID() method, which due to the
(Map/List) data structure that contains this information in
AreaTreeHandler, can probably be easily converted to a
PSLM.getLastPVWithID().  Note that with this method, when we add PV's
having a given ID, we don't bother needing to send "is first" or "is
last" indications, that is easily determinable by the List when it is
complete for that property ID.

Can we do a similar thing for markers?  I.e., feed a data structure
without needing to give first/last indications, and rely on the state of
that structure to subsequently find out what is first/last?

Thanks,
Glen


Jeremias Maerki wrote:

>As you may have seen I've been working through the layoutengine
>testcases to fix various failures/bugs last week. One of the last
>problems that need to be fixed is markers. Markers already work fine
>under the new page breaking mechanism when an FO is not broken over the
>page/column boundaries.
>
>The problem is getting the two last booleans on getCurrentPV().addMarkers()
>right. Currently the calls are hardcoded to:
>getCurrentPV().addMarkers(markers, true, true, false);
>and
>getCurrentPV().addMarkers(markers, false, false, true);
>
>The isfirst and islast parameters must be set correctly. Currently, I
>don't see a reliable way to determine these values. For example, there's
>some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
>the layout context but I found this doesn't work reliably. I've
>experimented with two other approaches both of which were not good
>enough. One (flags on Position instances) failed because the first n
>elements at the beginning of the element list may be removed which also
>removed the marker for the first element in the list. The other
>(counting Position instances) failed because the element list may be
>modified after the initial generation thus throwing off counters. I
>discarded this mainly because I didn't want to make the code more
>complicated just to get the indices right again.
>
>The only thing that sounds like worth pursuing right now is to do
>look-behind and look-ahead in the Position iterator, which is in a way
>extending the approach that is currently visible in AreaAdditionUtils.
>This approach checks whether the current LM changes or not.
>
>Maybe someone has another idea on how to approach this problem. I'll let
>it rest for a moment until I've made keeps and breaks work on tables.
>
>Jeremias Maerki
>
>
>  
>

Reply | Threaded
Open this post in threaded view
|

Re: Markers: Determining the last generated area for a LM

gmazza
Also, one more point--I think it may be a good idea for us to abstract
out AreaTreeModel from PSLM and encapsulate it back into AreaTreeHandler
(i.e. RootLayoutManager), including moving resolveRetrieveMarker()
there.  IIRC I was the guilty party who moved ATM into PSLM to begin
with, quite erroneously thinking that ATH might be proven superfluous
over time, and so trying to make direct ATM<-->PSLM linkages.  ATH is
here to stay, though, and resolveRetrieveMarker() is something that
cycles through the results of several PSLM instances so it seems more
natural/intuitive to have it in the higher, root-level processing class
here.  Thoughts?

Thanks,
Glen


Glen Mazza wrote:

> Jeremias, I think we do something like this for ID's already -- I
> wonder if we can use a similar approach here.
>
> We already have a PSLM.getFirstPVWithID() method, which due to the
> (Map/List) data structure that contains this information in
> AreaTreeHandler, can probably be easily converted to a
> PSLM.getLastPVWithID().  Note that with this method, when we add PV's
> having a given ID, we don't bother needing to send "is first" or "is
> last" indications, that is easily determinable by the List when it is
> complete for that property ID.
>
> Can we do a similar thing for markers?  I.e., feed a data structure
> without needing to give first/last indications, and rely on the state
> of that structure to subsequently find out what is first/last?
> Thanks,
> Glen
>
>
> Jeremias Maerki wrote:
>
>> As you may have seen I've been working through the layoutengine
>> testcases to fix various failures/bugs last week. One of the last
>> problems that need to be fixed is markers. Markers already work fine
>> under the new page breaking mechanism when an FO is not broken over the
>> page/column boundaries.
>>
>> The problem is getting the two last booleans on
>> getCurrentPV().addMarkers()
>> right. Currently the calls are hardcoded to:
>> getCurrentPV().addMarkers(markers, true, true, false);
>> and
>> getCurrentPV().addMarkers(markers, false, false, true);
>>
>> The isfirst and islast parameters must be set correctly. Currently, I
>> don't see a reliable way to determine these values. For example, there's
>> some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
>> the layout context but I found this doesn't work reliably. I've
>> experimented with two other approaches both of which were not good
>> enough. One (flags on Position instances) failed because the first n
>> elements at the beginning of the element list may be removed which also
>> removed the marker for the first element in the list. The other
>> (counting Position instances) failed because the element list may be
>> modified after the initial generation thus throwing off counters. I
>> discarded this mainly because I didn't want to make the code more
>> complicated just to get the indices right again.
>>
>> The only thing that sounds like worth pursuing right now is to do
>> look-behind and look-ahead in the Position iterator, which is in a way
>> extending the approach that is currently visible in AreaAdditionUtils.
>> This approach checks whether the current LM changes or not.
>>
>> Maybe someone has another idea on how to approach this problem. I'll let
>> it rest for a moment until I've made keeps and breaks work on tables.
>>
>> Jeremias Maerki
>>
>>
>>  
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Markers: Determining the last generated area for a LM

Jeremias Maerki
In reply to this post by gmazza
Sadly, that won't work. You'd have to make FOP a two-pass system to use
that approach where side regions are layed out in the second pass. With
your idea getLastPVWithID() will only result in a correct value after an
FO is fully distributed to PageViewports. That would, for example, kill
the ability to do out-of-line rendering of pages that can immediately be
fully resolved. The approach I'm currently working on takes very little
additional processing power and just a little bit more memory per
Position instance. Only for special cases additional processing is
needed. I'm currently trying to get markers on table-body for which
there is no separate LM anymore. Cases like this make the whole thing a
little more complicated but there's room for optimization, i.e. the
additional processing can be skipped if a table-body has no markers
(which is probably a common case anyway).

On 28.05.2005 07:13:20 Glen Mazza wrote:

> Jeremias, I think we do something like this for ID's already -- I wonder
> if we can use a similar approach here.
>
> We already have a PSLM.getFirstPVWithID() method, which due to the
> (Map/List) data structure that contains this information in
> AreaTreeHandler, can probably be easily converted to a
> PSLM.getLastPVWithID().  Note that with this method, when we add PV's
> having a given ID, we don't bother needing to send "is first" or "is
> last" indications, that is easily determinable by the List when it is
> complete for that property ID.
>
> Can we do a similar thing for markers?  I.e., feed a data structure
> without needing to give first/last indications, and rely on the state of
> that structure to subsequently find out what is first/last?
>
> Thanks,
> Glen
>
>
> Jeremias Maerki wrote:
>
> >As you may have seen I've been working through the layoutengine
> >testcases to fix various failures/bugs last week. One of the last
> >problems that need to be fixed is markers. Markers already work fine
> >under the new page breaking mechanism when an FO is not broken over the
> >page/column boundaries.
> >
> >The problem is getting the two last booleans on getCurrentPV().addMarkers()
> >right. Currently the calls are hardcoded to:
> >getCurrentPV().addMarkers(markers, true, true, false);
> >and
> >getCurrentPV().addMarkers(markers, false, false, true);
> >
> >The isfirst and islast parameters must be set correctly. Currently, I
> >don't see a reliable way to determine these values. For example, there's
> >some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
> >the layout context but I found this doesn't work reliably. I've
> >experimented with two other approaches both of which were not good
> >enough. One (flags on Position instances) failed because the first n
> >elements at the beginning of the element list may be removed which also
> >removed the marker for the first element in the list. The other
> >(counting Position instances) failed because the element list may be
> >modified after the initial generation thus throwing off counters. I
> >discarded this mainly because I didn't want to make the code more
> >complicated just to get the indices right again.
> >
> >The only thing that sounds like worth pursuing right now is to do
> >look-behind and look-ahead in the Position iterator, which is in a way
> >extending the approach that is currently visible in AreaAdditionUtils.
> >This approach checks whether the current LM changes or not.
> >
> >Maybe someone has another idea on how to approach this problem. I'll let
> >it rest for a moment until I've made keeps and breaks work on tables.
> >
> >Jeremias Maerki
> >
> >
> >  
> >



Jeremias Maerki