FOP2.0 taking more time to format complex script documents

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

FOP2.0 taking more time to format complex script documents

sripathi
This post was updated on .
Hi All,

 Initially,we are used FOP0.20.5 in my application,now we migrated to FOP2.0.But FOP2.0 is taking more time to format complex script(Chinese,Japanese..etc) documents comparing to FOP0.20.5.

Could you please help me on this issue..


Thanks,
Sri
Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

Glenn Adams-2
Yes, you are right. It also uses more memory. These are unavoidable side effects of the ability to process complex scripts and font features.

If you don't need any complex script features, then you can disable that processing as described in the documentation [1].


On Tue, Jun 23, 2015 at 6:48 AM, sripathi <[hidden email]> wrote:
Hi All,

 Initially,we are used FOP0.20.5 in my application,now we migrated to
FOP2.0.But FOP2.0 is taking more time to format complex
script(Chinese,Japanese..etc) documents comparing to FOP0.20.5.

Could you please help me on this issue..


Thanks,
Sripathi



--
View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-format-complex-script-documents-tp42461.html
Sent from the FOP - Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

sripathi
Hi Glenn,

 Thanks for your reply..We are already disabled the Complex scripts feature..Actually ,here the problem with Chinese and Japanese fonts.I tried in FOP1.1 also, FOP2.0 is 4times slower than 1.1 ,when processing these fonts.
Could you please help me on this Performance issue.

Thanks,
Sripathi.
Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

Chris Bowditch
Hi Sripathi,

Not much we can do to help without at least the following:

1. XSL-FO file. Please attach your file. Please don't send XSLT/XML
2. fop.xconf. Please send the configuration file you are using.
3. The version of Java and Operating System you are using.

Thanks,

Chris

On 24/06/2015 12:04, sripathi wrote:

> Hi Glenn,
>
>   Thanks for your reply..We are already disabled the Complex scripts
> feature..Actually ,here the problem with Chinese and Japanese fonts.I tried
> in FOP1.1 also, FOP2.0 is 4times slower than 1.1 ,when processing these
> fonts.
> Could you please help me on this Performance issue.
>
> Thanks,
> Sripathi.
>
>
>
> --
> View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-to-format-complex-script-documents-tp42461p42468.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

sripathi
This post has NOT been accepted by the mailing list yet.
chinese.fofop.xconf

Hi Chris,
 
   Thanks for your reply..I have attached the XSL-FO file and FOP configuration file.And we are using Java JDK1.6 and Windows7 OS.

Thanks,
Sripathi.
Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

dvineshkumar@gmail.com
Hi,

After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous even after a glyph character is found. Updated the findGlyphIndex() method to terminate the loop once the glyph character is found and performance got improved much. Refer below existing and updated method.

Existing:

 public int findGlyphIndex(int c) {
        int idx = c;
        int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;

        // for most users the most likely glyphs are in the first cmap segments (meaning the one with
        // the lowest unicode start values)
        if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
            return mostLikelyGlyphs[idx];
        }
        for (CMapSegment i : cmap) {
            if (retIdx == 0
                    && i.getUnicodeStart() <= idx
                    && i.getUnicodeEnd() >= idx) {
                retIdx = i.getGlyphStartIndex()
                    + idx
                    - i.getUnicodeStart();
                if (idx < NUM_MOST_LIKELY_GLYPHS) {
                    mostLikelyGlyphs[idx] = retIdx;
                }
            }
        }
        return retIdx;
    }

Updated:

public int findGlyphIndex(int c) {
int idx = c;
int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;

// for most users the most likely glyphs are in the first cmap segments (meaning the one with
// the lowest unicode start values)
if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
return mostLikelyGlyphs[idx];
}

for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
if (cmap.get(i).getUnicodeStart() <= idx
&& cmap.get(i).getUnicodeEnd() >= idx) {

retIdx = cmap.get(i).getGlyphStartIndex()
+ idx
- cmap.get(i).getUnicodeStart();
if (idx < NUM_MOST_LIKELY_GLYPHS) {
mostLikelyGlyphs[idx] = retIdx;

}
}
}
return retIdx;
}

Regards,
Vinesh Kumar. D
Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

Pascal Sancho-2
Hi,

please, can you file in a Jira entry, attaching all materials (test
case, patch, etc.)


2015-08-11 16:36 GMT+02:00 [hidden email] <[hidden email]>:

> Hi,
>
> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
> even after a glyph character is found. Updated the findGlyphIndex() method
> to terminate the loop once the glyph character is found and performance got
> improved much. Refer below existing and updated method.
>
> Existing:
>
>  public int findGlyphIndex(int c) {
>         int idx = c;
>         int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
>         // for most users the most likely glyphs are in the first cmap
> segments (meaning the one with
>         // the lowest unicode start values)
>         if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>             return mostLikelyGlyphs[idx];
>         }
>         for (CMapSegment i : cmap) {
>             if (retIdx == 0
>                     && i.getUnicodeStart() <= idx
>                     && i.getUnicodeEnd() >= idx) {
>                 retIdx = i.getGlyphStartIndex()
>                     + idx
>                     - i.getUnicodeStart();
>                 if (idx < NUM_MOST_LIKELY_GLYPHS) {
>                     mostLikelyGlyphs[idx] = retIdx;
>                 }
>             }
>         }
>         return retIdx;
>     }
>
> Updated:
>
> public int findGlyphIndex(int c) {
> int idx = c;
> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
> // for most users the most likely glyphs are in the first cmap segments
> (meaning the one with
> // the lowest unicode start values)
> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
> return mostLikelyGlyphs[idx];
> }
>
> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
> if (cmap.get(i).getUnicodeStart() <= idx
> && cmap.get(i).getUnicodeEnd() >= idx) {
>
> retIdx = cmap.get(i).getGlyphStartIndex()
> + idx
> - cmap.get(i).getUnicodeStart();
> if (idx < NUM_MOST_LIKELY_GLYPHS) {
> mostLikelyGlyphs[idx] = retIdx;
>
> }
> }
> }
> return retIdx;
> }
>
> Regards,
> Vinesh Kumar. D
>
>
>
>
> --
> View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-to-format-complex-script-documents-tp42461p42749.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>



--
pascal

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

Klaus Malorny
On 12.08.2015 08:38, Pascal Sancho wrote:

> Hi,
>
> please, can you file in a Jira entry, attaching all materials (test
> case, patch, etc.)
>
>
> 2015-08-11 16:36 GMT+02:00 [hidden email] <[hidden email]>:
>> Hi,
>>
>> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
>> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
>> even after a glyph character is found. Updated the findGlyphIndex() method
>> to terminate the loop once the glyph character is found and performance got
>> improved much. Refer below existing and updated method.
>>
>> Existing:
>>
>>   public int findGlyphIndex(int c) {
>>          int idx = c;
>>          int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>
>>          // for most users the most likely glyphs are in the first cmap
>> segments (meaning the one with
>>          // the lowest unicode start values)
>>          if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>>              return mostLikelyGlyphs[idx];
>>          }
>>          for (CMapSegment i : cmap) {
>>              if (retIdx == 0
>>                      && i.getUnicodeStart() <= idx
>>                      && i.getUnicodeEnd() >= idx) {
>>                  retIdx = i.getGlyphStartIndex()
>>                      + idx
>>                      - i.getUnicodeStart();
>>                  if (idx < NUM_MOST_LIKELY_GLYPHS) {
>>                      mostLikelyGlyphs[idx] = retIdx;
>>                  }
>>              }
>>          }
>>          return retIdx;
>>      }
>>
>> Updated:
>>
>> public int findGlyphIndex(int c) {
>> int idx = c;
>> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>
>> // for most users the most likely glyphs are in the first cmap segments
>> (meaning the one with
>> // the lowest unicode start values)
>> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>> return mostLikelyGlyphs[idx];
>> }
>>
>> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
>> if (cmap.get(i).getUnicodeStart() <= idx
>> && cmap.get(i).getUnicodeEnd() >= idx) {
>>
>> retIdx = cmap.get(i).getGlyphStartIndex()
>> + idx
>> - cmap.get(i).getUnicodeStart();
>> if (idx < NUM_MOST_LIKELY_GLYPHS) {
>> mostLikelyGlyphs[idx] = retIdx;
>>
>> }
>> }
>> }
>> return retIdx;
>> }
>>
>> Regards,
>> Vinesh Kumar. D
>>

Just for curiosity: Are breaks and returns within loops forbidden in your coding
conventions? ;-)

By the way, if this is really a performance bottleneck and the number of
segments are typically larger (say e.g. >= 10), I would sort the segments by
their starts and convert the three values into arrays (during object
construction) and would perform a binary search on the starts, then test for the
end and finally calculate the index.

Regards,
Klaus



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

Pascal Sancho-2
Hi,

AFAIK, there is no rules that prevent such usage.
as a starting point, you can follow this:
http://xmlgraphics.apache.org/fop/dev/conventions.html

2015-08-13 10:15 GMT+02:00 Klaus Malorny <[hidden email]>:

> On 12.08.2015 08:38, Pascal Sancho wrote:
>>
>> Hi,
>>
>> please, can you file in a Jira entry, attaching all materials (test
>> case, patch, etc.)
>>
>>
>> 2015-08-11 16:36 GMT+02:00 [hidden email]
>> <[hidden email]>:
>>>
>>> Hi,
>>>
>>> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
>>> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
>>> even after a glyph character is found. Updated the findGlyphIndex()
>>> method
>>> to terminate the loop once the glyph character is found and performance
>>> got
>>> improved much. Refer below existing and updated method.
>>>
>>> Existing:
>>>
>>>   public int findGlyphIndex(int c) {
>>>          int idx = c;
>>>          int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>>
>>>          // for most users the most likely glyphs are in the first cmap
>>> segments (meaning the one with
>>>          // the lowest unicode start values)
>>>          if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0)
>>> {
>>>              return mostLikelyGlyphs[idx];
>>>          }
>>>          for (CMapSegment i : cmap) {
>>>              if (retIdx == 0
>>>                      && i.getUnicodeStart() <= idx
>>>                      && i.getUnicodeEnd() >= idx) {
>>>                  retIdx = i.getGlyphStartIndex()
>>>                      + idx
>>>                      - i.getUnicodeStart();
>>>                  if (idx < NUM_MOST_LIKELY_GLYPHS) {
>>>                      mostLikelyGlyphs[idx] = retIdx;
>>>                  }
>>>              }
>>>          }
>>>          return retIdx;
>>>      }
>>>
>>> Updated:
>>>
>>> public int findGlyphIndex(int c) {
>>> int idx = c;
>>> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>>
>>> // for most users the most likely glyphs are in the first cmap segments
>>> (meaning the one with
>>> // the lowest unicode start values)
>>> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>>> return mostLikelyGlyphs[idx];
>>> }
>>>
>>> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
>>> if (cmap.get(i).getUnicodeStart() <= idx
>>> && cmap.get(i).getUnicodeEnd() >= idx) {
>>>
>>> retIdx = cmap.get(i).getGlyphStartIndex()
>>> + idx
>>> - cmap.get(i).getUnicodeStart();
>>> if (idx < NUM_MOST_LIKELY_GLYPHS) {
>>> mostLikelyGlyphs[idx] = retIdx;
>>>
>>> }
>>> }
>>> }
>>> return retIdx;
>>> }
>>>
>>> Regards,
>>> Vinesh Kumar. D
>>>
>
> Just for curiosity: Are breaks and returns within loops forbidden in your
> coding conventions? ;-)
>
> By the way, if this is really a performance bottleneck and the number of
> segments are typically larger (say e.g. >= 10), I would sort the segments by
> their starts and convert the three values into arrays (during object
> construction) and would perform a binary search on the starts, then test for
> the end and finally calculate the index.
>
> Regards,
> Klaus
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>



--
pascal

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: FOP2.0 taking more time format complex script documents

Matthias Reischenbacher
In reply to this post by dvineshkumar@gmail.com
Hi,

thanks for your analysis. I've committed a fix as part of
https://issues.apache.org/jira/browse/FOP-2530

Best regards,
Matthias

On 11.08.2015 11:36, [hidden email] wrote:

> Hi,
>
> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
> even after a glyph character is found. Updated the findGlyphIndex() method
> to terminate the loop once the glyph character is found and performance got
> improved much. Refer below existing and updated method.
>
> Existing:
>
>  public int findGlyphIndex(int c) {
>         int idx = c;
>         int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
>         // for most users the most likely glyphs are in the first cmap
> segments (meaning the one with
>         // the lowest unicode start values)
>         if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>             return mostLikelyGlyphs[idx];
>         }
>         for (CMapSegment i : cmap) {
>             if (retIdx == 0
>                     && i.getUnicodeStart() <= idx
>                     && i.getUnicodeEnd() >= idx) {
>                 retIdx = i.getGlyphStartIndex()
>                     + idx
>                     - i.getUnicodeStart();
>                 if (idx < NUM_MOST_LIKELY_GLYPHS) {
>                     mostLikelyGlyphs[idx] = retIdx;
>                 }
>             }
>         }
>         return retIdx;
>     }
>
> Updated:
>
> public int findGlyphIndex(int c) {
> int idx = c;
> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
> // for most users the most likely glyphs are in the first cmap segments
> (meaning the one with
> // the lowest unicode start values)
> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
> return mostLikelyGlyphs[idx];
> }
>
> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
> if (cmap.get(i).getUnicodeStart() <= idx
> && cmap.get(i).getUnicodeEnd() >= idx) {
>
> retIdx = cmap.get(i).getGlyphStartIndex()
> + idx
> - cmap.get(i).getUnicodeStart();
> if (idx < NUM_MOST_LIKELY_GLYPHS) {
> mostLikelyGlyphs[idx] = retIdx;
>
> }
> }
> }
> return retIdx;
> }
>
> Regards,
> Vinesh Kumar. D
>
>
>
>
> --
> View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-to-format-complex-script-documents-tp42461p42749.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]