Export Pebble Blog Entry to PDF Plugin

In one of my previous posts, I described how I implemented a plug in for Pebble blogging software that allows export of blog entries to PDF.

Today, I have made modifications to enable plug in to load font files at run time during PDF generation to provide support for additional non-Latin languages. So now, it is possible to export non-Lain characters from blog entries to PDF.

Below, I’ve added some content using different languages. To allow support for non-Latin languages, I am using font file cyberbit.ttf by Bitstream. You can test the plugin’s multilingual support by generating PDF from the current post.

I think its comes out quite nicely. The only hitch at this moment is that Flying Saucer does not have support for right-to-left text in PDF yet.

Japanese Kanji:
五輪代表

Japanese Hiragana
こんにちは、これは真実を決定するためにはテストテキストです。

Japanese Katakana
ラドクリフ、マラソン

Japanese Kokuji
和製漢字

Chinese Simplified:
您好,这是一个测试文本,以确定事实真相

Chinese Traditional:
您好,這是一個測試文本,以確定事實真相

Korean:
안녕하세요,이 사실을 확인하는 테스트 텍스트입니다

Arabic:
مرحبا ، هذا هو اختبار لتحديد نص الحقيقة

Hebrew:
שלום, זוהי בדיקה טקסט כדי לקבוע את האמת

Russian:
Привет, это тест текста, чтобы определить истину

Greek:
Γεια σας, αυτό το κείμενο είναι μια δοκιμασία για τον προσδιορισμό της αλήθειας

Thai:
สวัสดีนี่คือการทดสอบข้อความเพื่อตรวจสอบความจริง

Vietnamese:
Xin chào, đây là một bài kiểm tra văn bản để xác định sự thật

Turkish:
Merhaba, bu gerçeği belirlemek için bir test metin

Export to PDF using iText and Flying Saucer

In my previous post I attempted to generate PDF on the fly using iText library. My goal was to parse HTML snippet into PDF. Unfortunately, as I discovered iText alone is not powerful enough as HTML parser. iText is not flexible enough to manipulate the CSS. Its understandable, since iText‘s main functionality is PDF generation and not HTML parsing.

While trying to find workaround iText limitations, I came across Flying Saucer Java library. Flying Saucer is XML/XHTML/CSS 2.1 renderer, that uses iText and allows to render CSS stylesheets and XHTML, either static or generated, directly to PDFs.

I want to say that Flying Saucer does a beautiful job. You can check this out by trying to export current post to PDF :)

Joshua Marinacci, the Flying Saucer project lead wrote a nice tutorial that explains how to generate PDF using Flying Saucer.

Export to PDF Using iText Java-PDF Library

I had some time during this weekend, so I used iText, free Java-PDF library to make a plug in for Pebble blogging software. This plug in now allows to export blog entries to PDF document.

I liked this library, except one thing – converting HTML snippets to PDF. The library allows you to set styles to HTML tags during export.

The conversion is done with the help of HTMLWorker class. It is also possible to assign different styles to tags supported by HTMLWorker:

ol ul li a pre font span br p div body table td th tr i b u sub sup em
strong s strike h1 h2 h3 h4 h5 h6 img

Unfortunately there isn’t much documentation on what you can do for styles. So after poking through the source code, and going through iText mailing lists for examples, my results were a bit disappointing.

The PDF export works fine, except the case when blog entry has images. In that case, images exported to PDF having text overlaying on top of them.

I am hoping, that some of the people who had done a lot of work in the past using iText, will be able to share their experience.

Recent update:
In my later post, I talk about Flying Saucer Java library, which is XML/XHTML/CSS 2.1 renderer, that uses iText and allows to render CSS stylesheets and XHTML, either static or generated, directly to PDFs.