Generate PDF with Flying Saucer and iText by Adding Font Files

In this post, I would like to show how to load font files and format specific sections of the document at run time, when generating PDF document using Flying Saucer and iText libraries.

I will create a String that has a structure of valid XHTML file. The content of String is what going to be generated as a PDF document. Flying Saucer knows how to render XHTML. Before generating the PDF, I will load a font file for future formatting. Once the font file is loaded, I will retrieve the font family and will apply it as a formatting style to the selected paragraph of my future PDF document.

import java.io.*;
import com.lowagie.text.pdf.*;
import org.xhtmlrenderer.pdf.*;

public class TestFont {

	public static void main(String[] args) {
	try  {
		ITextRenderer renderer =  new ITextRenderer();
		File fontDir = new File(SOME_ABSOLUTE_PATH_TO_YOUR_FONT_DIR);

		//Build valid XHTML source for parsing
		StringBuffer buf = new StringBuffer();
		buf.append("<html>");
		buf.append("<head>");
		buf.append("</head>");
		buf.append("<body>");

		String body = "This is formatted paragraph";

		//Gets TTF or OTF font file from
		//the font directory
		if (fontDir.isDirectory()) {

			//Only add fonts with specific extensions
			File[] files = fontDir.listFiles( new FilenameFilter() {
			public boolean accept(File dir, String name) {
				String lower = name.toLowerCase();
				//Load TTF or OTF files
				return lower.endsWith(".otf") || lower.endsWith(".ttf");
				}
			});

				if (files.length > 0) {
					String fontFamilyName = "";
					//You should always embed TrueType fonts.
					renderer.getFontResolver().addFont(files[0].getAbsolutePath(),
BaseFont.IDENTITY_H, BaseFont.EMBEDDED);

					//Get font family name from the BaseFont object.
					//All this work just to get font family name
					BaseFont font = BaseFont.createFont(files[0].getAbsolutePath(),
BaseFont.IDENTITY_H , BaseFont.NOT_EMBEDDED);
					fontFamilyName = TrueTypeUtil.
					getFamilyName(font);

					if (!fontFamilyName.equals("")) {
					//Wrap DIV with font family name around the content
						body = "<div style="font-family: " + fontFamilyName + ";">" + body + "</div>";
					}
				}
		}

		buf.append("<p>This paragraph is unformatted</p>");
		buf.append("<p>" + body + "</p>");
		buf.append("</body>");
		buf.append("</html>");

		byte[] bytes = buf.toString().getBytes("UTF-8");

		ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
		DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
		InputSource is = new InputSource(bais);
		Document doc = builder.parse(is);

		renderer.setDocument(doc, null);
		renderer.layout();

		String filename = "document.pdf";
		BufferedOutputStream bufferedOutput = new BufferedOutputStream(new
FileOutputStream(filename));

		renderer.createPDF(bufferedOutput);
		bufferedOutput.flush();
		bufferedOutput.close();
		}
		catch (Exception e)
		{
			System.out.println(e.getMessage());
		}
	}
}

I hope by looking at the source code, the concept of how to add fonts and retrieve font family was clear.

Regards,
Alex

Redefining Web Applications with AJAX, Servlets and JSON

In this article I would like to show how JSON (JavaScript Object Notation) and Java servlet can be used together in a little AJAX (Asynchronous JavaScript and XML) application.

To give brief description to those who are not closely familiar with JSON -

JSON is a lightweight syntax for representing data, which makes working with it much more pleasant than with XML and makes AJAX applications faster. Also, when working with JSON, there is no need for an XML parsing.

In the following example, I am going to create a callback servlet that fetches and parses an RSS feed. Then the parsed feed data is passed to the client side in a form of JSON. The data then formatted and presented to the user. The client uses AJAX call to query the servlet.

For this application, I used three third-party libraries:

  1. JSON library provided by JSON.org and extended by JSON-RPC-Java which allows to create and easily parse JSON data through Java code. This library can run in a Servlet container such as Tomcat, JBoss and other J2EE Application servers.
  2. Project ROME
    ROME is an set of open source Java tools for parsing, generating and publishing RSS and Atom feeds.
  3. JDOM XML parser
    JDOM is a Java-based “document object model” for XML files. JDOM serves the same purpose as DOM, but is easier to use

The libraries are included in the source code which accompanies this article. This application example is also included as a WAR archive, ready to be deployed on Tomcat.

The following is my servlet implementation. The servlet fetches and parses feed data. The JSON library mentioned previously allows me easily to create and populate JSON object.

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.json.JSONArray;
import org.json.JSONObject;

import com.sun.syndication.feed.synd.SyndEntry;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.fetcher.FeedFetcher;
import com.sun.syndication.fetcher.FetcherException;
import com.sun.syndication.fetcher.impl.FeedFetcherCache;
import com.sun.syndication.fetcher.impl.HashMapFeedInfoCache;
import com.sun.syndication.fetcher.impl.HttpURLFeedFetcher;
import com.sun.syndication.io.FeedException;

/**
* @author Alexander Zagniotov (http://javabeans.asia)
*/
public class JsonServlet extends HttpServlet {

	private static final long serialVersionUID = 1L;
	private static final String BLOG_URL = &quot;http://javabeans.asia/rss.xml&quot;;
	private static final String CONTENT_TYPE = &quot;application/json&quot;;
	private FeedFetcherCache feedInfoCache = null;
	private FeedFetcher feedFetcher = null;

	public void init(ServletConfig config) throws ServletException {
		super.init(config);
		feedInfoCache = HashMapFeedInfoCache.getInstance();
		feedFetcher = new HttpURLFeedFetcher(feedInfoCache);
	}

	public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

		SyndFeed feed = this.feedFethcer(BLOG_URL);
		if (feed != null) {
			String json = this.feedToJSON(feed);
			response.setContentType(CONTENT_TYPE);
			response.setHeader(&quot;Cache-Control&quot;, &quot;no-cache&quot;);
			response.getWriter().write(json);
		}
	}

	private SyndFeed feedFethcer(String url) {
		SyndFeed feed = null;
		try {
			feed = feedFetcher.retrieveFeed(new URL(BLOG_URL));
		} catch (IllegalArgumentException e) {
			e.printStackTrace();
		} catch (FeedException e) {
			e.printStackTrace();
		} catch (FetcherException e) {
			e.printStackTrace();
		} catch (MalformedURLException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
		return feed;
	}

	private String feedToJSON(SyndFeed feed) {

		JSONObject jsonObj = new JSONObject();
		JSONArray jsonEntryTitles = new JSONArray();
		jsonObj.put(&quot;blogtitle&quot;, feed.getTitle());
		jsonObj.put(&quot;blogdescription&quot;, feed.getDescription());
		jsonObj.put(&quot;bloglanguage&quot;, feed.getLanguage());
		jsonObj.put(&quot;bloglink&quot;, feed.getLink());
		jsonObj.put(&quot;author&quot;, feed.getCopyright());

		List&lt;?&gt; feedEntries = feed.getEntries();

		for (Object c : feedEntries) {
			SyndEntry syndicateEntry = (SyndEntry) c;
			jsonEntryTitles.put(syndicateEntry.getTitle());
		}

		jsonObj.put(&quot;blogentrytitles&quot;, jsonEntryTitles);
		return jsonObj.toString();
	}
}

As you can see it is very easy to construct JSON objects and arrays on the server side and pass them to the client. For the purpose of this example I am getting my data from RSS feed, but the data can also be coming from DB etc.

The following is my client implementation. The client queries the servlet using AJAX call. When an AJAX call returns a response from the servlet in a form of JSON object, the object data is formatted and information about the RSS feed is presented to the client:

&lt;html&gt;
    &lt;head&gt;
    &lt;title&gt;Java Beans dot Asia&lt;/title&gt;

    &lt;script language=&quot;JavaScript&quot; type=&quot;text/javascript&quot;&gt;

        var httpRequest = null;

    function getDescriptionAsJSON() {
        var description = document.getElementById('description');
        description.innerHTML = &quot;Loading, please wait ...&quot;;

        var url = &quot;http://localhost:8080/json/json&quot;;
        if(window.XMLHttpRequest){
            httpRequest = new XMLHttpRequest();
        } else if(window.ActiveXObject){
            httpRequest = new ActiveXObject(&quot;Microsoft.XMLHTTP&quot;);
        }

        httpRequest.open(&quot;GET&quot;, url, true);
        httpRequest.onreadystatechange = handler;
        httpRequest.send(null);
    }

    function handler() {
        if (httpRequest.readyState == 4) {
            if (httpRequest.status == 200) {
                processJSON(httpRequest.responseText);
            }
        }
    }

    function processJSON(jsonObjectString) {

        var description = document.getElementById('description');

        //Since JSON is a subset of JavaScript, I am using
        //JavaScript's own compiler to parse JSON in one line!

        var jsonObject = eval('(' + jsonObjectString + ')')
        var text = &quot;&quot;;

        text += &quot;Author: &quot; + jsonObject.author + &quot;&lt; br /&gt;&quot;;
        text += &quot;Blog Name: &quot; + jsonObject.blogtitle + &quot;&lt; br /&gt;&quot;;
        text += &quot;Blog URL: &quot; +    jsonObject.bloglink + &quot;&lt; br /&gt;&quot;;
        text += &quot;Blog Description: &quot; +    jsonObject.blogdescription + &quot;&lt; br /&gt;&quot;;
        text += &quot;Blog Language: &quot; + jsonObject.bloglanguage + &quot;&lt; br /&gt;&quot;;

        description.innerHTML = text;
        var entries = &quot;Last &quot; + jsonObject.blogentrytitles.length + &quot; blog entries are:nn&quot;;

        for (var index = 0; index &lt; jsonObject.blogentrytitles.length; index ++) {
            entries += (index + 1) + &quot;: &quot; + jsonObject.blogentrytitles[index] + &quot;n&quot;;
        }
        alert(entries);
    }

    &lt;/script&gt;

    &lt;/head&gt;
    &lt;body&gt;
        &lt;img src=&quot;images/javabeansmugshot_120x120.jpg&quot; border=&quot;1&quot; /&gt;&lt; br /&gt;&lt; br /&gt;
        &lt;div id=&quot;description&quot;&gt;&lt;/div&gt;&lt; br /&gt;&lt; br /&gt;
        &lt;a href=&quot;javascript:void(0)&quot; onclick=&quot;return getDescriptionAsJSON();&quot;&gt;Click to get description!&lt;/a&gt;
    &lt;/body&gt;
&lt;/html&gt;

As you could see, JSON data can be easily parsed on the client side with the help of Java script eval() function. To remind – JSON is a subset of Java script, therefore eval() will produce a valid object.

Keep in mind, that there is a need for extra care when using eval. The problem is that eval will compile and execute Java script code that coming back from the response. This could cause a security risk if the response data is coming from an untrusted source.

That’s it. I hope this example was clear and helpful :)

source json servlets ajax

Please note that this example was tested by me and its working fine. The source code as mentioned previously is included as Eclipse project. You can simply create a new Java project from the existing Ant build.xml file.

Comments/flames are appreciated :)

Cheers

Rule Engine Stress Testing

I came across a blog by a company called Illation. What those guys do is compare performance of several rule engines available on the market: Drools, ILog, OPSJ and Jess.

The stress tests cover different aspects, for example:

  • Rules firing time
  • Data load time
  • Memory usage
  • Pre-run memory used
  • Post-run memory used

The test results available on their blog for the wide public. The team also makes business rules, object model and datasets used in their stress tests available for download if someone wishes to repeat the tests. Some of the results look very interesting

Brainteaser: Overridable methods

Consider the following case of inheritance:

public class  Parent {
   public Parent()  {
	getValue();
   }
   public void getValue()  {

   }
}

public class  Child extends Parent {
   private final Integer integer;
   public Child()  {
	integer = new Integer(888);
   }

   @Override
   public void getValue()  {
	System.out.println(integer);
   }
}

Question: What would the following program print, why?

public class  Test {
   public static void main(String[] args)  {
	Child child = new Child();
	child.getValue();
   }
}

Lets assume that getValue() implementation in Child class was changed to:

@Override
public void getValue()  {
     System.out.println(integer.toString());
}

Question: What would the output of the Test class be now, why?

Online PDF Generator

Online PDF generator now available. This simple but useful online PDF generator tool allows you to generate PDF document online from HTML snippets. The tool uses Flying Saucer library.

The PDF is generated with full compression, including meta data in PDF properties. Keep in mind that for successful PDF generation, your HTML markup has to validate as XHTML 1.0 Transitional, which means <img> becomes <img /> or <p> must be terminated by the matching </p>.

You can provide URL to your own CSS file with styles for your HTML snippet or to provide a “style” attribute on HTML elements. If URL or “style” attribute are not specified, default system styles will be applied on your HTML snippet.

I hope this helps.

Cheers

Export Pebble Blog Entry to PDF Plugin

In one of my previous posts, I described how I implemented a plug in for Pebble blogging software that allows export of blog entries to PDF.

Today, I have made modifications to enable plug in to load font files at run time during PDF generation to provide support for additional non-Latin languages. So now, it is possible to export non-Lain characters from blog entries to PDF.

Below, I’ve added some content using different languages. To allow support for non-Latin languages, I am using font file cyberbit.ttf by Bitstream. You can test the plugin’s multilingual support by generating PDF from the current post.

I think its comes out quite nicely. The only hitch at this moment is that Flying Saucer does not have support for right-to-left text in PDF yet.

Japanese Kanji:
五輪代表

Japanese Hiragana
こんにちは、これは真実を決定するためにはテストテキストです。

Japanese Katakana
ラドクリフ、マラソン

Japanese Kokuji
和製漢字

Chinese Simplified:
您好,这是一个测试文本,以确定事实真相

Chinese Traditional:
您好,這是一個測試文本,以確定事實真相

Korean:
안녕하세요,이 사실을 확인하는 테스트 텍스트입니다

Arabic:
مرحبا ، هذا هو اختبار لتحديد نص الحقيقة

Hebrew:
שלום, זוהי בדיקה טקסט כדי לקבוע את האמת

Russian:
Привет, это тест текста, чтобы определить истину

Greek:
Γεια σας, αυτό το κείμενο είναι μια δοκιμασία για τον προσδιορισμό της αλήθειας

Thai:
สวัสดีนี่คือการทดสอบข้อความเพื่อตรวจสอบความจริง

Vietnamese:
Xin chào, đây là một bài kiểm tra văn bản để xác định sự thật

Turkish:
Merhaba, bu gerçeği belirlemek için bir test metin

Brainteaser: Broken Case of Inheritance

Consider the following case of inheritance:

public class ExtendingHashSet<E> extends HashSet<E>  {
   private int counter = 0;

   public ExtendingHashSet() {

   }

   @Override
   public boolean add(E e)  {
      counter++;
      return super.add(e);
   }

   @Override
   public boolean addAll(Collection&lt;? extends E&gt; c)  {
      counter += c.size();
      return super.addAll(c);
   }

   public int getCounter()  {
      return counter;
   }
}

Created instance:

ExtendingHashSet<String> s = new ExtendingHashSet<String>();
s.addAll(Arrays.asList("one", "two", "three"));

Question: What value would s.getCounter() method return at this point and why?

Looking forward for your answers dear readers

How to Prevent iFrame Breakaway

Few days ago I was searching for a solution to the problem I’ve encountered – I needed to prevent a third party page to break out of iframe inside a web page of my web application. For people who are not closely familiar with JavaScript, the following JS snippet will make it more clear how page can break out of iframe:

if (top.location.href != self.location.href)  {
    top.location.href = self.location.href;
}

If the current page is not the parent window – become the parent window.

I needed to implement something on my end, that would block or prevent the above script or similar to it from executing. I’ve spent several hours browsing the Net, talking to people on IRC and simply playing trial and error.

After some time, I understood that I wont be able to find a solution to my problem, simply because there is none unfortunately. But, having said that, I have some findings to share:

  1. There is iframe security attribute which only works on IE. Setting this attribute to security=”restricted”, will prevent iframe to break out. Its always “nice” to see that MS have few tricks up their sleeve :) . Also, on one of the forums, someone mentioned that the same attribute will work under Opera as well as under IE. I personally haven’t tested it my self under Opera, I can just say that it works for IE and not FF.
  2. To make use of window.onbeforeunload event and prompt user with a dialog that requires user’s input if he agrees to navigate away from the current page. If user disagrees (clicks “cancel”), he will remain on the current page. So here in a sense iframe breakaway was canceled. By the way, there is no way to suppress the dialog prompt and make event from clicking “cancel” default.
  3. To grab the content of third party page using PHP Curl lib and to create your own placeholder page for that content. Then the placeholder page can be put inside iframe. The page or the grabbed content will not attempt to breakout, but any request submitted to the placeholder page (hyper link or button click on the grabbed content) will cause page to unload.

Also, while researching, I came across this post that talks about preventing iframe breakaway and click jacking with the help of 204 header response code.

After all that, my conclusion is:
If the page inside iframe is not yours, in other words it is a page hosted under another domain, its not possible actually to stop a page from unloading. Having something like that, would allow malicious sites to “trap” a user indefinitely.

I would love to hear any other suggestions regarding iframe breakout you may have dear readers.

Cheers

Export to PDF using iText and Flying Saucer

In my previous post I attempted to generate PDF on the fly using iText library. My goal was to parse HTML snippet into PDF. Unfortunately, as I discovered iText alone is not powerful enough as HTML parser. iText is not flexible enough to manipulate the CSS. Its understandable, since iText‘s main functionality is PDF generation and not HTML parsing.

While trying to find workaround iText limitations, I came across Flying Saucer Java library. Flying Saucer is XML/XHTML/CSS 2.1 renderer, that uses iText and allows to render CSS stylesheets and XHTML, either static or generated, directly to PDFs.

I want to say that Flying Saucer does a beautiful job. You can check this out by trying to export current post to PDF :)

Joshua Marinacci, the Flying Saucer project lead wrote a nice tutorial that explains how to generate PDF using Flying Saucer.