Monday, June 6, 2011

The quest for the ultimate feed parser in Android

Last week we started a new project at NASA Trained Monkeys that requires to parse feeds from well-known sources, such as CNN or the BBC. Piece of cake, you say to yourself, I'll do a quick search for the best pure Java RSS/Atom feed parser and get rolling in seconds. Well... not so fast. It is not so straightforward.

The problem
Android uses Dalvik as its virtual machine, which includes a subset of the Apache Harmony Java implementation. As a result, all the software that relies for example on AWT or Swing is not going to run under Dalvik. This software includes one of the most popular feed parsers for Java: rome.

For a full list of the J2SE packages that are not supported go here.

The options
It's a lightweight Android library hosted at Github that can be easily integrated to your project via maven or the standard library management.  The syntax is really nice and simple.

The main drawback is that it only supports RSS 2.0, so it doesn't suit our current needs.

Cool project. Unlike the other parsers you have to register a listener to hook into the different parsing stages. It is fully customizable.

The downside: it used to be in the sandbox of the Apache Commons project, but it has been moved to the Dormant category. This means that there is no active development and that the project is unlikely to be continued anytime soon. The projects in this category have to be built from source.

  • Build your own
There is a nice article covering the basics of working with XML in Android and feed parsers here. If you plan on building your own feed parser you should consider forking the android-rss project and adding support for the missing formats instead.

You can't use the rome feed reader as-is because it makes heavy use of java.bean classes, which are not present in Dalvik. This project is a repackaging of rome and jdom so they work properly on Android. The only thing I don't like about rome is the lack of support for generics and all the goodies introduced in Java 5, as it targets a lower version of the JDK.

The syntax is really straightforward:

URL feedUrl = new URL(url);

SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new XmlReader(feedUrl));

List entries = feed.getEntries();
Iterator iterator = entries.listIterator();
   
while (iterator.hasNext()) {
 SyndEntry entry = (SyndEntry) iterator.next();
 // Do stuff with each entry
}

Conclusion
We are sticking to the repackaged rome option for our current project. It would be great to contribute to android-rss to make it fulfill our requirements, as it seems to be the only option that is aimed specifically at Android.

What other tools are you guys using? I'll be glad to hear about some alternatives.