Sunday, April 08, 2007

What's Wrong With JBed

In my last post I was fairly blunt in my dislike for JBed, Esmertec's pseudo-JVM for low-end devices such as most of LG's range (including the Chocolate).

Why Is It Different?
When you install content onto a JBed device, after the content has arrived it must go through a compilation stage where it is compiled into a native machine code - it is run native, rather than as interpreted bytecode in a sandboxed JVM. The native code still retains the garbage collection, threading and permissions models of a MIDP/CLDC JVM so code can in theory run just as it would traditionally; I have absolutely no idea if there are any security holes introduced by this process, this could be a fruitful area for a security researcher but I currently have no experience that suggests this is a risk. Array bounds checking and the like appear to still be implemented correctly.

Esmertec claim a performance increase of "up to 20x" from this process. That suggests the underlying hardware this JVM usually runs on must be really poor, because performance is pretty typical for low-end devices with equivalent speed in the non-Java UI. I'm sure there is some advantage gained to the manufacturer in slightly reducing the electronics cost of the device

Why Is It Bad?
The addition of a compilation stage is not, per se, bad - in fact if it really does transparently improve performance without affecting the way the code runs, it's great. Note the qualifiers there though: the reason I dislike it is that sometimes, arbitrarily, it fails to compile the code - code which a few small source changes ago compiled fine - just like the early Symbian MontyThread bugs where a small change would kill your code for the 6600 v3.4 firmware. This sort of unpredictability has been greatly reduced from the early days, when the MIDP1 LGs would just arbitrarily fail things all over the place, but it still exists.

Unpredictable builds are a nightmare, greatly increasing QA time for every tiny change. Do you single out these devices for early and repeated testing so you can trap failures immediately and pin down exactly what change triggered the problem, or is it better to just leave them to the end and accept days of pain if the build happens to fail, tweaking every line until it works? Whenever possible I now opt for the latter as failures can come and go as they please, and I make it clear that these problem devices are troublesome and will only be supported if possible, but this doesn't help when you have to patch code already out in the wild and the problems suddenly surface from nowhere.

I'll repeat, failures at compile time have become much less frequent on more recent MIDP1 devices like the Chocolate, so this is becoming less of an issue - but the unpredictability remains in other areas. Don't try talking to Esmertec about the flaws though, you will be greeted with complete silence and a total lack of useful advice.

ZLib
The ZLib deflate/inflate compression system underpins two major parts of JavaME: the Jar file format used to package applications (which is really just a special case of zip), and the mandatory PNG file format in which almost every image you will use is encoded.

The JBed JVM as implemented on, for example, the Chocolate has problems with both of these file types (it presumably uses the same flawed inflate algorithm for both). With conventional tools, you rarely hit problems but if you use Ken Silverman's extremely useful tools you will hit more problems because they use clever heuristics out on the edges of the deflate spec to improve compression, resulting in Zlib blocks that the normal Zlib will never produce and therefore presumably never appear in the test suites for Esmertec's inflate algorithms (I am assuming that they have test suites).

If you use KZip to create smaller Jars, sometimes they will fail - some files just won't be able to decompress so either the install will fail or some resources will not load correctly at runtime. This seems to go away if you use the conventional Jar tool, so no big problem - your users will pay a little extra for those spare Kilobytes, but it won't be more than 1-2% more.

If you use PNGOut you have larger problems. On many devices, including S40 MIDP2 handsets like the 6230, maybe 0.5% of files run through PNGOut will fail and you just have to resave them in a more conventional tool and they'll work again. These failures can actually be predicted at build time so it's not the end of the world - you aren't going to kill the QA team by making them check every PNG in every game.
On the Chocolate, however, you see more like 10% of the PNGOut-compressed images failing to load, for a wide range of unpredictable reasons (unpredictable in that, without the source code for their inflate algorithm, I really can't be bothered to find the reasons which seem completely aribitrary and random). So, basically, you can't use PNGOut for images intended for the Chocolate - which means the whole device (and others on this platform) have to be treated as special cases with different resources. Because you can't predict when an image will fail, you still have to thoroughly test that every image has loaded every time the resources change, because it may be possible that a PNG created through a normal tool will also occasionally break the loader. I've enough experience with other devices to say this won't happen, but with JBed devices you just can't take the chance.

Not Just ZLib
I've had certain resource files refuse to load completely, even when I deliberately make sure they are not compressed at all inside the jar. They just throw an exception and you end up with nothing. It's impossible to say what causes this, but if you adjust the first few bytes they will suddenly magically load (whatever the compression level of the file). So maybe the file is accidentally colliding with the magic bytes at the front of a JBed-compiled class - because you get a similar error if you try to open a class file. Or maybe it's something completely different, because you just don't know what is happening to the contents of your jar inside this opaque compilation step.

These are my key problems with the device. There are other bugs, but you accept that they exist and work around them just like with any other device on any other JVM. But it's very hard to live with completely unpredictable failure of some builds, sometimes. It annoys the hell out of your testers and your developers, and it makes managing big complicated builds a nightmare. The absence of support from Esmertec just rubs salt in to the wounds, and it's not a huge consolation to know that it's the same salt Nokia et al use when refusing to acknowledge any bug in their forums, because the JBed wounds are that much bigger and more common.

Labels:

6 Comments:

Blogger Guilhem said...

Hello,

I just quickly read through your two recent articles on JBed and on Java developer support by top5 OEMs (or rather the lack terefo by the Koreans)... and something strikes me:
1- you dislike JBed on LG handsets
2- you seem to somewhat like SonyEricsson's Java support

Did you know that both brands use JBed ?
Could it "just" be that LG has done a poor job at integrating JSRs and Java app user experience, whereas SEMC has done an excellent one ?

10:17 AM

 
Blogger raddedas said...

Hmmm - I have to say I don't believe SE do run on Jbed, full argument in my latest post: http://techype.blogspot.com/2007/04/sony-ericsson-in-jbed-with-esmertec.html

1:46 PM

 
Blogger Ed said...

Hi raddedas,
Thanks for the insightful blog entry. My own experience of the LG chocolate closely corresponds to what you found.
One of your comments about PNGOUT interest me though:
"0.5% of files run through PNGOut will fail ... These failures can actually be predicted at build time.."
Can you say how exactly you predict at build time?
because I am having the same problems with PNGOUT compressed images on a Motorola V3x.
(By the way I found that the /kp option fixes some problems but not all)

7:25 PM

 
Blogger Wex said...

Hi, I'm currently porting to devices with JBed (specifically Panasonic X400), and am having all the problems that you've mentioned. I'm wondering if you might be able to give any deeper details about the issues?

Such as what 'header' bytes have caused failiures? And you also mention 'other bugs' towards the end of the post. Are these anything JBed-orientated?

I'm basically stuck, and seaching for any sort of light at the end of the tunnel...

12:51 PM

 
Blogger mika said...

very insightful article... i share your pain with JBED and in particular LG devices.

8:59 PM

 
Blogger Brian said...

Sony-Ericsson definitely uses JBed. We just hit a pre-compilation problem in which an OutOfCodeSpace exception was returned during pre-compilation. The exception began with com.jbed...

4:52 PM

 

Post a Comment

<< Home