Fun with manually diffing Java bytecode instructions

I recently found myself writing code against a simple library that was distributed in the form of a .class file. For a few reasons (laziness being one of them), I decompiled the class file using jd-gui and just added it to the source path of my project. Decompiling binary Java code and using it in a project is fairly routine and I’ve never had a problem with it before. This time, however, I noticed slightly anomalous behavior from the code.

Swapping it out with the binary version of the class fixed the problem, which was good, but left me wondering – why did the decompiled version behave slightly differently? Decompiling Java code is usually pretty safe, and if it had messed up, I’d expect a more immediate problem like a compilation failure. Intrigued, I decided to spend a few minutes figuring out just why decompiling a class and using the source to compile with had led to subtly different behavior.

Java Bytecode

One of the interesting things about Java is that the compiler (javac) doesn’t do very much optimization at all – code optimization occurs at runtime. The resulting bytecode – if deliberate obfuscation steps aren’t performed – can be easily decompiled and reassembled into Java code, and it is generally quite easy to manually read through to figure out what is going on. So, it seems like a reasonable place to start if we want to figure out why two pieces of nearly-identical code are behaving differently.

Bytecode Diffing

I’ll save you the tedium of poring through a few hundred lines of bytecode, and just show the interesting part. Here is the result from running javap -c on the original .class file:


108: iload 16
110: i2d
111: iload 17
113: i2d
114: ddiv
115: dstore 18

This is pretty simple – it takes the integer located in variable slot #16, putting it on the stack, then converting that value to a double, and placing it back on the stack. It does so for a second integer, then does double division and assigns the double result to variable slot #18. We can imagine the original Java code looked something like this:


double x = (double)y/(double)z;

What does it then do with x?


117: dload 5
119: dload 18
121: dcmpg
122: ifge 136

Again, fairly simple code – it is comparing two doubles and branching based on the result. We can imagine the original code looked something like this:


if(a < x){
....
}

Moving on to the code that has been decompiled (by jd-gui), recompiled with javac, then examined with javap -c, the problem is easy to find:


108: iload 16
110: iload 17
112: idiv
113: i2d
114: dstore 18

This might appear to be fairly similar, but there is an important difference. Here is what this block of code does: Push an int (#16) on to the stack. Push another int (#17) onto the stack. Perform integer division (which rounds the result to the integer closest to 0). Convert the integer to a double (i2d), then store the double in #18. The decompiled Java code looked something like this:


double x = y/z;

For any non-trivial piece of code, you'd have to get fairly lucky to spot this problem. For one, when we see this block of code, we don't actually know the types of y and z - and there could be a lot of trivial operations like this. Second, there are cases where rounding is perfectly valid.

This small bug led to the comparison behaving incorrectly in some cases (ie, if a = 2.0 and we compare what should be 2.5, they'll actually be equal due to rounding), which led to statistical anomalies in the output.

Decompiler Bugs

So, this is obviously a pretty simple decompiler bug. How did it happen? Well, remember when I told you that javac created bytecode that pretty closely mirrored the Java code? One of the small things it does is automatically insert primitive widening conversions - that is, it inserts bytecodes (such as i2d) to convert from one primitive type to another when it can guarantee that no loss of precision will occur. Integers to doubles are one of these cases, and you can see in our decompiled example how it automatically inserted an i2d call.

My guess is that some decompilers assume that all widening conversions (such as i2d) are automatically inserted by javac and can be safely elided from the decompiled code - probably to reduce the amount of noise in the code. However, it is quite clear that not all widening conversions are safe to ignore - thinking about it naively, it seems like there would be a set of rules you could follow to determine when it would be safe to ignore them and when it isn't, but I'm not convinced you could ever be 100% correct.

Conclusion

While there probably isn't a ton of useful technical information in this post, I had a lot of fun tracking the problem down - tracking down weird, seemingly impossible problems can be an enjoyable experience, and having some notion of what bytecode is and how it works can come in handy occasionally. This has, however, made me slightly more careful when using Java source code that has been decompiled from its class file format.

No, In-App-Purchases are not a good alternative to paid upgrades

Recently, there has been a lot of discussion about making app development more sustainable – these discussions were intensified by Sparrow’s acquisition by Google last week. This acquisition was bad for users, since it meant there would be no further development on the Sparrow Mail app. One theory being floated is that if app development were more sustainably profitable, fewer teams would be tempted to sell in situations that would harm the long-term future of the app.

One idea for improving the long-term profitability of an app – and thus the amount of effort that goes into continually improving it and adding new features – has been the idea of upgrades. On HackerNews and other forums, however, many folks have claimed that such an option already exists – “upgrades” via the In-App-Purchase (IAP) functionality that exists on all stores today. I wanted to share why I think this is a really poor approach – for both the user and the app developer – and why upgrades are a good way to financially motivate developers to deliver useful, significant upgrades to their existing applications.

The Idea

Proponents of IAP as an upgrade argue that developers should continue to develop the core of an application and sell new features as IAP items. The argument is that users only pay when they derive new value from the application, and only purchase the functionality they want. Admittedly, there is probably some subset of users who would enjoy customizing their software to the max and saving a few bucks in the process, but I think the vast majority of users would find this to be a nuisance.

Nickel-And-Dime Your Users

The main problem with IAP-as-an-upgrade is the fact that your users are going to feel like they are getting nickel-and-dimed to death. Imagine this scenario: You see someone using a cool app and think “Hey, that would be useful to me!” You purchase the app, only to find that it only does about half the things you saw the other person’s app doing – to get the same functionality they had, you are forced to purchase a bunch of “upgrades”, possibly doubling the price of the app in the process. This is not going to delight your users – it will instead confuse them as they try to figure out what combination of features they need to purchase to actually use your app in a way that is meaningful to them.

The other problem with this approach is that it misses on benefit of paid upgrades – usually, the upgrade price is less than the full price, as a reward to your existing customers. This can’t be accomplished with IAP.

Create headaches for yourself

IAP-as-an-upgrade also creates headaches for developers. Improvements to your application can come in many forms – new functionality, improving existing functionality, performance improvements, UI improvements, and more. Rarely do you come up with a single feature that is worth paying for by itself. This approach forces you to think about it in terms of individually sellable items, which can get really complicated when there are feature dependencies. It encourages you to to focus on discrete functionality that can be sold, rather than potentially big general improvements to the application. Some proponents have even suggested including multiple code bases – even if Apple or the other app store maintainers would allow this, I cringe at thinking how ugly of a solution this is.

Upgrades are a good feature

Upgrades have a lot of advantages:
 

  • They align your interests with those of your users – Users want applications they depend on to be maintained and improved for long periods of time, and you want the ability to derive a stable long-term income from that work.
  • By making the upgrade price less than the full price, you reward your existing users.
  • Application-specific settings are maintained in the event of an upgrade.
  • Users don’t end up with a mess of different versions of your application, as happens when developers decide to simply sell a new product as a new version.
  • Upgrades are a simple, well-understood system for delivering new versions of software.

Drawbacks

The main problem with upgrades are transaction fees. It would be expected that an upgrade would cost some fraction of the original price of the application, but with many applications charging $0.99, this doesn’t give a lot of room for discounting – Apple’s fixed cost to run a transaction is rumored to be in the neighborhood of $0.15, which doesn’t leave a lot of profit for them at levels below $0.99 with a 30% cut of the retail price. I think there are a few ways to deal with this:
 

  • $0.99 app upgrades could be discounted less – say, the minimum cost of the upgrade is $0.75.
  • Only apps $1.99 and up are eligible for upgrades.
  • Apple takes a bigger cut of upgrade fees.

IAP is a great feature and is useful in a lot of scenarios, but it doesn’t obviate the need for upgrade functionality. As a user and a (really small-time) app developer, I hope app store maintainers seriously consider offering upgrade functionality in the future, and I hope even more that we don’t see a bunch of developers trying to implement them as IAPs.

Graphical HTTP Client 1.0.7 released

Graphical HTTP Client 1.0.7 was approved on the App Store this week. Here are some of the changes that came with it:

1. Save to File… functionality fixed

2. Support for PATCH requests

3. New cookie functionality

This included a bunch of changes, so I’ll talk a little more about it. In previous versions, cookies were automatically added to the request from the persistent cookie store on your machine. While this was desirable in most cases, it wasn’t very useful when you were trying to test cookie-related functionality. I’ve tried to make changes that preserve the automatic behavior when it is useful to you, while making it possible to customize. Here are the changes:

a. If you add a ‘Cookie’ header, the use of the persistent cookie store is disabled.

b. You can also click on the ‘Cookies’ button to pull up this dialog:

 

 

 

 

 

 

 

This lets you see all the cookies for a given URL/path that would be sent with the request. You can do the following here:

a. Turn off use of the persistent cookie store for this request.

b. Add cookies to or delete them from the persistent cookie store.

c. Manually select cookies you want to include in the request, and then click ‘Use selected cookies as request header’ to have them put into a Cookie header, which is automatically added to your request.

Hopefully this remains easy to use, but allows more flexibility with how you send cookies. If you have any problems or suggestions to make it easier to use, please let me know.

Next Version

I’ve already made a few small changes for 1.0.8, which should be released later this month. A small, but useful, change will be to use UTF8 for decoding response bodies if the response doesn’t include an explicit encoding. This should make things a lot better for users who work with services that use non-latin encodings and don’t specify an encoding in the response.

Windows Version

I’ve had a handful of requests for a Windows version over the past year or so, and have been considering this a little bit more lately. If you or someone you work with is interested in a Windows version, please fill out this short survey so I can have a little bit more information about interest in a Windows version.

State of Authentication

I’ve been working on some new websites the past month or so, and one thing that has me second-guessing myself and generally wasting a bunch of time is authentication. Ever since I read Jeff Atwood’s post¬†on this issue, I’ve been trying to figure out the ideal authentication setup for websites. Currently, I see 4 options:

Classic username/password

Pros:

  • Fairly straightforward to implement.
  • Everyone is used to it.

Cons:

  • A fair amount of work (ie, a signup page, login page, reset password page, session handling, etc)
  • Forces your users to remember (or more likely, re-use) a password.
  • You put your users at risk if you don’t use good practices to manage passwords.
  • Since most users re-use login info across multiple sites, if they get compromised on one of those sites, and unauthorized user can access their account on your site and cause problems.

Facebook Authentication

Pros:

  • Easy to implement.
  • Easier for your users to gain access to your website.
  • A very large percentage of the internet has a Facebook account.
  • No risk of compromising your users.

Cons:

  • Not everyone has a Facebook account.
  • Those who do might not want to associate it with sites they use.
  • Frighteningly large numbers of Facebook accounts are compromised daily, meaning you can still face unauthorized use of your site.

Basic OpenID

Pros:

  • Doesn’t require your users to signup on your site.
  • Doesn’t compromise users if your database is compromised.

Cons:

  • Slightly more difficult to implement.
  • Can be unfriendly to non-technical users.

JanRain Engage (or similar service)

Pros:

  • Very easy to implement (I’ve implemented JanRain in half a dozen languages/platforms now, and it is incredibly easy).
  • Gives your users lots of options (ie, if you offer Google, Facebook, Twitter, Yahoo, LinkedIn, and Windows Live as options, chances are pretty high that any given user is going to use at least one of those).
  • Doesn’t compromise your users if your database is compromised.
  • No requirement for a user to signup on your site, but you can still get a lot of useful data in some cases (although relying on it probably isn’t a good idea).

Cons:

  • Expensive if you create a lot of sites with low revenue per user (JanRain starts at $10/month).
  • Giving your users a lot of options is nice, but for those who have accounts with a lot of the options you provide, it can be tricky to remember which one you used to signup for which sites.
  • Relying on a third-party for authentication can make your website unusable if there is a technical outage, they terminate their relationship with you, or they go out of business. There are ways to mitigate the last two, but they require a lot of work.

Conclusion

I’m still not sure of the best option when it comes to authentication. It is clear to me that requiring a unique (or pseudo-unique, since users tend to use the same login info for every site) login for each site on the internet is a broken paradigm, but the other solutions have drawbacks of their own. I used JanRain on my last 2 sites, and am generally very happy with it, but I am still fairly uneasy about relying on a third-party service to handle such a critical part of my websites.

ScalaCareers is back up

Just wanted to post a quick (and somewhat belated) note about some work I’ve been doing the past few weeks, putting my ScalaCareers.com website back up and adding some new features.

Several years ago, I built it as an experiment with Lift. Then, some months ago, someone (there’s a really good chance it could have been me…) mistakenly shut down the VPS it was running on without realizing it. I tried to get the old codebase running again, but as it was depending on an old snapshot version of Lift, I couldn’t get it fixed very quickly, so I ended up deciding to rewrite it from scratch when I got some time.

This time around, I decided to build the site with Play 2.0, which was a fun experience. I’ll try to find some time this week to post some of my experiences from that.

Scala career outlook

I think the outlook for Scala-related employment is much better now than when it was when I first built the site. While it may never replace Java, it certainly has enough traction now that it is going to stick around and be supported for a while, and more employers are starting to use it and look for developers who are familiar with it.

New Authentication mechanism

I’ve begun to look for ways to avoid forcing users to create a username and password for my sites. Not only is it annoying for me to create all the functionality related to it, but requiring users to remember yet another username and password and exposing them to further risk that their password (which they likely use for a lot of sites) may be compromised (even though I’ve always used pretty good password management practices) isn’t ideal either. I’m also not a fan of forcing users to use Facebook to login, like a lot of sites are these days – not everybody has a Facebook account, and not all of those who do want it tied to everything they do on the internet.

So, I’m experimenting with JanRain, which lets users choose from a handful of OpenID providers. In my case, I’m allowing – OpenID, Google, LinkedIn, Twitter, Facebook, and Yahoo. I realize that still probably doesn’t cover 100% internet users, but I hope it is close enough that it doesn’t scare people away. I realize this isn’t perfect, but I think a long-term migration path away from people having a login for every site is worth some work and short-term pain. Any feedback or thoughts on this is appreciated.

Developer profiles

I know there are a number of folks in the Scala community who want to find some contract or full-time work doing Scala development, but may not be aware of opportunities. So, I added a developer profile section where developers who want to find work can post information about them and some links to show their ability (for example, it has sections to put your GitHub and StackOverflow accounts). This is pretty basic right now, but there is some potentially interesting functionality I want to work on if it looks like it is something people will use.

Feedback and suggestions are welcomed. Hopefully this site adds some value to the employment situation in the Scala community. If not, at least it was a good excuse to have some fun with Scala, Play 2.0, and MongoDB.