New site – CashBack Optimizer

I’m pleased to announce a site I’ve been working really hard on the past month or so – CashBack Optimizer.

If you’re like me and like to maximize the cashback you get from your cashback credit cards by using the most optimal one for a given store/category, you know that keeping track of which card to use where can be a little tricky. So, I built a simple website that shows you the best card for any given category – it automatically tracks rotating bonus categories that Discover, Chase, and others use.

You can even signup (free) and create a wallet, which lists all of your cards and shows you the best card you own in every category. Finally, the site has reviews from a number of users of each card to help you learn more about any cards you may be interested in.

Cashback credit cards can be a great way to earn a little extra money every month (provided you are careful and never carry a balance), so I hope this site is useful to people.

Check it out and let me know what you think!

Books I’ve read in 2015

I read a bunch of books this year – a lot of them really good, some of them less so. Here are some of the more notable ones in a few categories, and a short paragraph about my thoughts on them. I already have a bunch of good ones lined up to read in 2016, but I’d love to hear about any you read (and liked) this year!

History

Dead Wake: The Last Crossing of the Lusitania
by Erik Larson

This was a really interesting book about the sinking of the Lusitania. It approaches the story from three angles: 1) The personal lives of many of the passengers and crew of the ill-fated vessel, 2) all of the things that had to go wrong in order for it to be in position to be sunk, and 3) how engaged Winston Churchill and the Admiralty were in getting the United States to enter World War I, and how disengaged Woodrow Wilson may have been (being more focused on a love interest). Ultimately, there were a number of missed opportunities and a myriad of different ways the Lusitania could have made its voyage safely, and the author seems to question whether the British government at least allowed – if not helped it – to be sunk. The book is gripping and well-researched. 5/5


The Bully Pulpit: Theodore Roosevelt, William Howard Taft, and the Golden Age of Journalism
by Doris Kearns Goodwin

Team of Rivals – written by the same author – is one of my favorite books, so I was excited to read Bully Pulpit this year. This book is interesting, but it is a long read that felt tedious at times. The journalism parts were interesting and were a good introduction to 20th century muckraking, but also made the book less focused. The interactions between Taft and Roosevelt were fun to read and some there are some parallels in the politics of Roosevelt fracturing the Republican Party (Tea Party, anyone?) to ultimately make this worth reading, but this 750 page (admittedly well-researched) behemoth felt unfocused and slow to me. 3/5

In the Kingdom of Ice: The Grand and Terrible Polar Voyage of the USS Jeannette
by Hampton Sides

In the Kingdom of Ice is a fascinating tale of a young and competent captain, eager to be the first person to visit the North Pole for his young country, an ill-fated voyage that is doomed to fail due to poor understanding of the geography of the time but wouldn’t have ended so disastrously if it had not been for a few small events, and the survival of men pushed to limits we can only imagine. This book definitely starts out a little slow with the backstory, but eventually pulls you in as you travel with the men of the Jeannette and eagerly read to find their fate. 4/5


The Box: How the Shipping Container Made the World Smaller and the World Economy Bigger
by Marc Levinson

I never realized how huge container ships were until I was walking near the Oakland-San Francisco bay bridge and saw one of them heading through the bay. This book does a decent job of covering what at first might seem like a dry topic, but was really an interesting dive into a somewhat hidden aspect of our daily lives. It does a good job of showing how the container became dominant, how that affected global trade, and how the system works in general. Some downsides: it got bogged down in a lot of figures that would have communicated the point much better as graphs, it ignored some technical details of how things worked, and it introduces a central character (McLean), but doesn’t get enough detail of his life to satisfy you as a reader. Still, a really good (and relatively quick) read. 4/5

Business

Predictive Analytics, Revised and Updated: The Power to Predict Who Will Click, Buy, Lie, or Die
by Eric Siegel

The latter half of 2015 at my job was spent building a system for analytics, something I already have experience in but wouldn’t consider myself an expert. This book was a reasonably good introduction into the types of things analytics can tell us or help us figure out. It doesn’t really get into the “how” very much or give you really any direction as to where you should turn to find out more about the “how”, but it does at least whet your appetite a bit and give you some ideas. It is a good book for a really high-level look at analytics. 3/5


The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
by Ben Horowitz

This is largely based on a bunch of blog posts by Ben, and the organization suffers from it. It still, however, is a good book that offers some valuable insights into running a startup and leadership in general. Ben is pretty candid about when and how he screwed up, and is open about the hard parts of running a business. It is also a fun look back at the early days of Netscape and the internet in general, which I’m always a sucker for. The rap lyrics to start every chapter were a bit weird. 4/5

Technology

Countdown to Zero Day: Stuxnet and the Launch of the World’s First Digital Weapon
by Kim Zetter

Reverse-engineering, Capture-the-Flag competitions, and computer security in general are big hobbies of mine, so I was really excited to read this book. It covers the Stuxnet virus, which was a clever hack (all but certainly built by the United States and Israel) to cause Iran’s nuclear fuel production equipment to fail, causing setbacks to the program. This is a well-researched book that was interesting enough to keep me up late several nights trying to get to the end of it. This also raises a number of questions – how safe is our infrastructure from attacks like this? What are the ethics of virtual weapons? Highly recommended. 5/5


Turing’s Cathedral: The Origins of the Digital Universe
by George Dyson

The title is somewhat misleading – this isn’t a book about Alan Turing, but rather John von Neumann and the development of the computer in the first few decades after World War II. The history in this thoroughly researched book is amazing, and I really enjoyed learning more about von Neumann and other pioneers of the computing industry. It is, however, somewhat of a dry read. Software developers will probably still find the many stories about the early days of our profession to be fascinating. I’d recommend this book to computer enthusiasts and developers, but probably people who don’t have an inherent interest in the field. 3/5

Hadoop: The Definitive Guide
by Tom White

Hadoop, Sqoop, Spark, YARN, HDFS, HBase, Pig, ZooKeeper… So many weird names, so many things to learn. As someone who has been trying to come up to speed on the Hadoop ecosystem this year, this has been an incredible resource. It isn’t something you are likely to read straight-through (except for the first handful of chapters), but it has been an invaluable resource for the core parts of the Hadoop ecosystem and a good high-level overview of other parts of it. 5/5

Fiction

All the Light We Cannot See
by Anthony Doerr

I don’t read enough fiction, but my wife got me this book that I’d had my eye on for a while. It is about two young children on opposite sides of World War II and how their world’s intersect. One is a blind French girl, the other is a curious and gifted, but naive German boy. The book is beautifully written, if not a little wordy at times. It is a good story and the author is talented, but I found the back-and-forth short chapters to be jarring. There were some historical inaccuracies that were distracting, and the ending felt rushed and, frankly, unsatisfying. Still a good read, and it has won a Pulitzer Prize. 4/5

Data Twister 1.1 released

Data Twister 1.1 has been released to the Mac App Store. It has a few small changes:

1) Fixed an issue where the input/output boxes didn’t scroll correctly in some cases when the text overflowed.

2) Added the ability to load a file to use as the input. This makes it handy for importing large amounts of data and also doing quick hash checks on files.

Get it here.

Data Twister 1.0 released

I’m happy to announce the release of my latest developer-focused Mac OS App – Data Twister. Data Twister is a small, but handy, utility for converting data between various representations. You can input data in plain text (UTF-8, or ASCII), Base64, Hex, and others. You can decrypt/encrypt (only AES in ECB mode is supported right now, but I’m working on others) the data, and then output it in various ways (UTF-8, Base64, hashes like MD5 and SHA, etc). Here is a sample screen:

dt-ss2

 

 

 

 

 

There are lots of web-based tools to do each of these conversions, but having them all in one simple app has been a real timesaver for me as I’ve been using it. It is available today on the Mac App Store for $3.99. Some future enhancements I’m already at work on:

  • More encryption algorithms and modes
  • More input/output types (URL encoded, different text encodings, etc)
  • The ability to have file input/output

I hope you’ll check it out here and let me know what features will be useful for you!

Fun with manually diffing Java bytecode instructions

I recently found myself writing code against a simple library that was distributed in the form of a .class file. For a few reasons (laziness being one of them), I decompiled the class file using jd-gui and just added it to the source path of my project. Decompiling binary Java code and using it in a project is fairly routine and I’ve never had a problem with it before. This time, however, I noticed slightly anomalous behavior from the code.

Swapping it out with the binary version of the class fixed the problem, which was good, but left me wondering – why did the decompiled version behave slightly differently? Decompiling Java code is usually pretty safe, and if it had messed up, I’d expect a more immediate problem like a compilation failure. Intrigued, I decided to spend a few minutes figuring out just why decompiling a class and using the source to compile with had led to subtly different behavior.

Java Bytecode

One of the interesting things about Java is that the compiler (javac) doesn’t do very much optimization at all – code optimization occurs at runtime. The resulting bytecode – if deliberate obfuscation steps aren’t performed – can be easily decompiled and reassembled into Java code, and it is generally quite easy to manually read through to figure out what is going on. So, it seems like a reasonable place to start if we want to figure out why two pieces of nearly-identical code are behaving differently.

Bytecode Diffing

I’ll save you the tedium of poring through a few hundred lines of bytecode, and just show the interesting part. Here is the result from running javap -c on the original .class file:


108: iload 16
110: i2d
111: iload 17
113: i2d
114: ddiv
115: dstore 18

This is pretty simple – it takes the integer located in variable slot #16, putting it on the stack, then converting that value to a double, and placing it back on the stack. It does so for a second integer, then does double division and assigns the double result to variable slot #18. We can imagine the original Java code looked something like this:


double x = (double)y/(double)z;

What does it then do with x?


117: dload 5
119: dload 18
121: dcmpg
122: ifge 136

Again, fairly simple code – it is comparing two doubles and branching based on the result. We can imagine the original code looked something like this:


if(a < x){ .... }

Moving on to the code that has been decompiled (by jd-gui), recompiled with javac, then examined with javap -c, the problem is easy to find:


108: iload 16
110: iload 17
112: idiv
113: i2d
114: dstore 18

This might appear to be fairly similar, but there is an important difference. Here is what this block of code does: Push an int (#16) on to the stack. Push another int (#17) onto the stack. Perform integer division (which rounds the result to the integer closest to 0). Convert the integer to a double (i2d), then store the double in #18. The decompiled Java code looked something like this:


double x = y/z;

For any non-trivial piece of code, you'd have to get fairly lucky to spot this problem. For one, when we see this block of code, we don't actually know the types of y and z - and there could be a lot of trivial operations like this. Second, there are cases where rounding is perfectly valid.

This small bug led to the comparison behaving incorrectly in some cases (ie, if a = 2.0 and we compare what should be 2.5, they'll actually be equal due to rounding), which led to statistical anomalies in the output.

Decompiler Bugs

So, this is obviously a pretty simple decompiler bug. How did it happen? Well, remember when I told you that javac created bytecode that pretty closely mirrored the Java code? One of the small things it does is automatically insert primitive widening conversions - that is, it inserts bytecodes (such as i2d) to convert from one primitive type to another when it can guarantee that no loss of precision will occur. Integers to doubles are one of these cases, and you can see in our decompiled example how it automatically inserted an i2d call.

My guess is that some decompilers assume that all widening conversions (such as i2d) are automatically inserted by javac and can be safely elided from the decompiled code - probably to reduce the amount of noise in the code. However, it is quite clear that not all widening conversions are safe to ignore - thinking about it naively, it seems like there would be a set of rules you could follow to determine when it would be safe to ignore them and when it isn't, but I'm not convinced you could ever be 100% correct.

Conclusion

While there probably isn't a ton of useful technical information in this post, I had a lot of fun tracking the problem down - tracking down weird, seemingly impossible problems can be an enjoyable experience, and having some notion of what bytecode is and how it works can come in handy occasionally. This has, however, made me slightly more careful when using Java source code that has been decompiled from its class file format.