Debugging Difficult Things
One of the unfortunate realities of any application is that it requires maintenance. As technology continues to evolve and operating systems undergo improvements, applications must be updated to continue running smoothly. However, installing updates isn’t as simple as simply hitting the “update” button. In this post, I will explore our recent updating and debugging adventures on a Ruby on Rails application.
We recently needed to update both Ruby and Rails in our Rivals project to deal with depreciation and unsupported features due to version updates. The problem was that both Ruby and Rails must be updated at the exact same time because of the way the dependencies cross. There was no way to do one and then the other.
When you attempt an update of this complexity there are bound to be some hiccups. As a result, the Rails update led to a lot of interesting debugging.
Debugging is the process of removing errors from computer hardware or software. This is a critical skill for developing software. No matter what software you work on, you will inevitably find yourself in a situation where debugging is needed, and quickly.
Let’s look at a few examples of debugging situations in Ruby and Rails:
A Simple Example
The following image is a stack trace from Rails:
As you can see at the top of the notification, we have a runtime error at Totally legit bug. The top of the stack trace is pointing us to ContentsController, line 37.
If you go to line 37 in the code, you can see that it’s raising Totally legit bug for us:
That’s a very simple example. Most of the errors you see in day-to-day debugging are a bit more complicated.
Deeper Down the Rabbit Hole
In the example below, we once again have a stack trace. However, unlike the previous example, this stack trace shows multiple ContentsController callouts:
In this example, how do you know where the error actually occurred?
This is the issue with debugging as it gets more complicated: It’s not always clear where the issue actually lies. In this case, if we start by looking at the very top of the stack trace, ContentsController, line 214, we discover that there’s no bug in line 214:
Line 157 is the actual cause of the error:
In this example, line 157 is passing nil for the number of max-loaded articles, so line 214 is showing an error when it receives the unexpected nil value.
When the top of the stack trace isn’t directly the source of the error, it makes it harder to track down where the issue actually lies.
This is a case that I directly ran into during the Ruby and Rails upgrade:
As you can see, the erroring line is actually completely fine, and nothing else in the stack trace points to what is wrong.
Looking at line 10, there’s one argument, a call to asset URL, and we’re passing one argument. However, we are receiving an error about the argument that says: wrong number of arguments (given 3, expected 1..2).
Why do we have a mismatch between what the code looks like and the error we are receiving? This is where it gets complicated.
To find the solution to this debugging mystery, I took the following steps:
Step 1. Google & Stack Overflow: I searched Google and Stack Overflow for someone who had the same issue. There’s no reason to reinvent the wheel if someone else already found a solution. However, as I discovered, if you can’t find someone else who had a similar problem, then you have to get more adventurous and strike out on your own.
Step 2. Add Debugging Code to Dependencies: Adding debugging code into the dependencies can give you a better understanding of what’s going on. This is a solution you can’t use in all languages. In the case above, this helped me eliminate some dependencies that weren’t the issue, but it didn’t identify the root cause of the issue.
Step 3: Downgrade Whack-A-Mole. The problem began during the upgrades, so it stood to reason that one of the upgrades may have caused the issue. I had some idea about which section of the code was problematic, so I started downgrading areas that directly linked to that section of code.
In this case, the haml gem ended up being the issue. The upgrading process introduced a bug and it didn’t allow the haml gem to play nicely with others.
Thankfully, this haml gem wasn’t necessary for upgrading, so I was able to downgrade it and get everything working properly.
Let’s discuss the internals of Ruby and another error I encountered during our upgrades. The following error was thrown by the test suite, rather than the application directly:
This error is saying invalid domain error, example.com. This error is strange, but when you look at how the test is set up, we have example.com allowed for the test suite because we run on non-application domains during the test suite:
I began Googling the issue and found someone who was having the exact same issue as us. The recommended solution was to upgrade the “cgi” gem to version 0.3.6. However, we don’t use the cgi gem in the Rails application.
Ruby uses two files to keep track of the dependencies that the application uses:
- Gemfile: Used for dependencies you explicitly depend on.
- Gemfile.lock: Used for gemfile dependencies PLUS all of the required sub-dependencies.
Our application doesn’t have any trace of “cgi” in either of these files. So how do we upgrade cgi when we’re not using cgi? It turns out that Ruby has certain “default” gems that are the core part of its install and cgi is one of them. So, while we weren’t technically depending on cgi, it was being implicitly used by our application.
We were able to solve the error by explicitly adding “cgi” version 0.3.6 to the gemfile, just as the other user suggested:
As you can see from these examples, sometimes the debugging process goes in really weird directions and you never know what you’re going to get. Though this project’s upgrade has been fraught with debugging issues, I hope that our solutions will prove to be inspiration for other people who encounter the same problems in the future.