Tuesday, December 9, 2008

The Part Where The Java Developer Uses PHP

Any illusions I may have had about my PHP prowess were all but shattered over the past couple of days. What initially seemed like a fantastically successful migration disintegrated into a day of vigorous LAMP-stration, as I struggled through what seemed like a virtually impossible issue. Most everything was working - all read operations were just fine. Many inserts and updates were working as well. After about half an hour of poking around, I realized that the problem was that we weren't able to submit any form that included text with a single quote in it. Shouldn't be a problem right? I know the issue! Let's fix it. Not so fast my friend...

Here is the tale of my woe:

1) My first thought was "Google, single quotes php mysql". That's exactly what I did, and immediately found a reference to 'magic_quotes'. After reading the php help, I was pretty sure this was it. A thing that is built in to php to save the unsafe programmers from the evils of sql injections, by escaping special characters that are submitted as part of a form. That certainly includes single quotes. A-ha! This'll be quick. I verified that on the old server, and locally, this option was being used. I did this by running php -i, to get the php info. Here's what I saw on both of those servers:

magic_quotes_gpc => On => On

So I assumed that when I ran php -i on the new server, I would see

magic_quotes_gpc => Off => Off

Imagine my surprise when I ran php -i | grep magic, and saw

magic_quotes_gpc => On => On

How could that be? This was a definite crossroads in my debugging process, and I chose the wrong road. I could have verified that this was actually the case, but instead what I did was this:

2) Figure out how to escape strings in php prior to inserting into the database. Of course this application isn't using prepared statements - that would be too simple. Despite the availability of this option, no dice. That would solve everything, because the strings would be escaped as part of the persistence logic. Wrong. I played around with mysql_real_escape_string, which would work, but since this codebase has no persistence layer, I would to make a change to every single php file that posts to the database, and that just isn't practical for a product that we hope not to touch too much. So...the next option is to override the query() function in mysqli, to do some logic to escape the characters that need escaping. I started trying to do this, and wasn't getting anywhere, because it was pretty difficult to debug through this issue on the production to make sure that the new query logic was even being called. This was made especially difficult because of the sheer amount of cruft in the php_error_log.

Aside: Broken Windows

The Broken Windows theory is something that I heard a lot about during my time at Blackboard. The idea is that if you clean up the trash on a sidewalk, people are much less likely to litter. If you paint over all the graffiti, people won't tag on that wall. It was discussion in Freakonomics, and it a central theme in New York City's recovery from their crime epidemic of the 70s and 80s. At Blackboard, we had gotten pretty lax about what ended up in the log files. This sucks for a number of reasons. Under heavy use, this fills up the log files pretty quickly, using up disk space and cpu cycles to write the files. Under any usage level, it certainly makes it harder to see what the problems are in your software. The log is a place for things that you want to be there, like errors, warnings, and informational statements that you want there. It's not a good place for "Made it here at line 262", and "id coming in is: 12". That doesn't help anyone, and hasn't helped anyone since the person who put it in the log in the first place. My old manager was very astute in pointing this out, and making sure we did somethin about it. We made it a point at Blackboard to edit any code that did unnecessary print statements, and it really made a difference. Furthermore, nobody wanted to be the one who left gratuitous logging in the code.

Back to the story.

Once I decided that it couldn't be a php code thing, and definitely wasn't a mysql thing, I moved on to sheer despair and anger. This is probably not the best way to go about things, but it's where I was at. A walk to the coffee store with my boss led me to re-examine the php.ini file. See, I had compiled and tried to install php to include some packages that weren't installed initially, and it was a bad bad idea. Now, on this machine there are two versions of php. One is running on apache for the site - it's php 5.1.6. The php I installed was 5.2.6. Now I check and see what the options are for php again. Same thing. No way, I think. I finally did what I should have done a while ago and read the php.ini file. Guess what:

magic_quotes_gpc => Off => Off

How about that. I fixed that, and the problem is now solved. Voila.

The Moral of the Story

When I was doing interviews at Blackboard, I usually asked the candidates a few standard java questions to make sure they had a basic level of competency. Then I dove into problem solving techniques. So many people simply failed to even start any sort of critical thinking. I think that next time I am interviewing, I am going to use a version of this issue as my problem solving question.

Lessons learned - don't install php when it's already there as part of the OS. If you do, make sure that you are using the right version of php. Once you've done that, verify that what is in php -i is actually correct, by looking at php.ini. Biggest lesson learned - if your gut tells you something and you just know it's right, stick with it and follow up on it. It's usually the right path.
blog comments powered by Disqus