Moving from Apache to Nginx

I’ve been having problems with too much memory usage on the 256 slice I use to serve my web pages. I was using Apache and mod_php, and the Apache processes were growing large enough that I could only have 4 of them running at once, which killed any hope of decent concurrency. I decided to switch to using PHP with FastCGI and after some research decided to go with nginx. The switch has now been made and page generation is faster, the server can handle greater concurrency, and memory usage is under control.

I actually ended up using nginx, PHP, PHP-FPM, and xcache. I’m running 8 instances of PHP and 2 nginx worker processes. First, I got started by using the step-by-step instructions I found on someone’s blog. After the initial setup, there was still a lot of configuration to do, some of which was not trivial. Below are some notes from my experience:

  1. When setting up FastCGI processes for PHP, make sure the firewall isn't blocking access to the port the FastCGI servers are listening on. If your firewall is filtering packets to the port, you'll get a "504 Gateway Time-out" page from nginx whenever you try to access a PHP page. To open up the port, if you're using iptables, add something to your iptables startup script like this:

    iptables -A INPUT -p tcp -s localhost --dport 9000 -j ACCEPT
    
  2. Somewhere I saw a redirect example to eliminate the www from URLs that caused me to add this to the config file:

    server {
        listen          80;
        server_name     www.definingterms.com;
        rewrite         ^/(.*) http://definingterms.com/%1 permanent;
    }
    
    server {
        listen          80 default;
        server_name     definingterms.com;
        [...]
    }
    

    I'm not sure why I thought I needed this, but it was messing up the loading of static images on this blog. The right thing to do is just list multiple server_names for the host, and let WordPress handle the redirect:

    server {
        listen          80 default;
        server_name     definingterms.com www.definingterms.com;
        [...]
    }
    
  3. Moving most of my sites from Apache to nginx was easy, but there was the problem of rewrite rules. A lot of the software I use (e.g. Wordpress, Mediawiki, Gallery) needs rewrite rules. If you're using Apache, there's always an example .htaccess file either in the documentation, or in some user forum. However, the nginx rewrite rules work differently, and you can't always find an example config, so it took some time and brain power to get them all right. Actually, this is true for all parts of the nginx configuration. The English documentation is sparse and it's sometimes hard to find examples of what you want to do.

  4. nginx doesn't have CGI support. If you've got those one or two sites that use non-PHP CGI, you'll have to set up a FastCGI instance for each language you want to support. For example, I had a copy of ViewVC running, which is written in Python. With Apache, it was just using CGI. With nginx, I'd have to constantly run a whole Python FastCGI process for the rare event when someone wants to browse ViewVC.

In the end, I think the move to nginx was worth it purely for the performance gains, since it keeps me from having to pay for a heftier VM every month. But, it did take a fair amount of work and if I had slightly more complex needs, it might not have been a good fit for me.

Secure Passwords Not Allowed

For quite a while now, I’ve been using a tiered password system for all of the websites where I have accounts. I knew this was bad practice, but it was easy. Recently there have been a number of stories about websites’ databases being leaked, which made me seriously consider doing something better. Password managers have never impressed me much, both because of the security issues of storing all of your passwords in a central location and the danger of losing the database and not being able to reconstruct it. But, when I came across PasswordMaker, I liked what I saw.

Instead of storing the passwords, they’re generated on the fly using a cryptographically secure method. The software is all open source, so you know what’s going on and can reconstruct it if you need to. Finally, there’s also a very convenient browser plugin for Firefox that I can install everywhere. So, I finally bit the bullet and got away from my tiered password system by moving all of my online accounts to using passwords generated by PasswordMaker.

If I’m going to go through all of that trouble, I might as well use a secure password, right? So I decided to use passwords with letters, numbers, and punctuation. That shouldn’t be anything special, it’s just the standard recommendation for password security. Some systems even have that as a minimum requirement.

What surprised me was that a huge number of websites I use don’t allow passwords with punctuation, 20% in fact. Out of the 97 sites where I tried to update my account, 19 of them would not allow it. These ranged from hip web 2.0 sites like digg.com to big, corporate sites like geico.com. Here’s the full list:

So now I have to use a different set of characters for my passwords at these sites. Fortunately, PasswordMaker lets me configure different profiles for different sites, so I can set it up once and forget about it. But, why should I have to do that, especially when it makes my account less secure?

It seems like it just requires more work from the website makers to restrict certain characters, and I can’t think of any good reason to do so. It might make sense to restrict passwords to ASCII characters if their system doesn’t fully support Unicode. But disallowing all punctuation just doesn’t make any sense. The programmers might be worried about allowing escape characters in passwords, but it seems like it would be just as much work to protect the system against them internally as making an additional demand on the user.

If we’re going to expect users to have secure passwords, we need to allow them to do so. I’d like to see the above sites change their password policies, and I want any new sites to allow long, complex passwords.

Dinosaur Remix

I like Dinosaur Comics. Since the drawings and panels are the same in every comic, I thought of the idea of trying to mix and match panels from different comics to see what comes out. It turns out, Ryan North had already done something similar, but I decided to implement my idea anyway. Here’s the result:

Dinosaur Remix

Dinosaur Remix lets you randomly mix together panels, but also lock in certain panels to make a comic you like. Then you can add clever alt text and save it or send it to a dinosaur-loving friend!

It’s written in Python, PHP, and JavaScript. It was my first chance to really use jQuery, which is just about as awesome as everyone says it is. All of the code is available on GitHub.

Translation Hacks

Not so long ago, the only help available when translating text from one language to another was a dictionary and a grammar book. That’s how it was when I started learning German. Now, there are a number of tools online to help you in your bilingual quest, but you have to be clever to get the most use out of them. My examples will be in German, but these techniques can be applied to most languages that have a strong online presence.

  • A Good Dictionary - This is still your first line of defense against the hostile hoards of foreign words. For German, the best is clearly LEO. The reason it’s better than your trusty Langenscheidt’s or Oxford-Duden is that it’s user created and maintained, so you get idioms, slang, and current events.

  • Google for Grammar - If you don’t know which of two or three possible variants is the correct one, Google all of them and see if one has many more hits than the rest. Important here: put the phrase in quotes. This is especially good for things like which preposition to use with more common words. Let’s say you want to translate “I’m going to Chicago.” Which preposition do you use? Try searching for a similar phrase (in this case, altered for geographic relevance) and see if one stands out. Go to google.de and click the “Seiten auf Deutsch” (Pages in German) button. Then try three reasonable guesses:

    1. "gehe zu Berlin": 7 hits
    2. "gehe nach Berlin": 2,890 hits
    3. "gehe bei Berlin": 1 hit

    It looks like we have a clear winner. You can be pretty confident translating your sentence as “Ich gehe nach Chicago”. However, this isn’t always foolproof. If you had searched for "gehe in Berlin" you would have gotten 576 hits and the results wouldn’t have been quite as clear. But, after reading the first few hits, you’d have realized that it’s not what you’re looking for.

  • Wikipedia - This is especially good for technical terms. Suppose you want to talk about the famous Quicksort algorithm in German. LEO won’t help you. So, go to the English Wikipedia page for Quicksort. Then, look down on the left side of the page in the “languages” box. Click “Deutsch” (if it exists) and you’ll be taken to the equivalent German page. In this case, you find out that Quicksort is the same in English and German, so now you can (not) translate it with confidence.

  • Names - You read a foreign name and you’re not sure what gender it is. Sometimes that doesn’t matter, but if you want to talk about this person, it sure is useful to be able to use pronouns. So, to figure it out, do a Google image search, preferably at the Google site for the country, and look at the results. Example: Johannes vs. Johanna (I read this somewhere online, but I’m not sure where. It’s especially helpful for Asian names.)

  • Machine Translation - I list this one last because it’s generally the least helpful. The two main free sites are Google Translate and Babelfish. These sites are slowly getting better, but right now they’re still of limited value. They’re good if you want to read something in a language you don’t know at all or it would take you a long time to do the translation yourself and you just want to get the gist of the text.

Simple Canvas Example

Here is a simple example of drawing pixels (rather than paths) to a canvas element with JavaScript. The algorithm goes left to right and draws across the canvas, and for each pixel column draws two blue pixels, one for the top half of the circle and one for the bottom half. Then, because this results in sparse drawing for the far left and right edges of the circle, it does another pass from top to bottom, doing the same thing with red pixels. The result is a circle with gradient colors, showing which pass drew more of the circle.

It’s been tested in Firefox 2 and 3, and I think it should work in recent versions of Safari and Opera.