Archive

Posts Tagged ‘rewrite’

The Basics to Mod_Rewrite

May 8th, 2009 admin No comments

Mod rewrite. A term many webmasters often hear and get that awful sense of dread coming from within. But why? What makes this term strike fear in the hearts of webmasters everywhere? Misunderstanding. Yep, that’s it. People are afraid of what they don’t understand, and rightfully so when it comes to anything .htaccess; if you don’t know what you’re doing, you could waste hours of time working on something that could have taken 5 minutes.

So now that I’ve scared all of you, how can I help this dilemma? Well, I’m going to go through the barebone basics of what mod rewrite does and why one could (and probably should) use it on a production quality site.

Let’s begin talking about what it is and what people use it for. Mod_rewrite (as the prefix implies) is an Apache mod. This allows a user to make links to page (which usually don’t exist) appear to be static. Let me explain. Many PHP developers like to use the http://YOURDOMAIN.com/?p=PAGE which is fine. In fact, it’s probably the most efficient way to go about setting up multiple pages in one document. However, this is bad for SEO: which brings us to our next point. This is usually used for SEO purposes (but obviously there could be infinitely many uses) to make keep the programming aspect as http://YOURDOMAIN.com/?p=PAGE but make users (and Search Engines) see http://YOURDOMAIN.com/PAGE.html (or whatever extension for that matter).

Now how does one use mod_rewrite. Well, the code snippets I will provide will ALWAYS go in a .htaccess file for an Apache server (and some other servers actually do use .htaccess file and so it may be valid on those as well). Since this is not a tutorial about the .htaccess file, I won’t go anymore into detail on that-just realize that if you’re using an FTP client, you will have to show hidden files to see the .htaccess because the nature of Linux (any file/folder prefixed with a “.” is automatically hidden unless otherwise specified).

Now that we have that clear let’s begin with the most basic thing that we need. We must turn the mod on, and to do so we write this:

RewriteEngine On

Like I said, this will enable the mod.

Now, to do this efficiently (unless you would like to type in all the pages manually), a knowledge of regex is necessary. Regex is by no means easy, but well worth learning if you’re serious about programming in any language. A very good reference to look at is Regular-Expressions.info as a first-hand learning source and then use Google for help with more specific issues.

Now I’m going to go through a little example (I will provide the source code for at the end of this tutorial) which will explain how to create SEO friendly URL’s by way of page-NUMBER.html.

First, we need to create the page which is going to process the information. Below is a simple PHP document in which ?p=NUM will give you a result.

/**
* .htaccess Tutorial by Dennis M.
*
* This is just the example script where we will
* be redirected to and will do all our work! :)
*
*/

if(!$_GET['p']){
$_GET['p'] = “default”;
}

// Define our pages:
switch($_GET['p']){
// Default page if unknown page, etc.
default:
print “Default page!

Try:
Page1
Page2

or Page3 (Redirects back here! :D )”;
break;
case 1:
print “This is page one, congratulations! If you’re accessing this through ?p=1 then you’re not using SEO friendly
url’s. If you’re using page-1.html then you have successfully created an SEO friendly rewrite!”;
break;
case 2:
print “This is page one, congratulations! If you’re accessing this through ?p=2 then you’re not using SEO friendly
url’s. If you’re using page-2.html then you have successfully created an SEO friendly rewrite!”;
break;
}

?>

Now, below this is the .htaccess file which will properly rewrite the URL’s. They can still be accessed by ?p=NUM, but now page-NUM.html (or just .htm for that matter) will also work!

#### mod_rewrite Tutorial by Dennis M. ####
# #
# mod_rewrite .htaccess file. This is a #
# working redirect sample! A proper SEO #
# friendly rewrite. #
# #
###########################################

# Enable mod_rewrite
RewriteEngine On

# Our rules – using regex to find page number!
# the regex will find any page-NUM num being anywhere from 1
# to 99 digits in total length
#
# The regex more deeply explained here:
# http://microsonic.org/2009/05/08/the-basics-to-mod_rewritethe-basics-to-mod_rewrite
#
RewriteRule ^page-([0-9]{1,99})\.htm(l)?$ ?p=$1 [QSA,L]

Now let’s pull this apart. Obviously, RewriteRule is exactly what it stands for-what we’re going to be changing. Now the ^ and $ symbols mark the beginning of the link (^) and the end ($) [those are standard regex start/end string symbols]. This is followed by the static part of the link or page- followed by our first regex rule. The parenthesis ( and ) set this information apart as a set. Now, the brackets [ and ] set another element which is any number 0-9. The set symbols { and } allocate length of the string. For example, this will allow any number from 0 (1 digit long) to 1000 (4 digits long) all the way up to 99… carried out 99 total digits. The next part is the \. The backslash (\) escapes our . so it’s not processed as regex (because it is a regex symbol) and so it reads as part of the static text. The htm(l)? makes the htm necessary and the l character optional (allowing .htm or .html): which is the reason for the set (l) followed by the ? meaning optional. Now, the ?p=$1 is the actual page we will be going to. ?p=NUM is the format of our initial document to access pages, so we keep that and $1 takes the value of the first set in the regex defined before. In this case it is ([0-9]{1,99}) so it grabs the number. [QSA,L] are simply anchors. Now, I recommend always using [L] when rewriting URLs so it goes to your script, otherwise it will revert to the main index of your site (YOURSITE.com/index.php?p=NUM or however you choose to rewrite).

So now you all understand what misery truly is. You’ve been introduced to .htaccess and mod_rewrite. One of the most sensitive things a webmaster has to deal with otherwise you will receive a nasty “500 Internal Server Error.” So now the only thing to do is practice, practice, practice until you become as comfortable as comfortable can be with .htaccess, mod_rewrite, and regex. Good luck!

Mod_rewrite Tutorial

Regards,
Dennis M.