Bryan Bedard's Blog
My adventures in software development, mistakes and all.

Using the IIS URL Rewrite Module to Provide Canonical URLs for your Site

Posted by Bryan Bedard - 11/24/2016

In my article about Getting Your Site to Play Nice with Search Engines and Social Networks I discussed the importance of having canonical URLs for the pages on your site to avoid issues with multiple URLs for the same page causing your page reputation to be divided across the versions. While you can provide a <link rel=”canonical” …> meta tag to achieve this, it’s also good practice to configure your site to redirect the various versions of URLs to the canonical version.

If you are using IIS as your web server, you can implement rules to redirect to canonical URLs with the IIS URL Rewrite Module 2.0. To use this module, download and install it on your web server. Once installed, you will see a URL Rewrite option in Internet Information Services (IIS) Manager when viewing the properties of your site.

Screen shot of URL rewrite

The user interface for adding and modifying rules is very straightforward. When creating a new rule you will be presented with a variety of templates to start from for common rewrite cases. When a request is received, all matching rules are executed against your URL in the order in which they are defined. You can adjust the order of the rules. You can also set a property on some rule types to indicate that processing should stop and to not move on to the remaining rules.

Microsoft has documentation on Using URL Rewrite Module 2.0 and the invaluable URL Rewrite Module v2.0 Configuration Reference.

The user interface in IIS Manager saves the settings to the web.config file. I will now go through each of the rewrite rules implemented on the Highway North site with a screen shot of the settings in IIS Manager. At the end I will include the full code for the settings from the web.config file.

Matching patterns in rules can be specified as JavaScript Regular Expressions or wildcards.

Ignored Paths Rule

There are a few paths on our site that we don’t want to redirect from. For example, we have some URLs that our existing Android applications are expecting to be available and don’t have the ability to redirect from HTTP to HTTPS. The URL Rewrite Module makes available a {PATH_INFO} server variable which includes the path after the protocol prefix and domain name including the forward slash. It’s easy to match the ignored paths to the {PATH_INFO} server variable and set an action type of None (i.e. leave the URL alone) and then stop processing more rules.

Screen shot of ignored paths
Name Ignored Paths Rule
Match URL Matches the regular expression: (.*)
Conditions {PATH_INFO} matches any of these regular expressions:
/AppInfo/.*
/Mountie/Token
Action Type None
Stop processing of subsequent rules Yes

HTTP to HTTPS Redirect Rule

We want to force users of our site to connect with HTTP secure (HTTPS). The URL Rewrite Module makes available an HTTPS server variable set to either ON or OFF to tell us if the URL is using HTTPS.

Screen shot of HTTP to HTTPS
Name HTTP to HTTPS Redirect Rule
Match URL Matches the regular expression: (.*)
Conditions {HTTPS} matches the regular expression: ^OFF$
Action Type Redirect
Redirect URL https://{HTTP_HOST}{PATH_INFO}
Append query string Yes
Redirect Type Permanent (301)
Stop processing of subsequent rules No

Canonical Host Name Rule

Our site is accessible via both its domain name and via a ‘www’ name. i.e. https://highwaynorth.com and https://www.highwaynorth.com. We choose the ‘www’ version as our preferred version and redirect to it. We can look at the {HTTP_HOST} server variable to determine if our canonical host name was used or not.

Screen shot of canonical host name
Name Canonical Host Name Rule
Match URL Matches the regular expression: (.*)
Conditions {HTTP_HOST} does not match the regular expression: ^www\.highwaynorth\.com$
Action Type Redirect
Redirect URL https://www.highwaynorth.com{PATH_INFO}
Append query string Yes
Redirect Type Permanent (301)
Stop processing of subsequent rules No

Remove Trailing Slash Rule

Whether or not your URLs end with a trailing slash is largely a matter of preference. Because your URL is treated as a separate URL by search engines when it has a trailing slash from when it does not, you should choose the one you prefer and make it canonical. You can add a rule to either add or remove a trailing slash. For our site, our preference is to remove the trailing slash. This can be achieved by matching URLs that end with a trailing slash.

Screen shot of remove trailing slash
Name Remove Trailing Slash Rule
Match URL Matches the regular expression: (.*)/$
Conditions {REQUEST_FILENAME} is not a directory
{REQUEST_FILENAME} is not a file
Action Type Redirect
Redirect URL {R:1}
Append query string Yes
Redirect Type Permanent (301)
Stop processing of subsequent rules No

Lower Case Rule

Many sites redirect to an all lower case URL which is important because search engines will treat casing differences as being different URLs. However, for our site, we had too many existing links with mixed case that would now cause lots of redirects to happen. Also, we prefer the readability and look of mixed case URLs when users navigate our site or share links. We decided to accept the risk of our links being shared with mixed case and feel it will be a rare enough scenario not to worry about so we didn’t implement a rule to redirect to all lower case.

Source Code of Rules

Here is the full source code of the above rules as they appear in the web.config file:

<rewrite xdt:Transform="Insert">
<rules>
    <clear />
    <rule name="Ignored Paths Rule" stopProcessing="true">
    <match url="(.*)" />
    <conditions logicalGrouping="MatchAny" trackAllCaptures="false">
        <add input="{PATH_INFO}" pattern="/AppInfo/.*" />
        <add input="{PATH_INFO}" pattern="/Mountie/Token" />
        <add input="{PATH_INFO}" pattern="/Mountie/api/.*" />
        <add input="{PATH_INFO}" pattern="/Sasquatch/api/.*" />
        <add input="{PATH_INFO}" pattern="/Tundra/api/.*" />
    </conditions>
    <action type="None" />
    </rule>
    <rule name="HTTP to HTTPS Redirect Rule">
    <match url="(.*)" negate="false" />
    <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
        <add input="{HTTPS}" pattern="^OFF$" />
    </conditions>
    <action type="Redirect" url="https://{HTTP_HOST}{PATH_INFO}" />
    </rule>
    <rule name="Canonical Host Name Rule">
    <match url="(.*)" />
    <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
        <add input="{HTTP_HOST}" pattern="^www\.highwaynorth\.com$" negate="true" />
    </conditions>
    <action type="Redirect" url="https://www.highwaynorth.com{PATH_INFO}" />
    </rule>
    <rule name="Remove Trailing Slash Rule">
    <match url="(.*)/$" />
    <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
        <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
        <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
    </conditions>
    <action type="Redirect" url="{R:1}" />
    </rule>
</rules>
</rewrite>

Again, you don’t have to use the IIS Manager interface to configure your rules. Personally, I take the approach of using the UI to configure the rules against a local instance of IIS then copy and paste the code from the generated web.config into our site’s web.config.

Add Your Comment

Want to comment on this? Log in or Register to add your comments.