Multiple Uses for the ASP.NET Web Sitemap (web.sitemap) File

December 17, 2009
Sean Cooper

The ASP.NET Web Sitemap (web.sitemap) file is the default source for navigation info for the ASP.NET navigation controls. While this file is extremely useful in mapping out your site and controlling navigation, we have found other, complimentary uses for the Web.sitemap.

Managing Search Engine Optimization
Creating the Google Sitemap
Search Using Linq to XML
Serializing Content via the Web.Sitemap

Managing Search Engine Optimization

The Web.sitemap is a great place to put SEO information if you’re not sourcing it out of a database, like this site. At its heart, the web.sitemap file is just an XML file. Microsoft has created the siteMapNode class to expose each node in the XML file with specific properties as well as an attributes collection. This allows the creative developer to add additional attributes to the XML node which can then be used later in code.

In our case, we’re going to add a “keywords” attribute to each siteMapNode and populate it with the keywords we want for our page. Here is a live sample taken from this site:


<sitemapnode 
   url="Object-to-XML-lab.aspx"
   title="Ideosity - Labs - Self-Serializing Custom Objects"
   description="Learn how to create custom objects that will serialize
      themselves into XML using LINQ and Reflection."
   keywords="XML, serialize, serialization, Reflection,
      System.Reflection, LINQ, custom objects" />

We’ve used the “description” property to fill in the description for our meta description tag and filled in the keywords that we’ll use for our meta keywords tag. The next step is to put in a bit of code to make use of those attributes.

We are using a base class for our Master page so we’ll put our code in there. If you’re not using a Master page, using a base class from which your pages will inherit will do the same thing.


Protected Sub Page_Load(ByVal sender As Object, _
        ByVal e As System.EventArgs) Handles Me.Load
       
   If Not SiteMap.CurrentNode Is Nothing Then
      Page.Title = SiteMap.CurrentNode.Title
     
      If (Not SiteMap.CurrentNode("keywords") Is Nothing) Then
          Dim meta As New HtmlMeta meta.Name = "keywords"
          meta.Content = SiteMap.CurrentNode("keywords")
          Page.Header.Controls.AddAt(1, meta)
      End If
     
      If Not SiteMap.CurrentNode.Description Is Nothing Then
         Dim meta As New HtmlMeta
         meta.Name = "description"
         meta.Content = SiteMap.CurrentNode.Description
         Page.Header.Controls.AddAt(2, meta)
      End If
   End If
End Sub

As you can see, we’re setting the page title, meta description and meta keywords by reading attributes/properties of the relevant siteMapNode. And, of course, we check to see if we have a node that corresponds to the page we’re on.

While title, description, and keywords are not the last word in Search Engine Optimization, they do play a part and the Web.sitemap gives you a great place to manage them from.


Creating the Google Sitemap from the Web.Sitemap

Another topic related to SEO is the creation of the sitemap.xml file for Google.

We went the route of creating a handler, GoogleSiteMap.ashx, and pointing Google towards that. However, we also have the handler create the sitemap.xml just in case someone else goes looking for it.

To tell Google how often we update the site, we added another field to our siteMapNodes: modifiedDate.


<sitemapnode 
   url="Object-to-XML-lab.aspx"
   title="Ideosity - Labs - Self-Serializing Custom Objects"
   description="Learn how to create custom objects that will serialize
      themselves into XML using LINQ and Reflection."
   keywords="XML, serialize, serialization, Reflection,
      System.Reflection, LINQ, custom objects"
   modifieddate="5/11/2009" />

We could look at the physical modified date for each file but, for a small site, this seemed much more manageable.

Here’s the code for our sitemap creator.
Full disclosure: this is based on code and techniques from Bertrand Le Roy over at "Tales from the Evil Empire"


Imports System.Net.Mail

Public Class GoogleSiteMap : Implements IHttpHandler
   
   Private Const MaxDepth As Integer = 6
   Private Const domain As String = "http://www.ideosity.com"
   Public Sub ProcessRequest(ByVal context As HttpContext) _
      Implements IHttpHandler.ProcessRequest

      Try
         Dim request As HttpRequest = context.Request
         Dim myPath As String = request.MapPath("sitemap.xml")
         Using writer As New XmlTextWriter(myPath, Encoding.UTF8)
         
            writer.Formatting = Formatting.Indented
            writer.WriteStartDocument()
            writer.WriteStartElement("urlset")
            writer.WriteAttributeString(_
               "xmlns", "http://www.google.com/schemas/sitemap/0.84")
            writer.WriteAttributeString(_
               "xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance")
            writer.WriteAttributeString("xsi:schemaLocation", _
               "http://www.google.com/schemas/sitemap/0.84/sitemap.xsd")
            Dim root As SiteMapNode = SiteMap.RootNode
            WriteNode(root, writer, 0)
            writer.WriteEndElement()
            writer.WriteEndDocument()
         End Using
      Catch ex As Exception
     
      End Try
       
      Dim myResponse As HttpResponse = context.Response
      myResponse.ContentType = "txt/xml"
      Using writer As New XmlTextWriter( _
         myResponse.OutputStream, Encoding.UTF8)
         
         writer.Formatting = Formatting.Indented
         writer.WriteStartDocument()
         writer.WriteStartElement("urlset")
         writer.WriteAttributeString("xmlns", _
            "http://www.google.com/schemas/sitemap/0.84")
         writer.WriteAttributeString("xmlns:xsi", _
            "http://www.w3.org/2001/XMLSchema-instance")
         writer.WriteAttributeString("xsi:schemaLocation", _
            "http://www.google.com/schemas/sitemap/0.84/sitemap.xsd")
         Dim root As SiteMapNode = SiteMap.RootNode
         WriteNode(root, writer, 0)
         writer.WriteEndElement()
         writer.WriteEndDocument()
      End Using
   End Sub
 
   Public ReadOnly Property IsReusable() As Boolean _
      Implements IHttpHandler.IsReusable
      Get
         Return True
      End Get
   End Property
       
   Private Sub WriteNode(ByVal node As SiteMapNode, _
      ByRef writer As XmlTextWriter, _
      ByVal depth As Integer)
               
      If depth > MaxDepth Then Exit Sub
     
      writer.WriteStartElement("url")
      writer.WriteElementString("loc", domain & node.Url)
      writer.WriteElementString("lastmod", _
         Format(CDate(node("modifiedDate")), "yyy-MM-dd"))
      writer.WriteElementString("changefreq", "monthly")
      writer.WriteEndElement()
         
      Dim subNodeDepth As Integer = depth + 1
      For Each n As SiteMapNode In node.ChildNodes
         WriteNode(n, writer, subNodeDepth)
      Next
   End Sub
End Class

The heart of this is the recursive WriteNode function which looks for child nodes and creates the appropriate elements. I’m sure at least one of you has noticed that I duplicate my code. I simply did this so that, on the off chance that writing the sitemap.xml file failed, the handler would still return a sitemap to the Googlebot.

Not shown here, I also wrote code to notify us every time the sitemap is requested so we can get an idea of how frequently Google or Yahoo crawls the site.


Build a Custom Search with LINQ to XML

As I’ve noted above, this is not a data-driven site. (I’ll wait a moment while you gasp in amazement.) Implementing a usable search function for this site becomes a bit of a challenge. We could certainly use Google to search the site but we’d have very little control over the way the results were displayed. We could write some code that would scan through the text of each page, but that would be slow and cumbersome. Once again, we turn to the sitemap to power a keyword-based search.

We're going to use some of the features in .NET 3.5 to search through the elements in the sitemap while checking the keywords for matches against the search term, then build out a custom result set. In performing the search, we'll useLINQ to XML and Lambdaexpressions to keep our code simple.

For our search, we're going to pass the search term, in session, to our search page. After verifying that we're not posting back, we load the web.sitemap file into an XElement. We immediately run a LINQ query on the our inital XElement object, take all descendants. We want to use the Descendants property because it recurses through the document tree. This query yields an enumerable of XElement containing all elements of our sitemap same level. We'll then cast that enumerable to a List(of Xelement).

Once we've flattened our collection, it's time to loop through it. For each element, we're going to loop through each word in our search term, checking to see how many times that word matches in the current element's keywords attribute. We'll then add that value to a new attribute we've created called Hits.

Note: We could have skipped casting the enumerable to a list. However, because LINQ queries aren't run until the results are evaluated e1.Cast(of XElement).ToListwould be evaluated for each iteration of the loop. When there only a few elements, the overhead isn't too noticeable. But, when you get into larger numbers of items, the evaluation time can kill your performance. I prefer to only take the hit once, thank you.

Once we have modified all the elements in our list by adding the "Hits" attribute, it's time to do some filtering. Once again, LINQ comes in very handy. We'll select only those items in our list that actually have the "Hits" attribute and order by the value of that attribute. Notice, once again, we cast our results to a List(of XElement) because we're going to be iterating through the collection.

Finally, we loop through the items remaining in our collection, building out an entry in HTML for each one of our results. At the very last, we set the the text property of a literal on the page to equal the contents of our stringbuilder.


Protected Sub Page_Load(ByVal sender As Object, _
   ByVal e As System.EventArgs) Handles Me.Load
   Dim key As String = ""
   
   If Not Session("searchTerm") Is Nothing Then
      key = CStr(Session("searchTerm")).ToLower
   End If
   
   'intialize our label showing what was searched for
   'lblSearch.Text = "You searched for: " & key
   
   If Page.IsPostBack = False And key <> "" Then
      Dim keys() As String = key.Split(" ")
      Dim sb As StringBuilder = New StringBuilder
     
      'read sitemap into an xelement to allow for linq operations
      Dim element As XElement = _
         XElement.Load(Server.MapPath("~/web.sitemap"))
     
      Dim e1 As IEnumerable(Of XElement) = _
         From e2 In element.Descendants Select e2
     
      Dim howmany As Integer = 0
      Dim elementList As List(of XElement)=e1.Cast(of XElement).ToList
     
      For Each e2 As XElement In elementList
         For Each k As String In keys
            Dim ks = e2.Attribute("keywords").Value
            Dim rex As Regex = New System.Text.RegularExpressions.Regex _
               (Regex.Escape(k), RegexOptions.IgnoreCase)
             
            'use the regex to find matches within the
            'keywords attribute of the node
            howmany = rex.Matches(e2.Attribute("keywords").Value).Count
            If howmany > 0 Then
           
               'add a new attribute called "Hits"
               'to track how many times we matched
               If IsNothing(e2.Attribute("Hits")) Then
                  e2.Add(New XAttribute("Hits", howmany))
               Else
                  e2.Attribute("Hits").Value += howmany
               End If
            End If
         Next
      Next
     
      'use linq to get rid of elements that didn't match
      e1 = From e2 In e1 Where e2.Attribute("Hits") IsNot Nothing _
         Order By CInt(e2.Attribute("Hits")) Descending

      elementList = e1.Cast(Of XElement).ToList  
           
      Dim title As String = ""
      Dim descr As String = ""
     
      For Each e2 As XElement In e1.Cast(Of XElement).ToList
         title = e2.Attribute("title").Value
         descr = e2.Attribute("description").Value
         
         For i As Integer = 0 To UBound(keys)
            descr = Regex.Replace(descr, keys(i), "$0", _
               RegexOptions.IgnoreCase)
         Next
         
         ' dont forget to colorize or bold the search terms in title
         ' and/or description use regex here...
         b.Append("<div><a class='listLink' href='./" & _
            e2.Attribute("url").Value & "'>" & title & "</a><br />")
         b.Append(descr & "</div><br />")
      Next
     
      If e1.Count < 1 Then
         If key = "" Then
            litSearchResults.Text = "There were no results for your search."
         End If
      Else
         litSearchResults.Text = sb.ToString
      End If
   End If
   
   If key = "" Then
      litSearchResults.Text = "You must enter at least one search term."
   End If
End Sub


Serializing Content Across Your Site With the Web Sitemap

You can also use the web.sitemap to serialize content to your viewers. On the font page of the Ideosphere, the articles we show you are controlled by attributes in the sitemap. With the addition of two more attributes we can control which articles show up in our feed. First, our siteMapNode with a new attribute: ShortTitle


<sitemapnode 
   url="Object-to-XML-lab.aspx"
   title="Ideosity - Labs - Self-Serializing Custom Objects"
   description="Learn how to create custom objects that will serialize
      themselves into XML using LINQ and Reflection."
   keywords="XML, serialize, serialization, Reflection,
      System.Reflection, LINQ, custom objects"
   modifieddate="5/11/2009"
   hortTitle="Labs: Self-Serializing Custom Objects" />

Also, we add the "SectionName" attribute and a meaningful value to the parent node of each group of nodes we're interested in.


<siteMapNode url="development-articles.aspx" title="Ideosity Development Main Page" 
    description="" keywords="" modifiedDate="6/14/2009"
    shortTitle="" sectionName="Dev">
          <siteMapNode url="Object-to-XML-lab.aspx" title="" description=""
            keywords="" modifiedDate="5/3/2009" shortTitle="Self-Serializing Custom Objects"/>
        ...
</siteMapNode>

Next we create our control. It's pretty simple. We've got a placeholder to put our content in and some code running in the background.

The code is pretty simple. To make this control usable across the site, we add the "SectionName" and "Count" properties. SectionName will let our control figure out what part of the sitemap to run against. The Count property will determine how many entries we will display.

Again, we load the elements into an XElement so we can use LINQ to XML. Our query finds all elements below our desired parent element that have a ShortTitle attribute, ordered by the date the page was modified. As an aside, if we don't want a page to show up in our feed, we can leave the ShortTitle attribute off.

Once we have the elements we are looking for we simply loop through them, building a simple HTML unordered list then add the list to the controls collection of our placeholder.


Private _sectionName As String
    Private _count As Integer = 1000

    Public Property SectionName() As String
        Get
            Return _sectionName
        End Get
        Set(ByVal value As String)
            _sectionName = value
        End Set
    End Property

    Public Property Count() As Integer
        Get
            Return _count
        End Get
        Set(ByVal value As Integer)
            _count = value
        End Set
    End Property

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs)
        Handles Me.Load
        Try
            Dim element As XElement = XElement.Load(Server.MapPath("~/web.sitemap"))

            Dim e1 As IEnumerable(Of XElement) = From e2 In element.Descendants
                Where e2.Parent.Attributes("sectionName") IsNot Nothing _
                And e2.Parent.Attribute("sectionName") = SectionName _
                And e2.Attribute("shortTitle") IsNot Nothing
                Order By (e2.Attribute("modifiedDate").Value) Descending Take Count

            Dim elementList As List(Of XElement) = e1.Cast(Of XElement).ToList
            If elementList.Count > 0 Then
                phContent.Controls.Add(New LiteralControl("<ul>"))
                For Each e2 As XElement In elementList
                    phContent.Controls.Add(New LiteralControl("<li>"))
                    Dim myLink As New HyperLink
                    With myLink
                        .Text = e2.Attribute("shortTitle").Value
                        .NavigateUrl = "~/" & e2.Attribute("url").Value
                        .CssClass = "listLink"
                    End With
                    phContent.Controls.Add(myLink)
                    phContent.Controls.Add(New LiteralControl(" - "))
                    phContent.Controls.Add(New LiteralControl(
                        CDate(e2.Attribute("modifiedDate").Value).ToShortDateString))
                    phContent.Controls.Add(New LiteralControl("<br/>"))
                    phContent.Controls.Add(New LiteralControl(
                        e2.Attribute("description").Value))
                    phContent.Controls.Add(New LiteralControl("</li>"))                  
                Next
                phContent.Controls.Add(New LiteralControl("</ul>"))
            Else
                phContent.Controls.Add(New LiteralControl("Coming soon..."))
            End If

        Catch ex As Exception

        End Try
    End Sub