Hi, I'm Karl.

And this is Jambr...

... my attempt at documenting my adventures as a developer.

I regularly come across, or generally do quite interesting things in my day job, so when I have some time I try and write about them. Hopefully you'll find some of the Articles handy.

Karl Stoney

Article » Generating a valid sitemap automatically with .NET

Generating a valid sitemap automatically with .NET

How to generate a sitemap automatically in .NET

Overview

Jambr is still a baby, as such it's content and structure is changing. 

It originally existed on two urls (www and non-www), and google was indexing both of them and to add to it not long ago I changed the url structure for Articles to be more, SEO friendly. 

All of these changes confuse search engine indexers and one way to help them out is to provide them with a Sitemap.  My rough list of requirements were:

  • To comply fully with the Sitemap protocol
  • To generate automatically, when /sitemap.xml was called
  • To be able to decorate fixed controller actions with an attribute which would include them in the map.
  • To provide a simple way of adding the dynamic content
  • To cache the output for a period of time

Implementation

First things first, we need to create an XML document which matches the Sitemap protocol.  So we create a new XmlDocument and from there, we add the xmlns for the Sitemap protocol, and add the root element "urlset"

        ''' <summary>
        ''' The scheme we add to the document
        ''' </summary>
        Private Const SiteMapSchemaURL As String = "http://www.sitemaps.org/schemas/sitemap/0.9"

        ''' <summary>
        ''' The full URL to your website, for example http://www.jambr.co.uk
        ''' </summary>
        Private Property FullyQualifiedUrl As String

        Private _document As XmlDocument
        ''' <summary>
        ''' Returns the XML document
        ''' </summary>
        Private ReadOnly Property Document As XmlDocument
            Get
                Return _document
            End Get
        End Property

        ''' <summary>
        ''' Create a new instance of the SiteMapGenerator, initialise the XML document
        ''' and add the required namespaces
        ''' </summary>
        ''' <param name="FullyQualifiedUrl">The full URL to your website, for example http://www.jambr.co.uk</param>
        Public Sub New(ByVal FullyQualifiedUrl As String)

            Me.FullyQualifiedUrl = FullyQualifiedUrl.Replace("\", "/")

            _document = New XmlDocument
            Document.AppendChild(Document.CreateNode(XmlNodeType.XmlDeclaration, Nothing, Nothing))

            'Create the root element and add the sitemap namespace
            Dim rootelement = Document.CreateElement("urlset", SiteMapSchemaURL)
            Document.AppendChild(rootelement)

        End Sub

Next I wanted to create a flexible method to add new urls, that accepted all the valid options for the url child elements, on an optional basis, and only adding them if they're passed:

        ''' <summary>
        ''' Adds a URL to the site map
        ''' </summary>
        ''' <param name="Location">The URL to the page, will check for your domain and add if required.</param>
        ''' <param name="LastModified">Optional: The date the URL was last modified</param>
        ''' <param name="ChangeFrequency">Optional: The expected change frequency of the URL</param>
        ''' <param name="Priority">Optional: The priority of the page, ranging from 0.0 to 1.0, default is 0.5</param>
        Public Sub AddUrl(ByVal Location As String,
                          Optional ByVal ChangeFrequency As ChangeFrequency = Nothing,
                          Optional ByVal Priority As Decimal = Nothing,
                          Optional LastModified As DateTime = Nothing)

            'sanitise the url
            Location = Location.Replace("\", "/")
            If Not Location.ToLower.Contains(FullyQualifiedUrl.ToLower) Then
                Location = FullyQualifiedUrl & If(Left(Location, 1) = "/", Location, "/" & Location)
            End If

            'check we haven't added it already in a stored list of urls we've added
            If AddedUrls.Contains(Location) Then Exit Sub
            AddedUrls.Add(Location)

            'Required elements
            Dim newUrl = Document.CreateElement("url", SiteMapSchemaURL)
            newUrl.AppendChild(CreateTextElement("loc", Location))

            'Optional Elements
            If Not LastModified = Nothing Then
                newUrl.AppendChild(CreateTextElement("lastmod", LastModified.ToW3C))
            End If

            If Not ChangeFrequency = Nothing Then
                newUrl.AppendChild(CreateTextElement("changefreq", ChangeFrequency.ToString))
            End If

            If Not Priority = Nothing Then
                newUrl.AppendChild(CreateTextElement("priority", Priority))
            End If

            Document.DocumentElement.AppendChild(newUrl)

        End Sub

Reflection

I mentioned previously that I wanted an easy way to add URLs, I didn't want to create a class which needed me to call AddUrl() over and over for all my pages.  I decided to go down the route of creating a custom SettingAttribute, that I could just stick at the top of the controller actions I wanted to map, like this:

    <SiteMap(ChangeFrequency:=ChangeFrequency.daily, Priority:=0.7)>
    Function Index() As ActionResult
        Return View(New HomeViewModel)
    End Function

Next huh?  Now you've probably realised that this would only work for static URL's, dynamic actions that require parameters like this, wouldn't work.  In the context of Jambr I have two controllers which serve dynamic content, Articles and News.  I decided to go down the route of creating an interface, which allowed me to have a sub routine that could be called by the site map generator, like this:

    ''' <summary>
    ''' Populate the site map with the dynamic data
    ''' </summary>
    ''' <param name="generator">the generate object that gets passed</param>
    Public Sub PopulateSiteMap(ByRef generator As SiteMapGenerator) Implements ISiteMap.PopulateSiteMap

        'We need to initialise the UrlHelper because of the way we've invokved this method
        Url = New UrlHelper(System.Web.HttpContext.Current.Request.RequestContext)

        Using db As New JambrDBContext

            'Lets add dynamic data, starting with my articles
            Dim articles = (db.
                           ArticlePosts.
                           Where(Function(w) w.IsLive = True).
                           OrderByDescending(Function(o) o.LastUpdated).
                           Select(Function(s) New With {.SEOUrl = s.SEOUrl,
                                                        .LastUpdated = s.LastUpdated})).tolist

            'Add my root element, with a last modified date of the latest article
            generator.AddUrl(Url.Action("Index", "Article"), ChangeFrequency.daily, Nothing, articles.First.LastUpdated)
            'Add the RSS feed, as it has the same last udpated date
            generator.AddUrl(Url.Action("RSS", "Article"), ChangeFrequency.daily, Nothing, articles.First.LastUpdated)

            'Add my other elements
            For Each post In articles
                generator.AddUrl(Url.Action("View", "Article", New With {.SEOUrl = post.SEOUrl}), Nothing, Nothing, post.LastUpdated)
            Next
            articles = Nothing

        End Using

    End Sub

We just look for either the SiteMapAttribute, or the Implementation of ISiteMap using reflection and get the associated details like so:

    ''' <summary>
    ''' When called, the site map generator will attempt to load any action methods
    ''' that are decorated with the SiteMapAttribute from your controller classes and
    ''' add a url for them based on it
    ''' </summary>
    ''' <remarks></remarks>
    Public Sub LoadFromAttribute()

        'Get all the controllers in the project
        Dim controllers = Assembly.
                          GetExecutingAssembly.
                          GetTypes().
                          Where(Function(t) GetType(System.Web.Mvc.ControllerBase).IsAssignableFrom(t))

        'First we want to get all controllers that implement the ISiteMap interface and fire the method
        For Each c In controllers.Where(Function(t) GetType(ISiteMap).IsAssignableFrom(t))

            'Create an instance
            Dim obj As ISiteMap = Activator.CreateInstance(c, True)
            obj.PopulateSiteMap(Me)

        Next

        'Now get all the methods which are decorated with the SiteMapAttribute
        Dim objs = (From c In controllers
                   From act In c.GetMembers
                   Where act.GetCustomAttributes(True).OfType(Of SiteMapAttribute)().Count > 0
                   Select New With {.controller = c,
                                    .action = act,
                                    .actionnameattribute = act.GetCustomAttributes(True).OfType(Of ActionNameAttribute)().FirstOrDefault,
                                    .sitemapattribute = act.GetCustomAttributes(True).OfType(Of SiteMapAttribute)().FirstOrDefault}).ToList

        'We need a url helper to help us generate the url path
        Dim UrlHelper = New UrlHelper(HttpContext.Current.Request.RequestContext)

        For Each p In objs
            'Now we have the objects, we need to build the url.  We need to look out for the ActionNameAttribute in case people are using it
            'to name their action methods, we also need to remove Controller from the name of the controller
            Dim url As String = UrlHelper.Action(If(p.actionnameattribute Is Nothing, p.action.Name, p.actionnameattribute.Name),
                                                 p.controller.Name.Replace("Controller", ""))

            'Add the object
            AddUrl(url,
                   p.sitemapattribute.ChangeFrequency,
                   p.sitemapattribute.Priority,
                   If(p.sitemapattribute.LastModified Is Nothing,
                      Nothing,
                      DateTime.Parse(p.sitemapattribute.LastModified, (New CultureInfo("en-us")))
                      )
                   )
        Next

    End Sub

Now add a route for sitemap.xml (remember this programming article is based around .Net MVC) in your RouteConfig.vb

        'This is to overwrite the sitemap request
        routes.MapRoute( _
            name:="SiteMap", _
            url:="sitemap.xml", _
            defaults:=New With {.controller = "SiteMap", .action = "Index"})

Set the controller and action to wherever you're going to put your method, I decided to put mine in a new controller.  Finally create your action method, I've decorated mine with the OutputCache attribute and set it to 6 hours, with the ability to clear the cache by using the query string parameter ClearCache

    ''' <summary>
    ''' Returns the site map
    ''' </summary>
    <OutputCache(Duration:=21600, VaryByParam:="ClearCache", Location:=OutputCacheLocation.Server)>
    Function Index() As ActionResult

        'Create our site map
        Dim p As New SiteMapGenerator("http://www.jambr.co.uk")

        'Load any methods which are tagged with the attribute
        p.LoadFromAttribute()

        'Return the content
        Return Content(p.ToString, "text\xml")

    End Function

Something to note here is that I have created a ToString method, which takes the XmlDocument and outputs it as a UTF8 string, UTF8 is important so there is another class in the source code which creates a UTF8 based string writer.

Conclusion

I hope this article has shown you a clean way to implement a dynamic site map in .NET MVC using flexible attributes, full source code can be downloaded from Here, if you want to see my sitemap, check it Here and as usual - any questions please drop me a comment!

 
Karl
 
1718
 
1
 
.NET, MVC

Comments » Generating a valid sitemap automatically with .NET

I was asked about the interface, the code is:
Public Interface ISiteMap
Sub PopulateSiteMap(ByRef generator As SiteMapGenerator)
End Interface

Add your own comment

You need to be Logged In to post comments.

Adverts

Recent Articles

Recent Blog Items