SPWakeUp (SPWakeUp3) v1.1.0 Released

I’ve implemented something that I’ve wanted to for a long time on SPWakeUp: The ability to wake additional URLs.

Version 1.1.0 allows the use of the ‘-Include’:’ command line parameter to specify additional URLs that will be woken once the detected site collections and subsites have been traversed.

SPWakeUp (SPWakeUp3)–Wake Up On-Premises SharePoint and WSS Instances

Since I started working with on-premises SharePoint instances, one of the solutions that I’ve used to wake up (pre-compile) the site collections and sub-sites contained within the web applications hosted by the farm is SPWakeUp.

This was originally a solution hosted on CodePlex and provided binaries for SharePoint 2007, then later SharePoint 2010 (the archive containing those can still be downloaded from the CodePlex Archive). I created compiled binaries for SharePoint 2013 and SharePoint 2016 and made those available as well.

I recently had need to use SPWakeUp on SharePoint 2019, so decided to produce a compiled for that version. As SPWakeUp doesn’t seem to have an active home anymore, I thought that it may be worthwhile putting the code and compiled versions on GitHub in case anyone else wants to use them! Note that if anyone objects to this happening, let me know and I’ll pull it down.

At the moment the repository hosts the original source code, the source code upgraded for use with Visual Studio 2019, compiled versions of SPWakeUp for SharePoint 2013, SharePoint 2016 and SharePoint 2019 and some instructions on how to compile the source yourself using Visual Studio Community 2019.

I hope it’s useful to someone!

SharePoint Crawl Rules Appears to Ignore Some URL Protocols

I recently came across an issue relating to crawling people information in SharePoint and the use of crawl rules to exclude certain content.

The issue revolved around a requirement to exclude content contained within peoples’ MySites, but include user profile information so that people searches could still be conducted. The following crawl rule had been configured and was successfully excluding MySite content, but was also excluding the user profile data (crawled using the sps3s:// protocol):

URL Exclude or Include
https://mysite.domain.com/* Exclude

Using the crawl rule test facility indicated that while SharePoint treats http:// and https:// differently, https:// and sps3s:// appear to be treated the same as far as crawling is concerned, so if the above crawl rule is in place, items in the MySite root site collection, both with an https:// and sps3s:// prefix, will not be crawled, and therefore user profile data and people search will not be available:

Crawl rule test

[Screen shot from lab SharePoint 2010 system. however the same tests have been performed against SharePoint 2013 and 2016 with the same results]

In fact what is happening is that the sps3s:// prefix tells SharePoint which connector to use, and in the case of people search, this is translated into a call to a web service at the host specified, i.e. https://mysite.domain.com/_vti_bin/spscrawl.asmx, so the final call that is made is in fact to an https:// prefix, hence the reason that the people data is not crawled.

Replacing the above crawl rule with the following rule corrects the issue allowing people data stored in the MySite root site collection to be indexed and therefore be available for users to search:

URL Exclude or Include
https://mysite.domain.com/personal/* Exclude