1st Article of the Series – SharePoint 2010 for Administrators
Aggregated Search and Indexing is one of the most important features of Microsoft SharePoint Server 2010. Today I’m going to show you, how to create a search service application and configure the crawl process.
The heart of the search process is the crawler. The crawler goes and gathers the information from the content sources and stores information in the index. After the index is built, users execute queries against the index and receive results. The crawler has to be configured by the Administrator with content sources, security and timing rules.
There is no more Shared Service Provider (SSP) in SharePoint 2010 Architecture. All services are managed separately. Different services are started and managed in different ways, search service is started as part of creating the search service application and is managed with the Search Service Application topology.
Creating a Search Service Application
Like all service applications, search service application is created from the Manage Service Application page of Central Administration.
There from the new menu you can add a new Search Service Application. [A Search Service Application will be installed with the default installation of SP 2010].
After creating, selecting the Search Service Application (not the proxy) click on the properties button on the ribbon you will see more configuration options. The initial topology with your new search service application configuration will have all components on one application server and all databases on one database server. This topology can be changed later using the Modify Topology link located on the Farm-Wide Search Administration page or from the Search Service management page.
After naming and creating, the properties page of Search Service Application looks like the one given below.
Leave the FAST service application set to none. Next you must specify the search service account which it must be a managed service account, and it will be the same for all search services in the farm. Just like other managed service accounts, it can be changed from the Configure Service Accounts page under the General Security section of the Security page in Central Administration.
Then set the application pools for the two web services. These application pools can be shared with other application or can be unique. The security account used as its identity must be a managed account. First the Search Admin Web Service and the second Search Query and Site Settings Web Service. If you will have search administrators who are not farm administrators, you need to give them permission to manage the search service application from the Manage Service Applications page as shown below which can be done by selecting Administration button from the ribbon.
Now select the Search Service Application and click on the Manage button on the ribbon to open the Search Administration with a dashboard for System Status, Crawl History and Search Application Topology.
There are many possible configuration changes with the System Status screen. Can change the Default content access account, change contact email address, proxy server and etc…
Creating content sources is the first administrative task in building a search and indexing topology. A content source is a collection of start addresses that are accessed with the same type connection and collectively managed. Simply a start address is the URL location where the crawler starts the process. The crawl settings define the depth and, potentially, the width for the crawl process.
Content Source Types
- SharePoint sites
- Web sites
- File shares
- Exchange public folders
- Line of business data
- Custom repositories defined by custom connectors
I can create new content sources, depending on content source type I can set content sources (locations/addresses). Apart from that schedule crawling, setting priority too can be configured from here.
Crawl rules allows to include or/and exclude and specify security context for content sources for crawling. Adding crawl rules is a topic has been taken separately which I will not cover in today’s article.
Crawl rule paths can be entered in few different ways.
- Web application: http://www.sharepoint.com
- Web application path: http://www.sharepoint.com/path
- All inclusive: http://*
- Scheme independent: *://www.sharepoint.com
- Domain: http://*.sharepoint.com
- Crawl rules can include regular expressions too
Server Name Mappings
There can be situations where crawling has be performed on URLs other than what’s given in alternate access mapping. You can create server name mappings to override how the URLs are shown in search results and correct the name displayed to user. This is mainly needed in scenarios where the user need to crawl content using HTTP when users will access it using HTTPS and also when it is necessary to crawl with Windows authentication when the normal authentication method is not supported for the crawler, such as smart-card authentication.
The configuration you have done up to now is fair enough for the search to work. But there are lot more configuration that can be done regarding security permission, database configuration and etc.. depending on the farm topology used.