A New Internet Library: Add Your Website/Blog or Suggest A Website/Blog to our Free Web Directory http://anil.myfunda.net.

Its very simple, free and SEO Friendly.
Submit Now....

Wednesday, February 13, 2008

Full-Text Search Terminology

Here is a list of terms and components that you need to be familiar with when using Full-Text Search.

Full-text index
Stores information about significant words and their location within a given column. This information is used to quickly compute full-text queries that search for rows with particular words or combinations of words. For more information, see Full-Text Indexes.
Full-text catalog
A full-text catalog contains zero or more full-text indexes. Full-text catalogs must reside on a local hard drive associated with the instance of SQL Server. Each catalog can serve the indexing needs of one or more tables within a database. Full-text catalogs cannot be stored on removable drives, floppy disks, or network drives, except when you attached a read-only database that contains a full-text catalog.
Word breaker
For a given language, a word breaker tokenizes text based on the lexical rules of the language. For more information, see Word Breakers and Stemmers.
Token
Is a word or a character string identified by the word breaker.
Stemmer
For a given language, a stemmer generates inflectional forms of a particular word based on the rules of that language. Stemmers are language specific. For more information, see Word Breakers and Stemmers.
Filter
Given a specified file type, for example .doc, filters extract text from a file stored in a varbinary(max) or image column. For more information, see Filters.
Population or Crawl
Is the process of creating and maintaining a full-text index. For more information, see Full-Text Index Structure.
Noise words
Are frequently occurring words that do not help the search. For example, for the English locale words such as "a", "and", "is", and "the" are considered noise words. These words are ignored to prevent the full-text index from becoming bloated. For more information, see Noise Words.


 CREATE FULLTEXT INDEX (Transact-SQL)


Updated: 1 February 2007

Creates a full-text index on one or more columns of a table or indexed view in a database. Only one full-text index is allowed per table or indexed view. By default, a database is enabled for full-text search when the database is created.

 Transact-SQL Syntax Conventions

Syntax





CREATE FULLTEXT INDEX ON table_name      [(column_name [TYPE COLUMN type_column_name]            [LANGUAGE language_term] [,...n])]      KEY INDEX index_name           [ON fulltext_catalog_name]      [WITH            {CHANGE_TRACKING {MANUAL | AUTO | OFF [, NO POPULATION]}}      ]

Arguments



table_name
Is the name of the table or indexed view that contains the column or columns included in the full-text index.
column_name
Is the name of the column or columns included in the full-text index. Only columns of type char, varchar, nchar, nvarchar, text, ntext, image, xml, and varbinary can be indexed for full-text search.
TYPE COLUMN type_column_name
Is the name of the column in table_name that holds the document type of column_name.
type_column_name must be of type char, nchar, varchar, or nvarchar. Specify type_column_name only if the column or columns in column_name are of type varbinary(max) or image; otherwise, SQL Server returns an error.
For more information, see Filters.
LANGUAGE language_term
Is the language of the data stored in column_name.
language_term is optional and can be specified as a string, integer, or hexadecimal value corresponding to the locale identifier (LCID) of a language. If language_term is specified, the language it represents will be used to index data stored in char, nchar, varchar, nvarchar, text, and ntext columns. This language is the default language used at query time if language_term is not specified as part of a full-text predicate against the column. If no value is specified, the default language of the SQL Server instance is used.
If the language is unknown, or there are multiple languages that are to be stored in column_name (and the column is not of BLOB or XML data type), use the neutral (0x0) language. For more information, see International Considerations for Full-Text Search.
Use the sp_configure stored procedure to access information about the default full-text language of the Microsoft SQL Server instance. For more information, see sp_configure (Transact-SQL) and default full-text language Option.
When specified as a string, language_term corresponds to the alias column value in the syslanguages system table. The string must be enclosed in single quotation marks, as in 'language_term'. When specified as an integer, language_term is the actual LCID that identifies the language. When specified as a hexadecimal value, language_term is 0x followed by the hex value of the LCID. The hex value must not exceed eight digits, including leading zeros.
If the value is in double-byte character set (DBCS) format, SQL Server will convert it to Unicode.
Resources, such as word breakers and stemmers, must be enabled for the language specified as language_term. If such resources do not support the specified language, SQL Server returns an error.
For non-BLOB and non-XML columns containing text data in multiple languages, or for cases when the language of the text stored in the column is unknown, use the neutral (0x0) language resource. For documents stored in XML- or BLOB-type columns, the language encoding within the document will be used at indexing time. For example, in XML columns, the xml:lang attribute in XML documents will identify the language. At query time, the value previously specified in language_term becomes the default language used for full-text queries unless language_term is specified as part of a full-text query.
,... n
Indicates that multiple columns can be specified to be full-text indexed. When multiple columns are specified, separate each column with a comma.
KEY INDEX index_name
Is the name of the unique key index on table_name. The KEY INDEX must be a unique, single-key, non-nullable column. Select the smallest unique key index for the full-text unique key. For best performance, a CLUSTERED index is recommended.
ON fulltext_catalog_name
Is the full-text catalog used for the full-text index. The catalog must already exist in the database. This clause is optional. If it is not specified, a default catalog is used. If no default catalog exists, SQL Server returns an error.
The filegroup of fulltext_catalog_name or, if no catalog is specified, the default full-text catalog must not be READONLY or OFFLINE. Otherwise, SQL Server returns an error.
WITH CHANGE_TRACKING {MANUAL | AUTO | OFF [ , NO POPULATION]}
Specifies whether or not SQL Server maintains a list of all changes to the indexed data. Data changes through WRITETEXT and UPDATETEXT are not picked up with change tracking.
MANUAL
Specifies that the change-tracking log will be propagated either on a schedule using SQL Server Agent, or manually by the user.
AUTO
Specifies that SQL Server automatically updates the full-text index as the data is modified in the associated tables. AUTO is the default.
Note   Although changes can be propagated automatically, these changes may not be reflected immediately in the full-text index. For more information about properties used for monitoring the full-text indexing process and for diagnosing any problems with full-text indexing, see Obtaining Full-Text Property Values Using Transact-SQL Functions.
OFF [ , NO POPULATION]
Specifies that SQL Server does not keep a list of changes to the indexed data.
The NO POPULATION option can be used only when CHANGE_TRACKING is OFF. When NO POPULATION is specified, SQL Server does not populate an index after it is created. The index is only populated after the user executes the ALTER FULLTEXT INDEX command with the START FULL, or INCREMENTAL POPULATION clause. When NO POPULATION is not specified, SQL Server populates the index fully after it is created.

permissions



User must have REFERENCES permission on the full-text catalog and have ALTER permission on the table of view, or be a member of the sysadmin fixed server role, or db_owner, or db_ddladmin fixed database roles.

Examples



The following example creates a full-text index on the HumanResources.JobCandidate table.

 Copy Code

USE AdventureWorks; GO CREATE UNIQUE INDEX ui_ukJobCand ON HumanResources.JobCandidate(JobCandidateID); CREATE FULLTEXT CATALOG ft AS DEFAULT; CREATE FULLTEXT INDEX ON HumanResources.JobCandidate(Resume) KEY INDEX ui_ukJobCand; GO

 


 ALTER FULLTEXT INDEX (Transact-SQL)


Updated: 1 February 2007

Changes the properties of a full-text index.

 Transact-SQL Syntax Conventions

Syntax




ALTER FULLTEXT INDEX ON table_name    { ENABLE     | DISABLE    | SET CHANGE_TRACKING { MANUAL | AUTO | OFF }    | ADD ( column_name       [ TYPE COLUMN type_column_name ]       [ LANGUAGE language_term ] [,...n] )      [ WITH NO POPULATION ]    | DROP ( column_name [,...n] )      [WITH NO POPULATION ]     | START { FULL | INCREMENTAL | UPDATE } POPULATION    | { STOP | PAUSE | RESUME } POPULATION     }

Arguments



 table_name
Is the name of the table that contains the column or columns included in the full-text index. Specifying database and table owner names is optional.
ENABLE | DISABLE
Tells Microsoft SQL Server whether or not to gather full-text index data for table_name. ENABLE activates the full-text index; DISABLE turns off the full-text index.
When the full-text index is disabled, the full-text index metadata remains in the system tables. If CHANGE_TRACKING is in the enabled state (automatic or manual update) when the full-text index is disabled, the state of the index freezes, any ongoing crawl stops, and new changes to the table data are not tracked or propagated to the index. Full-text index on table_name can be reactivated with ENABLE.
SET CHANGE_TRACKING {MANUAL | AUTO | OFF}
Specifies whether or not SQL Server maintains a list of all changes to the indexed data in the full-text catalog. Data changes through WRITETEXT and UPDATETEXT are not reflected in the full-text index, and are not picked up with change tracking.
To modify the CHANGE_TRACKING state of a full-text index, the FILEGROUP in which the full-text indexed table resides must not be set to OFFLINE or READONLY. Otherwise, the command fails and SQL Server returns an error.
MANUAL
Specifies that the change-tracking log will be propagated either on a schedule using SQL Server Agent, or manually by the user.
AUTO
Specifies that SQL Server will automatically update the full-text index as data is modified in the associated tables. AUTO is the default.
OFF
Specifies that SQL Server will not keep a list of changes to the indexed data.
ADD | DROP column_name
Specifies the columns to be added or deleted from a full-text index. The column or columns must be of type char, varchar, nchar, nvarchar, text, ntext, image, xml, or varbinary(max).
Use the DROP clause only on columns that have been enabled previously for full-text indexing.
Use TYPE COLUMN and LANGUAGE with the ADD clause to set these properties on the column_name. When a column is added, the full-text index on the table must be repopulated in order for full-text queries against this column to work.
Unless WITH NO POPULATION is specified, SQL Server automatically starts a full population after a column is added or dropped from a full-text index.
TYPE COLUMN type_column_name
Is the name of the column in table_name that holds the document type of column_name. Specify type_column_name only if the column or columns of column_name are of type varbinary(max) or image
TYPE COLUMN may be of data type char, nchar, varchar or nvarchar. For more information about full-text supported file formats and indexing data stored in varbinary(max) columns, see Filters.
LANGUAGE language_term
Is the language of the data stored in column_name.
language_term is optional and can be specified as a string, integer, or hexadecimal value corresponding to the locale identifier (LCID) of a language. If language_term is specified, the language it represents will be applied to all elements of the search condition. If no value is specified, the default full-text language of the SQL Server instance is used.
Use the sp_configure stored procedure to access information about the default full-text language of the SQL Server instance. For more information, see default full-text language Option.
When specified as a string, language_term corresponds to the alias column value in the syslanguages system table. The string must be enclosed in single quotation marks, as in 'language_term'. When specified as an integer, language_term is the actual LCID that identifies the language. When specified as a hexadecimal value, language_term is 0x followed by the hex value of the LCID. The hex value must not exceed eight digits, including leading zeros.
If the value is in double-byte character set (DBCS) format, SQL Server will convert it to Unicode.
Resources, such as wordbreakers and stemmers, must be enabled for the language specified as language_term. If such resources do not support the specified language, SQL Server returns an error.
For non-BLOB and non-XML columns containing text data in multiple languages, or for cases when the language of the text stored in the column is unknown, use the neutral (0x0) language resource. For documents stored in XML- or BLOB-type columns, the language encoding within the document will be used at indexing time. For example, in XML columns, the xml:lang attribute in XML documents will identify the language. At query time, the value previously specified in language_term becomes the default language used for full-text queries unless language_term is specified as part of a full-text query.
[ ,...n]
Indicates that multiple columns may be specified for the ADD or DROP clauses. When multiple columns are specified, separate these columns with commas.
WITH NO POPULATION
Specifies that the full-text index will not be populated after the ADD or DROP column operation. The index will only be populated if the user executes a START...POPULATION command.
If CHANGE_TRACKING is enabled and WITH NO POPULATION is specified, SQL Server returns an error. If CHANGE_TRACKING is enabled and WITH NO POPULATION is not specified, SQL Server performs a full population on the index after it is created.
The NO POPULATION option can be used only when CHANGE_TRACKING is OFF. When NO POPULATION is specified, SQL Server does not populate an index after it is created. The index is populated only after the user gives an ALTER FULLTEXT INDEX...START POPULATION command. When NO POPULATION is not specified, SQL Server populates the index after it is created.
START {FULL|INCREMENTAL|UPDATE} POPULATION
Tells SQL Server to begin population of the full-text index of table_name. If a full-text index population is already in progress, SQL Server returns a warning and does not start a new population.
FULL
Specifies that every row of the table be retrieved for full-text indexing even if the rows have already been indexed.
INCREMENTAL
Specifies that only the modified rows since the last population be retrieved for full-text indexing. INCREMENTAL can be applied only if the table has a column of the type timestamp. If a table in the full-text catalog does not contain a column of the type timestamp, the table undergoes a FULL population.
UPDATE
Specifies the processing of all insertions, updates, or deletions since the last time the change-tracking index was updated. Change-tracking population must be enabled on a table, but the background update index or the auto change tracking should not be turned on.
{STOP | PAUSE | RESUME } POPULATION
Stops, or pauses any population in progress; or stops or resumes any paused population.
STOP POPULATION does not stop auto change tracking or background update index. To stop change tracking, use SET CHANGE_TRACKING OFF.
PAUSE POPULATION and RESUME POPULATION can only be used for full populations. They are not relevant to other population types because the other populations resume crawls from where the crawl stopped.

Remarks



When instances of xml data type data are indexed for full-text search, the attributes and element names are not indexed and cannot be queried; however, element values can be indexed and queried in each instance. Well-formed XML documents and fragments containing multiple languages are supported.

Permissions



The user must have ALTER permission on the table or view, or be a member of the sysadmin fixed server role, or the db_ddladmin or db_owner fixed database roles.

Examples



The following example modifies the full-text index in the JobCandidate table of the AdventureWorks database.

Copy Code

USE AdventureWorks; GO ALTER FULLTEXT INDEX ON HumanResources.JobCandidate ENABLE; GO


DROP FULLTEXT INDEX (Transact-SQL)


Removes a full-text index from a specified table.

DROP FULLTEXT INDEX ON table_name

Arguments

table_name

Is the name of the table containing the full-text index to be removed.


You do not need to drop all columns from the full-text index before using the DROP FULLTEXT INDEX command.

The user must have ALTER permission on the table or view, or be a member of the sysadmin fixed server role, or db_owner or db_ddladmin fixed database roles.

The following example drops the full-text index that exists on the JobCandidate table.

USE AdventureWorks; GO DROP FULLTEXT INDEX ON HumanResources.JobCandidate; GO

No comments:

Post a Comment

Post your comments here:

Dotnet-Interviews