Digging into SQL Server 2012 columnstore index


The SQL Server 11.0 launch (code named “Denali”) introduces a brand new knowledge warehouse question acceleration characteristic based mostly on a brand new kind of index known as the columnstore. Columnstore indexing is formally introduced in SQL Server 2012. It’s working based mostly on xVelocity reminiscence optimised know-how and it improves knowledge warehouse question efficiency considerably. Resulting from the truth that knowledge warehousing, choice help programs and enterprise intelligence purposes are rising in a short time, we’d like to have the ability to learn and course of very giant knowledge units shortly and precisely into helpful data and data. Columnstore index know-how is particularly acceptable for knowledge warehousing knowledge units. It improves the widespread knowledge warehousing queries’ efficiency considerably.

Columnstore index is storing knowledge for every column and joins all of the columns to finish the index. There are various benefits of utilizing columnstore indexing as compared with the normal rowstore indexing. The time period “rowstore” is utilizing to explain both a heap or a B-tree that comprises a number of rows per web page. As columnstore indexing is fairly new, it has some restrictions and limitations. So, you have to be conscious of these limitations if you find yourself planning to implement columnstore index in your knowledge warehouse. On this article we’ll talk about in regards to the beneath subjects:

§  How columnstore index works?

§  Advantages of utilizing columnstore indexes

§  Restrictions of columnstore indexes

§  Easy methods to create a SQL Server columnstore index?

§  Planning for creating columnstore index

§  Selecting columns for a columnstore index

Whereas rowstore indexing shops a number of rows per web page, columnstore index shops every column in disk pages individually. The next picture illustrates the distinction between columnstore and rowstore indexing from storage perspective:

clip_image001[14]

As you may see C1, C2…C6 are saved in numerous pages, so:

·         solely the columns wanted in a question are fetched from the disk

·         because of the redundancy of knowledge inside a column it’s simpler for knowledge compression

·         due to the information compression and continuously accessed components of generally used columns are nonetheless stay in reminiscence, therefore, buffer hit charge is improved.

As mentioned, columnstore is working based mostly on xVelocity know-how that’s in widespread with SQL Server Evaluation Providers Tabular Mannequin in addition to PowerPivot. Truly, it doesn’t imply that columnstore indexes have to slot in reminiscence; nevertheless, they will use obtainable server reminiscence successfully to maneuver parts of columns out and in of reminiscence on demand. As columnstore indexes retailer all knowledge for separate columns in separate pages, utilizing columnstore indexes improves I/O scan efficiency considerably.

There are a number of advantages of utilizing columnstore indexes as compared with rowstore indexes as beneath:

·         As mentioned, solely the columns wanted in a question are fetched from the disk, so, the information warehouse question efficiency is manner sooner for widespread knowledge warehouse queries

·         As knowledge is extremely compressed utilizing xVelocity know-how the disk area reduces successfully

·         Because the pages are considerably compressed, the pages containing probably the most continuously accessed columns stay in reminiscence

·         As batch mode processing that’s a complicated question execution know-how that processes chunks of columns is used, the CPU utilization is decreased.

Columnstore indexing is a brand new know-how, so, you have to be conscious of its restrictions if you’re planning to implement columnstore indexes. The next restrictions ought to be thought of:

·         Columnstore index is accessible solely in SQL Server Enterprise, Developer and Analysis editions, so, you’ll face to the next error message if you wish to use columnstore index in different editions of SQL Server 2012: “CREATE INDEX assertion failed as a result of a columnstore index can’t be created on this version of SQL Server.”

·         Tables containing columnstore indexes can’t be up to date. This restriction may be eliminated within the subsequent releases of SQL Server. Now, the best way to insert, replace or delete knowledge in a desk that comprises a columnstore index? There are three options for this goal; nevertheless, evidently the primary answer is extra simple than the others.

1.       Drop the columnstore index, carry out any INSERT, UPDATE, DELETE or MERGE operations, and recreate the columnstore index.

2.      Partition the desk and change partitions. For a bulk insert:

§  insert knowledge right into a staging desk

§  construct a columnstore index on the staging desk

§  change the staging desk into an empty partition

For different updates:

§  change a partition out of the principle desk right into a staging desk

§  disable or drop the columnstore index on the staging desk

§  carry out the replace operations

§  rebuild or re-create the columnstore index on the staging desk

§  change the staging desk again into the principle desk.

3.      Place static knowledge right into a primary desk with a columnstore index, and put new knowledge and up to date knowledge prone to change, right into a separate desk with the identical schema that doesn’t have a columnstore index. Apply updates to the desk with the newest knowledge. To question the information, rewrite the question as two queries, one in opposition to every desk, after which mix the 2 end result units with UNION ALL. The sub-query in opposition to the big primary desk will profit from the columnstore index. If the updateable desk is far smaller, the dearth of the columnstore index could have much less impact on efficiency. Whereas it’s also doable to question a view that’s the UNION ALL of the 2 tables, chances are you’ll not see a transparent efficiency benefit. The efficiency will depend upon the question plan, which is able to depend upon the question, the information, and cardinality estimations. The benefit of utilizing a view is that an INSTEAD OF set off on the view can divert updates to the desk that doesn’t have a columnstore index and the view mechanism can be clear to the consumer and to purposes. Should you use both of those approaches with UNION ALL, take a look at the efficiency on typical queries and resolve whether or not the comfort of utilizing this method outweighs any lack of efficiency profit.

Notice: As we mentioned, the tables containing columnstore index, can’t be up to date. However, it doesn’t appear to be a good suggestion to make use of columnstore to make a read-only desk. As a result of, columnstore index will not be designed for this explicit goal and it’s doable that Microsoft removes this restriction within the subsequent releases of SQL Server.  

·         Columnstore indexes aren’t supporting greater than 1024 columns

·         Solely nonclustered columnstore indexes can be found (there isn’t a clustered columnstore index)

·         A columnstore index can’t be a novel index

·         Creating columnstore indexes on a view or listed view will not be supported

·         Columnstore indexes can not embrace a sparse column (an strange column that has an optimized storage for null values)

·         Columnstore indexes can not act as major keys or international keys (keep in mind that a columnstore index can’t be a novel index)

·         Columnstore indexes can’t be modified utilizing “ALTER INDEX” assertion. Nonetheless, the “ALTER INDEX” assertion can be utilized to disable and rebuild a columnstore index. So the one technique to modify a columnstore index is to drop and recreate the columnstore index.

·         The key phrase “INCLUDE” will not be supported to create a columnstore index

·         Sorting will not be allowed in a columnstore index, so, “ASC” and “DESC” key phrases aren’t supported. Truly, columnstore indexes are ordered in response to the compression algorithm. Values chosen from a columnstore index may be sorted by the search algorithm, however you will need to use the ORDER BY clause to ensure sorting of a end result set.

·         A columnstore index doesn’t use and even hold statistics as rowstore index does

·         A columnstore index doesn’t help FILESTREAM attribute, so, solely the columns within the desk that aren’t used within the columnstore index can comprise the FILESTREAM attribute.

·         As column retailer index is optimized for in-memory processing, so, server reminiscence limitations ought to be thought of

·         Columnstore indexes don’t help SEEK, so, if the desk trace FORCESEEK is used, the optimizer won’t contemplate the columnstore index.

·         Columnstore indexes can’t be mixed with web page and row compression, as columnstore indexes are already compressed in a distinct format.

·         Replication will not be supported for tables containing columnstore index

·         Change monitoring and alter knowledge seize aren’t supported

·         Filestream will not be supported

·         The next knowledge sorts can’t be included in a columnstore index:

1.       binary and varbinary

2.      ntext , textual content, and picture

3.      varchar(max) and nvarchar(max)

4.      uniqueidentifier

5.      rowversion (and timestamp)

6.      sql_variant

7.      decimal (and numeric) with precision better than 18 digits

8.     datetimeoffset with scale better than 2

9.      CLR sorts (hierarchyid and spatial sorts)

10.  xml

Making a columnstore index is rather like creating another index. Typically, there are two methods to create a columnstore index, creating index utilizing T-SQL statements or utilizing SSMS (SQL Server Administration Studio).

Making a columnstore index utilizing T-SQL

In a question editor window execute the next assertion:

 

CREATE NONCLUSTERED COLUMNSTORE INDEX IndexName

    ON TableName (Column1, Column2, …)

Making a columnstore index utilizing SSMS

Open SQL Server Administration Studio (SSMS) and connect with a SQL Server database engine. Keep in mind that columnstore index is accessible simply in SQL Server 201 Enterprise Version.

1.       From “Object Explorer”-> broaden the instance-> broaden the databases-> broaden the database-> broaden the table-> proper click on on “Indexes”->  New Index-> Non-Clustered Columnstore Index

clip_image002[10]

2.      In “New Index” window-> Index Title (kind a reputation)-> Add-> choose the column-> OK-> OK

clip_image004[8]

Now the columnstore index is created and you may see it within the “Indexes” in object explorer.

clip_image005[8]

As columnstore index is a brand new know-how, it has many limitations and restrictions. Though all the columnstore index restrictions ought to be thought of, one of the basic and essential restrictions of columnstore index is that it’s NOT obtainable in all variations of SQL Server 2012. So, it’s actually essential to know what model of SQL Server goes for use in manufacturing atmosphere. In case your organisation will not be going to make use of SQL Server 2012 Enterprise version, you can not use columnstore index in any respect. So, it’s a must to plan to create rowstore indexes in your knowledge warehouse.

Resulting from the truth that the indexing is de facto associated to the queries, it ought to be investigated in a case by case foundation. Though columnstore indexing is bettering the question efficiency, nevertheless, in some circumstances it is going to trigger poorer question efficiency.

A few of the efficiency good thing about a columnstore index is derived from the compression strategies that cut back the variety of knowledge pages that have to be learn and manipulated to course of the question. Compression works finest on character or numeric columns which have giant quantities of duplicated values. For instance, dimension tables might need columns for postal codes, cities, and gross sales areas. If many postal codes are positioned in every metropolis, and if many cities are positioned in every gross sales area, then the gross sales area column can be probably the most compressed, the town column would have considerably much less compression, and the postal code would have the least compression. Though all columns are good candidates for a columnstore index, including the gross sales area code column to the columnstore index will obtain the best profit from columnstore compression, and the postal code will obtain the least.

References: SQL Server 2012 Books On-line, SQL Server Technical Article: Columnstore Indexes for Quick Knowledge Warehouse Question Processing in SQL Server 11.0; November 2010

 



Supply hyperlink

Leave a Reply

Your email address will not be published. Required fields are marked *