My SQL Cosmos: 2012

Wednesday, October 31, 2012

SQL Queries

Basic Restore commands all in one

Restore database adventureworks from disk = 'D:\AdventureWorks.bak'

Restore with different file names and/or file locations

RESTORE DATABASE Adventureworks

FROM DISK = 'D:\AdventureWorks.bak'

WITH MOVE 'Adventureworks_data' TO 'C:\Adventureworks_data.mdf',

MOVE 'Adventureworks_log' TO 'C:\Adventureworks_log.ldf'

Restore with different database name:

RESTORE DATABASE Adventureworks1

FROM DISK = 'D:\AdventureWorks.bak'

WITH REPLACE

SQL Server Indexes

What is index?

In English Indexes are defined as something that serves to guide, point out, or otherwise facilitate reference. And the same concept applies to the SQL Server. To understand this concept let’s take example of Telephone Directory. Assume that data is written in the telephone directory on the basis of person applying for connection there is no sorting based on name or anything. Just imagine how much effort you have to do in order to search one name from that directory. So if you have to search name ‘Roli’ you have to go through all the records till you find Roli. Now in next scenario if the data is arranged alphabetically how will you search? You will open pages with R then Ro then Rol and then you can go through records to find name Roli.

Why to use indexes?
Index helps in faster data retrieval of the data and reduces number of I\O operations of the server.

Types of indexes present in SQL Server?

There are two types of Indexes in SQL Server.
1. Clustered
2. Non Clustered

Architecture of Clustered Index

Properties

Clustered Index is that its leaf pages in contain data pages.
There is only one Clustered index possible on any table in SQL Server. The database engine allows the creation of a single clustered index per table for the reason that data is physically sorted in case of Clustered indexes. And only one time you can sort any data physically .Let’s go back to our example of telephone directory If the data is sorted based on first name you cannot sort it again by Surname.
A clustered index is built by default for each table for which you define the primary key using the primary key constraint. Also each clustered index is unique by default that is each data value can appear only once in a column for which the clustered index is defined. If a clustered index is built on a non-unique column the database system will force uniqueness by adding 4 byte identifier to the rows that have duplicate values.
Clustered Index allows very fast access in cases where a query searches for a range of values.

Syntax

Architecture of Non-Clustered Index

Properties

A non clustered index does not change the physical order of the order of the rows in the table.
The leaf pages of the non clustered index consist of an index key plus a bookmark.
For each non Clustered Index SQL Server engine creates an additional index structure that is sorted in index pages.

Significant differences between Clustered & Non Clustered Index

A non clustered index does not change the physical order of the rows in the table.
The leaf pages of a non clustered index consist of an index key plus a bookmark.

Tuesday, October 9, 2012

Partitioning of Table in SQL Server

If you have very large database then you can use Partition on those tables to optimise the performance of your database. The concept of partition was introduced with SQL Server 2005. The benefit is that partitioining allows you to spread data into different physical disks,leveraging the concurrent performance of those disks to optimize query performance.

Partitioning a SQL Server database table is 3 steps process :

1. Creating Partition function

2. Create Partition Scheme

3. Partitioning a Table

Let's have a closer look at each of these steps.

Step 1 : Creating a Partition Function.

The partition function defines how you want SQL Server to partition the data.Here you need to define boundaries of each partition.For example suppose we have a Salesorderdetail table that contains information of sales order , identified by unique salesorderID and holding records ranging from 1 to 400 .We decided to partition the table into four equally spaced partition.

Following is the syntax to create Partition Function -

CREATE PARTITION FUNCTION SALES_PARTFUNC (int) As RANGE RIGHT FOR VALUES ( 100 , 200 , 300 )

These boundaries define four partitions.The first contains all values less than 100. The Second contains values between 100 and 199. The third contains vaules between 200 and 299. All the values greater than or equal to 300 go in the fourth partition.

I used the "RANGE RIGHT" clause in this example. This indicates that the boundary value itself should go in the partition on the right side. Alternatively, if I had used "RANGE LEFT", the first partition would have included all values less than or equal to 100; the second partition would have included values between 101 and 200 and so on.

Step 2 : Create Partition Scheme

Once you have a partition function describing how you want to split your data, you need to create a partition scheme defining where you want to partition it.

Creating partition scheme will link partitions function to filegroups for eg. If I created four filegroups named FG_example1 to FG_example4 then following statement will create the partition scheme.

CREATE PARTITION SCHEME RolSALES_partscheme

AS PARTITION RolSALES_PARTFUNC

TO (FG_example1, FG_example2, FG_example3, FG_example4)

Did you notice that we now link a partition function to the partition scheme, but we still haven’t linked the partition scheme to any specific database table. We could use this partition scheme on any number of database tables.That’s where the power of reusability comes into picture.

Step 3: Partitioning a Table

After defining a partition scheme, you’re now ready to create a partitioned table. This is the simplest step of the process. You simply add an "ON" clause to the table creation statement specifying the partition scheme and the table column to which it applies. You don’t need to specify the partition function because the partition scheme already identifies it.

For example, if you wanted to create a customer table using our partition scheme, you would use the following Transact-SQL statement:

CREATE TABLE SalesOrderUpdate (ProductName nvarchar(40), ProductID nvarchar(40), SalesOrderCode int)

ON RolSALES_partscheme (SalesOrderCode)

Once tacit this is very useful concept of SQL Server :)

Sunday, September 30, 2012

So how does recovery know what to do?

All recovery processes depend on the fact that each log record is stamped with a log sequence number (LSN). A log sequence number is an ever-increasing, three-part number that uniquely defines the position of a log record within the transaction log. Each log record in a transaction is stored in sequential order within the transaction log and contains the transaction ID and the LSN of the previous log record for the transaction. In other words, each operation that is recorded as part of the transaction has a "link" back to the operation that immediately preceded it.

For the simple case of a single transaction being rolled back, the recovery mechanism can easily and quickly follow the chain of logged operations from the most recent operation back to the first operation and undo the effects of the operations in the opposite order from which they occurred. The database pages that were affected by the transaction are either still in the buffer pool or on disk. In either case, the image of the page that is available is guaranteed to be one where the effect of the transaction is reflected on the page and must be undone.

During crash recovery, the mechanism is more complicated. The fact that database pages are not written to disk when a transaction commits means that there is no guarantee that the set of database pages on disk accurately reflects the set of changes described in the transaction log—either for committed or uncommitted transactions. However, there is one final piece of the puzzle that I haven't mentioned yet—all database pages have a field in their page header (a 96-byte portion of the 8192-byte page that contains metadata about the page) that contains the LSN of the last log record that affected the page. This allows the recovery system to decide what to do about a particular log record that it must recover:

For a log record from a committed transaction where the database page has an LSN equal to or greater than the LSN of the log record, nothing needs to be done. The effect of the log record has already been persisted on the page on disk.
For a log record from a committed transaction where the database page has an LSN less than the LSN of the log record, the log record must be redone to ensure the transaction effects are persisted.
For a log record from an uncommitted transaction where the database page has an LSN equal to or greater than the LSN of the log record, the log record must be undone to ensure the transaction effects are not persisted.
For a log record from an uncommitted transaction where the database page has an LSN less than the LSN of the log record, nothing needs to be done. The effect of the log record was not persisted on the page on disk and as such does not need to be undone.

Crash recovery reads through the transaction log and ensures that all effects of all committed transactions are persisted in the database, and all effects of all uncommitted transactions are not persisted in the database—the REDO and UNDO phases, respectively. Once crash recovery completes, the database is transactionally consistent and available for use.

Suspect Database

The suspect_pages table is used for maintaining information about suspect pages, and is relevant in helping to decide whether a restore is necessary.

The suspect_pages table resides in the msdb database and was introduced in SQL Server 2005.

Saturday, September 29, 2012

Configuring Transactional Replication -SQL Server 2008

There are 3 steps involved in setting up Transactional Replication.

1. Configuring the Distribution database.
2. Creating the Publisher.
3. Creating the Subscriber.

Scenario:

Here I am taking two SQL Server instances ( Roli-w7\rolit & Roli-w7\roliSQL) installed on same server. I will create Distribution database on 'rolit' instance and Publisher on roliSQL instance. A contact table which belongs to Person schema in the Adventureworks Database is replicated to rollsubs database present on same server.

Configuring the Distribution database

Step1. Connect to the Microsoft SQL Server 2008 R2 Management Studio.
Step2. Right Click on the Replication node and Select Configure Distribution.

Step3. A new window appears which holds only information about the wizard. Click Next.

Step4. A new window will appear where you can make choice whether you want your distribution database to reside on same server or some other server will host the distributor.

Note : Distributer database can reside on the same server as the publisher, but it is always recommended to keep it on a separate server for better performance.

Step5. Next window will appear. Select the first option, i.e. Yes, configure the SQL Server Agent service to start automatically and click on the Next> button as shown in the screenshot below.

Step6. In below screen, you are asked about location of the Snapshot folder.

Step7. The new screen will display information such as what will be the distribution database name, the location where the data and the log file will reside. Make required changes if you want and click Next.

Step8 : Click Next.

Note here you get option of adding publisher.

Step 9: Choose to configure distribution database.

Step10. Click Finish.

Step11. Snapshot folder is created & Distributor database is configured.

After this a new database named distribution gets created. You can confirm this by expanding the System Database node and you shall be able to view the distribution database, please refer below screenshot.

2. Creating the Publisher

Step1: Right Click on Local Publications and select New Publication.

Step2: New Publication window will appear. Click Next.

Step3: Choose the Publication Database and Click Next.

Step4: Choose ' Trasaction Replication' in Publication type.

Step5: You have to choose here what you want to publish

Step6: I have choosen only one table for replication.

Step7: If want to add filter you can add filter here however I did not add any filter in this example. Click next.

Step8: Choose when you want to run snapshot agent.

Step9: Click Security Settings

Step10: Here specify the domain or machine account under which snapshot Agent process will run.

Step11: Click the check box of - Create the publication. and proceed to the next screen.

Step12: Give publication name. I have named my publication as RollRep.

Publication is created and now we will move ahead to create subscriber.

Creating the Subscriber

Step 1: Open SSMS and go to Replication. Click new Subscriptions.

Step 2 : A new Subcription wizard will appear click Next.

Step 3 : This step will allow you to choose the publication for which you want to create subsciptions. You can create one or more subscriptions.

Step 4: Now choose the type of Replication you want to create Push or Pull.

After choosing the type of subscriptions click next.

Step5 : Choose Subscriber and click next.

Step 6: Choose connection to Dstributor and Subscriber.

Step 7: Specify Distribution agent security and mode of connection to distributor and subscriber.

Step 8: Choose the Synchronization schedule.

Step 9: Initialize subscriptions

Step 10: Choose Create Subscription and click Next.

Step 11: New Subcription is successfully created.

To ensure that you replication is working fine keep a close eye on replication monitor.

Replication monitor is used for measuring Latency and Validating Connections for Transactional Replication.Transactional replication provides the tracer token feature, which provides a convenient way to measure latency in transactional replication topologies and to validate the connections between the Publisher, Distributor and Subscribers. A token (a small amount of data) is written to the transaction log of the publication database, marked as though it were a typical replicated transaction, and sent through the system, allowing a calculation of:

How much time elapses between a transaction being committed at the Publisher and the corresponding command being inserted in the distribution database at the Distributor.
How much time elapses between a command being inserted in the distribution database and the corresponding transaction being committed at a Subscriber.

From these calculations, you can answer a number of questions, including:

Which Subscribers take the longest to receive a change from the Publisher?
Of the Subscribers expected to receive the tracer token, which, if any, have not received it?

Replication Monitor displays performance quality values for transactional replication and merge replication in the Current Average Performance and Current Worst Performance columns for publications and the Performance column for subscriptions. The values are:
Excellent
Good
Fair
Poor
Critical (transactional replication only)

Sunday, September 23, 2012

Transactional Replication - Deep dive

Transactional Replication is most widely used form of SQL Server replication. It’s more full featured than snapshot replication but much easier to set up and manage than merge replication. There is common belief that merge replication is only replication type that supports bidirectional data replication. That’s not the case as both snapshot and transaction replication offer immediate and queued updating subscriptions.
SQL Server implements transactional replication via snapshot agent , log reader agent , and the distributor agent. The snapshot agent prepare the initial snapshot of a transactional publication. The log reader agent scans the T-log on the publisher and detects the changes made to the data after the snapshot has been taken and records them in the distribution database. The distribution agent reads the changes recorded in the distribution database and applies them to subscriber.
Each modification to a published table causes the Log Reader agent to write at least one row to MSrepl_commands. Unlike snapshot replication here data in MSrepl_commands is not stored in human readable form. You have to use sp_browsereplcmds.This sp returns readable version of replicated commands stored in distribution database.

The log reader agent uses the extended procedure sp_replcmds (implemented internally by SQL Server) to retrieve log records produced by DML statements in the published database's T-log. Each publisher database participating in transactional replication will have only one log reader agent regardless of howmany transactional publications it contains. Each database published using transactional replication has its own Log Reader Agent that runs on the Distributor and connects to the Publisher.

If one publication calls sp_replcmds for a db then log reader is considered for that database until it disconnects. Other publications attempting to run sp_replcmds before the first disconnects will get error stating that ' Another log reader is replicating the database'. One other reason that why only one log reader agent is permitted for each database is that scanning the log for changes can impact performance. Each time the log reader agent invokes sp_replcmds, it causes log reader code within the SQL Server process to scan the published database's transaction log for changes that need to be replicated. When it does this , the log reader agent changes the typical sequential method SQL Server uses to access the log into something more random. While the server writes new entries to the end of the T-log as changes are made to the database, the log reader agent may be reading a different section of log in order to write replication commands to MSrepl_commands and MSrepl_Transactions.This can cause resource contention for the T log and impact the performance of the server.
SQL Server maintains a global chache of article metadata known as the article cache. This cache stores metadata from sysarticles and syscolumns for the replicated article. SQL Server consults this chache when it requires metadata for a particular article.
Once the log reader agent finishes calling sp_replcmds and writing new entries to MSrepl_commands and MSrepl_Transactions, it calls sp_repldone to indicate that the specified log records have been successfully replicated. This allows SQL Server to purge log records as necessary. Log records for articles cannot be purged utill they are replicated to distributor.