Saturday, 6 December 2014

exam 70-461 summary

Chapter 1
Dated : 18-Oct-2014
T-SQL is based on strong mathematical foundation.
Set theory + Predicate logic => Relational Model => SQL => T-SQL.
SQL Standard =>> International Organization for Standards (ISO) and the American National Standards Institute (ANSI).
Standard SQL is based on the relational model, which is a mathematical model for data management and manipulation
A relation in the relational model is what SQL calls a table. a relation has a heading and a body heading of a relation is a set of attributes, and the body a set of tuples mathematical branches that the relational model is based on

·          Set theory
·          Predicate logic


By a “set” we mean any collection M into a whole of definite, distinct objects m (which are called the “elements” of M) of our perception or of our thought by

The other branch of mathematics that the relational model is based on is called predicate logic.
 A predicate is an expression that when attributed to some object, makes a proposition either true or false. For example, “salary greater than $50,000” is a predicate

The relational model uses predicates as one of its core elements. You can enforce data
integrity by using predicates. You can filter data by using predicates


Name two aspects in which T-SQL deviates from the relational model
A relation has a body with a distinct set of tuples.
A table doesn’t have to have a key.
T-SQL allows referring to ordinal positions of columns in the ORDER BY
clause.

Why are the terms “field” and “record” incorrect when referring to column and
row?
Because “field” and “record” describe physical things, whereas columns and
rows are logical elements of a table.
Why is the term “NULL value” incorrect?
Because NULL isn’t a value; rather, it’s a mark for a missing value


Make the Non relational Query Relational
·         Define an attribute name by assigning an alias to the expression YEAR(orderdate).
·         Add a DISTINCT clause to remove duplicates.
·         Also, remove the ORDER BY clause to return a relational result.

Lesson-2 - Logical query processing

T-SQL has both logical and physical sides to it.
The logical side is the conceptual interpretation of the query that explains what the correct result of the query is.
The physical side is the processing of the query by the database engine.
Physical processing must produce the result defined by logical query processing.
Logical query processing - the conceptual interpretation of the query that defines the correct result.
the logical processing of a query must first know which table is being queried
before it can know which attributes can be returned from that table
Therefore, contrary to the keyed-in order of the previous query, the logical query processing has to be as follows.
FROM Sales.Shippers
SELECT shipperid, phone, companyname

Logical query processing follows following sequence,

1.     Evaluate the FROM Clause
2.     Evaluate the Where Clause
3.     Evaluate the Group by Clause
4.     Evaluate the Having Clause
5.     Evaluate the Select Clause
6.     Evaluate the Order by Clause


What is the difference between the WHERE and HAVING clauses?
The WHERE clause is evaluated before rows are grouped, and therefore is evaluated
per row. The HAVING clause is evaluated after rows are grouped, and therefore
is evaluated per group.

Dated : 19-Oct-2014
Chapter -2
Delimiting Identifiers
The rules say that the first character must be
a letter in the range A through Z (lower or uppercase), underscore (_), at sign (@), or number
sign (#).
Subsequent characters can include letters, decimal numbers, at sign, dollar sign ($),
number sign, or underscore. The identifier cannot be a reserved keyword in T-SQL, cannot
have embedded spaces, and must not include supplementary characters.
An identifier that doesn’t comply with these rules must be delimited.
must be delimited as "2006" or [2006].
Choosing a Data Type for Keys

Understanding the built-in tools T-SQL provides for generating surrogate keys like the
sequence object, identity column property, and the NEWID and NEWSEQUENTIALID
functions, and their impact on performance, is an important skill for the exam.
20-Oct-14
Float is Approximate-number data type, which means that not all values in the data type range can be represented exactly.
Decimal/Numeric is Fixed-Precision data type, which means that all the values in the data type reane can be represented exactly with precision and scale.

Converting from Decimal or Numeric to float can cause some loss of precision

Use char when the sizes of the column data entries are consistent.
Use varchar when the sizes of the column data entries vary considerably.
Use varchar(max) when the sizes of the column data entries vary considerably, and the size might exceed 8,000 bytes.


what is difference between below conversion function in sql server 2012?
1).Cast
2).Convert
3).Parse
4).Try_Cast
5).Try_Convert
6).Try_Parse.


TRY_PARSE ( string_value AS data_type [ USING culture ] )
This function works similar to the well known CAST, but will return NULL if the string_value does not represent a valid formatted value of the given data type.
PARSE ( string_value AS data_type [ USING culture ] )
This works pretty much the same as TRY_PARSE, but this function will throw an error if the string_value passed to the function is not a valid representation of the data type. Let me show you the similar parsings as above:
1.          SELECT PARSE('13-12-2011' AS datetime USING 'da-DK'AS Result  
2.          SELECT PARSE('13-12-2011' AS datetime USING 'en-US'AS Result 
TRY_CONVERT ( data_type [ ( length ) ], expression [, style ] )
This is pretty much the same as the well known CONVERT, but again with the NULL output for invalid input strings.
But what if there were million of rows in the table, and multiple non-integer values? It would not be pleasant to try to identify them all. But with TRY_PARSE we can now do this to identify the rows:

1.          SELECT * FROM MyBadTableDesign  
2.          WHERE TRY_PARSE(val AS INTIS NULL

3.         SELECT CAST('abc' AS INT);
4.     Conversely, the following code returns a NULL.
5.      SELECT TRY_CAST('abc' AS INT);
24-Oct-14
String Length
LEN and DATALENGTH
Len >> return number charater
DataLength>> return number of bytes

Would you use the type FLOAT to represent a product unit price
No, because FLOAT is an approximate data type and cannot represent all values
Precisely.

What is the difference between NEWID and NEWSEQUENTIALID
The NEWID function generates GUID values in random order, whereas the
NEWSEQUENTIAL ID function generates GUIDs that increase in a sequential
order.
When concatenating character strings, what is the difference between the plus
(+) operator and the CONCAT function?
The + operator by default yields a NULL result on NULL input, whereas the
CONCAT function treats NULLs as empty strings.



differences between COALESCE and ISNULL
COALESCE, it returns the first input that is not NULL.
ISNULL is actually more limited than COALESCE
DECLARE
@x AS VARCHAR(3) = NULL,
@y AS VARCHAR(10) = 'unknows';
SELECT COALESCE(@x, @y) AS [COALESCE], ISNULL(@x, @y) AS [ISNULL];

OUTPUT will be

COALESCE        ISNULL
----------        ---------
unknows           unk

NULLIF function. This function accepts two input expressions,
returns NULL if they are equal, and returns the first input if they are not. For example,
consider the expression NULLIF(col1, col2). If col1 is equal to col2, the function returns a
NULL; otherwise, it returns the col1 value.

IIF and CHOOSE, these are nonstandard T-SQL functions that were added to simplify
migrations from Microsoft Access platforms.

IIF(<predicate>, <true_result>, <false_or_unknown_result>)
This expression is equivalent to the following.
CASE WHEN <predicate> THEN <true_result> ELSE <false_or_unknown_result> END

DATETIME data type uses 8 bytes of storage. SQL Server 2012 supports the DATE
data type, which uses 3 bytes of storage

new functions
EOMONTH , FORMAT , DATEFROMPARTS
SWITCHOFFSET
and TODATETIMEOFFSET
25-Oct-14
TRY_CAST returns a NULL instead of failing

The right form for a Unicode character string literal is to prefix the literal with a capital N and delimit the
literal with single quotation marks; for example, N'literal'

WHERE lastname = N'Davis'; best practice to use the proper form

pattern matching
'_D%': string where second character is D,
'[AC]%': string where first character is A or C,
'[0-9]%': string where first character is a digit
'[^0-9]%': string where first character is not a digit

term three-valued logic refer to in T-SQL as true, false, and unknown

literals are language-dependent and language-neutral for the DATETIME data type

SQL predicates are search arguments
companyname LIKE 'A%'
orderdate > = '20120212' AND orderdate < '20120213'
Is an expression that evaluates to TRUE, FALSE, or UNKNOWN. Predicates are used in the search condition of WHERE clauses and HAVING clauses, the join conditions of FROMclauses, and other constructs where a Boolean value is required


guarantee the order of the rows in the result of a query is used of order by Clause.

You want result rows to be sorted by orderdate descending, and then by orderid,
descending.
ORDER BY orderdate DESC, orderid DESC.


 WITH TIES grabs all other records that matches the value of “order by” column. 

SELECT TOP(1) WITH TIES * FROM #MyTable ORDER BY age desc

All record have age 20 shows

Filtering Data with OFFSET-FETCH
query defines ordering based on order date descending, followed by order ID
descending; it then skips 50 rows and fetches the next 25 rows.
SELECT orderid, orderdate, custid, empid
FROM Sales.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 50 ROWS FETCH NEXT 25 ROWS ONLY

26-oct-2014
What’s the difference between the ON and the WHERE clauses



If the OUTER keyword is omitted from LEFT OUTER JOIN, RIGHT OUTER
JOIN, and FULL OUTER JOIN, the meaning is retained.
27-OCT-2014

With subqueries, you can nest queries within queries. You can use self-contained
subqueries as well as correlated ones. You can use subqueries that return single-valued
results, multi-valued results, and table-valued results.

T-SQL supports four kinds of table expressions, which are named query expressions.
Derived tables and CTEs are types of table expressions that are visible only in the scope
of the statement that defined them. Views and inline table-valued functions are reusable
table expressions whose definitions are stored as objects in the database. Views
do not support input parameters, whereas inline table-valued functions do

What are the benefits of using a CTE over derived tables?

·         If you want to refer to one derived table from another, you need to nest
them. With CTEs, you separate those by commas, so the code is more modular and
easier to follow.

·         Because the CTE name is defined before the outer query that uses it, the
outer query is allowed to refer to multiple instances of the same CTE name.

The APPLY operator has two
versions; the CROSS APPLY version doesn’t return left rows that get an empty set back
from the right side. The OUTER APPLY operator preserves the left side, and therefore,
does return left rows when the right side returns an empty set.


What is the difference between the result of T1 CROSS APPLY T2 and T1 CROSS JOIN
T2 (the right table expression isn’t correlated to the left)?

·         Both return the same result when there’s no correlation because CROSS
APPLY applies all rows from T2 to each row from T1



1-Nov-2014

What are the clauses that you can use to define multiple grouping sets in the
same query?
GROUPING SETS, CUBE, and ROLLUP

Pivoting Data
Pivoting is a technique that groups and aggregates data, transitioning it from a state of rows
to a state of columns.

The PIVOT operator determines the grouping element by elimination—
it’s what’s left besides the spreading and aggregation elements.

You cannot specify a computation as input to the aggregate function,rather just a name of a column from the input table.
You cannot specify a computation as the spreading element, rather just a name of a column from the input table.
You cannot specify a subquery in the IN clause, rather just a static list.
You cannot specify multiple aggregate functions, rather just one.


WINDOW FUNCTIONS

Like group functions, window functions also enable you to perform data analysis computations
window functions, you define the set of rows per function—and then return one result value per each underlying row and function. You define the set of rows for the function to work with using a clause called OVER.

window aggregate functions(SUM,AVG,COUNT) are applied to a window of rows defined by the OVER clause.
For example, the expression SUM(val) OVER() represents the grand total of all rows in the underlying query.
For example, the expression SUM(val) OVER(PARTITION BY custid) represents the current customer’s total.

SELECT custid, orderid,
val,
SUM(val) OVER(PARTITION BY custid) AS custtotal,
SUM(val) OVER() AS grandtotal
FROM Sales.OrderValues


Indicate the delimiters as one of three options:
UNBOUNDED PRECEDING or FOLLOWING, meaning the beginning or end of the partition,
respectively
CURRENT ROW, obviously representing the current row
<n> ROWS PRECEDING or FOLLOWING, meaning n rows before or after the current,
respectively


WITH RunningTotals AS
(
SELECT custid, orderid, orderdate, val,
SUM(val) OVER(PARTITION BY custid
ORDER BY orderdate, orderid
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS runningtotal
FROM Sales.OrderValues
)
SELECT *
FROM RunningTotals
WHERE runningtotal < 1000.00;


ROW_NUMBER, RANK, DENSE_RANK, and NTILE

The ROW_NUMBER function computes a unique sequential integer starting with 1 within
the window partition based on the window ordering.

RANK and DENSE_RANK differ from ROW_NUMBER in the sense that they assign the same
ranking value to all rows that share the same ordering value.

What are the clauses that the different types of window functions support.
Partitioning, ordering, and framing clauses.
What do the delimiters UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING
represent?
The beginning and end of the partition, respectively.


Window Offset Functions, Window offset functions return an element from a single row that is in a given offset from the current row in the window partition, or from the first or last row in the window frame

LAG, LEAD, FIRST_VALUE, and LAST_VALUE

SELECT custid, orderid, orderdate, val,
FIRST_VALUE(val) OVER(PARTITION BY custid
ORDER BY orderdate, orderid
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS first_val,
LAST_VALUE(val) OVER(PARTITION BY custid
ORDER BY orderdate, orderid
ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING) AS last_val
FROM Sales.OrderValues;

What default frame window functions
RANGE UNBOUNDED PRECEDING
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

What do the RANK and DENSE_RANK functions compute.
The RANK function returns one more than the number of rows that have a lower
ordering value than the current; the DENSE_RANK function returns one more than
the number of distinct ordering values that are lower than the current.
So, we find that Rank() skips the ranking number when it gets same OrdCount but Dense_Rank() maintains ranking order.

Why are window functions allowed only in the SELECT and ORDER BY clauses of a
Query ?
Because they are supposed to operate on the underlying query’s result, which is
achieved when logical query processing gets to the SELECT phase.

03-NOV-2014

Full-text queries can search not only for words you provide in a query; they can search
for synonyms as well. SQL Server finds synonyms in thesaurus files. Each language has an
associated XML thesaurus file.

For create synonyms for the words searched, You can edit the thesaurus file

SQL Server supports two very powerful predicates for limiting the result set of a query by using
full-text indexes. These two predicates are the CONTAINS and FREETEXT predicates.


04-NOV-2014

FOR XML is used only in SELECT if it is not a Sub Query.
In Sub-query, it can be used for INSERT, UPDATE, DELETE as well as in assignment statements.
In a FOR XML clause you can specify these modes.
1.     AUTO
2.     RAW
3.     EXPLICIT
4.     PATH


·         The AUTO mode with FOR XML generates nesting in the resulting XML on the way the SELECT statement is specified
·         The RAW mode with FOR XML generates a single tuple <row> element per row in the table that is returned by the SELECT statement.
·         The EXPLICIT mode gives more control on the shape of the XML that we need to generate. We can get attributes and elements together in deciding the shape of the XML. It requires a specific format for the resulting rowset that is generated because of query execution. This rowset format is then mapped into XML shape.
·         The PATH mode together with the nested FOR XML query capability provides the flexibility of the EXPLICIT mode in a simpler manner.


Examples: Player Table
Id
FirstName
LastName
CITY
1
Sachin
Tendulkar
Mumbai
2
Saurav
Ganguli
Kolkata
3
Mahendrasinh
Dhoni
Chennai

SELECT * FROM Players FOR XML AUTO
<Players Id="1" FirstName="Sachin" LastName="Tendulkar" CITY="Mumbai" />
<Players Id="2" FirstName="Saurav" LastName="Ganguli" CITY="Kolkata" />
<Players Id="3" FirstName="Mahendrasinh" LastName="Dhoni" CITY="Chennai" />

SELECT * FROM Players FOR XML RAW(‘Cricketers‘)
<Cricketers Id="1" FirstName="Sachin" LastName="Tendulkar" CITY="Mumbai" />
<Cricketers Id="2" FirstName="Saurav" LastName="Ganguli" CITY="Kolkata" />
<Cricketers Id="3" FirstName="Mahendrasinh" LastName="Dhoni" CITY="Chennai" />

SELECT * FROM Players FOR XML PATH(‘Player‘)
<Player>
  <Id>1</Id>
  <FirstName>Sachin</FirstName>
  <LastName>Tendulkar</LastName>
  <CITY>Mumbai</CITY>
</Player>
<Player>
  <Id>2</Id>
  <FirstName>Saurav</FirstName>
  <LastName>Ganguli</LastName>
  <CITY>Kolkata</CITY>
</Player>
<Player>
  <Id>3</Id>
  <FirstName>Mahendrasinh</FirstName>
  <LastName>Dhoni</LastName>
  <CITY>Chennai</CITY>
</Player>


SELECT 
        1 AS Tag,
        NULL AS Parent,
        ID AS ‘Player!1!Id‘,
        NULL AS ‘Detail!2!FirstName‘,
        NULL AS ‘Detail!2!LastName‘,
        NULL AS ‘Detail!2!City‘
FROM Players WHERE Id = 1
UNION ALL
SELECT 
        2 AS Tag,
        1 AS Parent,
        NULL,
        FirstName,
        LastName,
        CIty
FROM Players WHERE Id = 1
FOR XML EXPLICIT

<Player Id="1">
  <Detail FirstName="Sachin" LastName="Tendulkar" City="Mumbai" />
</Player>





The FOR XML AUTO option gives you nice XML documents with nested elements, and it is not
complicated to use. In AUTO and RAW modes, you can use the keyword ELEMENTS to produce
element-centric XML. The WITH NAMESPACES clause, preceding the SELECT part of the
query, defines namespaces and aliases in the returned XML.

A colon is used in XML to separate the namespace from the element name.

WITH XMLNAMESPACES('TK461-CustomersOrders' AS co)
SELECT [co:Customer].custid AS [co:custid],[co:Customer].companyname AS [co:companyname],[co:Order].orderid AS [co:orderid],
[co:Order].orderdate AS [co:orderdate] FROM Sales.Customers AS [co:Customer]
INNER JOIN Sales.Orders AS [co:Order] ON [co:Customer].custid = [co:Order].custid
ORDER BY [co:Customer].custid, [co:Order].orderid
FOR XML AUTO, ELEMENTS, ROOT('CustomersOrders');

The FOR XML clause comes after the ORDER BY clause in a query.

In RAW and AUTO mode, you can also return the XSD schema of the document you are
creating. This schema is included inside the XML that is returned, before the actual XML data;
therefore, it is called inline schema. You return XSD with the XMLSCHEMA directive. This directive
accepts a parameter that defines a target namespace.

SELECT [Customer].custid AS [custid],[Customer].companyname AS [companyname],
[Order].orderid AS [orderid],[Order].orderdate AS [orderdate]
FROM Sales.Customers AS [Customer]
INNER JOIN Sales.Orders AS [Order]
ON [Customer].custid = [Order].custid
WHERE 1 = 2
FOR XML AUTO, ELEMENTS,
XMLSCHEMA('TK461-CustomersOrders');

XMLSCHEMA directive in the FOR XML clause

Format Return XML

Automatic format:
·         FOR XML AUTO automatically formats the XML retuned.
·         FOR XML RAW automatically formats the XML retuned.
Manually Format:
·          FOR XML PATH allows you to manually format the XML returned.
·         FOR XML EXPLICIT allows you to manually format the XML returned.



Shredding XML to Tables (convert xml to table)

The OPENXML function provides a rowset over in-memory XML documents by using
Document Object Model (DOM) presentation.

The OPENXML function accepts an optional third parameter, called flags, which allows you
to specify the mapping used between the XML data and the relational rowset.
1 means attribute-centric mapping,  OPENXML (@DocHandle, '/CustomersOrders/Customer',1)
2 means element-centric, and OPENXML (@DocHandle, '/CustomersOrders/Customer',2)
3 means both OPENXML (@DocHandle, '/CustomersOrders/Customer', 1+2+8)


Flag value 8 can be combined with values 1 and 2 (1+2+8)

SELECT * FROM OPENXML (@DocHandle, '/CustomersOrders/Customer',11)
WITH (custid INT, companyname NVARCHAR(40));
-- Remove the DOM
EXEC sys.sp_xml_removedocument @DocHandle;

XQuery is a standard language for browsing XML instances and returning XML.


The real power of XQuery lies in its so-called FLWOR expressions. FLWOR is the acronym for
for, let, where, order by, and return. A FLWOR expression is actually a for each loop

Which conditional expression is supported in XQuery? Anwer (if..then..else)

The easiest way to create XML schemas is to create relational tables first, and then use the
XMLSCHEMA option of the FOR XML clause.
Store the resulting XML value (the schema) in a variable, and provide the variable as input to the CREATE XML SCHEMA COLLECTION statement.



an XML data type method ?
nodes() , exist() , value()   is an XML data type method.


What kind of XML indexes can you create?
1.     You create a PRIMARY XML index before any other XML indexes.
2.     A PATH XML index is especially useful if your queries specify path expressions.


Which XML data type method do you use to shred XML data to tabular format?
You use the nodes() method to shred XML data.

07-NOV-2014

Database Schema and Table Schema

A database schema is a database-wide container of objects. A table schema is the definition of a table that
includes the CREATE TABLE statement with all the column definitions

·         The DATE, TIME, and DATETIME2 data types can store data more efficiently and with
better precision than DATETIME and SMALLDATETIME.
·         Use VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) instead of the deprecated
TEXT, NTEXT, and IMAGE data types.
·         Use ROWVERSION instead of the deprecated TIMESTAMP.
·         DECIMAL and NUMERIC are the same data type, but generally people prefer DECIMAL
because the name is a bit more descriptive. Use DECIMAL and NUMERIC instead
of FLOAT or REAL data types unless you really need floating-point precision and are
familiar with possible rounding issues.


Computed Columns: CREATE TABLE Sales.OrderDetails(orderid INT NOT NULL,Initialcost AS unitprice * qty) -- computed column

What types of table compression are available? You can use either page or row compression on a table. Page compression
includes row compression.


Default constraints are most useful when you have a column in a table that does not allow
NULL, but you don't want to prevent an INSERT from succeeding if it does not specify a value
for the column
(supplierid INT NOT NULL,
categoryid INT NOT NULL,
unitprice MONEY NOT NULL
CONSTRAINT DFT_Products_unitprice DEFAULT(0),
discontinued BIT NOT NULL
CONSTRAINT DFT_Products_discontinued DEFAULT(0),)


What metadata tables give you a list of constraints in a database.
sys.key_constraints , sys.default_constraints , sys.foreign_keys



Views

CREATE VIEW Sales.OrderTotalsByYear
WITH SCHEMABINDING AS SELECT
YEAR(O.orderdate) AS orderyear, SUM(OD.qty) AS qty

WITH ENCRYPTION (with encryption of view code)
WITH SCHEMABINDING (which guarantees that the underlying table structures cannot be altered without dropping the view)
WITH VIEW_METADATA (returns the metadata of the view instead of the base table.)


If the tables are spread across multiple SQL Server instances, the view is
called a distributed partitioned view

Inline functions can be used to simulate parameterized views. T-SQL views cannot take
parameters. However, an inline table-valued function can return the same data as a
view and can accept parameters that can filter the results.


A view can contain an ORDER BY if the SELECT TOP clause is used,
but no actual sorting of the results is guaranteed.


A synonym is a name that refers to another database object such as a table, view, function,
or stored procedure.

Does a synonym store T-SQL or any data? No, a synonym is just a name. All that is stored with a synonym is the object it
refers to.
Can synonyms be altered? No, to change a synonym, you must drop and recreate it.

CREATE SYNONYM Reports.TotalCustQtyByMonth FOR Sales.CustOrders;

CREATE SYNONYM Reports.TotalCustQtyByMonth FOR TSQL2012.Sales.CustOrders;
GO
SELECT custid, ordermonth, qty FROM Reports.TotalCustQtyByMonth; -- Succeeds
GO

Synonyms can be used to provide an abstraction layer to the user by presenting different
names for database objects.

Synonyms can refer to stored procedures.
Synonyms can refer to temporary tables.

Synonyms cannot refer to indexes; indexes are not database objects
that are scoped by schema names.
Database users are not database objects that are scoped by schema
names.

You can create a synonym that refers to a nonexistent object. In order to
use the synonym, however, you must ensure that the object exists.

08-NOV-2014

INSERT VALUES, INSERT SELECT, INSERT EXEC, SELECT INTO
INSERT EXEC statement to insert the result of queries in a dynamic
batch or a stored procedure into the target table

If the column has an IDENTITY property, you must normally skip it in
the INSERT statement and let the property assign the value. To provide your own
value, you need to turn on the IDENTITY_INSERT option, but that’s not what happens
normally.

SELECT INTO doesn’t copy indexes
SELECT INTO doesn’t copy constraints.
SELECT INTO doesn’t copy triggers.


SELECT INTO locks both data and metadata, and therefore can cause
blocking related to both.
If the CREATE TABLE and INSERT SELECT are executed in
different transactions, you hold locks on metadata only for a very short period.

An UPDATE with a variable can both modify a column value and collect
the result into a variable using one visit to the row

The join can be used to filter the updated rows.

The join gives you access to information in other tables that can be used
in the source expressions for the assignments

You cannot update multiple tables in one UPDATE statement.

With an UPDATE based on table expressions, you can invoke a window
function in the inner query’s SELECT list. You can then refer to the alias you assigned
to the result column in the outer UPDATE statement’s SET clause

An UPDATE based on a join cannot refer to window functions in the SET
Clause

Difference between DELETE and TRUNCATE

The DELETE statement writes significantly more to the transaction log compared to the
TRUNCATE statement. For DELETE, SQL Server records in the log the actual data that
was deleted. For TRUNCATE, SQL Server records information only about which pages
were deallocated. As a result, the TRUNCATE statement tends to be substantially faster.


The DELETE statement doesn’t attempt to reset an identity property

The DELETE statement is supported if there’s a foreign key pointing to the table in
question as long as there are no related rows in the referencing table. TRUNCATE is not
allowed if a foreign key is pointing to the table—even if there are no related rows in
the referencing table, and even if the foreign key is disabled.

The DELETE statement is allowed against a table involved in an indexed view. A TRUNCATE
statement is disallowed in such a case.

The DELETE statement requires DELETE permissions on the target table. The TRUNCATE
statement requires ALTER permissions on the target table


The TRUNCATE statement resets an IDENTITY property.
The TRUNCATE statement is disallowed when a foreign key pointing to
the table exists.
The TRUNCATE statement is disallowed when an indexed view based on
the table exists.


Using the IDENTITY Column Property
IDENTITY is a property of a column in a table. The property automatically assigns a value to
the column upon insertion. You can define it for columns with any numeric type that has a
scale of 0. This means all integer types, but also NUMERIC/DECIMAL with a scale of 0. When
defining the property, you can optionally specify a seed and an increment.


SQL Server provides two features to help you generate a sequence of keys: the IDENTITY
column property and the sequence object

To get the newly generated identity value, you can query the functions
SCOPE_IDENTITY= ( the last identity value generated by your session and scope.)
@@IDENTITY=         ( the last identity value generated by your session.)
IDENT_CURRENT= ( the last identity value generated in the input table.)

Sequence object available from SQL Server 2012.

advantages of using a sequence object instead of IDENTITY

1.     One of the advantages of using a sequence object instead of IDENTITY is
that you can attach a DEFAULT constraint that has a call to the NEXT VALUE FOR
function to an existing column, or remove such a constraint from a column

2.     You can generate a new sequence value before using it by assigning the
value to a variable and later using the variable in an INSERT statement. This cannot
be done with IDENTITY.

3.     You can specify your own value for a column that has an IDENTITY
property, but this requires turning on the session option IDENTITY_INSERT, which
in turn requires special permissions. The sequence object is more flexible. You can
insert your own values into a column that normally gets its value from a sequence
object. And that’s without needing to turn on any special options and without
needing special permissions.

What is the difference between Primary key and unique key?


Ans: Primary key does not allow the null values but unique key allows one null value.
Primary key will create clustered index on column but unique key will create non-clustered index by defaul



Three Statements
1.     Sequence Object and Identity Columns
2.     MERGE
3.     OUTPUT Statement for Inserted and Deleted

With the MERGE statement, you can merge data from a source table or table expression into a target table.

Simple Merge Statement with Single Table
MERGE INTO Sales.MyOrders WITH (HOLDLOCK) AS TGT
USING (VALUES(@orderid, @custid, @empid, @orderdate)) AS SRC( orderid, custid, empid, orderdate) ON SRC.orderid = TGT.orderid
WHEN MATCHED THEN UPDATE SET TGT.custid = SRC.custid, TGT.empid = SRC.empid, TGT.orderdate = SRC.orderdate
WHEN NOT MATCHED THEN INSERT VALUES(SRC.orderid, SRC.custid, SRC.empid, SRC.orderdate);

Simple Merge Statement with Two Tables one is Source and One is Target
MERGE INTO Sales.MyOrders AS TargetTable
USING Sales.Orders AS SourceTable ON SourceTable.orderid = TargetTable.orderid AND shipcountry = N'Norway'
WHEN MATCHED AND (TargetTable.custid <> SourceTable.custid OR TargetTable.empid <> SourceTable.empid OR TargetTable.orderdate <> SourceTable.orderdate)
THEN UPDATE SET TargetTable.custid = SourceTable.custid,
TargetTable.empid = SourceTable.empid,
TargetTable.orderdate = SourceTable.orderdate
WHEN NOT MATCHED THEN INSERT VALUES(SourceTable.orderid, SourceTable.custid, SourceTable.empid, SourceTable.orderdate);

MERGE is very useful, especially when it comes to loading data warehouse tables, which can be very large and require specific actions to be taken when rows are or are not present.

CREATE TABLE dbo.tbl_Source (id INT, name NVARCHAR(100), qty INT);
CREATE TABLE dbo.tbl_Target (id INT, name NVARCHAR(100), qty INT);
 
--Synchronize source data with target
MERGE INTO dbo.tbl_Target AS t
    USING dbo.tbl_Source AS s    
        ON t.id = s.id
    WHEN MATCHED AND (t.name != s.name OR t.qty!= s.qty) THEN
        --Row exists and data is different
        UPDATE SET t.name = s.name, t.qty = s.qty
    WHEN NOT MATCHED THEN 
        --Row exists in source but not in target
        INSERT VALUES (s.id, s.name, s.qty) 
    WHEN SOURCE NOT MATCHED THEN 
        --Row exists in target but not in source
        DELETE OUTPUT$action, inserted.id, deleted.id

OUTPUT clause can be used with INSERT, UPDATE, or DELETE to identify the actual rows affected by these statements.

Using EXEC() instead of sp_executesql
When using dynamic SQL, you have two choices: EXEC() / EXECUTE(), or sp_executesql. Early on my career, I used EXEC() a lot, because it was much easier. I've since learned that sp_executesql is a lot better, for two main reasons:
1.      You substantially reduce the risk of SQL injection by continuing to pass strongly-typed variables throughout the whole chain. Which do you find more questionable?
1.         sp_executesql allows for statements to be parameterized
o    Therefore It’s more secure than EXEC in terms of SQL injection
2.         sp_executesql can leverage cached query plans.
o    The TSQL string is built only one time, after that every time same query is called with sp_executesql, SQL Server retrieves the query plan from cache and reuses it
3.         Temp tables created in EXEC can not use temp table caching mechanism
What is an Index
Index is a database object, which can be created on one or more columns (16 Max column combination). When creating the index will read the column(s) and forms a relevant data structure to minimize the number of data comparisons. The index will improve the performance of data retrieval and adds some overhead on data modification such as create, delete and modify. So it depends on how much data retrieval can be performed on table versus how much of DML (Insert, Delete and Update) operations.
In this article, we will see creating the Index. The below two sections are taken from my previous article as it is required here. If your database has changes for the next two sections, you can directly go to section 5.
SQL Server internally organizes data in a data file in pages.
A page is an 8 KB unit and belongs to a single object; for example, to a table or an index.
A page is the smallest unit of reading and writing. Pages are further organized into extents.
An extent consists of eight consecutive pages.

Pages are physical structures.
SQL Server organizes data in pages in logical structures.
SQL Server organizes tables as heaps or as balanced trees.
A table organized as a balanced tree is also known as a clustered table or a clustered index
Indexes are always organized as balanced trees.
Other indexes, such as indexes that do not contain all of the data and serve as pointers to table rows for quick seeks, are called non clustered indexes.

Heaps
A heap is a quite simple structure. Data in a heap is not organized in any logical order.
A heap is just a bunch of pages and extents

Index Allocation Map (IAM) pages
SQL Server traces which pages and extents belong to an object through special system
pages called Index Allocation Map (IAM) pages. Every table or index has at least one IAM
page, called first IAM. A single IAM page can point to approximately 4 GB of space. Large
objects can have more than one IAM page. IAM pages for an object are organized as a doubly
linked list; each page has a pointer to its descendant and antecedent.
SQL Server stores pointers to first IAM pages in its own internal system tables SQL Server uses Index Allocation Map(IAM) Pages to scan heaps in physical order, or allocation order.

If you do not create a clustered index explicitly or implicitly through primary key or unique
constraints, then a table is organized as a heap. SQL Server does not allocate any pages for
a table when you create it. It allocates the first page, and also the first IAM page, when you
insert the first row in the table. You can find general information about tables and indexes in
the sys.indexes catalog view.
Clustered Index
You organize a table as a balanced tree when you create a clustered index.
The structure is called a balanced tree because it resembles an inverse tree.
Every balanced tree has a single root page and at least one or more leaf pages.
In addition, it can have zero or more intermediate levels.
All data in a clustered table is stored in leaf pages.
Data is stored in logical order of the clustering key.
A clustering key can consist of a single column, or of multiple columns.
If the key consists of multiple columns, then this is a composite key.
SQL Server still uses IAM pages to follow the physical allocation.

The primary key created for the StudId column will create a clustered index for the Studid column. A table can have only one clustered index on it.When creating the clustered index, SQL server 2005 reads the Studid column and forms a Binary tree on it. This binary tree information is then stored separately in the disc. Expand the table Student and then expand the Indexes. You will see the following index created for you when the primary key is created:With the use of the binary tree, now the search for the student based on the studid decreases the number of comparisons to a large amount. Let us assume that you had entered the following data in the table student
The index will form the below specified binary tree. Note that for a given parent, there are only one or two Childs. The left side will always have a lesser value and the right side will always have a greater value when compared to parent. The tree can be constructed in the reverse way also. That is, left side higher and right side lower.
Non Clustered Index
nonclustered indexes are balanced tree structures that start from a root index node, and include intermediate index nodes and leaf nodes.

Nonclustered indexes have a very similar structure to clustered ones.
the root and the intermediate levels look the same as in a clustered index.
The leaf level is different because it does not hold all of the data.
it is orgnized as a heap or as a balanced tree.
You can have up to 999 nonclustered indexes on a single table
The leaf level of a nonclustered index contains the index keys and row locators.
You can have up to 16 columns in a key.

A non-clustered index is useful for columns that have some repeated values. Say for example, AccountType column of a bank database may have 10 million rows. But, the distinct values of account type may be 10-15. A clustered index is automatically created when we create the primary key for the table. We need to take care of the creation of the non-clustered index.
Follow the steps below to create a Non-clustered index on our table Student based on the column class.

A table can have more than one Non-Clustered index. But, it should have only one clustered index that works based on the Binary tree concept. Non-Clustered column always depends on the Clustered column on the database.

This can be easily explained with the concept of a book and its index page at the end. Let us assume that you are going to a bookshop and found a big 1500 pages of C# book that says all about C#. When you glanced at the book, it has all beautiful color pages and shiny papers. But, that is not only the eligibility for a good book right? One you are impressed, you want to see your favorite topic of Regular Expressions and how it is explained in the book. What will you do? I just peeped at you from behind and recorded what you did as below:

You went to the Index page (it has total 25 pages). It is already sorted and hence you easily picked up Regular Expression that comes on page Number 17.
Next, you noted down the number displayed next to it which is 407, 816, 1200-1220.
the Index page is Non-Clustered index and the page numbers are clustered index arranged in a binary tree. See how you came to the page 407 very quickly. Your mind actually traversed the binary tree way left and right to reach the page 407 quickly.


RID lookup

In order to seek for a row, SQL Server needs to traverse the index to the leaf level, and
then read the appropriate page from the heap and retrieve the row from the page.
The operation of retrieving the row from the heap is called RID lookup.

The operation of retrieving the row from the heap is called RID lookup.
As we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table.

However, the columns used in the SELECT clause are still not part of the index, and to display those columns, the engine will have to go to the base table again and retrieve those columns.
This particular behavior is known as bookmark lookup or key lookup.

The clustering key should be short and unique because it appears in all non clustered indexes.
SQL Server 2012 can store index data column by column in what’s called a column store index.
Column store indexes can speed up data warehousing queries by a large factor, from 10 to even 100 times.


Key lookup   is operator is used when SQL Server performs a non-clustered index seek to find a row,
but then also needs data from the underlying table, which is organized as a clustered index.

The data needs to be sorted in memory or must be spilled to [tempdb] if it does not fit in memory.


SQL Server processes a set-based solution, even if internally the execution plan for the query involves iterations.
As a result, if you know how to tune queries, you are often able to achieve much better performance compared to using iterative solutions.

It should be noted that there are exceptional cases where iterative solutions perform better than set-based ones—even with all of the extra overhead for the row-by-row operations.

What are the commands that are required to work with a cursor?
DECLARE, OPEN, FETCH in a loop, CLOSE, and DEALLOCATE
That the cursor is read-only, forward-only.

Covering index
A covering index is a non-clustered index which includes all columns referenced in the query.



Set Based Solution
Set-based solutions are based on principles from the relational model,
and this model is the foundation of SQL (the standard language) and T-SQL (the
dialect in SQL Server)
Because set-based solutions are declarative and iterative solutions are
imperative, set-based solutions tend to involve less code


Alternative to Cursor
This approach with the TOP option does give you a correct alternative to
a cursor. However, you need to think about the fact that it is more I/O-intensive

Table Variable.
Table variables are declared, as opposed to being created. They are named with the at sign
(@) as a prefix; for example, @T1. They are visible only to the batch that declared them and
are destroyed automatically at the end of the batch. They are not visible across batches in the
same level, and are also not visible to inner levels in the call stack.


Temporary tables are created in tempdb in the dbo schema.

CONSTRAINT PK_#T1 PRIMARY KEY(col1) not possible with name constraint in Primary Key
You can define as  PRIMARY KEY(col1) , Unique(col2) etc………………….

only temporary tables have a physical representation
in tempdb and that table variables reside only in memory. This isn’t true. Both temporary
tables and table variables have a physical representation in tempdb.

Difference with Physical table

Changes applied to a temporary table are undone if the transaction rolls back

Difference between Temporary Table and Table Variable
SQL Server maintains distribution statistics (histograms) for temporary
tables but not for table variables.
Temporary tables and table variables differ in a number of ways, including scope, DDL
and indexing, interaction with transactions, and distribution statistics.


The following code demonstrates that table variables are not visible even across batches in
the same level.
DECLARE @T1 AS TABLE
(
col1 INT NOT NULL
);
INSERT INTO @T1(col1) VALUES(10);
GO
SELECT col1 FROM @T1;
GO

Give error after GO new batch of code started .


Difference between SET based and Iterative processing

Iterative refers to a processing methodology which focuses on one record at a time whereas set based refers to working with the entire record set at a time.

Examples of Iterative processing are CURSORS and WHILE loops.
Whereas examples of SET based are straightforward JOINs.
Iterative processing is quite common outside of a relational database system; think arrays and loops.

Non relational software systems typically have no SET based functionality - hence the widespread integration of SQL Server, Oracle, DB2 etc. systems. And I think that this is where the problem comes in.



26-Nov-14 ---------------------------------------------------------------------------


If a table is organized as a heap, then the only access method available to SQL Server is a table scan.
The scan is performed in no specific logical order; SQL Server uses Index Allocation
Map (IAM) pages to do the scan in physical allocation order.
SQL Server can use the allocation order scan when a table is clustered as well.
An allocation order scan is faster if a table is less physically fragmented; the scan is slower if the physical fragmentation is higher.
Allocation order scans are not affected by the logical fragmentation.
When SQL Server scans a clustered index, it can also scan in the logical order of the index by using the index order scan.
SQL Server uses the index leaf–level’s linked list to perform an index
order scan.


Join Algorithms
When performing joins, SQL Server uses different algorithms.
SQL Server supports three basic algorithms: nested loops, merge joins, and hash joins.
A hash join can be furher optimized by using bitmap filtering;
a bitmap filtered hash join could be treated as the fourth algorithm, or as an enhancement of the third, the hash algorithm.


The nested loops algorithm is a very simple and, in many cases, efficient algorithm. SQL
Server uses one table for the outer loop. For each row in this outer input, SQL Server seeks for matching rows in the second table
example
SELECT o.OrderDate
FROM  Sales.SalesOrderHeader AS O
INNER JOIN Sales.SalesOrderDetail AS OD
ON O.SalesOrderID = OD.SalesOrderID
WHERE OD.SalesOrderID=48980
Only a nested loops join algorithm supports non-equijoins


Merge join is a very efficient join algorithm. However, it has its own limitations. It needs at
least one equijoin predicate and sorted inputs from both sides.

SELECT O.custid, O.orderdate, OD.orderid, OD.productid, OD.qty
FROM Sales.Orders AS O
INNER JOIN Sales.OrderDetails AS OD
ON O.orderid = OD.orderid;


It uses a searching structure named a hash table.
This is not a searching structure you can build, like a balanced tree used for indexes.
SQL Server builds the hash table internally.
It uses a hash function to split the rows from the smaller input into buckets.
After the hash table is built, SQL Server applies the hash function on each of the rows from the other input.
It checks to see into which bucket the row fits. Then it scans through all rows from the bucket.
This phase is called the probe phase.

In a single-thread mode Hash Table is usually slower than merge and nested loops join algorithms that are supported by
existing indexes.

SQL Server can split rows from the probe input in advance, and perform partial joins in multiple threads.
The hash join is actually very scalable. This kind of optimization of a hash join is called a bitmap filtered hash join.

SQL Server uses two different algorithms for calculating aggregations.
If an input is ordered by the columns used in the GROUP BY clause, then SQL Server uses the stream aggregation
algorithm, which is implemented in the Stream Aggregate operator.
Stream aggregation is very efficient.
SQL Server might even decide to sort the input before performing the aggregation in order to make it possible to use the Stream Aggregate operator.

Stream aggregation might be faster than the hash aggregation.


Index Scan:
Since a scan touches every row in the table whether or not it qualifies, the cost is proportional to the total number of rows in the table. Thus, a scan is an efficient strategy if the table is small or if most of the rows qualify for the predicate.

Index Seek:
Since a seek only touches rows that qualify and pages that contain these qualifying rows, the cost is proportional to the number of qualifying rows and pages rather than to the total number of rows in the table.

In general query optimizer tries to use an Index Seek which means that optimizer has found a useful index to retrieve record set.








0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home