Chapter 1
Dated : 18-Oct-2014
T-SQL is based on strong mathematical foundation.
Set theory + Predicate logic => Relational Model => SQL => T-SQL.
SQL Standard =>> International Organization for Standards (ISO) and the American National Standards Institute (ANSI).
T-SQL is based on strong mathematical foundation.
Set theory + Predicate logic => Relational Model => SQL => T-SQL.
SQL Standard =>> International Organization for Standards (ISO) and the American National Standards Institute (ANSI).
Standard SQL
is based on the relational model, which is a mathematical model for data management and manipulation
A relation
in the relational model is what SQL calls a table. a relation has a heading and a body heading of a relation is a set of
attributes, and the body a set of tuples mathematical
branches that the relational model is based on
·
Set theory
·
Predicate logic
By a “set” we mean any collection M into a whole of definite,
distinct objects m (which are called the “elements” of M) of our perception or
of our thought by
The other
branch of mathematics that the relational model is based on is called predicate
logic.
A predicate
is an expression that when attributed to some
object, makes a proposition either true or false. For example, “salary greater
than $50,000” is a predicate
The
relational model uses predicates as one of its core elements. You can enforce
data
integrity by
using predicates. You can filter data by using predicates
Name
two aspects in which T-SQL deviates from the relational model
A
relation has a body with a distinct set of tuples.
A
table doesn’t have to have a key.
T-SQL
allows referring to ordinal positions of columns in the ORDER BY
clause.
Why
are the terms “field” and “record” incorrect when referring to column and
row?
Because
“field” and “record” describe physical things, whereas columns and
rows
are logical elements of a table.
Why
is the term “NULL value” incorrect?
Because
NULL isn’t a value; rather, it’s a mark for a missing value
Make
the Non relational Query Relational
·
Define an attribute name by assigning
an alias to the expression YEAR(orderdate).
·
Add a DISTINCT clause to remove
duplicates.
·
Also, remove the ORDER BY clause to
return a relational result.
Lesson-2 -
Logical query processing
T-SQL has
both logical and physical sides to it.
The logical
side is the conceptual interpretation of the query that explains what the
correct result of the query is.
The physical
side is the processing of the query by the database engine.
Physical
processing must produce the result defined by logical query processing.
Logical query processing - the conceptual
interpretation of the query that defines the correct result.
the logical
processing of a query must first know which table is being queried
before it
can know which attributes can be returned from that table
Therefore,
contrary to the keyed-in order of the previous query, the logical query
processing has to be as follows.
FROM
Sales.Shippers
SELECT
shipperid, phone, companyname
Logical
query processing follows following sequence,
1. Evaluate the FROM Clause
2. Evaluate the Where Clause
3. Evaluate the Group by Clause
4. Evaluate the Having Clause
5. Evaluate the Select Clause
6. Evaluate the Order by Clause
What
is the difference between the WHERE and HAVING clauses?
The
WHERE clause is evaluated before rows are grouped, and therefore is evaluated
per
row. The HAVING clause is evaluated after rows are grouped, and therefore
is
evaluated per group.
Dated : 19-Oct-2014
Chapter -2
Delimiting
Identifiers
The rules
say that the first character must be
a letter in
the range A through Z (lower or uppercase), underscore (_), at sign (@), or
number
sign (#).
Subsequent
characters can include letters, decimal numbers, at sign, dollar sign ($),
number sign,
or underscore. The identifier cannot be a reserved keyword in T-SQL, cannot
have
embedded spaces, and must not include supplementary characters.
An
identifier that doesn’t comply with these rules must be delimited.
must be
delimited as "2006" or [2006].
Choosing
a Data Type for Keys
Understanding
the built-in tools T-SQL provides for generating surrogate keys like the
sequence
object, identity column property, and the NEWID and NEWSEQUENTIALID
functions,
and their impact on performance, is an important skill for the exam.
20-Oct-14
Float
is Approximate-number data type, which means that not all values in the data
type range can be represented exactly.
Decimal/Numeric
is Fixed-Precision data type, which means that all the values in the data type
reane can be represented exactly with precision and scale.
Converting
from Decimal or Numeric to float can cause some loss of precision
Use
char when the sizes of the column data entries are consistent.
Use
varchar when the sizes of the column data entries vary considerably.
Use
varchar(max) when the sizes of the column data entries vary considerably, and
the size might exceed 8,000 bytes.
what
is difference between below conversion function in sql server 2012?
1).Cast
2).Convert
3).Parse
4).Try_Cast
5).Try_Convert
6).Try_Parse.
1).Cast
2).Convert
3).Parse
4).Try_Cast
5).Try_Convert
6).Try_Parse.
TRY_PARSE ( string_value AS
data_type [ USING culture ] )
This function works
similar to the well known CAST, but will return NULL if the string_value does
not represent a valid formatted value of the given data type.
PARSE ( string_value AS
data_type [ USING culture ] )
This works pretty much
the same as TRY_PARSE, but this function will throw an error if the
string_value passed to the function is not a valid representation of the data
type. Let me show you the similar parsings as above:
1.
SELECT PARSE('13-12-2011' AS datetime USING 'da-DK') AS Result
2.
SELECT PARSE('13-12-2011' AS datetime USING 'en-US') AS Result
TRY_CONVERT ( data_type [ ( length ) ], expression [, style ] )
This is
pretty much the same as the well known CONVERT, but again with the NULL output
for invalid input strings.
But what if there were
million of rows in the table, and multiple non-integer values? It would not be
pleasant to try to identify them all. But with TRY_PARSE we can now do this to
identify the rows:
1.
SELECT * FROM MyBadTableDesign
2.
WHERE TRY_PARSE(val AS INT) IS NULL
3.
SELECT CAST('abc' AS INT);
4. Conversely, the following code returns a NULL.
5.
SELECT TRY_CAST('abc' AS INT);
24-Oct-14
String
Length
LEN and
DATALENGTH
Len
>> return number charater
DataLength>>
return number of bytes
Would
you use the type FLOAT to represent a product unit price
No,
because FLOAT is an approximate data type and cannot represent all values
Precisely.
What
is the difference between NEWID and NEWSEQUENTIALID
The
NEWID function generates GUID values in random order, whereas the
NEWSEQUENTIAL
ID function generates GUIDs that increase in a sequential
order.
When concatenating character strings,
what is the difference between the plus
(+) operator and the CONCAT function?
The
+ operator by default yields a NULL result on NULL input, whereas the
CONCAT
function treats NULLs as empty strings.
differences
between COALESCE and ISNULL
COALESCE, it
returns the first input that is not NULL.
ISNULL is actually
more limited than COALESCE
DECLARE
@x AS
VARCHAR(3) = NULL,
@y AS
VARCHAR(10) = 'unknows';
SELECT COALESCE(@x, @y) AS [COALESCE], ISNULL(@x, @y) AS [ISNULL];
OUTPUT will be
COALESCE ISNULL
---------- ---------
unknows unk
NULLIF
function. This function accepts two input expressions,
returns NULL
if they are equal, and returns the first input if they are not. For example,
consider the
expression NULLIF(col1, col2). If col1 is equal to col2, the function returns a
NULL;
otherwise, it returns the col1 value.
IIF and
CHOOSE, these are nonstandard T-SQL functions that were added to simplify
migrations
from Microsoft Access platforms.
IIF(<predicate>,
<true_result>, <false_or_unknown_result>)
This
expression is equivalent to the following.
CASE
WHEN <predicate> THEN <true_result> ELSE <false_or_unknown_result>
END
DATETIME
data type uses 8 bytes of storage. SQL Server 2012 supports the DATE
data type,
which uses 3 bytes of storage
new
functions
EOMONTH ,
FORMAT , DATEFROMPARTS
SWITCHOFFSET
and
TODATETIMEOFFSET
25-Oct-14
TRY_CAST
returns a NULL instead of failing
The right
form for a Unicode character string literal is to prefix the literal with a
capital N and delimit the
literal with
single quotation marks; for example, N'literal'
WHERE
lastname = N'Davis'; best practice to use the proper form
pattern
matching
'_D%':
string where second character is D,
'[AC]%':
string where first character is A or C,
'[0-9]%':
string where first character is a digit
'[^0-9]%':
string where first character is not a digit
term
three-valued logic refer to in T-SQL as true, false, and unknown
literals are
language-dependent and language-neutral for the DATETIME data type
SQL
predicates are search arguments
companyname
LIKE 'A%'
orderdate
> = '20120212' AND orderdate < '20120213'
Is an
expression that evaluates to TRUE, FALSE, or UNKNOWN. Predicates are used in
the search condition of WHERE clauses and HAVING clauses, the join conditions of FROMclauses,
and other constructs where a Boolean value is required
guarantee
the order of the rows in the result of a query is used of order by Clause.
You want
result rows to be sorted by orderdate descending, and then by orderid,
descending.
ORDER BY
orderdate DESC, orderid DESC.
WITH
TIES grabs all other records that matches the value of “order by”
column.
SELECT TOP(1) WITH
TIES * FROM #MyTable ORDER BY age desc
All record
have age 20 shows
Filtering
Data with OFFSET-FETCH
query
defines ordering based on order date descending, followed by order ID
descending;
it then skips 50 rows and fetches the next 25 rows.
SELECT
orderid, orderdate, custid, empid
FROM
Sales.Orders
ORDER
BY orderdate DESC, orderid DESC
OFFSET
50 ROWS FETCH NEXT 25 ROWS ONLY
26-oct-2014
What’s the
difference between the ON and the WHERE clauses
If the OUTER
keyword is omitted from LEFT OUTER JOIN, RIGHT OUTER
JOIN, and
FULL OUTER JOIN, the meaning is retained.
27-OCT-2014
With
subqueries, you can nest queries within queries. You can use self-contained
subqueries
as well as correlated ones. You can use subqueries that return single-valued
results,
multi-valued results, and table-valued results.
T-SQL
supports four kinds of table expressions, which are named query expressions.
Derived
tables and CTEs are types of table expressions that are visible only in the
scope
of the
statement that defined them. Views and inline table-valued functions are reusable
table
expressions whose definitions are stored as objects in the database. Views
do not
support input parameters, whereas inline table-valued functions do
What are the
benefits of using a CTE over derived tables?
·
If you want to refer to one derived table
from another, you need to nest
them. With CTEs, you separate those by
commas, so the code is more modular and
easier to follow.
·
Because the CTE name is defined before
the outer query that uses it, the
outer query is allowed to refer to multiple instances of the
same CTE name.
The APPLY
operator has two
versions;
the CROSS APPLY version doesn’t return left rows that get an empty set back
from the
right side. The OUTER APPLY operator preserves the left side, and therefore,
does return
left rows when the right side returns an empty set.
What is the
difference between the result of T1 CROSS APPLY T2 and T1 CROSS JOIN
T2 (the
right table expression isn’t correlated to the left)?
·
Both return the same result when there’s no
correlation because CROSS
APPLY applies all rows from T2 to each row from T1
1-Nov-2014
What
are the clauses that you can use to define multiple grouping sets in the
same
query?
GROUPING
SETS, CUBE, and ROLLUP
Pivoting
Data
Pivoting is
a technique that groups and aggregates data, transitioning it from a state of
rows
to a state
of columns.
The PIVOT
operator determines the grouping element by elimination—
it’s what’s
left besides the spreading and aggregation elements.
You
cannot specify a computation as input to the aggregate function,rather just a
name of a column from the input table.
You
cannot specify a computation as the spreading element, rather just a name of a
column from the input table.
You
cannot specify a subquery in the IN clause, rather just a static list.
You
cannot specify multiple aggregate functions, rather just one.
WINDOW
FUNCTIONS
Like group
functions, window functions also enable you to perform data analysis
computations
window
functions, you define the set of rows per function—and then return one result
value per each underlying row and function. You define the set of rows for the
function to work with using a clause called OVER.
window
aggregate functions(SUM,AVG,COUNT) are applied to a window of rows defined by
the OVER clause.
For example,
the expression SUM(val) OVER() represents the grand total of all rows in the
underlying query.
For example,
the expression SUM(val) OVER(PARTITION BY custid) represents the current
customer’s total.
SELECT
custid, orderid,
val,
SUM(val)
OVER(PARTITION BY custid) AS custtotal,
SUM(val)
OVER() AS grandtotal
FROM
Sales.OrderValues
Indicate
the delimiters as one of three options:
UNBOUNDED
PRECEDING or FOLLOWING, meaning the beginning or end of the partition,
respectively
CURRENT
ROW, obviously representing the current row
<n> ROWS
PRECEDING or FOLLOWING, meaning n rows before or after the current,
respectively
WITH
RunningTotals AS
(
SELECT
custid, orderid, orderdate, val,
SUM(val)
OVER(PARTITION BY custid
ORDER
BY orderdate, orderid
ROWS
BETWEEN UNBOUNDED PRECEDING
AND
CURRENT ROW) AS runningtotal
FROM
Sales.OrderValues
)
SELECT
*
FROM
RunningTotals
WHERE
runningtotal < 1000.00;
ROW_NUMBER,
RANK, DENSE_RANK, and NTILE
The
ROW_NUMBER function computes a unique sequential integer starting with 1 within
the window
partition based on the window ordering.
RANK and
DENSE_RANK differ from ROW_NUMBER in the sense that they assign the same
ranking
value to all rows that share the same ordering value.
What
are the clauses that the different types of window functions support.
Partitioning,
ordering, and framing clauses.
What
do the delimiters UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING
represent?
The
beginning and end of the partition, respectively.
Window
Offset Functions, Window offset functions return an element from a
single row that is in a given offset from the current row in the window
partition, or from the first or last row in the window frame
LAG, LEAD,
FIRST_VALUE, and LAST_VALUE
SELECT
custid, orderid, orderdate, val,
FIRST_VALUE(val)
OVER(PARTITION BY custid
ORDER
BY orderdate, orderid
ROWS
BETWEEN UNBOUNDED PRECEDING
AND
CURRENT ROW) AS first_val,
LAST_VALUE(val)
OVER(PARTITION BY custid
ORDER
BY orderdate, orderid
ROWS
BETWEEN CURRENT ROW
AND
UNBOUNDED FOLLOWING) AS last_val
FROM
Sales.OrderValues;
What default frame window functions
RANGE UNBOUNDED PRECEDING
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
What
do the RANK and DENSE_RANK functions compute.
The RANK function returns one more than the number
of rows that have a lower
ordering value than the current; the DENSE_RANK
function returns one more than
the number of distinct ordering values that are
lower than the current.
So, we find that Rank() skips the ranking number
when it gets same OrdCount but Dense_Rank() maintains ranking order.
Why
are window functions allowed only in the SELECT and ORDER BY clauses of a
Query
?
Because they are supposed to operate on the
underlying query’s result, which is
achieved when logical query processing gets to the
SELECT phase.
03-NOV-2014
Full-text
queries can search not only for words you provide in a query; they can search
for synonyms
as well. SQL Server finds synonyms in thesaurus
files. Each language has an
associated
XML thesaurus file.
For create
synonyms for the words searched, You can edit the thesaurus file
SQL Server
supports two very powerful predicates for limiting the result set of a query by
using
full-text
indexes. These two predicates are the CONTAINS and FREETEXT predicates.
04-NOV-2014
FOR XML is
used only in SELECT if it is not a Sub Query.
In
Sub-query, it can be used for INSERT, UPDATE, DELETE as well as in assignment
statements.
In a FOR XML
clause you can specify these modes.
1. AUTO
2. RAW
3. EXPLICIT
4. PATH
·
The AUTO mode with FOR XML generates nesting in the
resulting XML on the way the SELECT statement is specified
·
The RAW mode with FOR XML generates a single tuple
<row> element per row in the table that is returned by the SELECT
statement.
·
The EXPLICIT mode gives more control on the shape
of the XML that we need to generate. We can get attributes and elements
together in deciding the shape of the XML. It requires a specific format for
the resulting rowset that is generated because of query execution. This rowset
format is then mapped into XML shape.
·
The PATH mode together with the nested FOR XML
query capability provides the flexibility of the EXPLICIT mode in a simpler
manner.
Examples:
Player Table
Id
|
FirstName
|
LastName
|
CITY
|
1
|
Sachin
|
Tendulkar
|
Mumbai
|
2
|
Saurav
|
Ganguli
|
Kolkata
|
3
|
Mahendrasinh
|
Dhoni
|
Chennai
|
SELECT * FROM Players FOR XML AUTO
<Players Id="1" FirstName="Sachin" LastName="Tendulkar" CITY="Mumbai" />
<Players Id="2" FirstName="Saurav" LastName="Ganguli" CITY="Kolkata" />
<Players Id="3" FirstName="Mahendrasinh" LastName="Dhoni" CITY="Chennai" />
SELECT * FROM Players FOR XML RAW(‘Cricketers‘)
<Cricketers Id="1" FirstName="Sachin" LastName="Tendulkar" CITY="Mumbai" />
<Cricketers Id="2" FirstName="Saurav" LastName="Ganguli" CITY="Kolkata" />
<Cricketers Id="3" FirstName="Mahendrasinh" LastName="Dhoni" CITY="Chennai" />
SELECT * FROM Players FOR XML PATH(‘Player‘)
<Player>
<Id>1</Id>
<FirstName>Sachin</FirstName>
<LastName>Tendulkar</LastName>
<CITY>Mumbai</CITY>
</Player>
<Player>
<Id>2</Id>
<FirstName>Saurav</FirstName>
<LastName>Ganguli</LastName>
<CITY>Kolkata</CITY>
</Player>
<Player>
<Id>3</Id>
<FirstName>Mahendrasinh</FirstName>
<LastName>Dhoni</LastName>
<CITY>Chennai</CITY>
</Player>
SELECT
1 AS Tag,
NULL AS Parent,
ID AS ‘Player!1!Id‘,
NULL AS ‘Detail!2!FirstName‘,
NULL AS ‘Detail!2!LastName‘,
NULL AS ‘Detail!2!City‘
FROM Players WHERE Id = 1
UNION ALL
SELECT
2 AS Tag,
1 AS Parent,
NULL,
FirstName,
LastName,
CIty
FROM Players WHERE Id = 1
FOR XML EXPLICIT
<Player Id="1">
<Detail FirstName="Sachin" LastName="Tendulkar" City="Mumbai" />
</Player>
The FOR XML
AUTO option gives you nice XML documents with nested elements, and it is not
complicated
to use. In AUTO and RAW modes, you can use the keyword ELEMENTS to produce
element-centric
XML. The WITH NAMESPACES clause, preceding the SELECT part of the
query,
defines namespaces and aliases in the returned XML.
A colon is
used in XML to separate the namespace from the element name.
WITH
XMLNAMESPACES('TK461-CustomersOrders' AS co)
SELECT
[co:Customer].custid AS [co:custid],[co:Customer].companyname AS
[co:companyname],[co:Order].orderid AS [co:orderid],
[co:Order].orderdate
AS [co:orderdate] FROM Sales.Customers AS [co:Customer]
INNER
JOIN Sales.Orders AS [co:Order] ON [co:Customer].custid = [co:Order].custid
ORDER
BY [co:Customer].custid, [co:Order].orderid
FOR
XML AUTO, ELEMENTS, ROOT('CustomersOrders');
The
FOR XML clause comes after the ORDER BY clause in a query.
In RAW and
AUTO mode, you can also return the XSD
schema of the document you are
creating.
This schema is included inside the XML that is returned, before the actual XML
data;
therefore,
it is called inline schema. You
return XSD with the XMLSCHEMA directive. This directive
accepts a parameter
that defines a target namespace.
SELECT
[Customer].custid AS [custid],[Customer].companyname AS [companyname],
[Order].orderid
AS [orderid],[Order].orderdate AS [orderdate]
FROM
Sales.Customers AS [Customer]
INNER
JOIN Sales.Orders AS [Order]
ON
[Customer].custid = [Order].custid
WHERE
1 = 2
FOR
XML AUTO, ELEMENTS,
XMLSCHEMA('TK461-CustomersOrders');
XMLSCHEMA
directive in the FOR XML clause
Format
Return XML
Automatic
format:
·
FOR XML AUTO automatically formats the
XML retuned.
·
FOR XML RAW automatically formats the
XML retuned.
Manually
Format:
·
FOR XML PATH allows you to manually
format the XML returned.
·
FOR XML EXPLICIT allows you to manually
format the XML returned.
Shredding
XML to Tables (convert xml to table)
The OPENXML
function provides a rowset over in-memory XML documents by using
Document Object Model (DOM)
presentation.
The OPENXML
function accepts an optional third parameter, called flags, which allows you
to specify
the mapping used between the XML data and the relational rowset.
1 means
attribute-centric mapping, OPENXML (@DocHandle,
'/CustomersOrders/Customer',1)
2 means
element-centric, and OPENXML (@DocHandle,
'/CustomersOrders/Customer',2)
3 means both
OPENXML (@DocHandle,
'/CustomersOrders/Customer', 1+2+8)
Flag value 8
can be combined with values 1 and 2 (1+2+8)
SELECT
* FROM OPENXML (@DocHandle, '/CustomersOrders/Customer',11)
WITH
(custid INT, companyname NVARCHAR(40));
--
Remove the DOM
EXEC
sys.sp_xml_removedocument @DocHandle;
XQuery is a standard language for browsing XML instances
and returning XML.
The real
power of XQuery lies in its so-called FLWOR
expressions. FLWOR is the acronym for
for, let, where, order by, and return. A FLWOR expression is actually a for each loop
Which
conditional expression is supported in XQuery? Anwer (if..then..else)
The easiest
way to create XML schemas is to create relational tables first, and then use
the
XMLSCHEMA
option of the FOR XML clause.
Store the
resulting XML value (the schema) in a variable, and provide the variable as
input to the CREATE XML SCHEMA COLLECTION statement.
an XML data
type method ?
nodes() , exist() , value() is an
XML data type method.
What kind of
XML indexes can you create?
1. You create a PRIMARY XML index before any other XML indexes.
2. A PATH XML index is especially useful if your queries
specify path expressions.
Which XML
data type method do you use to shred XML data to tabular format?
You use the nodes() method to shred XML data.
07-NOV-2014
Database Schema and Table Schema
A
database schema is a database-wide container of objects. A table schema is the
definition of a table that
includes
the CREATE TABLE statement with all the column definitions
·
The DATE, TIME, and DATETIME2 data
types can store data more efficiently and with
better precision than DATETIME and
SMALLDATETIME.
·
Use VARCHAR(MAX), NVARCHAR(MAX), and
VARBINARY(MAX) instead of the deprecated
TEXT, NTEXT, and IMAGE data types.
·
Use ROWVERSION instead of the deprecated TIMESTAMP.
·
DECIMAL and NUMERIC are the same data
type, but generally people prefer DECIMAL
because the name is a bit more
descriptive. Use DECIMAL and NUMERIC instead
of FLOAT or REAL data types unless you
really need floating-point precision and are
familiar with possible rounding issues.
Computed
Columns: CREATE TABLE
Sales.OrderDetails(orderid INT NOT NULL,Initialcost AS unitprice * qty) -- computed column
What
types of table compression are available? You can use either page or row
compression on a table. Page compression
includes
row compression.
Default constraints
are most useful when you have a column in a table that does not allow
NULL, but
you don't want to prevent an INSERT from succeeding if it does not specify a
value
for the
column
(supplierid INT NOT NULL,
categoryid
INT NOT NULL,
unitprice
MONEY NOT NULL
CONSTRAINT
DFT_Products_unitprice DEFAULT(0),
discontinued
BIT NOT NULL
CONSTRAINT
DFT_Products_discontinued DEFAULT(0),)
What
metadata tables give you a list of constraints in a database.
sys.key_constraints
, sys.default_constraints , sys.foreign_keys
Views
CREATE
VIEW Sales.OrderTotalsByYear
WITH
SCHEMABINDING AS SELECT
YEAR(O.orderdate)
AS orderyear, SUM(OD.qty) AS qty
WITH
ENCRYPTION (with encryption of view code)
WITH
SCHEMABINDING (which guarantees that the underlying table structures cannot be
altered without dropping the view)
WITH
VIEW_METADATA (returns the metadata of the view instead of the base table.)
If the
tables are spread across multiple SQL Server instances, the view is
called a
distributed partitioned view
Inline functions can be used
to simulate parameterized views. T-SQL views cannot take
parameters.
However, an inline table-valued function can return the same data as a
view and can
accept parameters that can filter the results.
A view can
contain an ORDER BY if the SELECT TOP clause is used,
but no
actual sorting of the results is guaranteed.
A synonym is
a name that refers to another database object such as a table, view, function,
or stored
procedure.
Does
a synonym store T-SQL or any data? No, a synonym is just a name. All that is
stored with a synonym is the object it
refers
to.
Can
synonyms be altered? No, to change a synonym, you must drop and recreate it.
CREATE
SYNONYM Reports.TotalCustQtyByMonth FOR Sales.CustOrders;
CREATE
SYNONYM Reports.TotalCustQtyByMonth FOR TSQL2012.Sales.CustOrders;
GO
SELECT
custid, ordermonth, qty FROM Reports.TotalCustQtyByMonth; -- Succeeds
GO
Synonyms can
be used to provide an abstraction layer to the user by presenting different
names for
database objects.
Synonyms
can refer to stored procedures.
Synonyms
can refer to temporary tables.
Synonyms
cannot refer to indexes; indexes are not database objects
that
are scoped by schema names.
Database
users are not database objects that are scoped by schema
names.
You can
create a synonym that refers to a nonexistent object. In order to
use the
synonym, however, you must ensure that the object exists.
08-NOV-2014
INSERT
VALUES, INSERT SELECT, INSERT EXEC, SELECT INTO
INSERT EXEC
statement to insert the result of queries in a dynamic
batch or a
stored procedure into the target table
If the
column has an IDENTITY property, you must normally skip it in
the INSERT
statement and let the property assign the value. To provide your own
value, you
need to turn on the IDENTITY_INSERT option, but that’s not what happens
normally.
SELECT INTO
doesn’t copy indexes
SELECT INTO
doesn’t copy constraints.
SELECT INTO
doesn’t copy triggers.
SELECT INTO
locks both data and metadata, and therefore can cause
blocking
related to both.
If the
CREATE TABLE and INSERT SELECT are executed in
different
transactions, you hold locks on metadata only for a very short period.
An UPDATE
with a variable can both modify a column value and collect
the result
into a variable using one visit to the row
The join can
be used to filter the updated rows.
The join
gives you access to information in other tables that can be used
in the
source expressions for the assignments
You cannot
update multiple tables in one UPDATE statement.
With an
UPDATE based on table expressions, you can invoke a window
function in
the inner query’s SELECT list. You can then refer to the alias you assigned
to the
result column in the outer UPDATE statement’s SET clause
An UPDATE
based on a join cannot refer to window functions in the SET
Clause
Difference between DELETE and TRUNCATE
The DELETE
statement writes significantly more to the transaction log compared to the
TRUNCATE
statement. For DELETE, SQL Server records in the log the actual data that
was deleted.
For TRUNCATE, SQL Server records information only about which pages
were
deallocated. As a result, the TRUNCATE statement tends to be substantially
faster.
The DELETE
statement doesn’t attempt to reset an identity property
The DELETE
statement is supported if there’s a foreign key pointing to the table in
question as
long as there are no related rows in the referencing table. TRUNCATE is not
allowed if a
foreign key is pointing to the table—even if there are no related rows in
the
referencing table, and even if the foreign key is disabled.
The DELETE
statement is allowed against a table involved in an indexed view. A TRUNCATE
statement is
disallowed in such a case.
The DELETE
statement requires DELETE permissions on the target table. The TRUNCATE
statement
requires ALTER permissions on the target table
The
TRUNCATE statement resets an IDENTITY property.
The
TRUNCATE statement is disallowed when a foreign key pointing to
the
table exists.
The
TRUNCATE statement is disallowed when an indexed view based on
the
table exists.
Using
the IDENTITY Column Property
IDENTITY is
a property of a column in a table. The property automatically assigns a value
to
the column
upon insertion. You can define it for columns with any numeric type that has a
scale of 0.
This means all integer types, but also NUMERIC/DECIMAL with a scale of 0. When
defining the
property, you can optionally specify a seed and an increment.
SQL Server
provides two features to help you generate a sequence of keys: the IDENTITY
column
property and the sequence object
To get the
newly generated identity value, you can query the functions
SCOPE_IDENTITY=
( the last identity value generated by your session and scope.)
@@IDENTITY= ( the last identity value generated by
your session.)
IDENT_CURRENT=
( the last identity value generated in the input table.)
Sequence object
available from SQL Server 2012.
advantages of using a sequence object instead of IDENTITY
1. One of the advantages of using a sequence object instead of IDENTITY is
that you can attach a DEFAULT constraint that has a
call to the NEXT VALUE FOR
function to an existing column, or remove such a constraint from a
column
2. You can generate a new sequence value before using it by assigning the
value to a variable and later using the variable in
an INSERT statement. This cannot
be done with IDENTITY.
3. You can specify your own value for a column that has an IDENTITY
property, but this requires turning on the session
option IDENTITY_INSERT, which
in turn requires special permissions. The sequence
object is more flexible. You can
insert your own values into a column that normally
gets its value from a sequence
object. And that’s without needing to turn on any
special options and without
needing special permissions.
What is the difference between
Primary key and unique key?
Ans: Primary key does not allow the null values but unique key allows
one null value.
Primary key will create
clustered index on column but unique key will create non-clustered index by
defaul
Three
Statements
1. Sequence Object and Identity Columns
2. MERGE
3. OUTPUT Statement for Inserted and Deleted
With the
MERGE statement, you can merge data from a source table or table expression into
a target table.
Simple Merge
Statement with Single Table
MERGE INTO Sales.MyOrders WITH (HOLDLOCK)
AS TGT
USING
(VALUES(@orderid, @custid, @empid, @orderdate)) AS SRC( orderid, custid, empid,
orderdate) ON
SRC.orderid = TGT.orderid
WHEN MATCHED THEN UPDATE SET TGT.custid =
SRC.custid, TGT.empid = SRC.empid, TGT.orderdate = SRC.orderdate
WHEN NOT MATCHED THEN INSERT VALUES(SRC.orderid,
SRC.custid, SRC.empid, SRC.orderdate);
Simple Merge
Statement with Two Tables one is Source and One is Target
MERGE INTO Sales.MyOrders AS TargetTable
USING
Sales.Orders AS SourceTable ON
SourceTable.orderid = TargetTable.orderid AND shipcountry = N'Norway'
WHEN MATCHED AND (TargetTable.custid <>
SourceTable.custid OR TargetTable.empid <> SourceTable.empid OR
TargetTable.orderdate <> SourceTable.orderdate)
THEN
UPDATE SET TargetTable.custid = SourceTable.custid,
TargetTable.empid
= SourceTable.empid,
TargetTable.orderdate
= SourceTable.orderdate
WHEN NOT MATCHED THEN INSERT VALUES(SourceTable.orderid,
SourceTable.custid, SourceTable.empid, SourceTable.orderdate);
MERGE is
very useful, especially when it comes to loading data warehouse tables, which
can be very large and require specific actions to be taken when rows are or are
not present.
CREATE TABLE dbo.tbl_Source (id INT, name NVARCHAR(100), qty INT);
CREATE TABLE dbo.tbl_Target (id INT, name NVARCHAR(100), qty INT);
--Synchronize source data with target
MERGE INTO dbo.tbl_Target AS t
USING dbo.tbl_Source AS s
ON t.id = s.id
WHEN MATCHED AND (t.name != s.name OR t.qty!= s.qty) THEN
--Row exists and data is different
UPDATE SET t.name = s.name, t.qty = s.qty
WHEN NOT MATCHED THEN
--Row exists in source but not in target
INSERT VALUES (s.id, s.name, s.qty)
WHEN SOURCE NOT MATCHED THEN
--Row exists in target but not in source
DELETE OUTPUT$action, inserted.id, deleted.id
OUTPUT
clause can be used with INSERT, UPDATE, or DELETE to identify the actual rows
affected by these statements.
Using
EXEC() instead of sp_executesql
When using dynamic SQL, you have two choices: EXEC() / EXECUTE(), or sp_executesql. Early on my career, I used EXEC() a lot, because it was much easier. I've since learned that sp_executesql is a lot better, for two main reasons:
When using dynamic SQL, you have two choices: EXEC() / EXECUTE(), or sp_executesql. Early on my career, I used EXEC() a lot, because it was much easier. I've since learned that sp_executesql is a lot better, for two main reasons:
1. You
substantially reduce the risk of SQL injection by continuing to pass strongly-typed
variables throughout the whole chain. Which do you find more questionable?
1.
sp_executesql allows
for statements to be parameterized
o Therefore It’s more secure than EXEC in terms
of SQL injection
2.
sp_executesql can
leverage cached query plans.
o The TSQL string is built only one time, after
that every time same query is called with sp_executesql, SQL Server retrieves
the query plan from cache and reuses it
3.
Temp tables created in
EXEC can not use temp table caching mechanism
What is an Index
Index is a database
object, which can be created on one or more columns (16 Max column
combination). When creating the index will read the column(s) and forms a
relevant data structure to minimize the number of data comparisons. The index
will improve the performance of data retrieval and adds some overhead on data
modification such as create, delete and modify. So it depends on how much data
retrieval can be performed on table versus how much of DML (Insert, Delete and
Update) operations.
In this article, we will
see creating the Index. The below two sections are taken from my previous
article as it is required here. If your database has changes for the next two
sections, you can directly go to section 5.
SQL Server
internally organizes data in a data file in pages.
A page is an
8 KB unit and belongs to a single object; for example, to a table or an index.
A page is
the smallest unit of reading and writing. Pages are further organized into
extents.
An extent consists of eight consecutive pages.
Pages are
physical structures.
SQL Server
organizes data in pages in logical structures.
SQL Server
organizes tables as heaps or as balanced
trees.
A table
organized as a balanced tree is also known as a clustered table or a clustered
index
Indexes are
always organized as balanced trees.
Other
indexes, such as indexes
that do not contain all of the data and serve as pointers to table rows for
quick seeks, are called non clustered indexes.
Heaps
A heap is a quite simple structure. Data in a heap is not
organized in any logical order.
A heap is just a bunch of pages and extents
Index Allocation Map (IAM) pages
SQL Server
traces which pages and extents belong to an object through special system
pages called
Index Allocation Map (IAM) pages. Every table or index has at least one IAM
page, called
first IAM. A single
IAM page can point to approximately 4 GB of space. Large
objects can
have more than one IAM page. IAM pages for an object are organized as a doubly
linked list; each page has a pointer to
its descendant and antecedent.
SQL Server
stores pointers to first IAM pages in its own internal system tables SQL Server
uses Index Allocation Map(IAM) Pages to scan heaps in physical order, or allocation order.
If you do
not create a clustered index explicitly or implicitly through primary key or
unique
constraints,
then a table is organized as a heap. SQL Server does not allocate any pages for
a table when
you create it. It allocates the first page, and also the first IAM page, when
you
insert the
first row in the table. You can find general information about tables and
indexes in
the
sys.indexes catalog view.
Clustered Index
You organize a table as a
balanced tree when you create a clustered index.
The structure is called a balanced tree because it resembles an inverse tree.
Every balanced tree has a single root page and at least one or more leaf pages.
In addition, it can have zero or more intermediate levels.
The structure is called a balanced tree because it resembles an inverse tree.
Every balanced tree has a single root page and at least one or more leaf pages.
In addition, it can have zero or more intermediate levels.
All data in
a clustered table is stored in leaf pages.
Data is
stored in logical order of the clustering key.
A clustering key can consist of a
single column, or of multiple columns.
If the key
consists of multiple columns, then this is a composite
key.
SQL Server
still uses IAM pages to follow the physical allocation.
The primary key created
for the StudId column will create a clustered index for the Studid column. A
table can have only one clustered index on it.When creating the clustered
index, SQL server 2005 reads the Studid column and forms a Binary tree on it.
This binary tree information is then stored separately in the disc. Expand the
table Student and then expand the Indexes. You will see the following index
created for you when the primary key is created:With the use of the binary
tree, now the search for the student based on the studid decreases the number
of comparisons to a large amount. Let us assume that you had entered the
following data in the table student
The index will form the
below specified binary tree. Note that for a given parent, there are only one
or two Childs. The left side will always have a lesser value and the right side
will always have a greater value when compared to parent. The tree can be
constructed in the reverse way also. That is, left side higher and right side
lower.
Non
Clustered Index
nonclustered
indexes are balanced tree structures that start from a root index node, and
include intermediate index nodes and leaf nodes.
Nonclustered
indexes have a very similar structure to clustered ones.
the root and
the intermediate levels look the same as in a clustered index.
The leaf
level is different because it does not hold all of the data.
it is
orgnized as a heap or as a balanced tree.
You can have
up to 999
nonclustered indexes on a single table
The leaf
level of a nonclustered index contains the index keys and row locators.
You can have
up to 16 columns in a key.
A
non-clustered index is useful for columns that have some repeated values. Say
for example, AccountType column of a bank database may have 10 million rows.
But, the distinct values of account type may be 10-15. A clustered index is
automatically created when we create the primary key for the table. We need to
take care of the creation of the non-clustered index.
Follow the
steps below to create a Non-clustered index on our table Student based on the
column class.
A table can
have more than one Non-Clustered index. But, it should have only one clustered
index that works based on the Binary tree concept. Non-Clustered column always
depends on the Clustered column on the database.
This can be
easily explained with the concept of a book and its index page at the end. Let
us assume that you are going to a bookshop and found a big 1500 pages of C#
book that says all about C#. When you glanced at the book, it has all beautiful
color pages and shiny papers. But, that is not only the eligibility for a good
book right? One you are impressed, you want to see your favorite topic of
Regular Expressions and how it is explained in the book. What will you do? I
just peeped at you from behind and recorded what you did as below:
You went to
the Index page (it has total 25 pages). It is already sorted and hence you
easily picked up Regular Expression that comes on page Number 17.
Next, you
noted down the number displayed next to it which is 407, 816, 1200-1220.
the
Index page is Non-Clustered index and the page numbers are clustered index
arranged in a binary tree. See how you came to the page 407 very quickly. Your
mind actually traversed the binary tree way left and right to reach the page
407 quickly.
RID
lookup
In order to
seek for a row, SQL Server needs to traverse the index to the leaf level, and
then read
the appropriate page from the heap and retrieve the row from the page.
The
operation of retrieving the row from the heap is called RID lookup.
The
operation of retrieving the row from the heap is called RID lookup.
As we have
an index on the WHERE clause, the SQL Server query execution engine uses the
non-clustered index to retrieve data from the table.
However, the
columns used in the SELECT clause are still not part of the index, and to display
those columns, the engine will have to go to the base table again and retrieve
those columns.
This
particular behavior is known as bookmark lookup or key lookup.
The
clustering key should be short and unique because it appears in all non clustered
indexes.
SQL Server
2012 can store index data column by column in what’s called a column store index.
Column store
indexes can speed up data warehousing queries by a large factor, from 10 to even 100 times.
Key lookup is operator
is used when SQL Server performs a non-clustered index seek to find a row,
but then
also needs data from the underlying table, which is organized as a clustered
index.
The data needs to be sorted in memory or must be
spilled to [tempdb] if it does not fit in memory.
SQL Server
processes a set-based solution, even if internally the execution plan for the
query involves iterations.
As a result,
if you know how to tune queries, you are often able to achieve much better
performance compared to using iterative solutions.
It should be
noted that there are exceptional cases where iterative solutions perform better
than set-based ones—even with all of the extra overhead for the row-by-row
operations.
What
are the commands that are required to work with a cursor?
DECLARE,
OPEN, FETCH in a loop, CLOSE, and DEALLOCATE
That
the cursor is read-only, forward-only.
Covering index
A covering index is a
non-clustered index which includes all columns referenced in the query.
Set Based Solution
Set-based
solutions are based on principles from the relational model,
and this
model is the foundation of SQL (the standard language) and T-SQL (the
dialect in
SQL Server)
Because
set-based solutions are declarative and iterative solutions are
imperative,
set-based solutions tend to involve less code
Alternative to Cursor
This
approach with the TOP option does give you a correct alternative to
a cursor.
However, you need to think about the fact that it is more I/O-intensive
Table Variable.
Table
variables are declared, as opposed to being created. They are named with the at
sign
(@) as a
prefix; for example, @T1. They are visible
only to the batch that declared them and
are
destroyed automatically at the end of the batch. They are not visible across
batches in the
same level,
and are also not visible to inner levels in the call stack.
Temporary
tables are created in tempdb in the dbo schema.
CONSTRAINT
PK_#T1 PRIMARY KEY(col1) not possible with name constraint in Primary Key
You
can define as PRIMARY KEY(col1) ,
Unique(col2) etc………………….
only
temporary tables have a physical representation
in tempdb
and that table variables reside only in memory. This isn’t true. Both temporary
tables and
table variables have a physical representation in tempdb.
Difference with Physical table
Changes
applied to a temporary table are undone if the transaction rolls back
Difference
between Temporary Table and Table Variable
SQL Server
maintains distribution statistics (histograms) for temporary
tables but
not for table variables.
Temporary
tables and table variables differ in a number of ways, including scope, DDL
and
indexing, interaction with transactions, and distribution statistics.
The
following code demonstrates that table variables are not visible even across
batches in
the same
level.
DECLARE
@T1 AS TABLE
(
col1
INT NOT NULL
);
INSERT
INTO @T1(col1) VALUES(10);
GO
SELECT
col1 FROM @T1;
GO
Give error
after GO new batch of code started .
Difference between SET
based and Iterative processing
Iterative refers to a processing methodology which focuses on one
record at a time whereas set based refers to working with the entire record set
at a time.
Examples of Iterative
processing are CURSORS and WHILE loops.
Whereas examples of SET based are straightforward JOINs.
Iterative processing is quite common outside of a relational
database system; think arrays and loops.
Non relational software systems typically have no SET based
functionality - hence the widespread integration of SQL Server, Oracle, DB2
etc. systems. And I think that this is where the problem comes in.
26-Nov-14
---------------------------------------------------------------------------
If a table
is organized as a heap, then the only access method available to SQL Server is
a table scan.
The scan is
performed in no specific logical order; SQL Server uses Index Allocation
Map (IAM)
pages to do the scan in physical allocation
order.
SQL Server
can use the allocation order scan when a table is clustered as well.
An
allocation order scan is faster if a table is less physically fragmented; the
scan is slower if the physical fragmentation is higher.
Allocation
order scans are not affected by the logical fragmentation.
When SQL
Server scans a clustered index, it can also scan in the logical order of the
index by using the index order scan.
SQL Server
uses the index leaf–level’s linked list to perform an index
order scan.
Join
Algorithms
When
performing joins, SQL Server uses different algorithms.
SQL Server
supports three basic algorithms: nested loops, merge
joins, and hash
joins.
A hash join
can be furher optimized by using bitmap filtering;
a bitmap
filtered hash join could be treated as the fourth algorithm, or as an
enhancement of the third, the hash algorithm.
The nested loops algorithm is a very simple and, in many
cases, efficient algorithm. SQL
Server uses
one table for the outer loop. For each row in this outer input, SQL Server
seeks for matching rows in the second table
example
SELECT o.OrderDate
FROM Sales.SalesOrderHeader
AS O
INNER JOIN Sales.SalesOrderDetail
AS OD
ON O.SalesOrderID = OD.SalesOrderID
WHERE OD.SalesOrderID=48980
Only
a nested loops join algorithm supports non-equijoins
Merge join is a very efficient join algorithm. However, it has
its own limitations. It needs at
least one
equijoin predicate and sorted inputs from both sides.
SELECT
O.custid, O.orderdate, OD.orderid, OD.productid, OD.qty
FROM
Sales.Orders AS O
INNER
JOIN Sales.OrderDetails AS OD
ON
O.orderid = OD.orderid;
It uses a
searching structure named a hash table.
This is not
a searching structure you can build, like a balanced tree used for indexes.
SQL Server
builds the hash table internally.
It uses a
hash function to split the rows from the smaller input into buckets.
After the
hash table is built, SQL Server applies the hash function on each of the rows
from the other input.
It checks to
see into which bucket the row fits. Then it scans through all rows from the
bucket.
This phase
is called the probe
phase.
In a
single-thread mode Hash Table is usually slower than merge and nested loops
join algorithms that are supported by
existing
indexes.
SQL Server
can split rows from the probe input in advance, and perform partial joins in
multiple threads.
The hash
join is actually very scalable. This kind of optimization of a hash join is
called a bitmap
filtered hash join.
SQL Server
uses two different algorithms for calculating aggregations.
If an input
is ordered by the columns used in the GROUP BY clause, then SQL Server uses the
stream aggregation
algorithm,
which is implemented in the Stream Aggregate operator.
Stream
aggregation is very efficient.
SQL Server
might even decide to sort the input before performing the aggregation in order
to make it possible to use the Stream Aggregate operator.
Stream
aggregation might be faster than the hash aggregation.
Index Scan:
Since a scan touches every row in the table whether or not it qualifies, the cost is proportional to the total number of rows in the table. Thus, a scan is an efficient strategy if the table is small or if most of the rows qualify for the predicate.
Since a scan touches every row in the table whether or not it qualifies, the cost is proportional to the total number of rows in the table. Thus, a scan is an efficient strategy if the table is small or if most of the rows qualify for the predicate.
Index Seek:
Since a seek only touches rows that qualify and pages that contain these qualifying rows, the cost is proportional to the number of qualifying rows and pages rather than to the total number of rows in the table.
Since a seek only touches rows that qualify and pages that contain these qualifying rows, the cost is proportional to the number of qualifying rows and pages rather than to the total number of rows in the table.
In general
query optimizer tries to use an Index Seek which means that optimizer has found
a useful index to retrieve record set.
No comments:
Post a Comment