Labels

admin (1) aix (1) alert (1) always-on (2) Architecture (1) aws (3) Azure (1) backup (3) BI-DWH (10) Binary (3) Boolean (1) C# (1) cache (1) casting (3) cdc (1) certificate (1) checks (1) cloud (3) cluster (1) cmd (7) collation (1) columns (1) compilation (1) configurations (7) Connection-String (2) connections (6) constraint (6) copypaste (2) cpu (2) csv (3) CTE (1) data-types (1) datetime (23) db (547) DB2 (1) deadlock (2) Denali (7) device (6) dotNet (5) dynamicSQL (11) email (5) encoding (1) encryption (4) errors (124) excel (1) ExecutionPlan (10) extended events (1) files (7) FIPS (1) foreign key (1) fragmentation (1) functions (1) GCP (2) gMSA (2) google (2) HADR (1) hashing (3) in-memory (1) index (3) indexedViews (2) insert (3) install (10) IO (1) isql (6) javascript (1) jobs (11) join (2) LDAP (2) LinkedServers (8) Linux (15) log (6) login (1) maintenance (3) mariadb (1) memory (4) merge (3) monitoring (4) MSA (2) mssql (444) mssql2005 (5) mssql2008R2 (20) mssql2012 (2) mysql (36) MySQL Shell (5) network (1) NoSQL (1) null (2) numeric (9) object-oriented (1) offline (1) openssl (1) Operating System (4) oracle (7) ORDBMS (1) ordering (2) Outer Apply (1) Outlook (1) page (1) parameters (2) partition (1) password (1) Performance (103) permissions (10) pivot (3) PLE (1) port (4) PostgreSQL (14) profiler (1) RDS (3) read (1) Replication (12) restore (4) root (1) RPO (1) RTO (1) SAP ASE (48) SAP RS (20) SCC (4) scema (1) script (8) security (10) segment (1) server (1) service broker (2) services (4) settings (75) SQL (74) SSAS (1) SSIS (19) SSL (8) SSMS (4) SSRS (6) storage (1) String (35) sybase (57) telnet (2) tempdb (1) Theory (2) tips (120) tools (3) training (1) transaction (6) trigger (2) Tuple (2) TVP (1) unix (8) users (3) vb.net (4) versioning (1) windows (14) xml (10) XSD (1) zip (1)

The XML source does not validate the data in the XML file against the XSD

Important to know:

The SSIS "XML source" element check validation of the input XML file,
but – check only the structure, not the data!
That mean not valid data will pass this validation check!

Is it good or not?
Depend on the case.

But don't forget it…

Get plans of a Stored Procedure

Select  PS.total_logical_reads/NullIf(DateDiff(Minute,PS.cached_time,GetDate()),0) N,
        Coalesce(DB_Name(PS.database_id),DB_Name(PSh.dbid),DB_Name(PSph.dbid))
        DBName,
        Object_Schema_Name(Coalesce(PS.object_id,PSh.objectid,PSph.objectid),Coalesce(PS.database_id,PSh.dbid,PSph.dbid))
        SchemaName,
        Coalesce(Object_Name(PS.object_id,PS.database_id),Object_Name(PSh.objectid,PSh.dbid),Object_Name(PSph.objectid,PSph.dbid))
        ObjectName,
        PS.*,
        PSh.text,
        PSph.query_plan
From    (Select Object_Name(object_id), *
        From    sys.dm_exec_procedure_stats (NoLock)
        Where   database_id=DB_ID('DatabaseName')
                And Object_Schema_Name(object_id)='SchemaName'
                And Object_Name(object_id)='SpName'
        Order By total_logical_reads/NullIf(DateDiff(Minute,cached_time,GetDate()),0) Desc) PS
Outer Apply sys.dm_exec_sql_text(PS.sql_handle) PSh
Outer Apply sys.dm_exec_query_plan(PS.plan_handle) PSph;

Pivot Table With Multiple Column Aggregates

CREATE TABLE #DoublePivot
(      [ID] INT IDENTITY(1,1) NOT NULL,
       [UserID] INT NOT NULL,
       [SomeType] INT NOT NULL,
       [NumberA] INT NOT NULL,
       [NumberB] INT NOT NULL,
       [Created] DATETIME NOT NULL DEFAULT(getdate())
)

INSERT INTO #DoublePivot
       (      UserID, SomeType, NumberA, NumberB )
       VALUES
       (      1, 1, 20, 10 ),
       (      1, 2, 50, 20 ),
       (      1, 1, 10, 0),
       (      2, 1, 10, 10),
       (      2, 2, 10, 40),
       (      1, 1, 20, 20),
       (      2, 2, 20, 70);

SELECT * FROM #DoublePivot order by UserID, SomeType

SELECT UserID, SomeType, SUM(NumberA) as SumNumberA, SUM(NumberB) as SumNumberB FROM #DoublePivot Group by UserID, SomeType order by UserID, SomeType
                                 
       SELECT UserID,
                     MAX([1]) AS SomeType1_NumberA,
                     MAX([-1]) AS SomeType1_NumberB,
                     MAX([2]) AS SomeType1_NumberB ,
                     MAX([-2]) AS SomeType2_NumberB
       FROM
                     (      SELECT UserID,
                                         SomeType,
                                         (-* SomeType) as SomeTypeB,
                                         SUM(NumberA) AS NumberA,
                                         SUM(NumberB) AS NumberB
                           FROM #DoublePivot
                           GROUP BY UserID, SomeType
                     )  BaseTable
                     PIVOT
                     (      SUM(NumberA)
                           FOR SomeType IN ([1], [2])
                     ) AS PivotA
                     PIVOT
                     (      SUM(NumberB)
                           FOR SomeTypeB IN ([-1], [-2])
                     ) AS PivotB
       GROUP BY UserID

DROP TABLE #DoublePivot

*  (-1 * SomeType) - in order to set a new "base" column to the second pivot summary.



"Automatic" extended events - system_health

After SQL Server installation, A default extended events session - system health – is created.
System health runs without any noticeable performance effects, and collect some important events:
  • Deadlocks
  • Sessions that have waited on locks, waits, etc.
  • More….
See all in:


I use it to catch deadlocks graph, and it was very helpful!

You can also filters events.
If you don’t remember the exact event name, you can type part of it:


After you get the results and see the event you need, you can specify the filter by it:





SSIS error: The SQL command requires a parameter named, which is not found in the parameter mapping

SSIS error message:
The SQL command requires a parameter named "@aaaa", which is not found in the parameter mapping.

Cause and Solutions:
1. The parameter is really not exists. Solution: declare it.
2. Space - " " - in the mapping pf the parameter - see the picture below. Solution: delete the space.


OUTPUT Clause for the MERGE Statements

MERGE ....
USING...
WHEN ...
WHEN ...
...
OUTPUT
   $action,
   inserted.*,
   deleted.*;

Get the number of rows in tables

In order to get the number of rows in tables you can simply run count(*).
But:
1. You need to run it for each table in the database.
2. Fot big tables - it takes time.

So. use thos query in order to get the number of rows in tables, or in one table - set WHERE clause on the table name or ID.

; WITH rCnt AS
(
       SELECT p.object_id, SUM(p.rows) as RowsCount
       FROM sys.partitions p
       WHERE p.index_id IN (0,1) -- heap or clustered index
       GROUP BY p.object_id, OBJECT_NAME(p.object_id)
)
SELECT SCHEMA_NAME(t.schema_id) AS SchemaName,
       t.name as TableName,
       rCnt.RowsCount
FROM   rCnt
JOIN   sys.tables t  ON t.object_id = rCnt.object_id

One less column in the query - and the results were changed dramatically

What is the case - short description:
A select query, that when I remove one column (of windows function) - the return data was changed dramatically.

What is the case - a little bit more detailed description:
A table that is updated by incoming events had a column that stored an EventId of the last event that affect it.
I had to update the EventId in a table - not according to the last event but according to the last event that was executed.
I checked the data with a query, that give me the last 2 dates (created and executed), and before running the update query I "remove" one of the columns from the query - and - surprisingly - the results were changed and the number of the records that I had to update was changed dramatically.


What cause this change?

In this post:
I created a demo that simulate the case I had, to described what I checked and how I found the real and interesting cause to the big different results - after remove a column from a SELECT.

In the demo:
Events and SubEvents tables.
LastSubEventID in Events table is the subevent of the last CreateDate of the SubEvents of the event.
We want to update LastSubEventID to be according to the the subevent of the last ExecDate of the SubEvents of the event.

-- 1. Create the demo tables:
CREATE TABLE dbo.Events
(
       EventID int,
       Notes nvarchar(50),
       LastSubEventID datetime
)
GO
CREATE TABLE dbo.SubEvents -- in the real life - join of 3-4 tables
(
       SubEventID int,
       EventId int,
       CreateDate datetime,
       ExecDate datetime
)
GO
INSERT INTO dbo.Events (EventID, Notes)
       VALUES (1, 'Notes 1'), (2, 'Notes 2'), (3, 'Notes 3'), (4, 'Notes 4'), (5, 'Notes 5')
GO
INSERT INTO dbo.SubEvents (SubEventID, EventID, CreateDate, ExecDate)
       VALUES
       (1, 1, '2017-01-01', '2017-01-02'),
       (2, 1, '2017-01-03', '2017-01-05'),
       (3, 1, '2017-01-02', '2017-01-02'),
       (4, 1, '2017-01-10', '2017-01-11'),
       (5, 1, '2017-01-07', '2017-01-12'),

       (6, 2, '2017-01-01', '2017-01-01'),
       (7, 2, '2017-01-02', '2017-01-02'),
       (8, 2, '2017-01-03', '2017-01-03'),
       (9, 2, '2017-01-04', '2017-01-04'),

       (10, 3, '2017-01-01', '2017-01-06'),
       (11, 3, '2017-01-02', '2017-01-15'),
       (12, 3, '2017-01-06', '2017-01-11'),
       (13, 3, '2017-01-08', '2017-01-12'),

       (14, 4, '2017-01-05', '2017-01-11'),
       (15, 4, '2017-01-03', '2017-01-11'),
       (16, 4, '2017-01-22', '2017-01-11'),
       (17, 4, '2017-01-01', '2017-01-11'),
       (18, 4, '2017-01-10', '2017-01-11'),
      
       (19, 5, '2017-01-11', '2017-01-21'),
       (20, 5, '2017-01-01', '2017-01-11'),
       (21, 5, '2017-01-22', '2017-01-23'),
       (22, 5, '2017-01-03', '2017-01-03'),
       (23, 5, '2017-01-02', '2017-01-07')
GO

Starting point: LastSubEventID is the subevent of the last CreateDate of the SubEvents of the event.

UPDATE dbo.Events SET LastSubEventID = 4 WHERE EventID = 1
UPDATE dbo.Events SET LastSubEventID = 9 WHERE EventID = 2
UPDATE dbo.Events SET LastSubEventID = 13 WHERE EventID = 3
UPDATE dbo.Events SET LastSubEventID = 16 WHERE EventID = 4
UPDATE dbo.Events SET LastSubEventID = 21 WHERE EventID = 5

SELECT * FROM dbo.Events
SELECT * FROM dbo.SubEvents



Now we want to update LastSubEventID to be according to the the subevent of the last ExecDate of the SubEvents of the event.

-- 2. Get the last sub-events according to the 2 options:
SELECT DISTINCT
              e.EventID,
              FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
              FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
FROM   dbo.Events e
JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId


-- 3. Now join them to the event table and select only the required results (= what we need to change):
; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT *
FROM   dbo.Events e
JOIN   maxes  ON     e.EventID = maxes.EventID
WHERE e.LastSubEventID <> maxes.maxEventId_ExecDate


-- 4. but, actually we don't need maxEventId_CreateDate, so let's remove it from the query and help the SQL to run better abd faster:
; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     --FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT *
FROM   dbo.Events e
JOIN   maxes  ON     e.EventID = maxes.EventID
WHERE e.LastSubEventID <> maxes.maxEventId_ExecDate



Why the results are different?????

In the "real world", I had more then 1,800,000 records,
and the results count was 40,000 instead of 4,000.....

So I tried to understand why the numbers of the queries are so differents.

-- 5. First check: what are the results of the CTEs?
; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT count(*) FROM maxes

; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     --FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT count(*) FROM maxes

-- Same number (5).... Next...


-- 6. Second check: check the name of the columns in the CTEs, maybe a ',' or something is missing

-- No... Next...


-- 7. Third check: check estimated execution plan
; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT *
FROM   dbo.Events e
JOIN   maxes  ON     e.EventID = maxes.EventID
WHERE e.LastSubEventID <> maxes.maxEventId_ExecDate

; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     --FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT *
FROM   dbo.Events e
JOIN   maxes  ON     e.EventID = maxes.EventID
WHERE e.LastSubEventID <> maxes.maxEventId_ExecDate


looks pretty similar (except the one more FIRST_VALUE componenets)
BUT: look at the estimated numbers in the left side of the plan - they are not the same...


And in my real case:



Finally - we have a clue...!

-- 8. Let's check actual execution plan

Plus-minus the same as the estimated....



-- 9. Forth check: Start look at the data

40,000 records, but I had no other choice...

-- let's take the CTEs to temp tables and check them:
       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       INTO #tmp2first_value
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId

       SELECT DISTINCT
                     e.EventID,
                     --FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc) AS [maxEventId_ExecDate]
       INTO #tmp1first_value
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId

-- run selects:     
select count(*) from #tmp2first_value --5 (in my "real" case there were 1,183,399...)
select count(*) from #tmp1first_value --5 (in my "real" case there were 1,183,399...)

select * from #tmp2first_value f2 join #tmp1first_value f1 -- 4 (in my "real" case there were 1,146,427...)
       on f2.EventID = f1.EventID and f2.[maxEventId_ExecDate] = f1.[maxEventId_ExecDate]

select EventID ,[maxEventId_ExecDate] from #tmp2first_value
except -- 1 (in my "real" case there were 36,972...)
select EventID ,[maxEventId_ExecDate] from #tmp1first_value

-- now I took one example:

select * from #tmp2first_value where EventID = 4
select * from #tmp1first_value where EventID = 4

--> DIFFERENT [maxEventId_ExecDate] !!!!!!!!!!!!!!!!!!!!


We have a reason to the different results, now we need to find:
  1. An explanation
  2. Which one is the correct one?
  3. A solution (to be sure that all of the 40000 will be correct).

SELECT * FROM dbo.SubEvents WHERE EventId = 4 ORDER BY CreateDate DESC



CAN YOU SEE THAT??

ExecDate is the same for all of the records.
           --> Any one of them can be the result of the query of the "max" of them, so the result can be different!
                  --> 1. We have an explanation!

2. which one is the correct one?
         --> For the query - all of them...
The answer for that it's a logic/bisiness/app answer: can be by the CreateDate, SubEventID, ..... etc.

3. A solution (to be sure that all of the 40000 will be correct)

What we have?
A case that all of the ExecDate were the same, so in 2 selects we had different ordering --> different results!

The solution is to specify an exact ordering (acording to the logic/bisiness/app answer that we've got).
Here it looks logicaly to add the CreateDate or the ID (SubEventID) to the ordering.
I'll take the ID (desc) - in order to do the example more clear
 * ID can be better if the CreateDate can also have duplications!
 * Of course it will be smart to do a specific ordering in all FIRST_VALUE.

       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc, SubEventID desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc, SubEventID desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId


; WITH maxes2first_value AS
(
       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc, SubEventID desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc, SubEventID desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
, maxes1first_value AS
(
       SELECT DISTINCT
                     e.EventID,
                     --FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.CreateDate desc, SubEventID desc) AS [maxEventId_CreateDate],
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc, SubEventID desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
select EventID ,[maxEventId_ExecDate] from maxes2first_value
except --0 records (also in my "real" case)
select EventID ,[maxEventId_ExecDate] from maxes1first_value

-- and - what we REALY need to update?

-- now join them to the event table and select only the required results:
; WITH maxes AS
(
       SELECT DISTINCT
                     e.EventID,
                     FIRST_VALUE(s.SubEventID) OVER (PARTITION BY s.EventId ORDER BY s.ExecDate desc, SubEventID desc) AS [maxEventId_ExecDate]
       FROM   dbo.Events e
       JOIN   dbo.SubEvents s      ON     e.EventID = s.EventId
)
SELECT *
FROM   dbo.Events e
JOIN   maxes  ON     e.EventID = maxes.EventID
WHERE e.LastSubEventID <> maxes.maxEventId_ExecDate

Summary, Conclusions and Insights
 Actually, in my "real" case, I select the spare FIRST_VALUE just to "get a feeling" of the data.
In the end - it was helpful for me to catch this case... (and, actualy, to prevent errors...).

Why the ordering was different?
Because in each table the table was scaned by a different column: ID of CreatedDate that was sorted for the FIRST_VALUE

Conclusions:
1. Ordering must be exact and untrusted by fate
            a. Wrong data
            b. Missing or surplus data
            c. Logicaly is more correct
2. Test data before run an update script! (supposed to be obvious, but you know..... ;) )


--     Cleanup
DROP TABLE #tmp2first_value
DROP TABLE #tmp1first_value

DROP TABLE dbo.Events
DROP TABLE dbo.SubEvents