Philip Hendry's Blog

March 8, 2011

Randomising data in a table using T-sql

Filed under: SQL Server, T-SQL — philiphendry @ 8:48 am

I needed to obfuscate some data for a demo server but rather than re-invent all the data I simply wanted to randomly swap data so associations could not be made between a store and pay rates for example. After a bit of head scratching I came to the conclusion that I simply needed to be able to correlate one set of data with a random set. The simplest way I thought of was to use a row_number() to correlate one list of data with another where one set of row numbers were ordered by a primary key whilst the other used a little trick of ordering by newid() which is recalculated for each row and therefore randomly distributes the data. Here’s the code :

select *
--update s1 set s1.StoreName = s2.StoreName, s1.StoreAddress1 = s2.StoreAddress1, s1.County = s2.County, s1.Town = s2.Town, s1.PostCode = s2.PostCode, s1.Telephone = s2.Telephone
from (select row_number() over(order by StoreID) as RowNumber, StoreId, StoreName, StoreAddress1, County, Town, PostCode, Telephone from Stores s1) as s1 
join (select row_number() over(order by newid()) as RowNumber, StoreId, StoreName, StoreAddress1, County, Town, PostCode, Telephone from Stores s2) as s2 on s1.RowNumber = s2.RowNumber 

October 12, 2010

Persisting Object Graphs/Collections to one or more SQL Server Tables in one Stored Procedure Call

Filed under: SQL Server, T-SQL — philiphendry @ 7:52 pm

One of the methods I’ve used on a number of occasions to solve performance issues in an application is to reduce a number of calls to the database into one call. Using the classic order header and order detail example, rather than looping through each order and call the database for each item, I would prefer to pass the entire set of order detail to the database and persist it to the database with one T-Sql call. Taking it a step further, and possibly unnecessarily in this example, it would also be possible to pass the order header in the same call.

Now before I continue with the solution below which works in SQL Server 2005 and above, there is a new feature introduced in SQL Server 2008 allowing for table valued parameters to be added to stored procedures which accomplish similar solution.

A Solution

In my particular data object I have a method, SaveDetail, that is passed a graph of objects which is then converted into XML.

   public static void SaveDetail(ScheduleDto schedule)
   {
       StoredProcedureDAL.LM_Schedule_SaveXML(ConvertToXml(schedule).ToString(SaveOptions.DisableFormatting));
   }

   private static XElement ConvertToXml(ScheduleDto schedule)
   {
       return
           new XElement("Schedule",
               new XAttribute("Date", schedule.Date),
               from jobArea in schedule.JobAreas
               from employee in jobArea.Employees
               select new XElement("Emp",
                   new XAttribute("No", employee.Number),
                   employee.Schedule.Select((item, index) =>
                       new XElement("W", // WorkTpe
                           new XAttribute("N", item),
                           new XAttribute("T", ConvertFloatTo24HrTimeString(index * schedule.StepSize)))
                       )
               )
           );
   }

The call to StoredProcedureDAL.LM_Schedule_SaveXML is not shown here and is automatically generated from the stored procedure definitions in the database. The ScheduleDto is a hierarchical object model which, in part, is converted to xml by the ConvertToXml method using Linq to Xml. Notice that a lot of the element names are very short and attributes are used heavily since the xml will be much shorter and it’s important to reduce the size of the data being posted to the database server.

The stored procedure that is called above is shown below although it isn’t complete but highlights the details required to ‘shred’ the xml into a recordset. The key to ‘shredding’ the data is taking the xml typed parameter and using the nodes method to return relational data. The parameter to nodes is XQuery and it’s best to drill down to the deepest elements in the XML since it’s easier to use parent paths to access data higher in the hierarchy… and that neatly leads on to accessing the data which is shown in the select below. The T(c) syntax provisions Table and Columns names which are then used in the Select. The T.c.value method accepts another XQuery (and here I’ve used parent paths, attributes and indexes) and the type to convert the xml into. I used to join directly to @xml.nodes() but as the comment in the code indicates, I’ve found it’s quicker to use an inner query or, if you prefer, a CTE – I’m afraid I’ve not tried to get to the bottom of why this is so slow.

CREATE PROCEDURE [dbo].[LM_Schedule_SaveXML]
    @xml XML
AS
begin
    insert into [LM_ScheduleDetails]( [Date], [EmployeeID], [Time], [WorkType] )
    select
        @date,
        e.EmployeeId,
        x.WorkTypeTime,
        spwt.Value
    from (
        -- Fetch the data from the xml in the inner select. This seems to
        -- be significantly quicker than trying to do everything in one select.    
        select
            T.c.value('../@No[1]', 'int') as EmployeeNumber,
            T.c.value('@T[1]', 'datetime') as WorkTypeTime,
            T.c.value('@N[1]', 'nvarchar(1)') as WorkTypeName
        from @xml.nodes('/Schedule/Emp/W') as T(c)
    ) x
    join LM_Employees e on e.PayrollNo = x.EmployeeNumber
    join LM_ShiftPatternsWorkTypes spwt on spwt.Name = x.WorkTypeName
end
GO

And that’s it. In the particular scenario I used this very effectively to provide near instant saving of data for something that would have taken a lot, lot longer to save with multiple calls. I like this solution too since the technology needed to facilitate this is simple to understand and very compact and therefore quick and easy to maintain.

September 2, 2010

Calling SQL Server Stored Procedure Synonym fails with parameter error

Filed under: Dev Problem, SQL Server, T-SQL — philiphendry @ 1:38 pm

I’ve just tried to call a stored procedure in another database via a synonym but received an error ‘Procedure or function ‘LM_Employees_HasNewStoreTempsInStore’ expect parameter ‘@OrganisationLevelValueID’, which was not supplied.

Here’s how I defined the synonym and called it :

CREATE SYNONYM [dbo].[LM_Employees_HasNewStoreTempsInStore] 
FOR [PHENDRY].[LabourManager_DEV].[dbo].[LM_Employees_HasNewStoreTempsInStore]
GO

exec [dbo].[LM_Employees_HasNewStoreTempsInStore] 1119
GO

As you can see, I’m definitely passing the parameter!!

The problem appears to be how the synonym is defined when the two database exist on the same server. The machine name, here ‘[PHENDRY].’ can be removed completely and everything works fine!

December 1, 2009

Dynamic Order By in T-SQL

Filed under: SQL Server, T-SQL — philiphendry @ 2:33 pm

I’ve been looking at some code that looked like this:

CREATE PROCEDURE [dbo]., 
    @DraftStatus int,
    @Surname nvarchar(50),
    @PayrollNo nvarchar(50),
    @StartRow int,
    @PageSize int,
    @OrderBy varchar(50),
    @OrgLevelValueID int = null
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @sql nvarchar(4000)
    set @sql = 'SELECT * FROM '+
            '(SELECT *, ROW_NUMBER() OVER (ORDER BY ' + @OrderBy + ') AS RowNum '+
            'FROM LMv_EmployeeDrafts '+
            'WHERE DraftStatus = ' + CONVERT(nvarchar(10), @DraftStatus) +
                ' AND Surname LIKE ''%'+@Surname+'%'' '+
                ' AND CONVERT(nvarchar(10), PayrollNo) LIKE ''%'+@PayrollNo+'%'') AS tmp '+
        'WHERE RowNum-1 BETWEEN ' + CONVERT(nvarchar(10), @StartRow)+
            ' AND (' + CONVERT(nvarchar(10), @StartRow) + ' + ' + CONVERT(nvarchar(10), @PageSize) + ') - 1'
        --IF NOT(@OrgLevelValueID IS NULL) SET @sql = @sql + ' AND OrganisationLevelValueID = ' + CONVERT(varchar(max), @OrgLevelValueID)
        --IF NOT (@OrgLevelValueID IS NULL) SET @sql = @sql + ' AND (' + CONVERT(varchar(max), @OrgLevelValueID) + ' IN ' +
        SET @sql = @sql + ' AND (' + CONVERT(varchar(max), @OrgLevelValueID) + ' IN ' +
        '(SELECT OrgLvlID FROM RMR_GetPreOrgLvlValues(OrganisationLevelValueID)))'

    EXEC sp_executesql @sql
    PRINT @sql
END

Eeeek… not the most user-friendly (and I’ve certainly written dynamic SQL like this before) but it was written like this to allow for different page ordering. There’s also a bug in this code where the ROW_NUMBER() needs to be conditionally partitioned before being filtered.

The solution isn’t necessarily the prettiest either. However, it’s now not a sql string being built up so the execution planner can pre-compile it and it is easier to read! The one disadvantage is if a new order by is introduced the SQL needs to be updated – however, I think this is a reasonable exchange:

SELECT * FROM (
    SELECT *, ROW_NUMBER() OVER (
        PARTITION BY
            CASE WHEN @OrgLevelValueID is not null THEN OrganisationLevelValueID END
        ORDER BY 
            CASE WHEN @OrderBy = 'DateAdded Desc' THEN DateAdded END DESC,
            CASE WHEN @OrderBy = 'Title Desc' THEN Title END DESC,
            CASE WHEN @OrderBy = 'Forename Desc' THEN Forename END DESC,
            CASE WHEN @OrderBy = 'Surname Desc' THEN Surname END DESC,
            CASE WHEN @OrderBy = 'PayrollNo Desc' THEN PayrollNo END DESC,
            CASE WHEN @OrderBy = 'DateAdded' THEN DateAdded END,
            CASE WHEN @OrderBy = 'Title' THEN Title END,
            CASE WHEN @OrderBy = 'Forename' THEN Forename END,
            CASE WHEN @OrderBy = 'Surname' THEN Surname END,
            CASE WHEN @OrderBy = 'PayrollNo' THEN PayrollNo END        
        ) AS RowNum  
    FROM LMv_EmployeeDrafts  
    WHERE 
        DraftStatus = @DraftStatus
        AND Surname LIKE '%' + @Surname + '%'
        AND PayrollNo LIKE '%' + @PayrollNo + '%' 
) AS tmp  
WHERE 
    RowNum - 1 BETWEEN @StartRow AND @StartRow + @PageSize - 1 
    AND (@OrgLevelValueID is null OR @OrgLevelValueID 
        IN (SELECT OrgLvlID FROM RMR_GetPreOrgLvlValues(OrganisationLevelValueID)))

A couple of final caveats:

  • I’ve not checked the final query in the execution planner to make sure it’s efficient!
  • The surname and payroll filtering are being performed using a prefixed ‘%’ which means a table scan – pretty inefficient!

September 29, 2009

Generating a list of numbers in T-SQL

Filed under: SQL Server, T-SQL, Tip — philiphendry @ 8:33 am

I wanted to create a list of numbers so I could cross join them to another set of data to generate some test data quickly. There are ways to generate numbers using a CTE (Common Table Expression) or if you’re in a real hurry you can make us of a fairly undocumented table called master..spt_values. Seeing as this table isn’t well documented don’t go using it for production code since it might change but for quick one-off tasks it seems fine.

The following SQL creates a list of numbers from 0 to 2047 :

select [number]
from master..spt_values
where type = 'P'

September 21, 2009

Sproc Compile Locks

Filed under: MSDN Article Summary, SQL Server, Tip — philiphendry @ 7:28 pm

I came across an article on Microsofts support site which has raised a point which I’d never considered before. It basically says that if the user that executes a stored procedure is not the owner of the procedure then locks are acquired during the process to find a cached plan that may lead to blocking. It also states this could be avoided by fully qualifying the sproc name (e.g. dbo.mysproc) when calling it to avoid the extra lookups. This has been a problem for me in the past since the projects I’ve worked on had been setup with an account running as dbowner, however, if I was setting up a database from scratch I’d set up a specific user(s) for accessing the database and give it just the permissions it requires (sproc execute permissions for example.) I think, therefore, this would fall into these problem. However, I would also make use of the schema prefix (e.g. Sales.Order table) rather than leaving it as dbo and therefore avoid this problem altogether.

September 15, 2009

Importing from Excel into SQL Server

Filed under: SQL Server, T-SQL — philiphendry @ 10:54 am

I was going through a mountain of pain trying to get an import working from an Excel spreadsheet working into SQL Server 2005 but I finally managed it and here’s how…

Here’s the SQL I’m using (slightly abridged!) :

set transaction isolation level read committed

declare @sql nvarchar(1024)
set @sql = 'select * into ##Temp from opendatasource(''Microsoft.Jet.OLEDB.4.0'', ''Extended Properties=Excel 8.0;Data Source=' + @ExcelFileToImport + ''')...' + @ExcelWorkSheet
exec sp_executesql @sql

Points to note :

  • I’m using dynamic sql because the opendatasource() doesn’t like being passed variables… so be sure to validate the contents of @ExcelFileToImport and @ExcelWorkSheet.
  • Because I was forced to use dynamic sql I had to use a global temporary table. I wanted to load the data into a temporary table so that the columns were generated on-the-fly and I could check the correct columns were supplied by examining tempdb.sys.columns. It therefore had to be a global temporary because a standard one would have been thrown away after sp_executesql() had finished! For my situation the global temporary table is ok… but you’ll have to consider your own situation and whether concurrency allows.
  • I kept getting the error below on the subsequent re-tries of uploading but worked around this by adding ‘set transaction isolation level read committed’ in the sproc.

    The requested operation could not be performed because OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "(null)" does not support the required transaction interface.

I was having trouble executing within a transaction but with the changes made above the sql could be called with the following :

using (TransactionScope scope = new TransactionScope())
{
    UploadExcelIntoDatabase(importTicket, importFilename);
    scope.Complete();
}

September 11, 2009

Conditional Aggregate Totals in SQL

Filed under: SQL Server, T-SQL, Tip — philiphendry @ 8:03 am

I’ve been working on a SQL Server Reporting Services report which had to display several counts on each row. Each row represented a project and each project could have several tasks that were either ordinary tasks or milestones and could also be escalated. The report need to show something like this :

Project Name Milestone Counts Task Count Escalated Task Count
1st Week 2nd Week 3rd Week >=4th Week Total 1st Week 2nd Week 3rd Week >=4th Week Total

As you can see a task initially has to break down into two major colours of milestone and ordinary task counts but then is broken down into counts in the 1st, 2nd, 3rd and greater than 4th week totals including the total itself.

Initially this seemed tricky when thinking in terms of using count() but there’s an incredibly simple technique that solves this but which isn’t immediately obvious. The Sum() aggregate function can contain an expression and in this example that expression could compare the task date against late week dates and return either a 1 or 0 :

select ProjectName,
   sum(case when IsMilestone = 1 
      and TaskDate < @LateDate 
      and Taskdate >@OneWeekLateDate               
      then 1 else 0 end) as Week1LateMilestoneCount
from Project
group by ProjectName

I’ve only included one example there but I think it’s enough to get the idea.

September 2, 2009

Selecting a result set from a stored procedure in SQL Server

Filed under: SQL Server, T-SQL, Tip — philiphendry @ 10:55 am

Selecting data from a stored procedure in SQL Server is already a documented feature and here’s an example:

insert into #systables exec sp_executesql N'select * from Northwind.sys.tables'

…problem is this example doesn’t run without first creating the temp table and therefore knowing all the column definitions. When I’m running quick queries this isn’t exactly convenient. I’ve seen blogs posts using a linked server but there’s another way :

select * into #systables 
from openrowset(
   'sqlncli', 
   'server=.;trusted_connection=Yes', 
   'sp_executesql N''select * from Northwind.sys.tables'''
)

I wouldn’t necessarily use this as a day-to-day process on a production environment but for administration or scripting installations I think it fits the bill.

July 31, 2009

String Concatenation in T-SQL and Importing from Excel

Filed under: SQL Server, T-SQL — philiphendry @ 10:34 am

I keep forgetting how simple this little trick is so it’s time to blog it.

I wanted to write some T-SQL that would confirm that the columns just created by an import from an Excel spreadsheet into a temporary table contained the expected columns irrespective of order or other columns that exist in the file. I also wanted to be able to inform the user of the problem instead of just fail with a horrible error. I chose a fairly simple method, I would concatenate the columns from the temporary table (#tempImport in this case) and compare that with an expected string. If they don’t match I can report back to the user that the expected columns do not exist (in this implementation I’m not interested reporting which columns are missing.) The implementation I chose was this :

select * into #tempImport
from openrowset('Microsoft.Jet.OLEDB.4.0',
   'Excel 8.0;Database=e:\temp\aSpreadsheet.xls', Sheet1$)

declare @columns varchar(512)
set @columns = ''

select @columns = @columns + ',' + [name]
from tempdb.sys.columns
where [object_id] = object_id('tempdb..#tempImport')
    and [name] in ('Store','Store Name','PSA1','PSA2','Mtrg')
order by [name]

if (@columns <> ',Mtrg,PSA1,PSA2,Store,Store Name')
    print 'The expected columns in the spreadsheet do not exist'

This is a brute-force type of implementation but I like it because it’s very simple and keeps everything in one place. There will be better ways to implement this but for the moment this fits the bill.

Update:

http://support.microsoft.com/kb/326548 is a useful guide to using the ISAM provider.

Update (20/08/2010)

http://www.projectdmx.com/tsql/rowconcatenate.aspx discusses concatenating rows in quite some detail.

Older Posts »

Theme: Shocking Blue Green. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.