SQL: How to parse Microsoft Transact-SQL Statements in C# and to match the column aliases of a view with its related (dependent) tables.
This is a example on how to use the Microsoft T-SQL parser in C# which is part of the Feature Pack.
Microsoft® SQL Server® 2014 Feature Pack:
https://www.microsoft.com/en-us/download/details.aspx?id=42295
Microsoft® SQL Server® 2014 Transact-SQL Language Service:
The SQL Server Transact-SQL Language Service is a component based on the .NET
Framework which provides parsing validation and IntelliSense services for Transact-SQL for SQL Server 2014, SQL Server 2012, SQL Server 2008 R2, SQL Server 2008, and SQL Server 2005. Filename: X86 and x64 Package(tsqllanguageservice.msi)
Add the following reference for 32 bit in your C# project:
C:\Program Files (x86)\Microsoft SQL Server\120\SDK\Assemblies\Microsoft.SqlServer.TransactSql.ScriptDom.dll
If you have a previous version of the language service, then the directory name will be slightly different; replace 120 with the installed version as below:
80 = SQL Server 2000
90 = SQL Server 2005
100 = SQL Server 2008
110 = SQL Server 2012
120 = SQL Server 2014
Documentation on MSDN:
https://msdn.microsoft.com/en-us/library/microsoft.sqlserver.transactsql.scriptdom.tsql120parser(v=sql.120).aspx
For each SQL server version there is a corresponding parser class following the same mapping as above: TSql80Parser, TSql90Parser,TSql100Parser, TSql110Parser, TSql120Parser.
In our example we will use the TSQL parser to parse a View statement in order to match its column aliases with the columns in the tables used by the view. This information is not available in the meta data of the sql server. A column from a view can be dependent on multiple columns and tables, but we only focus on the case where we have a one to one match.
Using the metadata available in sql database we can get information about the columns of a view and information about on what tables and columns the view depends on, but not a mapping between the two pieces of information.
This is our view:
CREATE VIEW [dbo].[View_1] AS SELECT AreaId = A.mcw_areaId, SurrogateKey = A.AreaKey, Code = S.statecode, Name = S.statename FROM CRM.dim_Area as A INNER JOIN CRM.dim_AreaState as S ON A.statecode = S.statecode
Using the following query we can get the dependencies for our view:
SELECT t.object_id as Table_ObjectID, c.column_id as Column_ObjectID, c.name as Column_Name, t.Name as Table_Name FROM sys.views v JOIN sys.sql_dependencies d ON d.object_id = v.object_id JOIN sys.objects t ON t.object_id = d.referenced_major_id JOIN sys.columns c ON c.object_id = d.referenced_major_id AND c.column_id = d.referenced_minor_id WHERE d.class < 2 AND v.name = 'View_1' AND v.schema_id = SCHEMA_ID('Dbo') ORDER BY COLUMN_ObjectID;
As alternative to the query above we can use Information schema as follows:
select * from INFORMATION_SCHEMA.VIEW_COLUMN_USAGE where VIEW_NAME='View_1' and VIEW_SCHEMA='Dbo'
However, a word of warning on using the Information_Schema from Microsoft:
“Do not use INFORMATION_SCHEMA views to determine the schema of an object. The only reliable way to find the schema of a object is to query the sys.objects catalog view.”
More information on this topic can be found here:
http://sqlblog.com/blogs/aaron_bertrand/archive/2011/11/03/the-case-against-information-schema-views.aspx
Using the following query we can get the list of column aliases of the view:
select object_id, column_id, name from sys.columns where object_id = object_id('Dbo.View_1');
As mentioned before, there is no way to link and match these two results using the metadata found in the sql database.
Therefore the only solution is to parse the sql statements of the view and construct ourselves the mapping:
The microsoft sql parser can be used as follows:
using Microsoft.SqlServer.TransactSql.ScriptDom; .... TSql120Parser SqlParser = new TSql120Parser(false); IList<ParseError> parseErrors; TSqlFragment result = SqlParser.Parse(new StringReader(SqlTextBox.Text), out parseErrors); TSqlScript SqlScript = result as TSqlScript; foreach (TSqlBatch sqlBatch in SqlScript.Batches) { foreach (TSqlStatement sqlStatement in sqlBatch.Statements) { ProcessViewStatementBody(sqlStatement); } }
When using the parser be prepared to use recursion a lot. Also when testing for a type, use an exact match as follows:
(sqlStatement.GetType() == typeof (AlterViewStatement)
Using * as placeholder for columns in the view is not supported in this parsing example, because this will require querying the sql database for metadata and this is outside the scope for this project. Another limitation of this example is that if a view is using another view, we do not recursively process that view to get to the base table and column.
The project source code can be downloaded here:
Andriescu SQL Wpf Parse Sql View