Data Manipulation Language (DML for short) includes operations such as INSERT
, UPDATE
and DELETE
:
-- Create a table HelloWorld
CREATE TABLE HelloWorld (
Id INT IDENTITY,
Description VARCHAR(1000)
)
-- DML Operation INSERT, inserting a row into the table
INSERT INTO HelloWorld (Description) VALUES ('Hello World')
-- DML Operation SELECT, displaying the table
SELECT * FROM HelloWorld
-- Select a specific column from table
SELECT Description FROM HelloWorld
-- Display number of records in the table
SELECT Count(*) FROM HelloWorld
-- DML Operation UPDATE, updating a specific row in the table
UPDATE HelloWorld SET Description = 'Hello, World!' WHERE Id = 1
-- Selecting rows from the table (see how the Description has changed after the update?)
SELECT * FROM HelloWorld
-- DML Operation - DELETE, deleting a row from the table
DELETE FROM HelloWorld WHERE Id = 1
-- Selecting the table. See table content after DELETE operation
SELECT * FROM HelloWorld
In this script we're creating a table to demonstrate some basic queries.
The following examples are showing how to query tables:
USE Northwind;
GO
SELECT TOP 10 * FROM Customers
ORDER BY CompanyName
will select the first 10 records of the Customer
table, ordered by the column CompanyName
from the database Northwind
(which is one of Microsoft's sample databases, it can be downloaded from here):
Note that Use Northwind;
changes the default database for all subsequent queries.
You can still reference the database by using the fully qualified syntax in the form of [Database].[Schema].[Table]:
SELECT TOP 10 * FROM Northwind.dbo.Customers
ORDER BY CompanyName
SELECT TOP 10 * FROM Pubs.dbo.Authors
ORDER BY City
This is useful if you're querying data from different databases. Note that dbo
, specified "in between" is called a schema and needs to be specified while using the fully qualified syntax. You can think of it as a folder within your database. dbo
is the default schema. The default schema may be omitted. All other user defined schemas need to be specified.
If the database table contains columns which are named like reserved words, e.g. Date
, you need to enclose the column name in brackets, like this:
-- descending order
SELECT TOP 10 [Date] FROM dbo.MyLogTable
ORDER BY [Date] DESC
The same applies if the column name contains spaces in its name (which is not recommended). An alternative syntax is to use double quotes instead of square brackets, e.g.:
-- descending order
SELECT top 10 "Date" from dbo.MyLogTable
order by "Date" desc
is equivalent but not so commonly used. Notice the difference between double quotes and single quotes: Single quotes are used for strings, i.e.
-- descending order
SELECT top 10 "Date" from dbo.MyLogTable
where UserId='johndoe'
order by "Date" desc
is a valid syntax. Notice that T-SQL has a N
prefix for NChar and NVarchar data types, e.g.
SELECT TOP 10 * FROM Northwind.dbo.Customers
WHERE CompanyName LIKE N'AL%'
ORDER BY CompanyName
returns all companies having a company name starting with AL
(%
is a wild card, use it as you would use the asterisk in a DOS command line, e.g. DIR AL*
). For LIKE
, there are a couple of wildcards available, look here to find out more details.
Joins are useful if you want to query fields which don't exist in one single table, but in multiple tables. For example: You want to query all columns from the Region
table in the Northwind
database. But you notice that you require also the RegionDescription
, which is stored in a different table, Region
. However, there is a common key, RgionID
which you can use to combine this information in a single query as follows (Top 5
just returns the first 5 rows, omit it to get all rows):
SELECT TOP 5 Territories.*,
Regions.RegionDescription
FROM Territories
INNER JOIN Region
ON Territories.RegionID=Region.RegionID
ORDER BY TerritoryDescription
will show all columns from Territories
plus the RegionDescription
column from Region
. The result is ordered by TerritoryDescription
.
When your query requires a reference to two or more tables, you may find it useful to use a Table Alias. Table aliases are shorthand references to tables that can be used in place of a full table name, and can reduce typing and editing. The syntax for using an alias is:
<TableName> [as] <alias>
Where as
is an optional keyword. For example, the previous query can be rewritten as:
SELECT TOP 5 t.*,
r.RegionDescription
FROM Territories t
INNER JOIN Region r
ON t.RegionID = r.RegionID
ORDER BY TerritoryDescription
Aliases must be unique for all tables in a query, even if you use the same table twice. For example, if your Employee table included a SupervisorId field, you can use this query to return an employee and his supervisor's name:
SELECT e.*,
s.Name as SupervisorName -- Rename the field for output
FROM Employee e
INNER JOIN Employee s
ON e.SupervisorId = s.EmployeeId
WHERE e.EmployeeId = 111
As we have seen before, a Join adds columns from different table sources. But what if you want to combine rows from different sources? In this case you can use a UNION. Suppose you're planning a party and want to invite not only employees but also the customers. Then you could run this query to do it:
SELECT FirstName+' '+LastName as ContactName, Address, City FROM Employees
UNION
SELECT ContactName, Address, City FROM Customers
It will return names, addresses and cities from the employees and customers in one single table. Note that duplicate rows (if there should be any) are automatically eliminated (if you don't want this, use a UNION ALL
instead). The column number, column names, order and data type must match across all the select statements that are part of the union - this is why the first SELECT combines FirstName
and LastName
from Employee into ContactName
.
It can be useful, if you need to deal with temporary data (especially in a stored procedure), to use table variables: The difference between a "real" table and a table variable is that it just exists in memory for temporary processing.
Example:
DECLARE @Region TABLE
(
RegionID int,
RegionDescription NChar(50)
)
creates a table in memory. In this case the @
prefix is mandatory because it is a variable. You can perform all DML operations mentioned above to insert, delete and select rows, e.g.
INSERT INTO @Region values(3,'Northern')
INSERT INTO @Region values(4,'Southern')
But normally, you would populate it based on a real table like
INSERT INTO @Region
SELECT * FROM dbo.Region WHERE RegionID>2;
which would read the filtered values from the real table dbo.Region
and insert it into the memory table @Region
- where it can be used for further processing. For example, you could use it in a join like
SELECT * FROM Territories t
JOIN @Region r on t.RegionID=r.RegionID
which would in this case return all Northern
and Southern
territories.
More detailed information can be found here. Temporary tables are discussed here, if you are interested to read more about that topic.
NOTE: Microsoft only recommends the use of table variables if the number of rows of data in the table variable are less than 100. If you will be working with larger amounts of data, use a temporary table, or temp table, instead.