SQL Tips for Beginners

Here I’ve complied some simple SQL tips and queries to help new programmers or anyone that needs a refresher. This is not intended to be all incompassing but just contains the things I use reguarly when viewing and filtering data in SQL. All of this information is for Microsoft SQL Server Management Studio on a PC.

Select Database

About 60% of the time that I open SQL, I go to run a query and get the following error: “Invalid object name”. After about 30 seconds of panic in which I double check my query for any typos, I realize I forgot to add that essential line at the top of my script: “USE database”. In order to use any tables or stored procedures etc. in a query that are saved in a database, you need to let your SQL server know which database you’re using within the server. You only need to run that line of code once and then you can access all tables, stored procedures etc. in that database.

Running a Query

In Microsoft SQL, to run a SQL query, you can use the Excute button with the green arrow button above your query. If you just click the button, your entire query will run. If want to run certain lines of of your SQL query, you can select those lines under the shift key or highlighting them with your mouse then click execute. On your keyboard, the F5 button also works to execute your query.

Building a Query

SQL queries are essentially built on the use of five verbs: SELECT, FROM, WHERE, GROUP BY AND HAVING and are built in that order:

From

The “FROM” function identifies which table/view/object you want to pull data from. To reference a table/object outside of your database write “FROM SCHEMA_NAME.TABLE_NAME”.

Select

SELECT indicates what you want returned in the query. If a field or calculation is not listed in the SELECT statement, it will not appear in the result of the query. If you want to return all columns, use an asterick sign ("SELECT *") as a shortcut to listing out all the column names. In the following example, the query selects two fields from the table dbo.retail_sales and outputs two other columns that are the product of the fields outputted:

select  Quantity,
        UnitPrice,
        Quantity * UnitPrice,
        Quantity * UnitPrice as Revenue
from dbo.retail_sales
Table 1: Displaying records 1 - 10
QuantityUnitPriceRevenue
62.5515.3015.30
63.3920.3420.34
82.7522.0022.00
63.3920.3420.34
63.3920.3420.34
27.6515.3015.30
64.2525.5025.50
61.8511.1011.10
61.8511.1011.10
321.6954.0854.08

The final two fields outputted above are the exact same calculation but the final field just is renamed for a cleaner and easier to follow output. Next, let’s say you just want to glipse at a couple rows in your output because even here you can see this data goes on for almost 1,000 more rows. Tod do this, just add the word “TOP” after your SELECT statement to indicate you want to show the top X number of rows in your output (the default is 10). In the example below, the query outputs the first five rows of the table. If you delete the “5” after the word TOP,it would automatically default to returning the first ten rows of data.

SELECT TOP 5 * 
FROM dbo.retail_sales
Table 2: 5 records
InvoiceNoStockCodeDescriptionQuantityInvoiceDateUnitPriceCustomerIDCountry
53636585123AWHITE HANGING HEART T-LIGHT HOLDER62010-12-01 08:26:00.00000002.5517850United Kingdom
53636571053WHITE METAL LANTERN62010-12-01 08:26:00.00000003.3917850United Kingdom
53636584406BCREAM CUPID HEARTS COAT HANGER82010-12-01 08:26:00.00000002.7517850United Kingdom
53636584029GKNITTED UNION FLAG HOT WATER BOTTLE62010-12-01 08:26:00.00000003.3917850United Kingdom
53636584029ERED WOOLLY HOTTIE WHITE HEART.62010-12-01 08:26:00.00000003.3917850United Kingdom

You can also use the key word “DISTINCT” that will return the distinct records for the fields in your select statment (ex: SELECT DISTINCT * would return the distinct rows of all fields). This function works similarly to the remove duplicates functionality in Excel.

Where

Okay here is where things get fun y’all. The WHERE statement enables you to filter your data. Similar to filters in other programming languages, you can apply conditional filtering using “AND” and “OR”. Viewing data is great but filtering is really when you get to play around. The key to filtering in SQL (and honestly most other programming languages) is remembering PEMDAS and making sure your parentheses are not only placed in the right location but that you’re not missing any closed paranetheses. This is one of the most common mistakes I make when querying so if you’re getting an error make sure to double check that everything is in the right place and closed.

In the two examples below, you can see that the query intends to focus on two invoice numbers and filter to only show records where the unit price is less than 2.00. However, the second example does not use the parentheses and in its output, you can see two records where the unit price is greater than 2 because the calculation moves from left to right so the only filter on these two records is InvoiceNo = 537446.

SELECT top 5 * 
FROM dbo.retail_sales
where unitPrice < 2.00 and (invoiceno = '536365' OR InvoiceNo = '537446')
Table 3: 5 records
InvoiceNoStockCodeDescriptionQuantityInvoiceDateUnitPriceCustomerIDCountry
53744621135VICTORIAN METAL POSTCARD SPRING82010-12-07 09:13:00.00000001.6918055United Kingdom
53744621397BLUE POLKADOT EGG CUP122010-12-07 09:13:00.00000001.2518055United Kingdom
53744622948METAL DECORATION NAUGHTY CHILDREN242010-12-07 09:13:00.00000000.8518055United Kingdom
53744646000MPOLYESTER FILLER PAD 45x45cm22010-12-07 09:13:00.00000001.5518055United Kingdom
53744672741GRAND CHOCOLATECANDLE182010-12-07 09:13:00.00000001.4518055United Kingdom
SELECT top 5 * 
FROM dbo.retail_sales
where unitPrice < 2.00 and invoiceno = '536365' OR InvoiceNo = '537446'
Table 4: 5 records
InvoiceNoStockCodeDescriptionQuantityInvoiceDateUnitPriceCustomerIDCountry
53744620782CAMOUFLAGE EAR MUFF HEADPHONES42010-12-07 09:13:00.00000005.4918055United Kingdom
53744621135VICTORIAN METAL POSTCARD SPRING82010-12-07 09:13:00.00000001.6918055United Kingdom
53744621175GIN + TONIC DIET METAL SIGN122010-12-07 09:13:00.00000002.1018055United Kingdom
53744621397BLUE POLKADOT EGG CUP122010-12-07 09:13:00.00000001.2518055United Kingdom
53744621411GINGHAM HEART DOORSTOP RED32010-12-07 09:13:00.00000004.2518055United Kingdom

Group By

The group by statement allows you to group the data by at least one field and then output calculations based off that grouping. The most common calculations that I use regularly are max(), min() and count(). Min and max both return the smallest or largest value respectively of hte field provided by the fields grouped by. For example, in our sample data, grouping by the InvoiceNo and returning the max(UnitPrice) would return the largest unit price of an item sold in each invoiceNo. Let’s look at some examples:


SELECT InvoiceNo , InvoiceDate, max(unitprice) as highest_price
FROM dbo.retail_sales
GROUP BY InvoiceNo , InvoiceDate
Table 5: Displaying records 1 - 10
InvoiceNoInvoiceDatehighest_price
5615212011-07-27 17:06:00.00000004.65
5445822011-02-21 14:38:00.00000005.75
5694932011-10-04 13:35:00.000000012.50
C5514632011-04-28 16:13:00.000000010.40
5534662011-05-17 11:10:00.00000000.72
5610602011-07-24 13:30:00.00000004.15
5653282011-09-02 13:50:00.00000008.50
C5778152011-11-22 09:16:00.00000004.95
5518882011-05-05 09:29:00.000000012.75
C5713362011-10-17 11:43:00.00000003.75

A useful trick is using the count(*) or count(distinct FIELD_NAME) functions as the aggregate in a group by query to get the row count for each group:


SELECT InvoiceNo , count(distinct stockcode) as product_ct
FROM dbo.retail_sales
GROUP BY InvoiceNo 
Table 6: Displaying records 1 - 10
InvoiceNoproduct_ct
5363657
5363662
53636712
5363684
5363691
53637020
5363711
5363722
53637316
5363741

Having

The final verb, HAVING, is similar to a where clause in that it filters data, but HAVING is only utilized when you have a group by statement. If you have a grouped query and want to filter it, you can use the HAVING statement to filter the rows of your output.

For example, if we add on to our previous example where we calculated the number of distinct products in each invoice to only return invoices with more than 10 products we would do the following:


SELECT InvoiceNo ,  COUNT(DISTINCT stockcode) as product_ct
FROM dbo.retail_sales
GROUP BY InvoiceNo 
HAVING COUNT(DISTINCT stockcode) > 10
Table 7: Displaying records 1 - 10
InvoiceNoproduct_ct
53636712
53637020
53637316
53637516
53637819
53638134
53638212
53638413
53638814
53638914

This has hopfeully been a useful quick overview of how to build queries in SQL and how to get started. Next up, I’m going to write up how to do simple joins in SQL so you can link your tables together.

Next
Previous