Another Introduction to SQL
This presentation discusses querying a database. It is presented in the style of a workshop, and gives practice in SQL DML (Data Manipulation Language). It also includes exercises.
In this workshop, we will focus on retrieving data from an SQL database. That is, we will NOT go very deeply into the creation/modification of databases OR proper database design. However, we will briefly go over creating/modifying tables so that you have the ability to create something to query by the end of this workshop.
Databases are ubiquitous in digital business. Any website or internal business uses databases to neatly store huge amounts of data that would be unmanageable in a spreadsheet. SQL is a query language that lets us generates useful reports using data from databases, and it lets us do it quickly.
Knowing SQL is a valuable skill for anyone interested in working with data, and data is everywhere! The majority of businesses rely on databases to store their data. Whether it's for big business or just a small side project you are interested in creating, having databases and being able to query them using SQL is often highly valuable.
SQL has a great amount of depth, like any language. To look at the language in-depth would take many, many workshops. However, by the end of this workshop, you should be able to write simple database queries that will get the job done for most small apps you might want to create.
From here, you'll simply need to install database management software to create databases with.
If you wish to create an application that interacts with your database, you will need to learn specific libraries/APIs you can use to have your program interact with it (e.g. the java.sql package for Java or ADO.NET for C#).
What do I need to know before learning SQL?
Nothing. However, experience with a programming language such as C, C++, or Java will make understanding datatypes, comparisons, and functions easier.
What is SQL?
- Structured Query Language
- As a query language, it's almost completely focused on retrieving data and generating reports
- Used to access and manipulate databases
Databases store information in the form of entries (or rows) in tables.
For example, a table of customers might look something like this:
|1||John Smith||555 Apple Way||Orlando||32816||FL|
|2||Jane Doe||222 Orange Rd||Longwood||32779||FL|
|3||Smith Johnson||123 Peach Street||Chapel Hill||27514||NC|
Before we start executing any SQL code, we need a database to work with. Let's go ahead and take a look at our tables.
In another tab, open up W3Schools's Try it Yourself Editor. Here we see there a database we can play with. On the right side of the interface, we see the list of tables we have.
Execute each of the following commands by writing them in the "SQL Statement" textbox and then hitting the "Run SQL" button.
SELECT * FROM Customers
SELECT * FROM Orders
SELECT * FROM Products
Now that we have a database to play with, let's learn how to write some basic queries.
Querying a database
SELECT is used in conjunction with other keywords and clauses to retrieve and view information. You can think of the SELECT statements as a way of posing a question to a database. Many techniques used in writing SELECT statements are also used in creating UPDATE, INSERT, and DELETE statements.
Basic SELECT statements follow this format:
SELECT column, column, column FROM table WHERE column operator value
SELECT - required, used to specify the columns (separated by commas) you want in the result set of the query.
FROM - required, specifies the table(s) from which the columns are drawn from.
WHERE - optional, used to filter rows returned by the FROM clause. We'll learn more about this later once we've looked at data types and operators.
Simple SELECT examples
- Get the ID numbers of names of all customers.
SELECT CustomerID, CustomerName FROM Customers
- Get all information from the Customer table.
SELECT * FROM Customers
- Which products cost $50 or more?
SELECT ProductName, Price FROM Products WHERE Price >= 50
- Which customers reside in Buenos Aires?
SELECT * FROM Customers WHERE City = "Buenos Aires"
DISTINCT keyword removes duplicate rows returned by a query. Syntax:
SELECT DISTINCT <columns> FROM <tables>
Example: Get a list of distinct cities that the customers reside in
SELECT DISTINCT City FROM Customers
ORDER BY keyword allows you to sort information returned by a query. Syntax:
SELECT Statement ORDER BY *column* ASC|DESC
Example: Get a list of distinct cities in ascending order.
SELECT DISTINCT City FROM Customers ORDER BY City ASC
More with WHERE
Now that we know more about expressions, we can be more confident using the
You might remember this query we saw earlier.
SELECT ProductName, Price FROM Products WHERE Price >= 50
Just like in this example, using the
WHERE keyword to filter data is fairly intuitive in most cases, but we will still need to know our comparison operators. I'll throw in the arithmetic operators while we are at it.
Arithmetic and Comparison Operators
These are the standard operators. Note that each RDBMS will have more operators in addition to these.
|+, -, *, /||Standard Arithmetic|
|!=, \<>||Not equal|
|>=, !<||Greater than or equal to|
|<=, !>||Less than or equal to|
Notes on Comparison Operators
- Less Than and Greater Than operators can be applied to not only numbers but character strings and dates/times.
Essentially, we use arithmetic and comparison operators the same way we do in a language like
Java. At this point, you may be looking for a logical and operator && and a logical or operator ||. Well, we have those and quite a few more logical operators available to us in SQL.
Listed in arbitrary order:
||TRUE if both epressions are true||
||TRUE if either boolean expression is true||
||Reverses the value of any boolean operator||
||TRUE if operand is within specified range||
||TRUE if the operand matches a pattern||
||TRUE if the operand is equal to one of a list of expressions.||
||TRUE if subquery contains any rows|
||TRUE if a set of comparisons are TRUE|
||TRUE if one in a set of comparisons is TRUE|
AND to specify multiple search criteria.
Example: Select all customers from Berlin, Germany.
SELECT * FROM Customers WHERE Country = "Germany" AND City = "Berlin";
OR is used to retrieve all records to match at least one of two criteria.
Example: Select all customers from either the UK or USA.
SELECT * FROM Customers WHERE Country = "USA" OR Country = "UK"
NOT is used to negate an expression
Example: Select all customers that are NOT from the USA.
SELECT * FROM Customers WHERE NOT (Country = "USA")
Another solution is
SELECT * FROM Customers WHERE Country != "USA"
BETWEEN selects values within a range.
<column> BETWEEN <A> and <B> is essentially a shorthand of
<column> >= <A> AND <column> <B>
Example: Select all products with a price in between
SELECT * FROM Products WHERE Price BETWEEN 15 AND 20
Not that both bounds are included in the range.
LIKE is a little more complicated, especially if you haven't seen pattern matching before.
LIKE selects all values that match a specified pattern. Let's start with an example.
Example: Select all customers whose name begins with the letter
SELECT * FROM Customers WHERE CustomerName LIKE "A%"
% is a wildcard that essentially represents a substitute for zero or more characters.
Another example: Select all customers whose name ends with the letter
SELECT * FROM Customers WHERE CustomerName LIKE "%e"
IN selects all elements whose value is found within a provided set of values.
Example: Select all customers that reside in either the UK or the USA.
SELECT * FROM Customers WHERE Country IN ("UK", "USA")
EXISTS is a condition that is met when a sub-query returns at least one row.
EXISTS is a bit more complicated than the previous logical operators we've seen, but it is fairly powerful.
Example: Select all customers who have at least one order currently placed
SELECT * FROM Customers WHERE EXISTS (SELECT * FROM orders WHERE Customers.CustomerID = Orders.CustomerID)
This example is important because it shows off the idea of using a query within another query. The embedded query we see here is called a sub-query.
ALL allows you to compare one value to every value within a set.
Simple but admittedly useless example: Select all orders whose price is greater than 15, 30, and 45.
SELECT * FROM Order WHERE Price > ALL (15, 30, 45)
Example: Select all products that cost either 15, 20, or 25
SELECT * FROM Products WHERE PRICE = ANY (15, 20, 25)
Unfortunately, the Try It Editor does not appear to support the
Like in any programming language, SQL comes with built-in functions, of which there are many. Each DBMS also adds its own functions. Since there are so many functions, we will only cover ones that are used very frequently.
Some examples of scalar functions
CURDATE()returns the current time and date
ABS(num)returns the absolute value of
strwith all characters changed to uppercase
While there are many, many scalar functions, to learn about, you can simply look them up whenever need them. Aggregate functions are special and are used very often.
Unlike with scalar functions, we argue entire columns as parameters rather than a single value.
MIN(col)returns the smallest value found in
MAX(col)returns the largest value found in
SUM(col)returns the sum of all numeric values in
AVG(col)returns the average value of
COUNT(col)returns the total number of values in
Example: Find the average price the products.
SELECT AVG(Price) FROM Products
Example: How many products are there?
SELECT COUNT(ProductID) From Products
Example: What is the most expensive product?
SELECT MAX(Price) From Products
Let's try an example: Count how many customers there are in each country.
We might try something like this:
SELECT Country, COUNT(Country) FROM Customers
However, all we get back is
Which is the last country in the Customers table and the total number of customers (the
COUNT function counted the number of rows in the entire table).
Right now, we can't do very much with aggregate functions. In the way we've used them before, they calculate a value across all rows in a table. We need a way to group data in order to get any sort of subtotal.
GROUP BY groups a selected set of rows into a set of summary rows by the values of one or more columns. Let's try including a
GROUP BY statement in our statement.
SELECT Country, COUNT(Country) FROM Customers GROUP BY Country
We see that this works. Yay!
While there is so much more to SQL, we actually have all the basic tools necessary to retrieve information from a database. Try some exercises make sure you are understanding everything so far.
- Select all products with a price anywhere in between
30. Order them by price in descending order.
- What's the most expensive product available? What is it's price?
- What's the average number of products sold in any given order?
- Select all products that have a price greater than the average price of all products. Order them by price in descending order.
- Count the number of products that have a price greater than than the average price of all products.
Solutions (note that each problem has multiple valid solutions):
SELECT * FROM Products WHERE Price BETWEEN 15 AND 30 ORDER BY Price DESC
SELECT ProductName, MAX(Price) FROM Products
SELECT AVG(Quantity) FROM OrderDetails
SELECT * FROM Products WHERE Price > (SELECT AVG(Price) FROM Products) ORDER BY Price DESC
SELECT COUNT(*) FROM Products WHERE Price > (SELECT AVG(Price) FROM Products)
Working with multiple tables using joins
Right now, we don't have many ways to work with multiple tables at once. There are many problems we still can't solve, like
- Generate a table of orders containing the order ID and the name of the customer that made that order.
- Get all customers and any orders they might have.
In many situations, you will need to link multiple tables together to get the data you need. For example, each pair of things listed here would probably be stored using two tables:
- Customers and their orders
- Bowlers and the games they've played
- Students and the classes they took
The INNER JOIN
The most common type of JOIN is the
INNER JOIN selects all rows from both tables as long as there is a match between the specified columns in both tables.
INNER JOIN is the most commonly used type of join
SELECT column(s) FROM table1 INNER JOIN table2 ON table1.column = table2.column
Looking at the
Customers table and the
Orders table, we see that there is a link between them. That is, the
Products table has a
CustomerID column, which is also found in the
Customers table. In the
OrderID is the called the primary key, and
CustomerID is called a foreign key. Foreign keys are what establish the links between tables.
Example 1: Generate a table of
SELECT Orders.OrderID, Customers.CustomerName FROM Orders INNER JOIN Customers ON Orders.CustomerID = Orders.CustomerID
Example 2: Generate a report orders consisting of their ID, the name of the customer that made the order, and the shipper of that order.
SELECT Orders.OrderID, Customers.CustomerName, Shippers.ShipperName FROM Orders INNER JOIN Customers ON Orders.CustomerID = Orders.CustomerID INNER JOIN Shippers ON Orders.ShipperID = Shippers.ShipperID
Example 3: Do the same as in example 2, but include the
OrderDate. Also, only include orders placed before October 1996. Order by
OrderDate, with the earliest order being shown first. These customers have been waiting a long time.
SELECT Orders.OrderID, Customers.CustomerName, Shippers.ShipperName, Orders.OrderDate FROM Orders INNER JOIN Customers ON Orders.CustomerID = Orders.CustomerID INNER JOIN Shippers ON Orders.ShipperID = Shippers.ShipperID WHERE Orders.OrderDate < "1996-08-00" ORDER BY Orders.OrderDate ASC
Side note: NULL values
NULL value represents missing data. For example, if you insert a new customer into the
Customers table without specifying a name, that customer would show
NULL in the
CustomerName column (though in practice, you might not allow customers have to have a
We do have an operator to check if a value is
LEFT JOIN returns all rows from the left table (table1), even if there are no matches on the right table. If there is no match, the field(s) corresponding to the right table will be
Example: Generate a report of every customer and an order they have placed if any.
SELECT Customers.CustomerName, Orders.OrderID FROM Customers LEFT JOIN Orders ON Customers.CustomerID=Orders.CustomerID ORDER BY Customers.CustomerName;
RIGHT JOIN works in a similar way to
RIGHT JOIN returns all rows from the right table (table2), even if there are no matches on the left table.
FULL OUTER JOIN
FULL OUTER JOIN returns all rows from both the left and right table (essentially combining the result of both
LEFT JOIN and
Example: Generate a report of all customers and all orders.
SELECT Customers.CustomerName, Orders.OrderID FROM Customers FULL OUTER JOIN Orders ON Customers.CustomerID=Orders.CustomerID ORDER BY Customers.CustomerName;
UNION operator combines the result-set of two or more
Example: Get a list of all the distinct cities that either a resident lives in or a supplier is based in.
SELECT City FROM Customers UNION SELECT City FROM Suppliers ORDER BY City
At this point, we can write some very informing queries. Try these exercises to see what you know and help solidify what you have learned up to this point.
1. Count the number of suppliers in each country. Order results by the number of suppliers, and then by the name of country.
2. Get a list of orders shipped by Speedy Express.
3. Count the number of products in each category.
4. (Difficult) Select all categories, listed in order of which categories are the most ordered.
5. (Difficult) Select all products that a customer in the UK has ordered.
Try solving these on your own before looking at my solutions
SELECT Country, Count(Country) AS CountryCount FROM Suppliers GROUP BY Country ORDER BY CountryCount, Country
SELECT * FROM Orders INNER JOIN Shippers ON Orders.ShipperID = Shippers.ShipperID WHERE ShipperName = "Speedy Express"
SELECT Categories.CategoryName, COUNT(Products.ProductID) AS TotalCount FROM Categories INNER JOIN Products ON Categories.CategoryID = Products.CategoryID GROUP BY Categories.CategoryName ORDER BY TotalCount
SELECT Categories.CategoryName, COUNT(Products.ProductID) AS TotalCount FROM Categories INNER JOIN Products ON Categories.CategoryID = Products.CategoryID GROUP BY Categories.CategoryName ORDER BY TotalCount DESC
SELECT Products.ProductID, Products.ProductName, Customers.CustomerName, Customers.Country FROM Products INNER JOIN OrderDetails ON Products.ProductID = OrderDetails.ProductID INNER JOIN Orders On OrderDetails.OrderID = Orders.OrderID INNER JOIN Customers On Orders.CustomerID = Customers.CustomerID WHERE Customers.Country = "UK" ORDER BY Products.ProductName ASC
Before making our own tables, we'll need to learn more about datatypes.
Take a look at the data types available in most versions of SQL. A data type simply defines what kind of values a variable can take. For example, we don't want to be able to assign someone's birthdate to be the string "Andrew".
Typecasting using CAST
Normally, we can't add
"2". This would be like adding a number and a word, which doesn't make sense. If we could somehow convert the string
"2" to the number
2, we could add with
We use the
CAST function to change the data type of an expression.
CAST (*expression* AS *data type*)
Go ahead and try it out.
SELECT (1 + CAST("2" AS INTEGER));
One interesting thing to note is that we can add/subtract date/time values. However, how this is done varies between different database management systems (i.e. the software that manages the database).
Creating a Table
At this point, I will very briefly introduce the keywords used to insert, update, and delete records in a database. Before doing that, let's create a table of our very own.
CREATE TABLE t_name (col_name1data_type(size), col_name2data_type(size), ...)
Example: Create a table of students consisting of their name, their standing, their major, and their enrollment date.
CREATE TABLE Students ( StudentName varchar(30), AcademicStanding varchar(20), Major varchar(20), EnrollmentDate date )
Now let's add some students to the database using the
INSERT INTO statement.
INSERT INTO t_name VALUES (value1, value2, ...)
INSERT INTO t_name (column1, column2, ...) VALUES (value1,value2, ... )
INSERT INTO Students (StudentName, AcademicStanding, Major, EnrollmentDate) VALUES ("Andrew", "Junior", "Computer Science", "2014-08-00")
UPDATE is used to modify existing records in a table.
UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE condition
Example: Update Andrew's
AcademicStanding value to
UPDATE Students SET AcademicStanding = "Senior" WHERE StudentName = "Andrew"
DELETE is used to delete existing records in a table.
Example: Remove Andrew from the
DELETE FROM Students WHERE StudentName = "Andrew Kolos"
Database Management Systems
While SQL is an ANSI standard, there are many versions of it. While they all support major commands, they have their own unique features. Examples include
- Microsoft's SQL Server
- Oracle SQL
- IBM's DB2
Fortunately, pretty much everything we learned here is standard among all DBMSes.
Source: Andrew Kolos, http://techknights.org/workshops/sql/
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.