In the previous lessons, you learned how to insert data into a table and how to select data from one table and insert it into another. Now, let’s focus on how to retrieve data from a table.
Querying data in PostgreSQL involves using SQL to extract specific information from a database by selecting columns and rows from one or more tables, applying filters based on conditions, and potentially joining tables together.
The term “query” refers to the process of requesting specific rows from a database. To construct a query, we typically need to specify which columns we want to retrieve from which tables and apply conditions to filter the results.
For example, we might want to retrieve all users who haven’t interacted with our service in the last year, or retrieve data on products that exceed a certain sales threshold.
Therefore, the first step in constructing a query is to identify the data source we’re working with, followed by determining the specific columns contained in the relevant tables.
Querying a Table in PostgreSQL
To browse the names of columns within a table, we can use the \d
statement followed by the table name in the geeksveda2 database. For example, to browse the columns of the “books” table, we would use the following command.
\d books
This will display the table’s columns, their data types, and whether they allow null values or have default values. From this information, we can construct a query to retrieve specific data from the table.
For instance, suppose we want to retrieve the id, title, and price of all books in the “books” table. We can use the following SELECT statement syntax:
SELECT id, title, price FROM books;
This will return a table with 6 rows (assuming the “books” table has the same data as before), each containing the id, title, and price of a book. Note that PostgreSQL statements must end with a semicolon to indicate the end of the query.
However, when dealing with large tables, it’s often not practical to display all the data at once. In such cases, we can use filtering to limit the amount of data returned. For example, we might only want to see books with a price greater than $20:
SELECT id, title, price FROM books WHERE price > 20;
This will return a smaller table with only the books that meet the specified condition. By using filtering and other query techniques, we can retrieve exactly the data we need from a table.
Limit Querying Table Data in PostgreSQL
When retrieving data from a database, it’s important to consider the amount of data being fetched and limit it as necessary. In some cases, fetching a large number of rows can be time-consuming and inefficient.
For example, Facebook allows users to fetch a limited number of user data at a time, rather than fetching all the data in one query. This is also common practice on other websites, such as YouTube.
To limit the number of rows returned in a query, we can use the LIMIT
directive followed by the maximum number of rows to be displayed. For example, to limit the number of rows to 2, we can use the following query:
SELECT id, title, price FROM books LIMIT 2;
Note that this query may return duplicate rows with the same ID, which we will address in later lessons.
Limiting the number of rows returned can improve query performance and make it easier to manage large datasets.
Ordering the Data in PostgreSQL
When fetching data from a table, the default order of the lines is based on the order in which they were inserted into the table. However, we can order the results based on the values in one or more specific columns using the ORDER BY
clause. In Postgres, we can order the results in ascending (ASC) or descending (DESC) order for both text and numeric data.
To order the data based on a specific column, we add the ORDER BY
clause to the end of the query followed by the name of the column. For example, to order the books by their id in ascending order, we can use the following query:
SELECT id, title, price FROM books ORDER BY id ASC;
The ORDER BY
returns the results in ascending order by default. If we want to sort the results in descending order, we can add the DESC
keyword after the column name.
For example, to order the books by their id in descending order, we can use the following query:
SELECT id, title, price FROM books ORDER BY price ASC, title DESC;
We can also order the results based on multiple columns by specifying the column names separated by commas. For example, to order the books by price in ascending order, and then by title in descending order, we can use the following query:
SELECT id, title, price FROM books ORDER BY price ASC, title DESC;
This will first order the books by price in ascending order, and then order the books with the same price by title in descending order.
In addition to ordering the results, we can also limit the number of rows returned using the LIMIT
clause. For example, to get the top 5 books ordered by their id in descending order, we can use the following query:
SELECT id, title, price FROM books ORDER BY id DESC LIMIT 5;
This will return the top 5 books ordered by their id in descending order.
PostgreSQL Aggregate Functions
In PostgreSQL, an aggregate function is a type of SQL function that performs a calculation on a set of values and returns a single value. These functions are commonly used in queries that group data and produce summary reports.
PostgreSQL provides a wide range of built-in aggregate functions, including:
- The
SUM()
function computes the total value obtained by adding up all the numbers in a given set. - To find the average value of a group of numbers, you can use the
AVG()
function. - When you need to know how many records are in a table or query result, the
COUNT()
function can provide that information. MAX()
is a function that returns the maximum value in a set of numbers.MIN()
is a function that retrieves the minimum value in a group of numbers.
Aggregate functions can be used with the GROUP BY
keyword to group the result set by one or more columns. This allows us to produce summary reports that provide insights into our data.
For example, we can get the average prices of the books in the books table, the number of books present, or the largest or smallest price, as follows:
SELECT MAX(price),MIN(price),AVG(price),SUM(price),COUNT (price) FROM books;
Notice how the outputs now have more meaningful column names.
MIN
– The MIN function returns the minimum value in a column.MAX
– The MAX function returns the maximum value in a column.COUNT
– Returns the number of rows that contain a non-null value.
The following example is what we aim at:
SELECT COUNT(*), COUNT(title), COUNT(price), COUNT(created_at) FROM books;
Notice how calling COUNT
on the *
returns the total count of all rows in the table regardless of the column’s values. However, did you notice the column names in the output? They are all count, which is not very useful. We can use the AS
keyword to assign more meaningful column names to the outputs.
Rename Columns with an Alias
The AS
directive is used to rename columns in the output table, making it more suitable for the purpose of the query. In the previous example, we performed three COUNT
operations, but the output table had all columns named “count” without distinction.
To address this, we used the AS
directive to rename the columns, as shown below:
SELECT COUNT(*) AS TOTAL, COUNT(title) AS TOTAL_TITLES, COUNT(price) AS TOTAL_PRICE FROM books;
The output is now more readable and understandable. Note that the AS
directive is optional, and we can name the output tables without using it, like this:
SELECT COUNT(*) AS TOTAL, COUNT(title) TOTAL_TITLES, COUNT(price) TOTAL_PRICE FROM books;
The GROUP BY
clause is used to group rows with the same values in a particular column. It’s commonly used with aggregate functions to calculate summary statistics for each group.
For example, the following query groups the books by title and calculates the average price for each title:
SELECT title, AvG(price) FROM books GROUP BY title;
The HAVING
keyword is used to filter the results of a GROUP BY
query based on a condition. For instance, the following query groups the books by title and filters the results to only include books with an id greater than 2:
SELECT title, AVG(price) FROM books GROUP BY title HAVING AVG(id) > 2;
Overall, these features of SQL help us to organize and analyze data in a more structured and efficient manner.
Conclusion
In this lesson, we learned how to retrieve data from a table using SQL queries in PostgreSQL. We started by browsing the columns within a table and constructing a query to retrieve specific data from the table.
We then learned how to use filtering to limit the amount of data returned, how to limit the number of rows displayed, and how to order the data based on specific columns.
Finally, we learned about aggregate functions, which perform calculations on a set of values and return a single value, and how to use them to produce summary reports. By using these techniques, we can retrieve and analyze data in a more efficient and effective way.