LIKE in PostgreSQL explained with examples
Databases form the backbone of modern applications, serving as the central repository for data that powers everything from websites to mobile apps. A critical component of working with databases is the ability to efficiently search and retrieve relevant information, which becomes especially significant in massive datasets. One of the cornerstones of this retrieval process is pattern matching, an indispensable tool for querying data based on specific patterns or substrings.
Introduction
PostgreSQL, commonly known as Postgres, is a powerful, open-source relational database system. With more than 15 years of active development and a proven architecture, it has earned a strong reputation for reliability, data integrity, and correctness. When working with vast amounts of data in Postgres, the significance of effective querying and searching cannot be overstated. It allows developers and database administrators to sift through tons of information quickly, ensuring that applications remain responsive and users get the data they need.
What is the LIKE Operator?
The LIKE
operator in SQL is designed for pattern matching within strings. It is a way to search for a specified pattern in a column. Unlike other search methods in SQL that look for exact matches, LIKE
is used to find a match based on a specific pattern, offering a more flexible approach to querying.
Basics of LIKE Operator
When you’re dealing with the LIKE
operator, the process revolves around comparing a value to a specified pattern. Patterns are defined using a combination of regular characters and wildcard characters, which we’ll discuss shortly.
Wildcard Characters
In the context of the LIKE
operator in SQL, especially within PostgreSQL, there are two primary wildcard characters: %
and _
.
%: Zero or More Characters
The %
wildcard represents zero, one, or multiple characters. This means that when you use %
in your pattern, it can stand for no characters, a single character, or several characters. For instance, if you’re looking for any word starting with ‘a’, you’d use the pattern a%
.
_: Single Character
The _
wildcard, on the other hand, represents a single character. So, if you’re trying to find a five-letter word where the third letter is ‘r’, you would use the pattern __r__
.
Examples of LIKE Operator Usage
Basic Examples
Searching for a Specific Prefix
To find all entries that start with “pre”, the query would look something like:
SELECT column_name FROM table_name WHERE column_name LIKE 'pre%';
Searching for a Specific Suffix
If you’re searching for entries that end with “fix”, the appropriate query is:
SELECT column_name FROM table_name WHERE column_name LIKE '%fix';
Searching for a Pattern Anywhere in the String
To locate entries with the substring “mid” anywhere within them:
SELECT column_name FROM table_name WHERE column_name LIKE '%mid%';
Searching for a Specific Character at a Specific Position
For a string where the second character is “z”:
SELECT column_name FROM table_name WHERE column_name LIKE '_z%';
Advanced Examples
Combining Multiple Wildcards
To find entries that start with “a”, followed by any character, and then “c”:
SELECT column_name FROM table_name WHERE column_name LIKE 'a_c%';
Using LIKE with Other SQL Clauses
The LIKE
operator can seamlessly integrate with other SQL clauses. For instance, when combined with WHERE
and ORDER BY
:
SELECT column_name FROM table_name WHERE column_name LIKE '%pattern%' ORDER BY column_name DESC;
This would return results matching the pattern, ordered in descending order based on the specified column.
NOT LIKE
The NOT LIKE
operator in PostgreSQL provides a way to filter out records that do not match a certain pattern. It’s the inverse of the LIKE
operator. When you want to exclude rows based on a particular pattern, NOT LIKE
is your tool.
For instance, if you have a table of books and you want to find all titles that don’t start with ‘The’, you would use:
SELECT title FROM books WHERE title NOT LIKE 'The%';
The %
character acts as a wildcard, representing any number of characters.
ILIKE: Case-Insensitive Search
While LIKE
is case-sensitive, PostgreSQL offers a case-insensitive counterpart named ILIKE
. This means, using ILIKE
, the pattern ‘HELLO’ would match both ‘hello’ and ‘HELLO’.
This is particularly useful when the case of the input data is uncertain or when user input can be in varying cases. The main advantage is the simplicity it offers in such scenarios.
Examples for ILIKE
Consider a table of customers and you want to find all customers whose name contains ‘john’, regardless of case:
SELECT name FROM customers WHERE name ILIKE '%john%';
This would match ‘John’, ‘JOHN’, ‘john’, ‘JoHn’, and so forth.
Performance Considerations
As with any database operation, performance considerations are crucial when using pattern matching. Unoptimized pattern searches can lead to slow queries.
Importance of Indexes
Indexes are essential in improving the speed of search operations. By default, using LIKE
or ILIKE
will result in a full table scan unless the query can utilize an index. Proper indexing can turn a potentially slow operation into a quick one.
Trigram Indexes with pg_trgm
PostgreSQL offers the pg_trgm
extension, which provides support for trigram-based searches. A trigram is a set of three consecutive characters taken from a string. Using trigram indexes can speed up LIKE
and ILIKE
operations, especially when the search pattern has a leading wildcard.
To use this, you’d first need to enable the extension:
CREATE EXTENSION pg_trgm;
Then, create an index on your desired column:
CREATE INDEX idx_name_trgm ON customers USING gin(name gin_trgm_ops);
Potential Performance Pitfalls
- Leading wildcards (
%
) in patterns can prevent the use of indexes, making searches slower. - Overusing pattern matching where simpler string operations might suffice.
- Not considering the case-sensitivity of
LIKE
can lead to missed matches and might makeILIKE
a better choice.
Using LIKE with Arrays and JSONB
Searching Arrays with LIKE
PostgreSQL allows the use of arrays as a datatype. To search within arrays using LIKE
:
SELECT * FROM table_name WHERE ANY(array_column) LIKE 'pattern%';
Using LIKE with JSONB Fields
For JSONB
fields, you can use the jsonb_array_elements_text
function combined with LIKE
:
SELECT * FROM table_name WHERE jsonb_array_elements_text(jsonb_column) LIKE 'pattern%';
Alternatives to LIKE in PostgreSQL
Regular Expressions in PostgreSQL
PostgreSQL supports powerful regular expressions using operators like ~
(case-sensitive), ~*
(case-insensitive), !~
(case-sensitive not match), and !~*
(case-insensitive not match).
Example:
SELECT name FROM customers WHERE name ~* 'jo.hn';
SIMILAR TO Operator
SIMILAR TO
provides SQL:1999-standard regular expression matching. It’s another way to pattern-match, but be aware of the different syntax and capabilities compared to POSIX-style regex.
SELECT name FROM customers WHERE name SIMILAR TO 'Jo(n|h)n';
Other Extensions and Tools
PostgreSQL has several extensions like fuzzystrmatch
for more advanced pattern matching and string comparison functionalities.
Tips and Best Practices
Use Wildcards Appropriately
Wildcards (%
and _
) can be powerful, but use them judiciously. Too many can lead to slower searches.
Be Wary of Leading Wildcards
As mentioned earlier, leading wildcards can prevent the utilization of indexes, leading to slower performance.
Effective Use of Indexes
Always consider the potential of indexing columns that are frequently searched using pattern matching.
Text Normalization Considerations
Normalizing text (e.g., converting to lowercase) before storing and searching can often lead to more consistent and predictable results.
Limitations of LIKE
LIKE
is limited to simple pattern matching. For more complex patterns, regular expressions might be more suitable.LIKE
andILIKE
are not full-text search tools. PostgreSQL offers specialized tools for such needs.
Conclusion
Pattern matching, using tools like LIKE
and its variants, is a powerful feature in PostgreSQL. However, it’s essential to understand its nuances, performance implications, and best practices. Here at codedamn, we encourage continuous learning and optimization to get the best out of your database operations.
Sharing is caring
Did you like what Rishabh Rao wrote? Thank them for their work by sharing it on social media.