How to Validate SQL Query With Regular Expression?
Generally, a row of data will consist of email address, phone numbers, alpha/alphanumeric/numeric values, etc., Usually, front end validations are there to validate an email address/phone number, etc., In this article, we will see how it can be validated by using Regular expressions while writing SQL queries. Because always it is good to pass relevant data in the backend. Whenever there is an error in the data, Regular expression quickly validates that and informs the user whether is it a success or failure.
Step 1: Create a database
Query:
SELECT * FROM sys.databases WHERE name = 'GEEKSFORGEEKS'
BEGIN
CREATE DATABASE [GEEKSFORGEEKS]
END
Step 2: Create “GeekAuthors” table under the “GEEKSFORGEEKS” database
Query:
use GEEKSFORGEEKS
CREATE TABLE GeekAuthors (
ID INT IDENTITY NOT NULL PRIMARY KEY,
AuthorName VARCHAR(255) NOT NULL,
AuthorEmail VARCHAR(255) NOT NULL,
Age INT,
Skillsets VARCHAR(255),
NumberOfPosts INT
);
GO
As AuthorEmail is a varchar column, a wrongly invalid email address got entered into the table “GeekAuthors”. Our task is to find out the rows that got an invalid email address.
Step 3: To achieve this, let us put sample data in each row
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts)
VALUES ('Rachel','rachel@gmail.com',25,'Java,Python,.Net',5);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts)
VALUES ('Phoebe','phoebegmailcom',22,'Android,Python,.Net',15);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts)
VALUES ('Monica','monica@gmailcom',23,'IOS,GO,R',10);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts)
VALUES ('Joey','joey@.com',24,'Java,Python,GO',5);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts)
VALUES ('Chandler','chandler@gmail',23,'IOS,GO,R',10);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts)
VALUES ('Ross','ross@gmail.com',24,'Java,Python,GO',5);
Step 4: Now we validate the email address.
We need to validate whether the entered email address is correct or not. For that, we can easily achieve that
Query:
-- Get all email address
SELECT AuthorEmail FROM GeekAuthors
-- Get ValidEmail Address
SELECT AuthorEmail AS ValidEmail
FROM GeekAuthors
WHERE AuthorEmail LIKE '%_@__%.__%'
AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0
GO
-- use NOT condition in the WHERE clause and select all the invalid emails as well.
SELECT AuthorEmail AS NotValidEmail
FROM GeekAuthors
WHERE NOT AuthorEmail LIKE '%_@__%.__%'
AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0
GO
Output:
Step 5: For locating numeric value from the given value using Regular expression. It can be found out by using two ways
Our input expression may consist of alphanumeric values. In order to find out the position of the numeric value in the data, we can use the below format.
Query:
--pattern to check is: Not of A-Z (both A-Z and a-z(search pattern))
-- in the input string and
--finding the position of the non-matching pattern
-- As we are checking numeric values, it checks for numeric value position
--and it is displaying position of the character
Way 1 :
SELECT 'GeekPremierLeague2022' as ContestName,
PATINDEX('%[^A-Za-z]%', 'GeekPremierLeague2022') as
NumericCharacterPosition;
Way 2 :
SELECT 'GeekPremierLeague2022' as ContestName,
PATINDEX('%[0-9]%', 'GeekPremierLeague2022') as
NumericCharacterPosition;
Output:
In case numeric is not available, we will get 0 as the answer
Output:
Step 6: In a few scenarios, we may need to extract only alphabets (a-z) from the input string.
It can be possible by combining 2 functions namely PATINDEX and STUFF.
- A regular expression has to be applied on PATINDEX. Using the above query we will get the numeric position
- By using the STUFF function, we can replace all the numeric values.
Query:
-- At the 18th position, one character is replaced.
We need to remove numeric value.
-- Hence STUFF function is used
1st Param -- Input string
2nd Param -- Start location . As numeric is present
at 18th location, here it is given
3rd Param -- Number of characters to be replaced
4th Param - Replacing value
SELECT STUFF('GeekPremierLeague2022', 18, 1, '' );
-- One character is replaced at the index 18
-- If we want to remove all the numeric above
SELECT STUFF('GeekPremierLeague2022', 18, 4, '' );
-- Four characters are replaced starting from the index 18
Output:
Hence by applying both PATINDEX and STUFF, we can remove the numeric value in the given string:
Query:
-- Below code will produce only alpha text only
-- Scenario: A column is created in a table to accept
only alphabets. As there is no direct way available,
-- we can use this approach and extract the alphabets alone
DECLARE @inputData NVARCHAR(MAX) = 'GEEKS PREMIER LEAGUE 2022'
--input string
DECLARE @intPosition INT
--get the position of the integer from the input string
SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData)
-- We can use PATINDEX('%[0-9]%') also
print @intPosition
--run loop until no integer is found in the input string
WHILE @intPosition > 0
BEGIN
--remove the integer from that position
SET @inputData = STUFF(@inputData, @intPosition, 1, '' )
--PRINT @inputData
SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData )
--Again get the position of the next integer in the input string
--PRINT @intPosition
END
SELECT 'GEEKS PREMIER LEAGUE 2022' as InputString, @inputData AS AlphabetsOnly
Output:
Step 7: To get the numeric values alone suppose when the data feed is occurring that consists of employee name, phone number, address, and email id, we can use the below approach to retrieve the phone number (i.e. numeric value) in the below way.
Query:
DECLARE @inputString VARCHAR(MAX)
DECLARE @validchars VARCHAR(MAX)SET @inputString = 'GeekAuthor1 123456789 Address1 geek@gmail.com'
--We are going to take continuous set of numbers
and it should not have any spaces also in between
SET @validchars = '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
DECLARE @idx INT
SET @idx = PATINDEX('%'+ @validchars +'%',@inputString )
IF @idx > 0 AND
(@idx = LEN(@inputString )-8
OR PATINDEX(SUBSTRING(@inputString ,@idx+9,1),'[0-9]')=0)
SET @inputString =SUBSTRING(@inputString,
PATINDEX('%'+ @validchars +'%',@inputString ), 9)
ELSE SET @inputString = ''
SELECT @inputString as NumericAlone
Hence Regular expressions are used in many places for validating an email, address, validating alpha alone, or validating numeric alone.