Шаблон для номера телефона sql

Here are some examples of formatting phone numbers in SQL Server.

This includes examples of formatting numbers in E.164 format (for international numbers), prepending the country code and area code, as well as omitting leading zero from the country code when required.

Numeric Phone Numbers

If the phone number is stored as a numeric value (which it shouldn’t be), you can use the FORMAT() function to format it as phone number.

Example:

SELECT FORMAT(0234567890, '000-000-0000');

Result:

023-456-7890

The first argument is the phone number and the second argument is the format string. In this example I’m using a custom format string. You can adjust the format string to suit the desired phone number format:

SELECT FORMAT(0234567890, '(000) 000-0000');

Result:

(023) 456-7890

It’s important to know what the format strings actually do. When using zeros, you’ll need to make sure that the phone number actually has digits in every place where there’s a zero format specifier (otherwise you might accidentally add zeros to the number).

You’ll also need to ensure that there’s a format specifier to match each digit (otherwise you’ll delete digits from the phone number).

Another way to express the format string is with the # format specifier. However, this will result in any leading zeros being removed from the phone number.

Here’s an example to illustrate what I mean:

SELECT 
    FORMAT(0234567890, '000-000-0000') AS "000-000-0000",
    FORMAT(0234567890, '###-###-####') AS "###-###-####";

Result:

+----------------+----------------+
| 000-000-0000   | ###-###-####   |
|----------------+----------------|
| 023-456-7890   | 23-456-7890    |
+----------------+----------------+

The FORMAT() function only accepts numeric types and datetime values. If the input phone number isn’t actually a numeric type, then you’ll probably get an error, something like this:

SELECT FORMAT('0234567890', '000-000-0000');

Result:

Msg 8116, Level 16, State 1, Line 1
Argument data type varchar is invalid for argument 1 of format function.

In such cases, it’s easy enough to convert the value to a numeric type:

SELECT FORMAT(CAST('0234567890' AS int), '000-000-0000');

Result:

023-456-7890

But phone numbers shouldn’t be stored as numeric types anyway.

Numeric values can be rounded up or down, have calculations performed on them, have insignificant zeros automatically removed, etc.

Phone numbers are a fixed value. Every digit is significant (including leading zeros). We don’t want leading zeros to go missing unless we explicitly require this (for a country code for example). And we don’t want our phone numbers to be inadvertently rounded up or down. And it’s unlikely you’ll ever need to perform calculations on your phone numbers.

So it therefore makes more sense to store phone numbers as a string. Converting them to a numeric type before formatting them (like in the above example) could still result in unexpected changes being done to the number.

If the phone number is already a string, then try the following method.

Phone Numbers Stored as Strings

If the phone number is stored as a string, you can use the STUFF() function to insert the appropriate strings into the phone number at the relevant places.

Examples:

SELECT
    STUFF(STUFF('0234567890', 7, 0, '-'), 4, 0, '-') AS "Format 1",
    STUFF(STUFF(STUFF('0234567890', 7, 0, '-'), 4, 0, ') '), 1, 0, '(') AS "Format 2";

Result:

+--------------+----------------+
| Format 1     | Format 2       |
|--------------+----------------|
| 023-456-7890 | (023) 456-7890 |
+--------------+----------------+

The first argument is the original string (in this case, the phone number), and the fourth argument is the string to insert. The second argument specifies where to insert the fourth argument.

The third argument specifies how many characters to delete from the original string (in case you want to replace certain characters with the new string). In our case, we don’t want to delete any characters, and so we use 0.

Depending on the format of the original phone number, another way to do it is to use the REPLACE() function. An example of where this could be useful is when the phone number is already formatted with a separator, but it needs to be replaced with another separator:

SELECT REPLACE('023 456 7890', ' ', '-');

Result:

023-456-7890

International Numbers

E.164 is an international standard that defines the format for international telephone numbers.

E.164 numbers are formatted [+][country code][area code][local phone number] and can have a maximum of fifteen digits.

Here’s an example that uses two methods to concatenate the country code, the area code, and the phone number:

SELECT 
    CONCAT('+', '1', '415', '4567890') AS 'CONCAT() Function',
    '+' + '1' + '415' + '4567890' AS 'Concatenation Operator';

Result:

+---------------------+--------------------------+
| CONCAT() Function   | Concatenation Operator   |
|---------------------+--------------------------|
| +14154567890        | +14154567890             |
+---------------------+--------------------------+

The first method uses the CONCAT() function, and the second uses the concatenation operator (+).

That example formats a US based number. In many countries, the area code has a leading zero that needs to be dropped when using the E.164 format.

One way to suppress any leading zeros is to convert the area code to a numeric value and back again.

Here’s an example of using that technique on a UK based number:

SELECT CONCAT(
        '+', 
        '44', 
        CAST(CAST('020' AS int) AS varchar(3)), 
        '34567890'
        );

Result:

+442034567890

In this case the leading zero was dropped.

Here’s the same code run against the previous US based number (which doesn’t use a leading zero in the country code):

SELECT CONCAT(
        '+', 
        '1', 
        CAST(CAST('415' AS int) AS varchar(3)), 
        '4567890'
        );

Result:

+14154567890

This time the country code remained at three digits.

I have table in the database with a phone number column. The numbers look like this:

123456789

I want to format that to look like this:

123-456-789

the Tin Man's user avatar

the Tin Man

158k41 gold badges214 silver badges301 bronze badges

asked Sep 15, 2009 at 11:02

avnic's user avatar

2

This should do it:

UPDATE TheTable
SET PhoneNumber = SUBSTRING(PhoneNumber, 1, 3) + '-' + 
                  SUBSTRING(PhoneNumber, 4, 3) + '-' + 
                  SUBSTRING(PhoneNumber, 7, 4)

Incorporated Kane’s suggestion, you can compute the phone number’s formatting at runtime. One possible approach would be to use scalar functions for this purpose (works in SQL Server):

CREATE FUNCTION FormatPhoneNumber(@phoneNumber VARCHAR(10))
RETURNS VARCHAR(12)
BEGIN
    RETURN SUBSTRING(@phoneNumber, 1, 3) + '-' + 
           SUBSTRING(@phoneNumber, 4, 3) + '-' + 
           SUBSTRING(@phoneNumber, 7, 4)
END

answered Sep 15, 2009 at 11:11

David Andres's user avatar

David AndresDavid Andres

31.2k7 gold badges45 silver badges36 bronze badges

3

I’d generally recommend you leave the formatting up to your front-end code and just return the data as-is from SQL. However, to do it in SQL, I’d recommend you create a user-defined function to format it. Something like this:

CREATE FUNCTION [dbo].[fnFormatPhoneNumber](@PhoneNo VARCHAR(20))
RETURNS VARCHAR(25)
AS
BEGIN
DECLARE @Formatted VARCHAR(25)

IF (LEN(@PhoneNo) <> 10)
    SET @Formatted = @PhoneNo
ELSE
    SET @Formatted = LEFT(@PhoneNo, 3) + '-' + SUBSTRING(@PhoneNo, 4, 3) + '-' + SUBSTRING(@PhoneNo, 7, 4)

RETURN @Formatted
END
GO

Which you can then use like this:

SELECT [dbo].[fnFormatPhoneNumber](PhoneNumber) AS PhoneNumber
FROM SomeTable

It has a safeguard in, in case the phone number stored isn’t the expected number of digits long, is blank, null etc — it won’t error.

EDIT: Just clocked on you want to update your existing data. The main bit that’s relevant from my answer then is that you need to protect against «dodgy»/incomplete data (i.e. what if some existing values are only 5 characters long)

Brian Dishaw's user avatar

answered Sep 15, 2009 at 11:13

AdaTheDev's user avatar

AdaTheDevAdaTheDev

141k28 gold badges204 silver badges198 bronze badges

1

Above users mentioned, those solutions are very basic and they won’t work if the database has different phone formats like:

(123)123-4564
123-456-4564
1234567989
etc

Here is a more complex solution that will work with ANY input given:

CREATE FUNCTION [dbo].[ufn_FormatPhone] (@PhoneNumber VARCHAR(32))
RETURNS VARCHAR(32)
AS
BEGIN
    DECLARE @Phone CHAR(32)

    SET @Phone = @PhoneNumber

    -- cleanse phone number string
    WHILE PATINDEX('%[^0-9]%', @PhoneNumber) > 0
        SET @PhoneNumber = REPLACE(@PhoneNumber, SUBSTRING(@PhoneNumber, PATINDEX('%[^0-9]%', @PhoneNumber), 1), '')

    -- skip foreign phones
    IF (
            SUBSTRING(@PhoneNumber, 1, 1) = '1'
            OR SUBSTRING(@PhoneNumber, 1, 1) = '+'
            OR SUBSTRING(@PhoneNumber, 1, 1) = '0'
            )
        AND LEN(@PhoneNumber) > 11
        RETURN @Phone

    -- build US standard phone number
    SET @Phone = @PhoneNumber
    SET @PhoneNumber = '(' + SUBSTRING(@PhoneNumber, 1, 3) + ') ' + SUBSTRING(@PhoneNumber, 4, 3) + '-' + SUBSTRING(@PhoneNumber, 7, 4)

    IF LEN(@Phone) - 10 > 1
        SET @PhoneNumber = @PhoneNumber + ' X' + SUBSTRING(@Phone, 11, LEN(@Phone) - 10)

    RETURN @PhoneNumber
END

Suraj Kumar's user avatar

Suraj Kumar

5,5398 gold badges21 silver badges41 bronze badges

answered Jul 12, 2014 at 19:48

Hiram's user avatar

HiramHiram

2,6391 gold badge16 silver badges13 bronze badges

I do not recommend keeping bad data in the database and then only correcting it on the output. We have a database where phone numbers are entered in variously as :

  • (555) 555-5555
  • 555+555+5555
  • 555.555.5555
  • (555)555-5555
  • 5555555555

Different people in an organization may write various retrieval functions and updates to the database, and therefore it would be harder to set in place formatting and retrieval rules. I am therefore correcting the data in the database first and foremost and then setting in place rules and form validations that protect the integrity of this database going forward.

I see no justification for keeping bad data unless as suggested a duplicate column be added with corrected formatting and the original data kept around for redundancy and reference, and YES I consider badly formatted data as BAD data.

j0k's user avatar

j0k

22.5k28 gold badges80 silver badges89 bronze badges

answered Oct 25, 2012 at 17:44

Daniel Byrne's user avatar

Daniel ByrneDaniel Byrne

1,39114 silver badges12 bronze badges

2

Solutions that use SUBSTRING and concatenation + are nearly independent of RDBMS. Here is a short solution that is specific to SQL Server:

declare @x int = 123456789
select stuff(stuff(@x, 4, 0, '-'), 8, 0, '-')

answered Dec 18, 2015 at 13:13

Sergey Kalinichenko's user avatar

You can also try this:

CREATE  function [dbo].[fn_FormatPhone](@Phone varchar(30)) 
returns varchar(30)
As
Begin
declare @FormattedPhone varchar(30)

set     @Phone = replace(@Phone, '.', '-') --alot of entries use periods instead of dashes
set @FormattedPhone =
    Case
      When isNumeric(@Phone) = 1 Then
        case
          when len(@Phone) = 10 then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 4, 3)+ '-' +substring(@Phone, 7, 4)
          when len(@Phone) = 7  then substring(@Phone, 1, 3)+ '-' +substring(@Phone, 4, 4)
          else @Phone
        end
      When @phone like '[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9][0-9]' Then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 5, 3)+ '-' +substring(@Phone, 8, 4)
      When @phone like '[0-9][0-9][0-9] [0-9][0-9][0-9] [0-9][0-9][0-9][0-9]' Then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 5, 3)+ '-' +substring(@Phone, 9, 4)
      When @phone like '[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]' Then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 5, 3)+ '-' +substring(@Phone, 9, 4)
      Else @Phone
    End
return  @FormattedPhone

end

use on it select

(SELECT [dbo].[fn_FormatPhone](f.coffphone)) as 'Phone'

Output will be

enter image description here

answered Mar 15, 2016 at 8:24

Mohammad Atiour Islam's user avatar

You Can Use FORMAT if you column is a number Syntax like
FORMAT ( value, format [, culture ] ) In use like
FORMAT ( @d, 'D', 'en-US' ) or FORMAT(123456789,'###-##-####') (But This works for only SQL SERVER 2012 And After)

In Use Like
UPDATE TABLE_NAME SET COLUMN_NAME = FORMAT(COLUMN_NAME ,'###-##-####')

And

if your column is Varchar Or Nvarchar use do like this
CONCAT(SUBSTRING(CELLPHONE,0,4),' ',SUBSTRING(CELLPHONE,4,3),' ',SUBSTRING(CELLPHONE,7,2) ,' ',SUBSTRING(CELLPHONE,9,2) )

You can always get help from

https://msdn.microsoft.com/en-us/library/hh213505.aspx

answered Jan 7, 2017 at 19:16

Toprak's user avatar

ToprakToprak

2831 gold badge4 silver badges11 bronze badges

2

I found that this works if wanting in a (123) — 456-7890 format.

UPDATE table 
SET Phone_number =  '(' +  
                    SUBSTRING(Phone_number, 1, 3) 
                    + ') ' 
                    + '- ' +
                    SUBSTRING(Phone_number, 4, 3) 
                    + '-' +
                    SUBSTRING(Phone_number, 7, 4) 

Rizier123's user avatar

Rizier123

58.6k16 gold badges97 silver badges154 bronze badges

answered Feb 9, 2015 at 22:53

nation161r's user avatar

nation161rnation161r

511 gold badge3 silver badges14 bronze badges

Updated @sqiller’s function for my purposes

CREATE FUNCTION [toolbox].[FormatPhoneNumber] (
    @PhoneNumber VARCHAR(50),
    @DefaultIfUnknown VARCHAR(50)
)
RETURNS VARCHAR(50)
AS
BEGIN
    -- remove any extension
    IF CHARINDEX('x', @PhoneNumber, 1) > 0
        SET @PhoneNumber = SUBSTRING(@PhoneNumber, 1, CHARINDEX('x', @PhoneNumber, 1) - 1)

    -- cleanse phone number string
    WHILE PATINDEX('%[^0-9]%',@PhoneNumber) > 0
        SET @PhoneNumber = REPLACE(@PhoneNumber,
                SUBSTRING(@PhoneNumber,PATINDEX('%[^0-9]%',@PhoneNumber),1),'')

    -- Remove US international code if exists, i.e. 12345678900
    IF SUBSTRING(@PhoneNumber,1,1) = '1' AND LEN(@PhoneNumber) = 11
        SET @PhoneNumber = SUBSTRING(@PhoneNumber, 2, 10)

    -- any phone numbers without 10 characters are set to default
    IF LEN(@PhoneNumber) <> 10
        RETURN @DefaultIfUnknown

    -- build US standard phone number
    SET @PhoneNumber = '(' + SUBSTRING(@PhoneNumber,1,3) + ') ' +
                SUBSTRING(@PhoneNumber,4,3) + '-' + SUBSTRING(@PhoneNumber,7,4)

    RETURN @PhoneNumber
END

answered Nov 10, 2015 at 21:11

Josh Jay's user avatar

Josh JayJosh Jay

1,2402 gold badges15 silver badges26 bronze badges

If you want to just format the output no need to create a new table or a function. In this scenario the area code was on a separate fields. I use field1, field2 just to illustrate you can select other fields in the same query:

area  phone
213   8962102

Select statement:

Select field1, field2,areacode,phone,SUBSTR(tablename.areacode,1,3) + '-' + SUBSTR(tablename.phone,1,3) + '-' + SUBSTR(tablename.areacode,4,4) as Formatted Phone from tablename

Sample OUTPUT:

columns: FIELD1, FIELD2, AREA, PHONE, FORMATTED PHONE
data:    Field1, Field2, 213,  8962102,  213-896-2102

DᴀʀᴛʜVᴀᴅᴇʀ's user avatar

DᴀʀᴛʜVᴀᴅᴇʀ

7,39116 gold badges70 silver badges118 bronze badges

answered Aug 24, 2016 at 18:13

Noe's user avatar

When dealing with SQL databases, users will have to interact with different data types and schemas depending on the project’s requirements. Since databases are used in many different applications and support multiple platforms, devices, and users, there may be differences in how data is entered, even in a simple field like a phone number.

For example, one set of users will enter the phone number with the country code, while the others may simply ignore the country code. Therefore, it is essential for database users to know how to clean data in a SQL database. In this post, we will see how to utilize SQL string operations on phone number fields to properly clean, format, and extract data.

Why Do We Care About Data Cleaning?

Data comes in all forms and sizes. Most of the time, it will be raw data, without proper formatting. Data cleaning allows users to filter, sort, and format this raw data to extract meaningful information from it. Data cleaning also makes data analysis a much more efficient process as an additional benefit.

Let’s assume we have a text field that captures user information and asks users to enter valuable information like phone numbers. We need to clean this text field to make the phone numbers usable. How would we go about doing that? To begin with, we need to understand SQL string operations.

What are SQL String Operations?

String operations are functions that can be used to manipulate strings. These string operators (also called string functions) take a string as input, process it according to the specified function, and return a string as the output. The availability of these functions will differ depending on the database. For example, the following is a list of commonly used string functions in MySQL:

Common Types of String Operations

Function Usage
LEFT/RIGHT Extract a specified number of characters from a string (starting from left or right)
TRIM Remove leading or trailing spaces
POSITION / STRPOS Analogous to the Location function, where a numerical value of the position of the specified substring is provided
SUBSTR Extract the specified substring from a specific location to match the specified length
CONCAT Combine two or more strings to make a single string
SUBSTR Extract the specified substring from a specific location to match the specified length
COALESCE Replace null values with a specified value
FORMAT Formats the value to the specified format
REGEXP Matches string to a regular expression
REGEXP_SUBSTR Extracts value for the matching regular expression
STR Convert numerical value to string
REVERSE Reverses the specified string
REPLACE Replaces the specified string with the given value

The full list of available string functions and operators is available in the official documentation.

How to use String Operations with Phone Numbers

Now we have a basic understanding of string operations and related functions in SQL! So, let’s look at how exactly we can utilize some of these functions on phone numbers to extract and format data. For all the examples provided here, we will be using the Arctype SQL client with a MySQL database in a Windows environment with the following dataset.

If you don’t have Arctype installed, you can use the link below to download it and follow along:

The fast and easy-to-use
SQL client for developers and teams

Properly Formatting a Phone Number

We can utilize the SUBSTR command to format phone numbers. It will extract parts of the number and use the CONCAT to join the specific parts together. In the following example, we have broken down the phone number into three distinct sections and combined them together with the formatting as a new field called num_formatted:

SELECT
    phone_num,
    CONCAT(
        '(',
        SUBSTR(phone_num, 1, 3),
        ') ',
        SUBSTR(phone_num, 4, 3),
        '-',
        SUBSTR(phone_num, 7)
    ) AS num_formatted
FROM
    EmpDetails
Formatting a phone number in SQL.

The result of the operation is show below:

Cleaning phone numbers - output of example.

Cleaning phone numbers in Arctype.

You can create a new column to save the formatted number using the ALTER TABLE command. It will create a new column and update it with the formatted numbers.

ALTER TABLE EmpDetails
ADD formatted_number VARCHAR(255);
    
UPDATE
    EmpDetails
SET
    formatted_number = (
        SELECT
            CONCAT(
                '(',
                SUBSTR(phone_num, 1, 3),
                ') ',
                SUBSTR(phone_num, 4, 3),
                '-',
                SUBSTR(phone_num, 7)
            ) AS num_formatted
    )
Creating a new column to save the formatted data.

The result of the above operation is shown below:

Screeshot of a query in Arctype.

Updating the query to save the formatted data in a new column using Arctype.
Screenshot of data from a query to convert phone numbers to formatted data.
Resulting data from the above query.

We can modify our SQL command to include the REPLACE function if there are additional characters in the phone number field, such as dashes and brackets. This function can be used to remove unnecessary characters and whitespaces in a string. Now let’s further format the formmated_number field to include the country code.

SELECT
    formatted_number,
    CONCAT(
        '+1 '
        '(',
        SUBSTR(num_cleansed, 1, 3),
        ') ',
        SUBSTR(num_cleansed, 5, 3),
        '-',
        SUBSTR(num_cleansed, 7)
    ) AS num_formatted
FROM
    (
        SELECT
            formatted_number,
            REPLACE(REPLACE(REPLACE(REPLACE(formatted_number, '(', ''), ')', ''), '-', ''), ' ','') as num_cleansed
        FROM
            EmpDetails
    ) FormattedPhoneNumbers
Adding the country code, using REPLACE.

In the above statement, the REPLACE function is used to remove the brackets, dashes, and whitespaces from the formatted_number field before undergoing the formatting process.

Screenshot of query results.

Results of the query.

We can use the LEFT and RIGHT functions to specify and extract different parts of the phone number. In the following example, we will extract the area code of the phone number using the LEFT function and the last four digits using the RIGHT function based on the num_formatted field.

SELECT
    REPLACE(LEFT(formatted_number, 4), '(', '') As 'Area Code',
    RIGHT(formatted_number, 4) AS 'Last Four Digits'
FROM
    EmpDetails
Using the LEFT and RIGHT functions to specify different parts of a phone number.

Here, the REPLACE function is used to remove the bracket from the selected number block. The result of the query is shown below.

Query result in Arctype.

Query result in Arctype.

One of the most common tasks in the world of data cleaning is extracting phone numbers from a text block. The complexity and feasibility of this task will mostly depend on the composition of the text.

The easiest way to extract phone numbers is to utilize regular expressions to target the specific phone number formats. Extracting data has become far simpler with the introduction of functions like REGEXP_SUBSTR in MySQL 8.0. We will be populating the details column with some phone numbers in different formats, as shown below.

Arctype showing query results.

Results of the query in Arctype

Identifying Rows with Matching Phone Numbers

First, we need to figure out which rows consist of data matching our requirements. In this case, the following regular expressions will be used on the details field.

  • Any consecutive 10 digits
  • Formatted number (XXX-XXX-XXXX)
SELECT
    *
FROM
    EmpDetails
WHERE
    # Any 10 Digits
    details REGEXP '[[:digit:]]{10}'
    # Formatted Number (XXX-XXX-XXXX)
    OR details REGEXP '[0-9]{3}-[0-9]{3}-[0-9]{4}';

Using regular expressions to clean the data.

Results of the query can be seen below:

Regexp query results screenshot.

Arctype showing the query and results of a Regexp approach.

Since we have identified the rows, the next step is to extract the phone numbers. It can be done using the REGEXP_SUBSTR function to extract the substring which matches the given regular expression. As we need to query two different regular expressions, we will be using the CONCAT_WS function to combine the results of both expressions into a single column.

SELECT
    emp_id,
    name,
    email,
    CONCAT_WS(
        '',
        REGEXP_SUBSTR(details, '^[0-9]+$', 1, 1, 'm'),
        REGEXP_SUBSTR(details, '[0-9]{3}-[0-9]{3}-[0-9]{4}', 1, 1, 'm')
    ) AS extracted_phone_numbers
FROM
    (
        SELECT
            *
        FROM
            EmpDetails
        WHERE
            details REGEXP '[[:digit:]]{10}'
            OR details REGEXP '[0-9]{3}-[0-9]{3}-[0-9]{4}'
) ExtractedDetails
Extracting the phone numbers.

The result of this operation can be seen below:

Screenshot of phone number extraction.

Extracting the phone numbers in Arctype, with the results shown below the query.

Handling Multiple Phone Numbers in a Single Field

To query results from a single field with multiple numbers, we need to create a stored procedure that loops through the desired field to capture all matching regex patterns. For instance, let’s see how to extract multiple phone numbers from the details field of emp_id 1702 (Dan).

DELIMITER $$
CREATE PROCEDURE get_number_of_matches_full()
BEGIN
  DECLARE regex_match INT DEFAULT 1;
  DECLARE current_number INT DEFAULT 1;
  WHILE regex_match >= 1 DO
    CREATE TABLE IF NOT EXISTS extracted_numbers_table (
        `emp_id` int NOT NULL,
        `name` varchar(255) DEFAULT NULL,
        `email` varchar(255) DEFAULT NULL,
        `extracted_phone_num` varchar(255) DEFAULT NULL
    );
    INSERT INTO extracted_numbers_table (emp_id, name, email, extracted_phone_num)
    SELECT emp_id, name, email, REGEXP_SUBSTR(details, '[0-9]{3}-[0-9]{3}-[0-9]{4}', 1, current_number, 'm') FROM EmpDetails WHERE emp_id = 1702;
    SET current_number = current_number + 1;
    IF ((SELECT REGEXP_SUBSTR(details, '[0-9]{3}-[0-9]{3}-[0-9]{4}', 1, current_number, 'm') FROM EmpDetails WHERE emp_id = 1702) IS NULL) THEN
        SET regex_match = 0;
    END IF;
  END WHILE;
END $$
    
DELIMITER;
Using a stored procedure to loop through thee desired field and find all matching Regex patterns. 

The result of this operation is shown below.

Query result in Arctype.

Query result in Arctype.

In the above code block, we have created a stored procedure called get_number_of_matches_full, which loops through the details field until all the regex matches are found on the specified row of the EmpDetails table. We are using the REGEXP_SUBSTR function with the position argument to extract different matches. It updates by one at each loop run and saves the results on a newly created extracted_numbers_table. Finally, the loop exits when a NULL result is found.

We can call this procedure and view the results by querying the extracted_numbers_table as shown below.

CALL get_number_of_matches_full;

SELECT * FROM extracted_numbers_table;
Finding the extracted numbers.

And, shown again in Arctype:

Query results in Arctype.

Query results as seen in Arctype.

Conclusion

String operations in SQL are vital functions to clean and format data within a database. Moreover, string operations are the core component when dealing with valuable information such as phone numbers as they allow users to extract or manipulate phone numbers according to their requirements. However, it’s important to remember that the exact functions available will change depending on the database type and version. Therefore, always remember to refer to the documentation to find out the available string operations and implement them to your heart’s content.

Ready to get started with string operations in SQL? If so, be sure to download Arctype and give it a try. You can use the link below to install Arctype for free and follow along with the examples seen here. Happy querying!

JOIN the Arctype Newsletter

Programming stories, tutorials, and database tips every 2 weeks

Introduction

In my first article, Telephone Numbers in SQL Server 2005: Part 1 – The Data Type, I discussed various ways of persisting a simple telephone number in SQL Server. Simply displaying the raw data stored in SQL Server would not be suitable for human consumption, so some form of data formatting must be done. Data formatting is usually performed in the user interface layer, for example, by the ASPX page or the WinForms application. Often, it is convenient to have SQL Server format the data in a view to be passed into another document, like a report that might have a difficult time formatting the number correctly. In this article, I will discuss a couple methods that can be used to format a phone number for presentation to the user on SQL Server.

The UDF

SQL Server 2000 introduced the ability to create User Defined Functions. Using a custom formatting function for telephone numbers is a perfect example of where to use UDFs. The code below can be used to create a telephone number formatting function in TSQL:






CREATE FUNCTION [dbo].[FORMATPHONENUMBER] 
(
    @Number money
)
RETURNS varchar(25)
AS
BEGIN

    
    DECLARE @Formatted varchar(25)  
    DECLARE @CharNum varchar(18)    
    DECLARE @Extension int         
    DECLARE @Numerator bigint         

    IF @Number IS NULL 
    BEGIN
        
        RETURN NULL
    END

    
    

    
    SET @Numerator = CAST(@Number * 10000 AS bigint)
    
    SET @Extension = CAST(RIGHT(@Numerator, 4) AS int)
    
    SET @CharNum = CAST(LEFT(@Numerator , LEN(@Numerator) - 4) 
        AS varchar(18))

    IF LEN(@CharNum) = 10    
      BEGIN
                
        SET @Formatted = '(' + LEFT(@CharNum, 3) + ') ' + 
            SUBSTRING(@CharNum,4,3) + '-' + RIGHT(@CharNum, 4)

        IF @Extension > 0    
        BEGIN
            SET @Formatted = @Formatted +  ' ext '+ 
                             CAST(@Extension AS varchar(4))
        END

        RETURN @Formatted
      END

    IF LEN(@CharNum) = 7    
      BEGIN
        SET @Formatted = LEFT(@CharNum, 3) + '-' + RIGHT(@CharNum, 4)
        IF @Extension > 0    
        BEGIN
            SET @Formatted = @Formatted +  ' ext '+ 
                             CAST(@Extension AS varchar(6))
        END

        RETURN @Formatted
      END

    IF LEN(@CharNum) = 11
    
    
      BEGIN
                
        SET @Formatted = LEFT(@CharNum, 1) + ' (' + SUBSTRING(@CharNum, 2, 3) + ') ' + 
                         SUBSTRING(@CharNum,4,3) + '-' + RIGHT(@CharNum, 4)

        IF @Extension > 0    
        BEGIN
            SET @Formatted = @Formatted +  ' ext '+ CAST(@Extension AS varchar(4))
        END

        RETURN @Formatted
      END

    
    
    SET @Formatted = @CharNum
    IF @Extension > 0    
      BEGIN
        SET @Formatted = @Formatted +  ' ext '+ CAST(@Extension AS varchar(4))
        RETURN 'ext '+ CAST(@Extension AS varchar(4))
        
      END

    RETURN @Formatted

END

The CLR Version

SQL Server 2005 added the ability to write your own DLLs in the .NET programming language of your choice, so I also wrote the equivalent function in C#. To do this in Visual Studio 2005, start a new SQL Server project, and add a new User-Defined Function. Paste this code into it:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction(
               IsDeterministic=true, IsPrecise=true)]
    public static SqlString FormatNAPhoneNumber(SqlMoney Number)
    {
        
        if (Number.IsNull)
            return SqlString.Null;

        string phoneNumber = Number.ToString();
        string [] phone = phoneNumber.Split(new Char [] {'.'});
        string charnum = phoneNumber;   
  
        int extension = int.Parse(phone[1]);
        
        switch (phone[0].Length)
        {
            case 10:
            
            
                {

                    if (extension > 0)
                    {
                        charnum = string.Format("({0}) {1}-{2} ext{3}",
                                phone[0].Substring(0, 3),
                                phone[0].Substring(3, 3),
                                phone[0].Substring(6),
                                extension.ToString());

        
        
        
        
        

                        return new SqlString(charnum);
                    }
                    else
                    {
                        charnum = string.Format("({0}) {1}-{2}",
                                phone[0].Substring(0, 3),
                                phone[0].Substring(3, 3),
                                phone[0].Substring(6));

        
        
        
        
                        return new SqlString(charnum);
                        
                    }
                    break;
                }
            case 7:  
                { 
                    if (extension > 0)
                    {
                        charnum = string.Format("{0}-{1} ext{2}",
                                phone[0].Substring(0, 3),
                                phone[0].Substring(3),
                                extension.ToString());
                        return new SqlString(charnum);
                        break;
                    }
                    else
                    {
                        charnum = string.Format("{0}-{1}",
                                phone[0].Substring(0, 3),
                                phone[0].Substring(3));
                        return new SqlString(charnum);
                        break;
                    }
                }
            case 11:
            
                {
                    
                    if (extension > 0)
                    {
                        charnum = string.Format("{0} ({1}) {2}-{3} ext{4}",
                            phone[0].Substring(0, 1),
                            phone[0].Substring(1, 3),
                                phone[0].Substring(4, 3),
                                phone[0].Substring(7),
                                extension.ToString());
                        return new SqlString(charnum);
                    }
                    else
                    {
                        charnum = string.Format("{0} ({1}) {2}-{3}",
                            phone[0].Substring(0, 1),
                            phone[0].Substring(1, 3),
                                phone[0].Substring(4, 3),
                                phone[0].Substring(7));
                        return new SqlString(charnum);
                    }
                }
            default:    
                {
                    return new SqlString(charnum); 
                    break;
                }
            }
        }
}

I did some quick benchmarking, to see how it affected performance, against a simple table containing the phone number data type. The table contains a small set of 580 rows, and the SQL SELECT command was executed 1,000 times, which generated 580,000 calls to the function. The SELECT command was executed without bringing the dataset back to the client so that the network time was minimized. Execution times are in milliseconds. Your numbers may vary, but use these numbers as a relative comparison of the different methods.

Method

Time

Function Overhead

No phone number formatting

1950 ms

TSQL FORMATPHONENUMBER

7450 ms

5500 ms

C# function using String.Format

4750 ms

2800 ms

C# function using RegEx

7187 ms

5237 ms

The Suprise!

To my surprise, the .NET function call ran twice as fast as the native TSQL function! My first guess would be that the native TSQL functions would certainly run faster by eliminating the overhead of the .NET engine and interface. It seems Microsoft has done their homework on .NET integration. This is another good example of where testing shows results that are often counter to what you would expect.

The C# version also has various lines commented out if you choose to use the RegEx library or the simple string concatenate operator for formatting the telephone number. One way to handle international phone numbers would be to store the RegEx pattern and evaluator strings in a table keyed by country. This table could then be linked to the phone number and the appropriate formatting codes passed to the function. I didn’t go this far with the design, but I did try using the RegEx method of formatting to test the performance over using the String.Format method. As you can see, it performed about the same as the TSQL version, so the overhead of the RegEx object was significant.

Conclusion

Using TSQL to write user defined functions can be awkward if it requires a fair amount of string manipulation or complicated logic. Using the .NET CLR integration allows you to easily use the full power of the .NET library and probably get better performance than with TSQL.

In my first article, it was stated that the small details make all the difference. By paying attention to proper design and doing a little bit of testing, we have cut the data footprint of storing a phone number in half as well as cut the CPU requirements for formatting the number in half. Small changes like these can make all the difference when it comes to how well your applications can scale.

Синтаксис

  • Wild Card с%: SELECT * FROM [table] WHERE [имя_столбца] Как «% Value%»

    Wild Card with _: SELECT * FROM [table] WHERE [column_name] Как ‘V_n%’

    Wild Card с [charlist]: SELECT * FROM [table] WHERE [column_name] Как ‘V [abc] n%’

замечания

Условие LIKE в предложении WHERE используется для поиска значений столбцов, соответствующих данному шаблону. Шаблоны формируются с использованием следующих двух подстановочных знаков

  • % (Процентный символ) — Используется для представления ноль или более символов
  • _ (Underscore) — используется для представления одного символа

Матч открытого шаблона

Подстановочный знак % добавленный к началу или концу (или обеим) строки, будет содержать 0 или более символов до начала или после окончания шаблона.

Использование «%» в середине позволит совместить 0 или более символов между двумя частями шаблона.

Мы собираемся использовать эту таблицу сотрудников:

Я бы FName LName Номер телефона ManagerID DepartmentID Оплата труда Дата приема на работу
1 Джон Джонсон 2468101214 1 1 400 23-03-2005
2 Софи Amudsen 2479100211 1 1 400 11-01-2010
3 Ronny кузнец 2462544026 2 1 600 06-08-2015
4 Джон Sanchez 2454124602 1 1 400 23-03-2005
5 Хильде сук 2468021911 2 1 800 01-01-2000

Следующие утверждения соответствуют всем записям, содержащим FName, содержащие строку «on» из таблицы Employees.

SELECT * FROM Employees WHERE FName LIKE '%on%';
Я бы FName LName Номер телефона ManagerID DepartmentID Оплата труда Дата приема на работу
3 R на ny кузнец 2462544026 2 1 600 06-08-2015
4 J on Sanchez 2454124602 1 1 400 23-03-2005

Следующий оператор соответствует всем записям, имеющим PhoneNumber, начиная со строки «246» от Employees.

SELECT * FROM Employees WHERE PhoneNumber LIKE '246%';
Я бы FName LName Номер телефона ManagerID DepartmentID Оплата труда Дата приема на работу
1 Джон Джонсон 246 8101214 1 1 400 23-03-2005
3 Ronny кузнец 246 2544026 2 1 600 06-08-2015
5 Хильде сук 246 8021911 2 1 800 01-01-2000

Следующий оператор соответствует всем записям с номером PhoneNumber, заканчивающимся на строку «11» от Employees.

SELECT * FROM Employees WHERE PhoneNumber LIKE '%11'
Я бы FName LName Номер телефона ManagerID DepartmentID Оплата труда Дата приема на работу
2 Софи Amudsen 24791002 11 1 1 400 11-01-2010
5 Хильде сук 24680219 11 2 1 800 01-01-2000

Все записи, где 3-й символ Fname — «n» от сотрудников.

SELECT * FROM Employees WHERE FName LIKE '__n%';

(два символа подчеркивания используются до «n», чтобы пропустить первые 2 символа)

Я бы FName LName Номер телефона ManagerID DepartmentID Оплата труда Дата приема на работу
3 Ronny кузнец 2462544026 2 1 600 06-08-2015
4 Джон Sanchez 2454124602 1 1 400 23-03-2005

Совпадение одного символа

Чтобы расширить выбор оператора структурированного запроса (SQL-SELECT), можно использовать подстановочные знаки, знак процента (%) и подчеркивание (_).

Символ _ (подчеркивание) может использоваться в качестве подстановочного знака для любого отдельного символа в совпадении с шаблоном.

Найдите всех сотрудников, чье Fname начинается с «j» и заканчивается на «n» и имеет ровно 3 символа в Fname.

SELECT * FROM Employees WHERE FName LIKE 'j_n'

_ (подчеркивание) также может использоваться более одного раза в качестве дикой карты для соответствия шаблонам.

Например, этот шаблон будет соответствовать «jon», «jan», «jen» и т. Д.

Эти имена не будут отображаться «jn», «john», «jordan», «justin», «jason», «julian», «jillian», «joann», потому что в нашем запросе используется один знак подчеркивания, и он может пропустить точно один символ, поэтому результат должен иметь 3 символа Fname.

Например, этот шаблон будет соответствовать «LaSt», «LoSt», «HaLt» и т. Д.

SELECT * FROM Employees WHERE FName LIKE '_A_T'

Соответствие диапазону или набору

Сопоставьте любой отдельный символ в указанном диапазоне (например: [af] ) или установите (например: [abcdef] ).

Этот шаблон диапазона будет соответствовать «gary», но не «mary»:

SELECT * FROM Employees WHERE FName LIKE '[a-g]ary'

Этот шаблон будет соответствовать «mary», но не «gary»:

SELECT * FROM Employees WHERE Fname LIKE '[lmnop]ary'

Диапазон или набор можно также отменить, добавив ^ каретку перед диапазоном или установить:

Этот шаблон диапазона не будет соответствовать «gary», но будет соответствовать «mary»:

SELECT * FROM Employees WHERE FName LIKE '[^a-g]ary'

Этот шаблон набора не будет соответствовать «mary», но будет соответствовать «gary»:

SELECT * FROM Employees WHERE Fname LIKE '[^lmnop]ary'

Матч ЛЮБОЙ против ВСЕХ

Совпадение:
Необходимо совместить хотя бы одну строку. В этом примере тип продукта должен быть «электроникой», «книгами» или «видео».

SELECT *
FROM   purchase_table
WHERE  product_type LIKE ANY ('electronics', 'books', 'video');

Все совпадение (должно соответствовать всем требованиям).
В этом примере должны быть согласованы как «объединенное королевство», так и «лондон» и «восточная дорога» (включая вариации).

SELECT *
FROM   customer_table
WHERE  full_address LIKE ALL ('%united kingdom%', '%london%', '%eastern road%');

Отрицательный выбор:
Используйте ВСЕ, чтобы исключить все элементы.
В этом примере приводятся все результаты, когда тип продукта не является «электроникой», а не «книгами», а не «видео».

SELECT *
FROM   customer_table
WHERE  product_type NOT LIKE ALL ('electronics', 'books', 'video');

Поиск диапазона символов

Следующий оператор соответствует всем записям, имеющим FName, которое начинается с буквы от A до F из таблицы Employees .

SELECT * FROM Employees WHERE FName LIKE '[A-F]%'

Выписка ESCAPE в LIKE-запросе

Если вы реализуете текстовый поиск как LIKE -query, вы обычно делаете это так:

SELECT * 
FROM T_Whatever 
WHERE SomeField LIKE CONCAT('%', @in_SearchText, '%') 

Однако (помимо того, что вы не должны использовать LIKE когда вы можете использовать полнотекстовый поиск), это создает проблему, когда кто-то вводит текст типа «50%» или «a_b».

Таким образом (вместо перехода на полнотекстовый поиск) вы можете решить эту проблему с помощью инструкции LIKE -escape:

SELECT * 
FROM T_Whatever 
WHERE SomeField LIKE CONCAT('%', @in_SearchText, '%') ESCAPE ''

Это означает, что теперь будет рассматриваться как символ ESCAPE. Это означает, что теперь вы можете просто добавить к каждому символу в строке, которую вы ищете, и результаты будут корректными, даже если пользователь вводит специальный символ, например % или _ .

например

string stringToSearch = "abc_def 50%";
string newString = "";
foreach(char c in stringToSearch) 
     newString += @"" + c;
 
sqlCmd.Parameters.Add("@in_SearchText", newString); 
// instead of sqlCmd.Parameters.Add("@in_SearchText", stringToSearch);

Примечание. Вышеупомянутый алгоритм предназначен только для демонстрационных целей. Он не будет работать в случаях, когда 1 графема состоит из нескольких символов (utf-8). например string stringToSearch = "Les Miseu0301rables"; Вам нужно будет сделать это для каждой графемы, а не для каждого персонажа. Вы не должны использовать вышеуказанный алгоритм, если имеете дело с азиатскими / восточно-азиатскими / южноазиатскими языками. Вернее, если вам нужен правильный код для начала, вы должны просто сделать это для каждого graphemeCluster.

См. Также ReverseString, вопрос интервью с C #

Подстановочные знаки

подстановочные символы используются с оператором SQL LIKE. Подстановочные знаки SQL используются для поиска данных в таблице.

Подстановочные знаки в SQL:%, _, [charlist], [^ charlist]

% — заменить ноль или более символов

   Eg:  //selects all customers with a City starting with "Lo"
        SELECT * FROM Customers
        WHERE City LIKE 'Lo%';

       //selects all customers with a City containing the pattern "es"
      SELECT * FROM Customers
       WHERE City LIKE '%es%';

_ — заменить один символ

Eg://selects all customers with a City starting with any character, followed by "erlin"
SELECT * FROM Customers
WHERE City LIKE '_erlin';

[charlist] — устанавливает и диапазоны символов для соответствия

Eg://selects all customers with a City starting with "a", "d", or "l"
SELECT * FROM Customers
WHERE City LIKE '[adl]%';

//selects all customers with a City starting with "a", "d", or "l"
SELECT * FROM Customers
WHERE City LIKE '[a-c]%';

[^ charlist] — Соответствует только символу, не указанному в скобках

Eg://selects all customers with a City starting with a character that is not "a", "p", or "l"
SELECT * FROM Customers
WHERE City LIKE '[^apl]%';

or

SELECT * FROM Customers
WHERE City NOT LIKE '[apl]%' and city like '_%';

Понравилась статья? Поделить с друзьями: