Here are some examples of formatting phone numbers in SQL Server.
This includes examples of formatting numbers in E.164 format (for international numbers), prepending the country code and area code, as well as omitting leading zero from the country code when required.
Numeric Phone Numbers
If the phone number is stored as a numeric value (which it shouldn’t be), you can use the FORMAT()
function to format it as phone number.
Example:
SELECT FORMAT(0234567890, '000-000-0000');
Result:
023-456-7890
The first argument is the phone number and the second argument is the format string. In this example I’m using a custom format string. You can adjust the format string to suit the desired phone number format:
SELECT FORMAT(0234567890, '(000) 000-0000');
Result:
(023) 456-7890
It’s important to know what the format strings actually do. When using zeros, you’ll need to make sure that the phone number actually has digits in every place where there’s a zero format specifier (otherwise you might accidentally add zeros to the number).
You’ll also need to ensure that there’s a format specifier to match each digit (otherwise you’ll delete digits from the phone number).
Another way to express the format string is with the #
format specifier. However, this will result in any leading zeros being removed from the phone number.
Here’s an example to illustrate what I mean:
SELECT
FORMAT(0234567890, '000-000-0000') AS "000-000-0000",
FORMAT(0234567890, '###-###-####') AS "###-###-####";
Result:
+----------------+----------------+ | 000-000-0000 | ###-###-#### | |----------------+----------------| | 023-456-7890 | 23-456-7890 | +----------------+----------------+
The FORMAT()
function only accepts numeric types and datetime values. If the input phone number isn’t actually a numeric type, then you’ll probably get an error, something like this:
SELECT FORMAT('0234567890', '000-000-0000');
Result:
Msg 8116, Level 16, State 1, Line 1 Argument data type varchar is invalid for argument 1 of format function.
In such cases, it’s easy enough to convert the value to a numeric type:
SELECT FORMAT(CAST('0234567890' AS int), '000-000-0000');
Result:
023-456-7890
But phone numbers shouldn’t be stored as numeric types anyway.
Numeric values can be rounded up or down, have calculations performed on them, have insignificant zeros automatically removed, etc.
Phone numbers are a fixed value. Every digit is significant (including leading zeros). We don’t want leading zeros to go missing unless we explicitly require this (for a country code for example). And we don’t want our phone numbers to be inadvertently rounded up or down. And it’s unlikely you’ll ever need to perform calculations on your phone numbers.
So it therefore makes more sense to store phone numbers as a string. Converting them to a numeric type before formatting them (like in the above example) could still result in unexpected changes being done to the number.
If the phone number is already a string, then try the following method.
Phone Numbers Stored as Strings
If the phone number is stored as a string, you can use the STUFF()
function to insert the appropriate strings into the phone number at the relevant places.
Examples:
SELECT
STUFF(STUFF('0234567890', 7, 0, '-'), 4, 0, '-') AS "Format 1",
STUFF(STUFF(STUFF('0234567890', 7, 0, '-'), 4, 0, ') '), 1, 0, '(') AS "Format 2";
Result:
+--------------+----------------+ | Format 1 | Format 2 | |--------------+----------------| | 023-456-7890 | (023) 456-7890 | +--------------+----------------+
The first argument is the original string (in this case, the phone number), and the fourth argument is the string to insert. The second argument specifies where to insert the fourth argument.
The third argument specifies how many characters to delete from the original string (in case you want to replace certain characters with the new string). In our case, we don’t want to delete any characters, and so we use 0
.
Depending on the format of the original phone number, another way to do it is to use the REPLACE()
function. An example of where this could be useful is when the phone number is already formatted with a separator, but it needs to be replaced with another separator:
SELECT REPLACE('023 456 7890', ' ', '-');
Result:
023-456-7890
International Numbers
E.164 is an international standard that defines the format for international telephone numbers.
E.164 numbers are formatted [+][country code][area code][local phone number]
and can have a maximum of fifteen digits.
Here’s an example that uses two methods to concatenate the country code, the area code, and the phone number:
SELECT
CONCAT('+', '1', '415', '4567890') AS 'CONCAT() Function',
'+' + '1' + '415' + '4567890' AS 'Concatenation Operator';
Result:
+---------------------+--------------------------+ | CONCAT() Function | Concatenation Operator | |---------------------+--------------------------| | +14154567890 | +14154567890 | +---------------------+--------------------------+
The first method uses the CONCAT()
function, and the second uses the concatenation operator (+
).
That example formats a US based number. In many countries, the area code has a leading zero that needs to be dropped when using the E.164 format.
One way to suppress any leading zeros is to convert the area code to a numeric value and back again.
Here’s an example of using that technique on a UK based number:
SELECT CONCAT(
'+',
'44',
CAST(CAST('020' AS int) AS varchar(3)),
'34567890'
);
Result:
+442034567890
In this case the leading zero was dropped.
Here’s the same code run against the previous US based number (which doesn’t use a leading zero in the country code):
SELECT CONCAT(
'+',
'1',
CAST(CAST('415' AS int) AS varchar(3)),
'4567890'
);
Result:
+14154567890
This time the country code remained at three digits.
I have table in the database with a phone number column. The numbers look like this:
123456789
I want to format that to look like this:
123-456-789
the Tin Man
158k41 gold badges214 silver badges301 bronze badges
asked Sep 15, 2009 at 11:02
2
This should do it:
UPDATE TheTable
SET PhoneNumber = SUBSTRING(PhoneNumber, 1, 3) + '-' +
SUBSTRING(PhoneNumber, 4, 3) + '-' +
SUBSTRING(PhoneNumber, 7, 4)
Incorporated Kane’s suggestion, you can compute the phone number’s formatting at runtime. One possible approach would be to use scalar functions for this purpose (works in SQL Server):
CREATE FUNCTION FormatPhoneNumber(@phoneNumber VARCHAR(10))
RETURNS VARCHAR(12)
BEGIN
RETURN SUBSTRING(@phoneNumber, 1, 3) + '-' +
SUBSTRING(@phoneNumber, 4, 3) + '-' +
SUBSTRING(@phoneNumber, 7, 4)
END
answered Sep 15, 2009 at 11:11
David AndresDavid Andres
31.2k7 gold badges45 silver badges36 bronze badges
3
I’d generally recommend you leave the formatting up to your front-end code and just return the data as-is from SQL. However, to do it in SQL, I’d recommend you create a user-defined function to format it. Something like this:
CREATE FUNCTION [dbo].[fnFormatPhoneNumber](@PhoneNo VARCHAR(20))
RETURNS VARCHAR(25)
AS
BEGIN
DECLARE @Formatted VARCHAR(25)
IF (LEN(@PhoneNo) <> 10)
SET @Formatted = @PhoneNo
ELSE
SET @Formatted = LEFT(@PhoneNo, 3) + '-' + SUBSTRING(@PhoneNo, 4, 3) + '-' + SUBSTRING(@PhoneNo, 7, 4)
RETURN @Formatted
END
GO
Which you can then use like this:
SELECT [dbo].[fnFormatPhoneNumber](PhoneNumber) AS PhoneNumber
FROM SomeTable
It has a safeguard in, in case the phone number stored isn’t the expected number of digits long, is blank, null etc — it won’t error.
EDIT: Just clocked on you want to update your existing data. The main bit that’s relevant from my answer then is that you need to protect against «dodgy»/incomplete data (i.e. what if some existing values are only 5 characters long)
answered Sep 15, 2009 at 11:13
AdaTheDevAdaTheDev
141k28 gold badges204 silver badges198 bronze badges
1
Above users mentioned, those solutions are very basic and they won’t work if the database has different phone formats like:
(123)123-4564
123-456-4564
1234567989
etc
Here is a more complex solution that will work with ANY input given:
CREATE FUNCTION [dbo].[ufn_FormatPhone] (@PhoneNumber VARCHAR(32))
RETURNS VARCHAR(32)
AS
BEGIN
DECLARE @Phone CHAR(32)
SET @Phone = @PhoneNumber
-- cleanse phone number string
WHILE PATINDEX('%[^0-9]%', @PhoneNumber) > 0
SET @PhoneNumber = REPLACE(@PhoneNumber, SUBSTRING(@PhoneNumber, PATINDEX('%[^0-9]%', @PhoneNumber), 1), '')
-- skip foreign phones
IF (
SUBSTRING(@PhoneNumber, 1, 1) = '1'
OR SUBSTRING(@PhoneNumber, 1, 1) = '+'
OR SUBSTRING(@PhoneNumber, 1, 1) = '0'
)
AND LEN(@PhoneNumber) > 11
RETURN @Phone
-- build US standard phone number
SET @Phone = @PhoneNumber
SET @PhoneNumber = '(' + SUBSTRING(@PhoneNumber, 1, 3) + ') ' + SUBSTRING(@PhoneNumber, 4, 3) + '-' + SUBSTRING(@PhoneNumber, 7, 4)
IF LEN(@Phone) - 10 > 1
SET @PhoneNumber = @PhoneNumber + ' X' + SUBSTRING(@Phone, 11, LEN(@Phone) - 10)
RETURN @PhoneNumber
END
Suraj Kumar
5,5398 gold badges21 silver badges41 bronze badges
answered Jul 12, 2014 at 19:48
HiramHiram
2,6391 gold badge16 silver badges13 bronze badges
I do not recommend keeping bad data in the database and then only correcting it on the output. We have a database where phone numbers are entered in variously as :
- (555) 555-5555
- 555+555+5555
- 555.555.5555
- (555)555-5555
- 5555555555
Different people in an organization may write various retrieval functions and updates to the database, and therefore it would be harder to set in place formatting and retrieval rules. I am therefore correcting the data in the database first and foremost and then setting in place rules and form validations that protect the integrity of this database going forward.
I see no justification for keeping bad data unless as suggested a duplicate column be added with corrected formatting and the original data kept around for redundancy and reference, and YES I consider badly formatted data as BAD data.
j0k
22.5k28 gold badges80 silver badges89 bronze badges
answered Oct 25, 2012 at 17:44
Daniel ByrneDaniel Byrne
1,39114 silver badges12 bronze badges
2
Solutions that use SUBSTRING
and concatenation +
are nearly independent of RDBMS. Here is a short solution that is specific to SQL Server:
declare @x int = 123456789
select stuff(stuff(@x, 4, 0, '-'), 8, 0, '-')
answered Dec 18, 2015 at 13:13
You can also try this:
CREATE function [dbo].[fn_FormatPhone](@Phone varchar(30))
returns varchar(30)
As
Begin
declare @FormattedPhone varchar(30)
set @Phone = replace(@Phone, '.', '-') --alot of entries use periods instead of dashes
set @FormattedPhone =
Case
When isNumeric(@Phone) = 1 Then
case
when len(@Phone) = 10 then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 4, 3)+ '-' +substring(@Phone, 7, 4)
when len(@Phone) = 7 then substring(@Phone, 1, 3)+ '-' +substring(@Phone, 4, 4)
else @Phone
end
When @phone like '[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9][0-9]' Then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 5, 3)+ '-' +substring(@Phone, 8, 4)
When @phone like '[0-9][0-9][0-9] [0-9][0-9][0-9] [0-9][0-9][0-9][0-9]' Then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 5, 3)+ '-' +substring(@Phone, 9, 4)
When @phone like '[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]' Then '('+substring(@Phone, 1, 3)+')'+ ' ' +substring(@Phone, 5, 3)+ '-' +substring(@Phone, 9, 4)
Else @Phone
End
return @FormattedPhone
end
use on it select
(SELECT [dbo].[fn_FormatPhone](f.coffphone)) as 'Phone'
Output will be
answered Mar 15, 2016 at 8:24
You Can Use FORMAT if you column is a number Syntax like
FORMAT ( value, format [, culture ] ) In use like
FORMAT ( @d, 'D', 'en-US' )
or FORMAT(123456789,'###-##-####')
(But This works for only SQL SERVER 2012 And After)
In Use Like
UPDATE TABLE_NAME SET COLUMN_NAME = FORMAT(COLUMN_NAME ,'###-##-####')
And
if your column is Varchar Or Nvarchar use do like this
CONCAT(SUBSTRING(CELLPHONE,0,4),' ',SUBSTRING(CELLPHONE,4,3),'
',SUBSTRING(CELLPHONE,7,2) ,' ',SUBSTRING(CELLPHONE,9,2) )
You can always get help from
https://msdn.microsoft.com/en-us/library/hh213505.aspx
answered Jan 7, 2017 at 19:16
ToprakToprak
2831 gold badge4 silver badges11 bronze badges
2
I found that this works if wanting in a (123) — 456-7890 format.
UPDATE table
SET Phone_number = '(' +
SUBSTRING(Phone_number, 1, 3)
+ ') '
+ '- ' +
SUBSTRING(Phone_number, 4, 3)
+ '-' +
SUBSTRING(Phone_number, 7, 4)
Rizier123
58.6k16 gold badges97 silver badges154 bronze badges
answered Feb 9, 2015 at 22:53
nation161rnation161r
511 gold badge3 silver badges14 bronze badges
Updated @sqiller’s function for my purposes
CREATE FUNCTION [toolbox].[FormatPhoneNumber] (
@PhoneNumber VARCHAR(50),
@DefaultIfUnknown VARCHAR(50)
)
RETURNS VARCHAR(50)
AS
BEGIN
-- remove any extension
IF CHARINDEX('x', @PhoneNumber, 1) > 0
SET @PhoneNumber = SUBSTRING(@PhoneNumber, 1, CHARINDEX('x', @PhoneNumber, 1) - 1)
-- cleanse phone number string
WHILE PATINDEX('%[^0-9]%',@PhoneNumber) > 0
SET @PhoneNumber = REPLACE(@PhoneNumber,
SUBSTRING(@PhoneNumber,PATINDEX('%[^0-9]%',@PhoneNumber),1),'')
-- Remove US international code if exists, i.e. 12345678900
IF SUBSTRING(@PhoneNumber,1,1) = '1' AND LEN(@PhoneNumber) = 11
SET @PhoneNumber = SUBSTRING(@PhoneNumber, 2, 10)
-- any phone numbers without 10 characters are set to default
IF LEN(@PhoneNumber) <> 10
RETURN @DefaultIfUnknown
-- build US standard phone number
SET @PhoneNumber = '(' + SUBSTRING(@PhoneNumber,1,3) + ') ' +
SUBSTRING(@PhoneNumber,4,3) + '-' + SUBSTRING(@PhoneNumber,7,4)
RETURN @PhoneNumber
END
answered Nov 10, 2015 at 21:11
Josh JayJosh Jay
1,2402 gold badges15 silver badges26 bronze badges
If you want to just format the output no need to create a new table or a function. In this scenario the area code was on a separate fields. I use field1
, field2
just to illustrate you can select other fields in the same query:
area phone
213 8962102
Select statement:
Select field1, field2,areacode,phone,SUBSTR(tablename.areacode,1,3) + '-' + SUBSTR(tablename.phone,1,3) + '-' + SUBSTR(tablename.areacode,4,4) as Formatted Phone from tablename
Sample OUTPUT:
columns: FIELD1, FIELD2, AREA, PHONE, FORMATTED PHONE
data: Field1, Field2, 213, 8962102, 213-896-2102
DᴀʀᴛʜVᴀᴅᴇʀ
7,39116 gold badges70 silver badges118 bronze badges
answered Aug 24, 2016 at 18:13
When dealing with SQL databases, users will have to interact with different data types and schemas depending on the project’s requirements. Since databases are used in many different applications and support multiple platforms, devices, and users, there may be differences in how data is entered, even in a simple field like a phone number.
For example, one set of users will enter the phone number with the country code, while the others may simply ignore the country code. Therefore, it is essential for database users to know how to clean data in a SQL database. In this post, we will see how to utilize SQL string operations on phone number fields to properly clean, format, and extract data.
Why Do We Care About Data Cleaning?
Data comes in all forms and sizes. Most of the time, it will be raw data, without proper formatting. Data cleaning allows users to filter, sort, and format this raw data to extract meaningful information from it. Data cleaning also makes data analysis a much more efficient process as an additional benefit.
Let’s assume we have a text field that captures user information and asks users to enter valuable information like phone numbers. We need to clean this text field to make the phone numbers usable. How would we go about doing that? To begin with, we need to understand SQL string operations.
What are SQL String Operations?
String operations are functions that can be used to manipulate strings. These string operators (also called string functions) take a string as input, process it according to the specified function, and return a string as the output. The availability of these functions will differ depending on the database. For example, the following is a list of commonly used string functions in MySQL:
Common Types of String Operations
Function | Usage |
---|---|
LEFT/RIGHT | Extract a specified number of characters from a string (starting from left or right) |
TRIM | Remove leading or trailing spaces |
POSITION / STRPOS | Analogous to the Location function, where a numerical value of the position of the specified substring is provided |
SUBSTR | Extract the specified substring from a specific location to match the specified length |
CONCAT | Combine two or more strings to make a single string |
SUBSTR | Extract the specified substring from a specific location to match the specified length |
COALESCE | Replace null values with a specified value |
FORMAT | Formats the value to the specified format |
REGEXP | Matches string to a regular expression |
REGEXP_SUBSTR | Extracts value for the matching regular expression |
STR | Convert numerical value to string |
REVERSE | Reverses the specified string |
REPLACE | Replaces the specified string with the given value |
The full list of available string functions and operators is available in the official documentation.
How to use String Operations with Phone Numbers
Now we have a basic understanding of string operations and related functions in SQL! So, let’s look at how exactly we can utilize some of these functions on phone numbers to extract and format data. For all the examples provided here, we will be using the Arctype SQL client with a MySQL database in a Windows environment with the following dataset.
If you don’t have Arctype installed, you can use the link below to download it and follow along:
The fast and easy-to-use
SQL client for developers and teams
Properly Formatting a Phone Number
We can utilize the SUBSTR
command to format phone numbers. It will extract parts of the number and use the CONCAT
to join the specific parts together. In the following example, we have broken down the phone number into three distinct sections and combined them together with the formatting as a new field called num_formatted
:
SELECT
phone_num,
CONCAT(
'(',
SUBSTR(phone_num, 1, 3),
') ',
SUBSTR(phone_num, 4, 3),
'-',
SUBSTR(phone_num, 7)
) AS num_formatted
FROM
EmpDetails
The result of the operation is show below:
You can create a new column to save the formatted number using the ALTER TABLE
command. It will create a new column and update it with the formatted numbers.
ALTER TABLE EmpDetails
ADD formatted_number VARCHAR(255);
UPDATE
EmpDetails
SET
formatted_number = (
SELECT
CONCAT(
'(',
SUBSTR(phone_num, 1, 3),
') ',
SUBSTR(phone_num, 4, 3),
'-',
SUBSTR(phone_num, 7)
) AS num_formatted
)
The result of the above operation is shown below:
We can modify our SQL command to include the REPLACE
function if there are additional characters in the phone number field, such as dashes and brackets. This function can be used to remove unnecessary characters and whitespaces in a string. Now let’s further format the formmated_number
field to include the country code.
SELECT
formatted_number,
CONCAT(
'+1 '
'(',
SUBSTR(num_cleansed, 1, 3),
') ',
SUBSTR(num_cleansed, 5, 3),
'-',
SUBSTR(num_cleansed, 7)
) AS num_formatted
FROM
(
SELECT
formatted_number,
REPLACE(REPLACE(REPLACE(REPLACE(formatted_number, '(', ''), ')', ''), '-', ''), ' ','') as num_cleansed
FROM
EmpDetails
) FormattedPhoneNumbers
In the above statement, the REPLACE
function is used to remove the brackets, dashes, and whitespaces from the formatted_number
field before undergoing the formatting process.
We can use the LEFT
and RIGHT
functions to specify and extract different parts of the phone number. In the following example, we will extract the area code of the phone number using the LEFT
function and the last four digits using the RIGHT
function based on the num_formatted
field.
SELECT
REPLACE(LEFT(formatted_number, 4), '(', '') As 'Area Code',
RIGHT(formatted_number, 4) AS 'Last Four Digits'
FROM
EmpDetails
LEFT
and RIGHT
functions to specify different parts of a phone number.Here, the REPLACE
function is used to remove the bracket from the selected number block. The result of the query is shown below.
One of the most common tasks in the world of data cleaning is extracting phone numbers from a text block. The complexity and feasibility of this task will mostly depend on the composition of the text.
The easiest way to extract phone numbers is to utilize regular expressions to target the specific phone number formats. Extracting data has become far simpler with the introduction of functions like REGEXP_SUBSTR
in MySQL 8.0. We will be populating the details column with some phone numbers in different formats, as shown below.
Identifying Rows with Matching Phone Numbers
First, we need to figure out which rows consist of data matching our requirements. In this case, the following regular expressions will be used on the details field.
- Any consecutive 10 digits
- Formatted number (XXX-XXX-XXXX)
SELECT
*
FROM
EmpDetails
WHERE
# Any 10 Digits
details REGEXP '[[:digit:]]{10}'
# Formatted Number (XXX-XXX-XXXX)
OR details REGEXP '[0-9]{3}-[0-9]{3}-[0-9]{4}';
Results of the query can be seen below:
Since we have identified the rows, the next step is to extract the phone numbers. It can be done using the REGEXP_SUBSTR function to extract the substring which matches the given regular expression. As we need to query two different regular expressions, we will be using the CONCAT_WS function to combine the results of both expressions into a single column.
SELECT
emp_id,
name,
email,
CONCAT_WS(
'',
REGEXP_SUBSTR(details, '^[0-9]+$', 1, 1, 'm'),
REGEXP_SUBSTR(details, '[0-9]{3}-[0-9]{3}-[0-9]{4}', 1, 1, 'm')
) AS extracted_phone_numbers
FROM
(
SELECT
*
FROM
EmpDetails
WHERE
details REGEXP '[[:digit:]]{10}'
OR details REGEXP '[0-9]{3}-[0-9]{3}-[0-9]{4}'
) ExtractedDetails
The result of this operation can be seen below:
Handling Multiple Phone Numbers in a Single Field
To query results from a single field with multiple numbers, we need to create a stored procedure that loops through the desired field to capture all matching regex patterns. For instance, let’s see how to extract multiple phone numbers from the details field of emp_id 1702 (Dan).
DELIMITER $$
CREATE PROCEDURE get_number_of_matches_full()
BEGIN
DECLARE regex_match INT DEFAULT 1;
DECLARE current_number INT DEFAULT 1;
WHILE regex_match >= 1 DO
CREATE TABLE IF NOT EXISTS extracted_numbers_table (
`emp_id` int NOT NULL,
`name` varchar(255) DEFAULT NULL,
`email` varchar(255) DEFAULT NULL,
`extracted_phone_num` varchar(255) DEFAULT NULL
);
INSERT INTO extracted_numbers_table (emp_id, name, email, extracted_phone_num)
SELECT emp_id, name, email, REGEXP_SUBSTR(details, '[0-9]{3}-[0-9]{3}-[0-9]{4}', 1, current_number, 'm') FROM EmpDetails WHERE emp_id = 1702;
SET current_number = current_number + 1;
IF ((SELECT REGEXP_SUBSTR(details, '[0-9]{3}-[0-9]{3}-[0-9]{4}', 1, current_number, 'm') FROM EmpDetails WHERE emp_id = 1702) IS NULL) THEN
SET regex_match = 0;
END IF;
END WHILE;
END $$
DELIMITER;
The result of this operation is shown below.
In the above code block, we have created a stored procedure called get_number_of_matches_full
, which loops through the details field until all the regex matches are found on the specified row of the EmpDetails table. We are using the REGEXP_SUBSTR
function with the position argument to extract different matches. It updates by one at each loop run and saves the results on a newly created extracted_numbers_table
. Finally, the loop exits when a NULL
result is found.
We can call this procedure and view the results by querying the extracted_numbers_table
as shown below.
CALL get_number_of_matches_full;
SELECT * FROM extracted_numbers_table;
And, shown again in Arctype:
Conclusion
String operations in SQL are vital functions to clean and format data within a database. Moreover, string operations are the core component when dealing with valuable information such as phone numbers as they allow users to extract or manipulate phone numbers according to their requirements. However, it’s important to remember that the exact functions available will change depending on the database type and version. Therefore, always remember to refer to the documentation to find out the available string operations and implement them to your heart’s content.
Ready to get started with string operations in SQL? If so, be sure to download Arctype and give it a try. You can use the link below to install Arctype for free and follow along with the examples seen here. Happy querying!
JOIN the Arctype Newsletter
Programming stories, tutorials, and database tips every 2 weeks
Introduction
In my first article, Telephone Numbers in SQL Server 2005: Part 1 – The Data Type, I discussed various ways of persisting a simple telephone number in SQL Server. Simply displaying the raw data stored in SQL Server would not be suitable for human consumption, so some form of data formatting must be done. Data formatting is usually performed in the user interface layer, for example, by the ASPX page or the WinForms application. Often, it is convenient to have SQL Server format the data in a view to be passed into another document, like a report that might have a difficult time formatting the number correctly. In this article, I will discuss a couple methods that can be used to format a phone number for presentation to the user on SQL Server.
The UDF
SQL Server 2000 introduced the ability to create User Defined Functions. Using a custom formatting function for telephone numbers is a perfect example of where to use UDFs. The code below can be used to create a telephone number formatting function in TSQL:
CREATE FUNCTION [dbo].[FORMATPHONENUMBER] ( @Number money ) RETURNS varchar(25) AS BEGIN DECLARE @Formatted varchar(25) DECLARE @CharNum varchar(18) DECLARE @Extension int DECLARE @Numerator bigint IF @Number IS NULL BEGIN RETURN NULL END SET @Numerator = CAST(@Number * 10000 AS bigint) SET @Extension = CAST(RIGHT(@Numerator, 4) AS int) SET @CharNum = CAST(LEFT(@Numerator , LEN(@Numerator) - 4) AS varchar(18)) IF LEN(@CharNum) = 10 BEGIN SET @Formatted = '(' + LEFT(@CharNum, 3) + ') ' + SUBSTRING(@CharNum,4,3) + '-' + RIGHT(@CharNum, 4) IF @Extension > 0 BEGIN SET @Formatted = @Formatted + ' ext '+ CAST(@Extension AS varchar(4)) END RETURN @Formatted END IF LEN(@CharNum) = 7 BEGIN SET @Formatted = LEFT(@CharNum, 3) + '-' + RIGHT(@CharNum, 4) IF @Extension > 0 BEGIN SET @Formatted = @Formatted + ' ext '+ CAST(@Extension AS varchar(6)) END RETURN @Formatted END IF LEN(@CharNum) = 11 BEGIN SET @Formatted = LEFT(@CharNum, 1) + ' (' + SUBSTRING(@CharNum, 2, 3) + ') ' + SUBSTRING(@CharNum,4,3) + '-' + RIGHT(@CharNum, 4) IF @Extension > 0 BEGIN SET @Formatted = @Formatted + ' ext '+ CAST(@Extension AS varchar(4)) END RETURN @Formatted END SET @Formatted = @CharNum IF @Extension > 0 BEGIN SET @Formatted = @Formatted + ' ext '+ CAST(@Extension AS varchar(4)) RETURN 'ext '+ CAST(@Extension AS varchar(4)) END RETURN @Formatted END
The CLR Version
SQL Server 2005 added the ability to write your own DLLs in the .NET programming language of your choice, so I also wrote the equivalent function in C#. To do this in Visual Studio 2005, start a new SQL Server project, and add a new User-Defined Function. Paste this code into it:
using System; using System.Data; using System.Data.SqlClient; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; using System.Text.RegularExpressions; public partial class UserDefinedFunctions { [Microsoft.SqlServer.Server.SqlFunction( IsDeterministic=true, IsPrecise=true)] public static SqlString FormatNAPhoneNumber(SqlMoney Number) { if (Number.IsNull) return SqlString.Null; string phoneNumber = Number.ToString(); string [] phone = phoneNumber.Split(new Char [] {'.'}); string charnum = phoneNumber; int extension = int.Parse(phone[1]); switch (phone[0].Length) { case 10: { if (extension > 0) { charnum = string.Format("({0}) {1}-{2} ext{3}", phone[0].Substring(0, 3), phone[0].Substring(3, 3), phone[0].Substring(6), extension.ToString()); return new SqlString(charnum); } else { charnum = string.Format("({0}) {1}-{2}", phone[0].Substring(0, 3), phone[0].Substring(3, 3), phone[0].Substring(6)); return new SqlString(charnum); } break; } case 7: { if (extension > 0) { charnum = string.Format("{0}-{1} ext{2}", phone[0].Substring(0, 3), phone[0].Substring(3), extension.ToString()); return new SqlString(charnum); break; } else { charnum = string.Format("{0}-{1}", phone[0].Substring(0, 3), phone[0].Substring(3)); return new SqlString(charnum); break; } } case 11: { if (extension > 0) { charnum = string.Format("{0} ({1}) {2}-{3} ext{4}", phone[0].Substring(0, 1), phone[0].Substring(1, 3), phone[0].Substring(4, 3), phone[0].Substring(7), extension.ToString()); return new SqlString(charnum); } else { charnum = string.Format("{0} ({1}) {2}-{3}", phone[0].Substring(0, 1), phone[0].Substring(1, 3), phone[0].Substring(4, 3), phone[0].Substring(7)); return new SqlString(charnum); } } default: { return new SqlString(charnum); break; } } } }
I did some quick benchmarking, to see how it affected performance, against a simple table containing the phone number data type. The table contains a small set of 580 rows, and the SQL SELECT
command was executed 1,000 times, which generated 580,000 calls to the function. The SELECT
command was executed without bringing the dataset back to the client so that the network time was minimized. Execution times are in milliseconds. Your numbers may vary, but use these numbers as a relative comparison of the different methods.
Method |
Time |
Function Overhead |
No phone number formatting |
1950 ms |
|
TSQL |
7450 ms |
5500 ms |
C# function using |
4750 ms |
2800 ms |
C# function using RegEx |
7187 ms |
5237 ms |
The Suprise!
To my surprise, the .NET function call ran twice as fast as the native TSQL function! My first guess would be that the native TSQL functions would certainly run faster by eliminating the overhead of the .NET engine and interface. It seems Microsoft has done their homework on .NET integration. This is another good example of where testing shows results that are often counter to what you would expect.
The C# version also has various lines commented out if you choose to use the RegEx library or the simple string concatenate operator for formatting the telephone number. One way to handle international phone numbers would be to store the RegEx pattern and evaluator strings in a table keyed by country. This table could then be linked to the phone number and the appropriate formatting codes passed to the function. I didn’t go this far with the design, but I did try using the RegEx method of formatting to test the performance over using the String.Format
method. As you can see, it performed about the same as the TSQL version, so the overhead of the RegEx object was significant.
Conclusion
Using TSQL to write user defined functions can be awkward if it requires a fair amount of string manipulation or complicated logic. Using the .NET CLR integration allows you to easily use the full power of the .NET library and probably get better performance than with TSQL.
In my first article, it was stated that the small details make all the difference. By paying attention to proper design and doing a little bit of testing, we have cut the data footprint of storing a phone number in half as well as cut the CPU requirements for formatting the number in half. Small changes like these can make all the difference when it comes to how well your applications can scale.
Синтаксис
-
Wild Card с%: SELECT * FROM [table] WHERE [имя_столбца] Как «% Value%»
Wild Card with _: SELECT * FROM [table] WHERE [column_name] Как ‘V_n%’
Wild Card с [charlist]: SELECT * FROM [table] WHERE [column_name] Как ‘V [abc] n%’
замечания
Условие LIKE в предложении WHERE используется для поиска значений столбцов, соответствующих данному шаблону. Шаблоны формируются с использованием следующих двух подстановочных знаков
- % (Процентный символ) — Используется для представления ноль или более символов
- _ (Underscore) — используется для представления одного символа
Матч открытого шаблона
Подстановочный знак %
добавленный к началу или концу (или обеим) строки, будет содержать 0 или более символов до начала или после окончания шаблона.
Использование «%» в середине позволит совместить 0 или более символов между двумя частями шаблона.
Мы собираемся использовать эту таблицу сотрудников:
Я бы | FName | LName | Номер телефона | ManagerID | DepartmentID | Оплата труда | Дата приема на работу |
---|---|---|---|---|---|---|---|
1 | Джон | Джонсон | 2468101214 | 1 | 1 | 400 | 23-03-2005 |
2 | Софи | Amudsen | 2479100211 | 1 | 1 | 400 | 11-01-2010 |
3 | Ronny | кузнец | 2462544026 | 2 | 1 | 600 | 06-08-2015 |
4 | Джон | Sanchez | 2454124602 | 1 | 1 | 400 | 23-03-2005 |
5 | Хильде | сук | 2468021911 | 2 | 1 | 800 | 01-01-2000 |
Следующие утверждения соответствуют всем записям, содержащим FName, содержащие строку «on» из таблицы Employees.
SELECT * FROM Employees WHERE FName LIKE '%on%';
Я бы | FName | LName | Номер телефона | ManagerID | DepartmentID | Оплата труда | Дата приема на работу |
---|---|---|---|---|---|---|---|
3 | R на ny | кузнец | 2462544026 | 2 | 1 | 600 | 06-08-2015 |
4 | J on | Sanchez | 2454124602 | 1 | 1 | 400 | 23-03-2005 |
Следующий оператор соответствует всем записям, имеющим PhoneNumber, начиная со строки «246» от Employees.
SELECT * FROM Employees WHERE PhoneNumber LIKE '246%';
Я бы | FName | LName | Номер телефона | ManagerID | DepartmentID | Оплата труда | Дата приема на работу |
---|---|---|---|---|---|---|---|
1 | Джон | Джонсон | 246 8101214 | 1 | 1 | 400 | 23-03-2005 |
3 | Ronny | кузнец | 246 2544026 | 2 | 1 | 600 | 06-08-2015 |
5 | Хильде | сук | 246 8021911 | 2 | 1 | 800 | 01-01-2000 |
Следующий оператор соответствует всем записям с номером PhoneNumber, заканчивающимся на строку «11» от Employees.
SELECT * FROM Employees WHERE PhoneNumber LIKE '%11'
Я бы | FName | LName | Номер телефона | ManagerID | DepartmentID | Оплата труда | Дата приема на работу |
---|---|---|---|---|---|---|---|
2 | Софи | Amudsen | 24791002 11 | 1 | 1 | 400 | 11-01-2010 |
5 | Хильде | сук | 24680219 11 | 2 | 1 | 800 | 01-01-2000 |
Все записи, где 3-й символ Fname — «n» от сотрудников.
SELECT * FROM Employees WHERE FName LIKE '__n%';
(два символа подчеркивания используются до «n», чтобы пропустить первые 2 символа)
Я бы | FName | LName | Номер телефона | ManagerID | DepartmentID | Оплата труда | Дата приема на работу |
---|---|---|---|---|---|---|---|
3 | Ronny | кузнец | 2462544026 | 2 | 1 | 600 | 06-08-2015 |
4 | Джон | Sanchez | 2454124602 | 1 | 1 | 400 | 23-03-2005 |
Совпадение одного символа
Чтобы расширить выбор оператора структурированного запроса (SQL-SELECT), можно использовать подстановочные знаки, знак процента (%) и подчеркивание (_).
Символ _
(подчеркивание) может использоваться в качестве подстановочного знака для любого отдельного символа в совпадении с шаблоном.
Найдите всех сотрудников, чье Fname начинается с «j» и заканчивается на «n» и имеет ровно 3 символа в Fname.
SELECT * FROM Employees WHERE FName LIKE 'j_n'
_
(подчеркивание) также может использоваться более одного раза в качестве дикой карты для соответствия шаблонам.
Например, этот шаблон будет соответствовать «jon», «jan», «jen» и т. Д.
Эти имена не будут отображаться «jn», «john», «jordan», «justin», «jason», «julian», «jillian», «joann», потому что в нашем запросе используется один знак подчеркивания, и он может пропустить точно один символ, поэтому результат должен иметь 3 символа Fname.
Например, этот шаблон будет соответствовать «LaSt», «LoSt», «HaLt» и т. Д.
SELECT * FROM Employees WHERE FName LIKE '_A_T'
Соответствие диапазону или набору
Сопоставьте любой отдельный символ в указанном диапазоне (например: [af]
) или установите (например: [abcdef]
).
Этот шаблон диапазона будет соответствовать «gary», но не «mary»:
SELECT * FROM Employees WHERE FName LIKE '[a-g]ary'
Этот шаблон будет соответствовать «mary», но не «gary»:
SELECT * FROM Employees WHERE Fname LIKE '[lmnop]ary'
Диапазон или набор можно также отменить, добавив ^
каретку перед диапазоном или установить:
Этот шаблон диапазона не будет соответствовать «gary», но будет соответствовать «mary»:
SELECT * FROM Employees WHERE FName LIKE '[^a-g]ary'
Этот шаблон набора не будет соответствовать «mary», но будет соответствовать «gary»:
SELECT * FROM Employees WHERE Fname LIKE '[^lmnop]ary'
Матч ЛЮБОЙ против ВСЕХ
Совпадение:
Необходимо совместить хотя бы одну строку. В этом примере тип продукта должен быть «электроникой», «книгами» или «видео».
SELECT *
FROM purchase_table
WHERE product_type LIKE ANY ('electronics', 'books', 'video');
Все совпадение (должно соответствовать всем требованиям).
В этом примере должны быть согласованы как «объединенное королевство», так и «лондон» и «восточная дорога» (включая вариации).
SELECT *
FROM customer_table
WHERE full_address LIKE ALL ('%united kingdom%', '%london%', '%eastern road%');
Отрицательный выбор:
Используйте ВСЕ, чтобы исключить все элементы.
В этом примере приводятся все результаты, когда тип продукта не является «электроникой», а не «книгами», а не «видео».
SELECT *
FROM customer_table
WHERE product_type NOT LIKE ALL ('electronics', 'books', 'video');
Поиск диапазона символов
Следующий оператор соответствует всем записям, имеющим FName, которое начинается с буквы от A до F из таблицы Employees .
SELECT * FROM Employees WHERE FName LIKE '[A-F]%'
Выписка ESCAPE в LIKE-запросе
Если вы реализуете текстовый поиск как LIKE
-query, вы обычно делаете это так:
SELECT *
FROM T_Whatever
WHERE SomeField LIKE CONCAT('%', @in_SearchText, '%')
Однако (помимо того, что вы не должны использовать LIKE
когда вы можете использовать полнотекстовый поиск), это создает проблему, когда кто-то вводит текст типа «50%» или «a_b».
Таким образом (вместо перехода на полнотекстовый поиск) вы можете решить эту проблему с помощью инструкции LIKE
-escape:
SELECT *
FROM T_Whatever
WHERE SomeField LIKE CONCAT('%', @in_SearchText, '%') ESCAPE ''
Это означает, что теперь будет рассматриваться как символ ESCAPE. Это означает, что теперь вы можете просто добавить
к каждому символу в строке, которую вы ищете, и результаты будут корректными, даже если пользователь вводит специальный символ, например
%
или _
.
например
string stringToSearch = "abc_def 50%";
string newString = "";
foreach(char c in stringToSearch)
newString += @"" + c;
sqlCmd.Parameters.Add("@in_SearchText", newString);
// instead of sqlCmd.Parameters.Add("@in_SearchText", stringToSearch);
Примечание. Вышеупомянутый алгоритм предназначен только для демонстрационных целей. Он не будет работать в случаях, когда 1 графема состоит из нескольких символов (utf-8). например string stringToSearch = "Les Miseu0301rables";
Вам нужно будет сделать это для каждой графемы, а не для каждого персонажа. Вы не должны использовать вышеуказанный алгоритм, если имеете дело с азиатскими / восточно-азиатскими / южноазиатскими языками. Вернее, если вам нужен правильный код для начала, вы должны просто сделать это для каждого graphemeCluster.
См. Также ReverseString, вопрос интервью с C #
Подстановочные знаки
подстановочные символы используются с оператором SQL LIKE. Подстановочные знаки SQL используются для поиска данных в таблице.
Подстановочные знаки в SQL:%, _, [charlist], [^ charlist]
% — заменить ноль или более символов
Eg: //selects all customers with a City starting with "Lo"
SELECT * FROM Customers
WHERE City LIKE 'Lo%';
//selects all customers with a City containing the pattern "es"
SELECT * FROM Customers
WHERE City LIKE '%es%';
_ — заменить один символ
Eg://selects all customers with a City starting with any character, followed by "erlin"
SELECT * FROM Customers
WHERE City LIKE '_erlin';
[charlist] — устанавливает и диапазоны символов для соответствия
Eg://selects all customers with a City starting with "a", "d", or "l"
SELECT * FROM Customers
WHERE City LIKE '[adl]%';
//selects all customers with a City starting with "a", "d", or "l"
SELECT * FROM Customers
WHERE City LIKE '[a-c]%';
[^ charlist] — Соответствует только символу, не указанному в скобках
Eg://selects all customers with a City starting with a character that is not "a", "p", or "l"
SELECT * FROM Customers
WHERE City LIKE '[^apl]%';
or
SELECT * FROM Customers
WHERE City NOT LIKE '[apl]%' and city like '_%';