In this short article I run through some examples of using Excel’s text functions to manipulate or format data. This follows on from my other Excel related articles.
Quite often I use Excel to change data to the format that is required in a database and text functions come in very handy in all sorts of situations. I also use Excel to format data into HTML code for use in web pages and I’ll give an example of that.
In the descriptions below I refer to the values within the function as text, character, string or values and show them as text or other values in the examples but they could easily be cell references such as A12 or a named field such as First_Name.
UPPER, LOWER and PROPER
These functions are used to reformat some text. As you might imagine, UPPER converts the text characters to their uppercase equivalent so “jAmEs o’conNER” becomes “JAMES O’CONNER”. The same example using the LOWER function would produce “james o’conner” and PROPER, which stands for proper-case is useful for names as it would give “James O’Conner”. All text characters are converted using these functions so if you use PROPER on a sentence you would get the each word being treated as a name with its first character in upper case such as “The Moon Over Water”.
LEN and TRIM
These functions are different but I’ve grouped them together basically as they look at the length of a string of text and return something. There are other functions that could also be grouped with these but I wanted to cover these two first. LEN returns the number of characters in a string including all white space so a string of text with four spaces before and after it such as ” popular games ” would return 21. TRIM of the same example will remove the extra spaces from the beginning and end but not the space between the text. The LEN of the TRIM in the example would return 13.
LEFT, RIGHT and MID
These three functions return a proportion of the string of characters. You specify the text string and then the number of characters to return for both LEFT and RIGHT and it will return that number of characters from either the start in the case of LEFT or the end in the case of RIGHT. So taking the first 3 charactaers of “Patrick howe” using LEFT would show “pat” and the last 3 characters using
RIGHT would show “owe”. For LEFT and RIGHT if you omit the number of characters it assumes you mean 1 so RIGHT(“patrick howe”) would return “e”. For MID you have to specify the starting position from the LEFT and the number of characters so MID(“patrick howe”,1,3) would be the same as LEFT(“patrick howe”,3) and return “pat”. If we want to include the 3 charactaers from somewhere in the middle such as starting with the seventh character “k” then MID(“patrick howe”,7,3) would return “k h”.
FIND and SEARCH
Both of these functions work in exactly the same way and both return the position of a character or characters with a text string. The difference is that FIND is case-sensitive and SEARCH is not. So if you want to find the word “cow” in “the brown Cow was grazing with the other cows”, FIND would return 42 (the position of the “c” from the left) and SEARCH would return 11 (the position of the “C” from the left). You can also optionally tell the function where in the string of text to start trying to match and it would ignore all text before that position so starting at position 12 using the SEARCH function for “cow” would ignore “Cow” as it starts in position 11 and would return 42.
CONCATENATE and REPT
These functions join text together in different ways. CONCATENATE joins text together in the order that it is specified with each item separated by a comma, i.e. CONCATENATE(item1,item2,item3). There is another way to concatenate and that is to join all the text strings with an ampersand, i.e. “some”&” “&”text” would produce “some text”. The REPT function is used to repeat text or a character a certain number of times such as REPT(“a”,4) would return “aaaa”. One thing that I find useful is when a database requires a fixed length string but my data doesn’t match that. In the following example I need to have a name field that is exactly 10 characters but the name “BOB” is only 3 characters and “MARY” is 4 characters. You get the picture, the number of characters is variable so what I need to do is create a string with BOB and 7 spaces or MARY with 6 spaces after it. In this example I’ve repeated the period mark even though I would normally use a space as I wanted it to show up on the screenshot. We need to find the length of the text using the LEN function and take that away from the 10 characters required and then add that to the original text. CONCATENATE(“MARY”,REPT(“.”,10-LEN(“MARY”))) would produce “MARY……”.
CLEAN, REPLACE and SUBSTITUTE
These functions change text in different ways. If you know the position and length of the text to replace you can use the REPLACE function. It takes four arguments, the original text, the start position and length of the text to replace and the replacement text. If you do not know the position and length of the text to replace in the original text then you can use the SUBSTITUTE function which works as a combination of SEARCH and REPLACE. By default SUBSTITUTE replaces the all the instances it finds but you can specify which instance by using the optional instance parameter. The CLEAN function removes any non-printable characters so in the example, I have some text that spans two rows but CLEAN concatenates both rows into a single row.
CODE, CHAR, UNICODE and UNICHAR
All text is made up of code in either the standard character set or the Unicode character set. The CODE and UNICODE functions take in text, look at the first character and return the reference number of it. CODE(“a”) would return 97 as would CODE(“abc”). The CHAR and UNICHAR fucntions take in a number and return the character from the character set that represents it. CHAR(98) would return “b”.
VALUE, NUMBERVALUE, T and TEXT
These functions are all to do with formatting. VALUE converts text to a value format such as a number or date. In the example, VALUE(“09/09/2014”) would return 41891 which is the date-time equivalent of September 9, 2014. If I had formatted the cell t a date format then it would have shown correctly. NUMBERVALUE does something similar but allows you to specify the decimal point and thousands separator. In the example I have used NUMBERVALUE(“1,234.99″,”.”,”,”) but as the decimal and separator are the same as my default settings I could have used VALUE(“1,234.99”) to get the same result. The T function converts a value to text and as you can see in the example the result is left aligned in the same way as other text in this General formatted cell. The TEXT function allows you to specify a format and in the example I have specified a date format.
I hope that this article on text functions has been useful. If you want to download the example used here then click the link here Excel Text Functions.