Transform Strings

Formatting Strings

Python supports multiple ways to format strings. With formatting I mean building new strings out of different kind of values. The most commons are the old school %-formatting, the .format() method and the Python3 novelty f-strings. In this manual, we will look into the latest way, given its minimal syntax and high flexibility.

Formatting strings with f-string is as easy as defining a standard string literal, look at this example:

myName = 'Roberto'
f'Hi! My name is {myName}'
# My name is Roberto

Moreover, f-strings allow to include full Python expressions like:

a = 10
b = 20
f'The mid value is {a+(b-a)*.5}'
# The mid value is 15.0

Don't forget to prefix the f-string with an 'f' or 'F' otherwise it would be considered just a regular string, look:

a = 10
b = 20
'The mid value is {a+(b-a)*.5}'
# The mid value is {a+(b-a)*.5}

As you may have already noticed, the expression into the 'f-string' should be surrounded by braces:

f'The mid value is a+(b-a)*.5'
# The mid value is a+(b-a)*.5

Otherwise, the expression is not evaluated.

f-strings and .format() share the same format specifier mini-language. Which can be synthesized in the following way:

f'{[expression]:[width][type]}'

This mini-language allows us to precisely instruct how to format the data into the string adding some extra information after a colon. For example:

# we import the euler constant from the math module
from math import e
# then we print the constant value with two digits after the period
print(f'euler: {e:.2f}')         # euler: 2.71

This step is optional, so if we omit any extra instruction, Python will use a standard conversion intent.

# we import the euler constant from the math module
from math import e
# then we print the constant value
print(f'euler: {e}') # euler: 2.718281828459045
# note the different amount of digits after the period

[width] provides instructions concerning padding, for example allowing the appending of extra characters to the right

message = 'hello'
f'{message:+<10}'
# hello+++++

or to the left

message = 'hello'
f'{message:>10}'
#      hello

In the cases above, white spaces will be added until the length of 10 characters is reached.

[width] also allows to center a string within a certain amount of characters:

print(f"{'a':^10}")
print(f"{'bcd':^10}")
print(f"{'efghi':^10}")
print(f"{'jklmnop':^10}")
print(f"{'qrstu':^10}")
print(f"{'vwx':^10}")
print(f"{'y':^10}")

#    a
#   bcd
#  efghi
# jklmnop
#  qrstu
#   vwx
#    y

You can easily define which character should be used by the interpreter:

print(f'{"hello":@<10}')
# hello@@@@@

print(f'{"hello":_>10}')
# _____hello

print(f'{"hello":-^10}')
# --hello---

If no instruction for [type] is defined, Python will use the basic string representation for the value provided. For example, integers will be represented using 10 base notation, but it is conveniently possible to specify a different base adding [type]. What follows is a list of the possible conversion options for integer values.

Type	Meaning	Output
`'b'`	Binary format	Outputs the number in base 2
`'c'`	Character	Converts the integer to the corresponding unicode character before printing
`'d'`	Decimal Integer	Outputs the number in base 10
`'o'`	Octal format	Outputs the number in base 8
`'x'`	Hex format	Outputs the number in base 16, using lower- case letters for the digits above 9
`'X'`	Hex format	Outputs the number in base 16, using upper- case letters for the digits above 9
`'n'`	Number	This is the same as `'d'`, except that it uses the current locale setting to insert the appropriate number separator characters

Consider the following examples:

value = 242
f'{value:.^12b}'
# '..11110010..'

242 is converted to 11110010. The binary representation is then centered in a string of length 12 using '.' as placeholder

value = 65
f'H{value:c}H'
# 'HAH'

65 in the Unicode mapping points to the uppercase 'A' which is then joined to the string literal

value = 200
f'{value:>+6d}'
# '  +200'

The plus in front of the string’s width forces the interpreter to put a sign in front of the decimal integer even if positive. Note that the plus can be applied to any numerical data conversion. 200 is displayed in base 10, and some white spaces are put in front of the sign until the string reaches length 6.

value = 379
f'U+{value:0>4X}'
# 'U+017B'

converts 3792₁₀ to 17B₁₆, using uppercase letters. Then it adds an extra 0 character in front of the hexadecimal representation to reach length 4. Finally, it is linked to the string literal 'U+'.

Now, let’s look at a selection of options for floating-point numbers

Type	Meaning	Output
`'f'`	Fixed point	Displays the number as a fixed-point number. The default precision is 6.
`'n'`	Number	It uses the current locale setting to insert the appropriate number separator characters. If the number is too large, it switches to scientific notation.
`'%'`	Percentage	Multiplies the number by 100 and displays in fixed (`'f'`) format, followed by a percent sign.

Consider the following examples:

value = 12.2345
f'{value:.2f}' # 12.23

'f' in combination with '.2' will output a floating point representation which precision is limited to 2 digits after the dot

# US standard
value = 2345.67
f'{value:n}'
# 2,345.67

# Italian standard
import locale
locale.setlocale(locale.LC_ALL, 'it_IT')
f'{value:n}'
# 2345,67

This conversion method could be useful when typesetting languages different from English. Check the Python standard locale module documentation.

value = .45
f'{value:.0%}' # 45%

The '%' conversion type will convert the floating point to percentage. '.0' will truncate the decimal information of the percentage (45% instead of 45.000000%).

Useful String Methods

Python provides a number of specific methods to transform text data. Remember that strings are immutable, so they are not manipulated “in place”. The following methods generate a brand new string that you have to assign to an identifier if you need to use their output afterwards.

Method	Input	Output
`s.capitalize()`	`hello`	`Hello`
`s.lower()`	`HELLO`	`hello`
`s.swapcase()`	`Hello`	`hELLO`
`s.title()`	`hello world`	`Hello World`
`s.upper()`	`hello`	`HELLO`

Python provides a number of specific methods to inspect text data. These methods return a boolean value, in fact their identifier describes a string condition

Method	Behaviour
`s.islower()`	return `True` if `s` is lowercase
`s.istitle()`	return `True` if `str` is title cased (`'Hello World'`)
`s.isupper()`	return `True` if `str` is uppercase
`s.startswith(s2)`	return `True` if `str` starts with `s2`
`s.endswith(s2)`	return `True` if `str` ends with `s2`
`s.isalnum()`	return `True` if `str` is alphanumeric (A-Z, a-z, 0-9, no white spaces)
`s.isalpha()`	return `True` if `str` is alphabetic (A-Z, a-z, no white spaces)

You can find correspondences of a substring into a string using the following methods:

Method	Behaviour
`s.index(s2, i, j)`	Index of first occurrence of `s2` in `s` after index i and before index j
`s.rindex(s2)`	Return highest index of `s2` in s (raise `ValueError` if not found)
`s.find(s2)`	Find and return lowest index of `s2` in `s`
`s.rfind(s2)`	Return highest index of `s2` in `s`

Or, you can generate new strings (or list of strings) using the following common methods:

Method	Behaviour
`s.join('123')`	Return `s` joined by iterable `'123'` if `s` is `'hello'` → `'1hello2hello3'`
`s.split(sep, maxsplit)`	Return the input string split by the separator `sep`. Optionally you can define a maximum amount of splits `maxsplits` to be performed
`s.splitlines()`	Return a list of lines in s if `s` is `'hello\nworld'` → `['hello', 'world']`
`s.replace(s2, s3, count)`	Replace `s2` with `s3` in `s` at most count times

>>> prev()

>>> next()