Python for Designers

by Roberto Arista

Fork me on GitHub

Transform Strings

Formatting Strings

Python supports multiple ways to format strings. With formatting I mean building new strings out of different kind of values. The most commons are the old school %-formatting, the .format() method and the Python3 novelty f-strings. In this manual, we will look into the latest way, given its minimal syntax and high flexibility.

Formatting strings with f-string is as easy as defining a standard string literal, look at this example:

myName = 'Roberto'
f'Hi! My name is {myName}'
# My name is Roberto

Moreover, f-strings allow to include full Python expressions like:

a = 10
b = 20
f'The mid value is {a+(b-a)*.5}'
# The mid value is 15.0

Don't forget to prefix the f-string with an 'f' or 'F' otherwise it would be considered just a regular string, look:

a = 10
b = 20
'The mid value is {a+(b-a)*.5}'
# The mid value is {a+(b-a)*.5}

As you may have already noticed, the expression into the 'f-string' should be surrounded by braces:

f'The mid value is a+(b-a)*.5'
# The mid value is a+(b-a)*.5

Otherwise, the expression is not evaluated.

f-strings and .format() share the same format specifier mini-language. Which can be synthesized in the following way:

f'{[expression]:[width][type]}'

This mini-language allows us to precisely instruct how to format the data into the string adding some extra information after a colon. For example:

# we import the euler constant from the math module
from math import e
# then we print the constant value with two digits after the period
print(f'euler: {e:.2f}')         # euler: 2.71

This step is optional, so if we omit any extra instruction, Python will use a standard conversion intent.

# we import the euler constant from the math module
from math import e
# then we print the constant value
print(f'euler: {e}') # euler: 2.718281828459045
# note the different amount of digits after the period

[width] provides instructions concerning padding, for example allowing the appending of extra characters to the right

message = 'hello'
f'{message:+<10}'
# hello+++++

or to the left

message = 'hello'
f'{message:>10}'
#      hello

In the cases above, white spaces will be added until the length of 10 characters is reached.

[width] also allows to center a string within a certain amount of characters:

print(f"{'a':^10}")
print(f"{'bcd':^10}")
print(f"{'efghi':^10}")
print(f"{'jklmnop':^10}")
print(f"{'qrstu':^10}")
print(f"{'vwx':^10}")
print(f"{'y':^10}")

#    a
#   bcd
#  efghi
# jklmnop
#  qrstu
#   vwx
#    y

You can easily define which character should be used by the interpreter:

print(f'{"hello":@<10}')
# hello@@@@@
print(f'{"hello":_>10}')
# _____hello
print(f'{"hello":-^10}')
# --hello---

If no instruction for [type] is defined, Python will use the basic string representation for the value provided. For example, integers will be represented using 10 base notation, but it is conveniently possible to specify a different base adding [type]. What follows is a list of the possible conversion options for integer values.

TypeMeaningOutput
'b'Binary formatOutputs the number in base 2
'c'CharacterConverts the integer to the corresponding unicode character before printing
'd'Decimal IntegerOutputs the number in base 10
'o'Octal formatOutputs the number in base 8
'x'Hex formatOutputs the number in base 16, using lower- case letters for the digits above 9
'X'Hex formatOutputs the number in base 16, using upper- case letters for the digits above 9
'n'NumberThis is the same as 'd', except that it uses the current locale setting to insert the appropriate number separator characters

Consider the following examples:

value = 242
f'{value:.^12b}'
# '..11110010..'

242 is converted to 11110010. The binary representation is then centered in a string of length 12 using '.' as placeholder

value = 65
f'H{value:c}H'
# 'HAH'

65 in the Unicode mapping points to the uppercase 'A' which is then joined to the string literal

value = 200
f'{value:>+6d}'
# '  +200'

The plus in front of the string’s width forces the interpreter to put a sign in front of the decimal integer even if positive. Note that the plus can be applied to any numerical data conversion. 200 is displayed in base 10, and some white spaces are put in front of the sign until the string reaches length 6.

value = 379
f'U+{value:0>4X}'
# 'U+017B'

converts 3792₁₀ to 17B₁₆, using uppercase letters. Then it adds an extra 0 character in front of the hexadecimal representation to reach length 4. Finally, it is linked to the string literal 'U+'.

Now, let’s look at a selection of options for floating-point numbers

TypeMeaningOutput
'f'Fixed pointDisplays the number as a fixed-point number. The default precision is 6.
'n'NumberIt uses the current locale setting to insert the appropriate number separator characters. If the number is too large, it switches to scientific notation.
'%'PercentageMultiplies the number by 100 and displays in fixed ('f') format, followed by a percent sign.

Consider the following examples:

value = 12.2345
f'{value:.2f}' # 12.23

'f' in combination with '.2' will output a floating point representation which precision is limited to 2 digits after the dot

# US standard
value = 2345.67
f'{value:n}'
# 2,345.67

# Italian standard
import locale
locale.setlocale(locale.LC_ALL, 'it_IT')
f'{value:n}'
# 2345,67

This conversion method could be useful when typesetting languages different from English. Check the Python standard locale module documentation.

value = .45
f'{value:.0%}' # 45%

The '%' conversion type will convert the floating point to percentage. '.0' will truncate the decimal information of the percentage (45% instead of 45.000000%).

Useful String Methods

Python provides a number of specific methods to transform text data. Remember that strings are immutable, so they are not manipulated “in place”. The following methods generate a brand new string that you have to assign to an identifier if you need to use their output afterwards.

MethodInputOutput
s.capitalize()helloHello
s.lower()HELLOhello
s.swapcase()HellohELLO
s.title()hello worldHello World
s.upper()helloHELLO

Python provides a number of specific methods to inspect text data. These methods return a boolean value, in fact their identifier describes a string condition

MethodBehaviour
s.islower()return True if s is lowercase
s.istitle()return True if str is title cased ('Hello World')
s.isupper()return True if str is uppercase
s.startswith(s2)return True if str starts with s2
s.endswith(s2)return True if str ends with s2
s.isalnum()return True if str is alphanumeric (A-Z, a-z, 0-9, no white spaces)
s.isalpha()return True if str is alphabetic (A-Z, a-z, no white spaces)

You can find correspondences of a substring into a string using the following methods:

MethodBehaviour
s.index(s2, i, j)Index of first occurrence of s2 in s after index i and before index j
s.rindex(s2)Return highest index of s2 in s (raise ValueError if not found)
s.find(s2)Find and return lowest index of s2 in s
s.rfind(s2)Return highest index of s2 in s

Or, you can generate new strings (or list of strings) using the following common methods:

MethodBehaviour
s.join('123')Return s joined by iterable '123' if s is 'hello''1hello2hello3'
s.split(sep, maxsplit)Return the input string split by the separator sep. Optionally you can define a maximum amount of splits maxsplits to be performed
s.splitlines()Return a list of lines in s if s is 'hello\nworld'['hello', 'world']
s.replace(s2, s3, count)Replace s2 with s3 in s at most count times