Basic Data Types
What is data? Data is a Latin word currently used in English. It is the plural of datum. We can consider a datum as an atom of a bigger pool of information. The smallest piece, the minimum value we can use to build a bigger collection of quantities and relations. Data are our bricks in the act of coding. The size of a rectangle, the amount of people in a country, an email address, the position of the mouse pointer on the screen, the location of a file on a server. Since there are different kinds of data, Python provides a series of data types. Types that indicate quantities (
float), if a condition is true or not (
bool), pieces of text (
str), ordered collections of other data (
list), data for the absence of data (
None), unsorted collections (
dict) and so on.
Python is a dynamically typed language, which means that the programmer does not have to declare in advance the data type is associated with an identifier. An identifier can be associated with any data type and then reassigned to another one. If you have experience with Processing/Java you have probably noticed that this is not the case there. This is because Java is statically typed. In the Python world, values are linked to a specific type, identifiers are not.
Before our journey into data types, we have to introduce the notion of mutability and its opposite, immutability. A data type is immutable if it has a fixed value that cannot be changed after its creation. On the contrary, a mutable object can be updated along the way. Think of a mutable object as a vase made of clay before being fired in the oven. You can still change its shape. An immutable object instead is already fired and therefore it is fixed in a permanent shape. If you want a different shape, you have to create a new vase.
Programming languages have the crucial ability to change the values assigned to variables during the execution of a program. Take into account that an identifier cannot be assigned to two different values at the same time. Consider parsing a spreadsheet. A spreadsheet is a table filled with numbers and text. It is a very simple database and you will use it quite often. If you need to read the data within the spreadsheet, you will probably start line by line. During the iteration of the lines of the table it would make sense to have an identifier being reassigned each time to only one line in order to load and then visualize data.
A programmer can establish an alias by assigning another identifier to an existing object.
newPage(100, 100) rectWidth = 20 rectHeight = rectWidth rect(10, 10, rectWidth, rectHeight)
This means that both names refer to the same object and they can be used to access the object. If the object supports behaviors that can affect its state, meaning it is mutable, both names will reflect these changes. However, if one of the names is reassigned to a new value using a subsequent assignment statement, this will not affect the aliased object, it will only break the link with the alias. Consider this example:
newPage(100, 100) rectWidth = 20 rectHeight = rectWidth rect(10, 10, rectWidth, rectHeight) # here I reassign the width to a new value rectHeight = rectHeight + 10 rect(50, 10, rectWidth, rectHeight)
Let’s observe line 6. The execution of this command begins with the evaluation of the expression on the right side of the assignment operator
=. This expression is evaluated on the basis of the existing association of the name rectHeight. Since
rectHeight + 10 is
30. Integer values are immutable, so a new value (
30) is created and associated, according to the assignment operator to the name
rectHeight. We will see an aliases example of a mutable object in a few paragraphs. We are now ready to dive into Python data types.
Boolean data types are used to manipulate logical values and the only two possible instances they can refer to are
False. These are literal, not strings! In fact, they are part of the reserved keywords list. Python provides a built-in function to create boolean value starting from non boolean ones, it is
A number is converted to
False if it’s equal to zero,
True if it’s different from zero.
Sequences or other containers are translated to
False if empty, to
True if they have objects inside. Boolean type is mostly used to describe a condition: is it black? is it a digit? did I reach the door? do I have still space on the page? and so on. This condition is then used in combination with control structures as
Numbers (Integers and Floating-point)
Python provides two main types for describing quantities, integers and floating-points. Aren’t numbers all the same? Well, yes and no. In programming there are several situations where you need precision, for example while scaling a drawing. In other cases, you cannot allow to describe a quantity using a fraction of a whole number, for example when you want to describe an iterative process. Can you repeat an action three and a half times? It could make some sense in natural language, because of the level of ambiguity allowed, but it certainly does not make sense to our Python interpreter.
In Python, an integer object is designated to represent whole numbers. The literal declaration of such value is a digit with an optional polarity sign:
Sometimes it can be handy to describe an integer using a different base, such as binary, octal or hexadecimal. This can be done using
0 as a prefix and a character representing the base
|# standard base 10 integer|
These expressions represent the same quantity with different notational systems.
Python provides a function to convert a value into an integer:
These are the behaviours you should expect:
- if a floating-point is provided as argument, the interpreter will truncate the point and any following digit. For instance,
3.1will both become
- if a string is provided as argument, the interpreter will try to parse it and transform it to an integer value. The output of
-3. Something like
int('hello')will raise a
int() uses a 10 base for the conversion. If you need a different base you can indicate a second optional argument, like
How would you represent 30₁₀ using base 2, 8 and 16? Try to use the same visual conversion method from the diagrams above
float object in Python is used to represent a positive or negative decimal number. Its literal declaration is made of a polarity sign (optional), digits, and a trailing point optionally followed by other digits. So, both
3. can be floating-point numbers.
Another way to declare a floating point number is to use a scientific notation. The Python equivalent for
Python provides a function to obtain a floating-point value from an existing object:
float(). These are the behaviours you should expect:
- if the argument is an integer it will be converted to floating-point
- if the argument is a string the interpreter will try to parse it and transform it to a floating-point value. You can expect
float('-3.14')to be transformed to
-3.14. Otherwise, if the interpreter encounters
float('world'), it will raise a
In Python the reserved keyword
None allows to express the absence of data. It can be used in various situations:
- creating an identifier without assigning any specific data to it. Something like saying “I am going to use this, I just do not know how yet”
- untying an identifier from a value without assigning it to anything else
Noneinto a conditional statement to check whether an identifier is tied to a value or not.
Note that if the call of a non-fruitful function (meaning that it doesn’t include neither
yield statements in its body) is assigned to an identifier, the Python interpreter will not raise any error, instead it will assign
None value to the identifier. For example:
def interpolateValue(a, b, factor): value = a + (b-a)*factor myValue = interpolateValue(10, 20, .5) print(myValue)
The value associated with
myValue will be
interpolateValue has no return statement.