python data types

numeric, sequence, set, mapping, boolean

A data type refers to the category in which the object belongs to. You can view the type of an object by using the type() or isinstance() functions.

A data structure is a collection of different data types; while a data type holds value, a data structure holds data. Lists, sets, and dictionaries are the most common forms of data structures. String can be considered as ‘an array of characters’; a range holds a range of values and can be turned into a list which is a data structure.

String

Any kind of text data; type(““) is considered an empty string
Convert an object to a string with the str() function
Strings are enclosed in parentheses
Either single quote (‘) or double quotes (“) can be used to open and close a string
Maximum of 79 characters in each line
Use ‘\’ at the end of the line if not enough character space while writing code; there cannot be any character or space after the backslash
Use ‘\n’ to insert new line; anything written after ‘\n’ will be on a new line
Use ‘\'t’ to insert tab
Choose the separator between strings; e.g. print(‘string’, ‘string1’, ‘string2’, sep=’\t’)
Choose how to end the string, default is ‘\n’; print(‘string’, ‘string1’, ‘string2’, sep=’\t’, end=’ ‘)
Raw string - taking code at face value, like writing ‘\n’ or ‘\t’ as part of the actual string; e.g. file name would be written as print(r’C:\some\name’)
Escape character - ‘\’ ; used to insert illegal characters within a string; use backlash right before the character; e.g. print(“This is an \”escape\” so quotes are taken at face value”) or print(‘C:\some\\name’)

String formatting - the "%" operator is used to format a set of variables enclosed in a "tuple", together with a format string
- %s - insert string
- %d - insert an integer (digit)
- %f - insert a floating-point number (default 6 numbers)
- %.nf - insert a floating-point number with a fixed amount of n to the right of the decimal point

Built-in Methods:

Numeric

Three distinct numeric types are int (integer), float (floating-point number), and complex (complex number)
- integer - a number that can be written without a fractional component
- float - a number that has a decimal component
- complex - a number that can be expressed in the form of a+bi, where a and b are real numbers and i is an imaginary number
When you assign an integer value to a variable, the variable automatically becomes INT type
When you assign a decimal value to a variable, the variable automatically becomes a FLOAT type
When running an arithmetic operation with an INT and FLOAT, the result will be FLOAT
To convert an object to an integer or float, use the int() or float() functions
The power function is not ^ in python; this is a bitwise operation; use double asterisk **
Refer to the operator chart here

Popular Math Functions:

Boolean

Boolean functions return one of the truth values “True” or “False”
Case sensitive; TRUE, true, FALSE, false are not boolean values
Built-in comparison operators (==, !=, …) can return Boolean
Convert an object to a boolean value with the bool() function
Conditions can be combined using “and”, “or”, and “not”:
- a and b: return True if both a and b are True
- a or b: return True is at least one is True
- not a: return True if a is False
Boolean functions with “and” and “or” conditions have a short-circuit behavior, meaning the program will only look at one of the two values to make a conclusion
- a and b: if ‘a’ is false, then the value is returned without looking at ‘b’, but if ‘a’ is true, then the program will evaluate ‘b’.
- a or b: if ‘a’ is true, then the value is returned without looking at ‘b’, but if ‘a’ is false, then the program will evaluate ‘b’

Values that are empty, none, or zero are False; e.g., True == 1 is True and False == 0 is True
Anything non-empty or non-zero will return true; e.g., bool(10) is True but bool([ ]) empty list is false

List

Lists are a mutable collection of values; otherwise known as iterables which is a Python object capable of returning its members one at a time
Lists are enclosed in brackets [a, b, c]
Lists can have different non-unique, data types; e.g. [“star”, 5.2, True, “star”]
Lists can be concatenated, e.g. [‘apple’] + [‘tree’] = [‘apple’, ‘tree’]; but adding two strings is different, e.g. ‘apple’ + ‘tree’ = ‘appletree’
Elements within a list can be identified by its index [ i ]
- list[1] from list [a, b, c] would pull ‘b’
- list[2] from a nested list [[1, 2, 3], [4, 5, 6], [7, 8, 9]] would pull [7, 8, 9]
- list[2][-1] from a nested list [[1, 2, 3], [4, 5, 6], [7, 8, 9]] would pull 9
- list[0][0] from a list [“Today”, “is”, “warm”] will pull the first letter of the first word, ‘T’
Slicing is multi-indexing, used to reference a certain range of elements (last element is excluded)
- list[1:4] from list [‘a’, ‘b’, ‘c’, ‘d’, ‘e’] would pull a range [‘b’, ‘c’, ‘d’]
- list[2:] from list [‘a’, ‘b’, ‘c’, ‘d’, ‘e’] would pull list starting from [2] [‘c’, ‘d’, ‘e’]
- list[:-1] from list [‘a’, ‘p’, ‘p’, ‘l’, ‘e’] would pull everything but the last [‘a’, ‘p’, ‘p’, ‘l’]
- third element inside the index is the step size [start:stop:step]; default is 1
  list[-2::-1] from list [‘a’, ‘b’, ‘c’, ‘d’] would show [‘c’, ‘b’, ‘a’]
Replace or update elements in a list using index; e.g. if list = [‘a’, ‘b’, ‘c’] was updated with list[1] = ‘c’, the new list would be [‘a’, ‘c’, ‘c’]

A nested list is a list within a list; e.g. [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
- You can use nested lists to create a 2D array (or 2D list). A for loop statement can be used to extract elements.
E.g., this type of an array can be used when recording several different temperatures each day
Day 1 - 31, 45, 49, 37
Day 2 - 42, 44, 51, 41
Day 3 - 35, 46, 50, 47
The position of an element is represented by two indices [row][column]
Writing only one index will output the entire row
insert values with the .insert() method
- numlist.insert(1, [10, 11, 12]) would insert a new list [10, 11, 12] into the second row (index 1)
update values by reassigning
- numlist[3] = [10, 11, 12] would replace the 4th row (index 3) with the new list [10 11, 12]
- numlist[1][2] = 5 would replace the 3rd element (index 2) from the 1st row (index 1) with 5
delete values with the del() method
- del numlist[3] would delete the 4th row (index 3)

Zip is a function that takes two lists or tuples, and combines them into one tuple with corresponding pairs of elements from each list; the elements inside a zipped list are also tuples
List comprehension offers a concise syntax and can perform the combined operation of any filter() and map().
- syntax is [<expression> for <element> in <list> if <boolean>]
- See examples under For Loop

List Functions and Methods:

Tuple

Tuple is an immutable sequence used to store collections of heterogeneous/unique data
Tuples are enclosed in parenthesis although this is optional; empty tuple is ( )
Tuples provide a safeguard against accidental tampering of data
Like a list, you can access the elements within the tuple using indexing
Like a list, you can have tuples inside tuples (nested tuple)
Unlike a list, tuples consist of immutable elements which can be used as key for dictionary
You cannot modify a tuple, but you can delete it entirely using the del keyword
If there is a list (mutable) inside a tuple (immutable), you can modify that inner-list
Packing refers to assigning multiple values into a tuple
Unpacking refers to assigning a tuple into multiple variables and passing it through a function as multiple arguments

Range

Range is an immutable sequence of numbers
Range() is the function to call the sequence of integers
You cannot view the return of a range; you have to convert it to a list()
This data type is useful for memory space efficiency, e.g. instructing program to iterate an object x many times without having to save each element or variable as a memory (i.e. a for loop)
Similar to slicing, range takes three arguments: start, stop, step (excluding the last element)
- range(10) would include all the numbers from 1 to 9
- list(range(10)) converts the output to a list type [1, 2, 3, 4, 5, 6, 7, 8, 9]
- range(0, 10, 2) returns every two numbers from the range, so 2, 4, 6, 8
- range(15, 10, -1) goes backwards from the range, so 15, 14, 13, 12, 11
- sum(range(100)) is the sum of every number between 1 to 99 (4950)

Set

A set is a mutable object used to store unique elements in no specific order

A frozenset is an immutable set
While the set itself is mutable, the elements inside a set have to be immutable; e.g. cannot add list to a set
Cannot use indexing to identify elements (like we can for lists and tuples)
Supports non-mutating list operations as long as they don’t depend on order
Example: a bowl of fruit; cannot say give me the first and last fruit, but you can say give me the next fruit
A set is enclosed in curly brackets { } or you can use the set() function
Is a list is [1, 1, 2, 3, 3, 3] then a set would be {1, 2, 3}
A set can be within another set, also known as Nested Set

Dictionaries

Mutable, unordered collection of data values that stores information called key-value pairs; if you want to access a specific piece of information in the dictionary, just refer to the key (think of an actual dictionary where key = word and value = definition)
This is the only Mapping Type which maps (connects) values to objects
Looking up via dictionary is quicker than a list because dictionaries use hash values (unique ID values for immutable objects); in contrast, looking up 4 in a list of [1, 2, 3, 4, 5, 6] will take longer since this is a linear search
Dictionaries are enclosed in curly brackets {key:value, key:value …}
Keys have to be immutable (e.g., int, str, float, tuple <elements inside tuple must also be immutable>); values can be any object
Keys must be unique; any duplicate keys will override the previous pairs
Use dict() function to create a dictionary; e.g., dict(((‘apple’, 123), (‘orange’, 456), (‘banana’, 789))) and it will return {‘apple’:123, ‘orange’:456, ‘banana’:789}
Dictionary can be within another dictionary, also known as Nested Dictionary

Functions and methods:

When iterating over a dictionary, the output will be the keys; e.g. for i in market: print(i) will output ‘fruit’, ‘price’, and ‘inventory’

Under the module collections, there is a function called defaultdict() which acts like a regular dictionary except it never errors; instead, it will provide a default value
- all the values must be of the same data type
- defaultdict() dictionary will pretend that that any possible keys exist and has a value of zero
- the argument in defaultdict() must be set to either None or a function that can be called upon (e.g., def default: return ‘value is missing’)
- you can also set the argument to int or float in which the default will be zero; or the argument can be a list in which the default will be the values in the list

main page

data science