This repository has been archived on 2021-10-31. You can view files and clone it, but cannot push or open issues or pull requests.
DSA/notes.md
2019-03-21 15:53:57 +01:00

145 lines
3.5 KiB
Markdown

<!-- vim: set ts=2 sw=2 et tw=80: -->
# Complexity
General way to describe efficiency algorithms (linear vs exponential)
indipendent from the computer architecture/speed.
## The RAM - random-access machine
Model of computer used in this course.
Has random-access memory.
### Basic types and basic operations
Has basic types (like int, float, 64bit words). A basic step is an operation on
a basic type (load, store, add, sub, ...). A branch is a basic step. Invoking a
function and returning is a basic step as well, but the entire execution takes
longer.
Complexity is not measured by the input value but by the input size in bits.
`Fibonacci(10)` in linear in `n` (size of the value) but exponential in `l`
(number of bits in `n`, or size of the input).
By default, WORST complexity is considered.
## Donald Knuth's A-notation
A(c) indicates a quantity that is absolutely at most c
Antonio's weight = (pronounced "is") A(100)
## (big-) O-notation
f(n) = O(g(n))
*Definition:* if f(n) is such that f(n) = k * A(g(n)) for all _n_ sufficiently
large and for some constant k > 0, then we say that
# Complexity notations (lecture 2019-02-26)
## Characterizing unknown functions
pi(n) = number of primes less than n
## First approximation
*Upper bound:* linear function
pi(n) = O(n)
*Lower bound:* constant function
pi(n) = omega(1)
*Non-trivial tight bound*:
pi(n) = theta(n/log n)
## Theta notation
Given a functio ng(n), we define the __family__ of functions theta(g(n)) such
that given a c_1, c_2 and an n_0, for all n >= n_0 g(n) is sandwiched between
c_1g(n) and c_2g(n)
## Big omega notation
Omega(g(n)) is a family of functions such that there exists a c and an n_0 such
that for all n>= n_0 g(n) dominates c\*g(n)
## Big "oh" notation
O(g(n)) is a family of functions such that there exists a c and an n_0 such
that for all n>= n_0 g(n) is dominated by c\*g(n)
## Small "oh" notation
o(g(n)) is the family of functions O(g(n)) excluding all the functions in
theta(g(n))
## Small omega notation
omega(g(n)) is the family of functions Omega(g(n)) excluding all the functions
in theta(g(n))
## Recap
*asymptotically* = <=> theta(g(n))
*asymptotically* < <=> o(g(n))
*asymptotically* > <=> omega(g(n))
*asymptotically* <= <=> O(g(n))
*asymptotically* >= <=> Omega(g(n))
# Insertion sort
## Complexity
- *Best case:* Linear (theta(n))
- *Worst case:* Number of swaps = 1 + 2 + ... + n-1 = (n-1)n/2 = theta(n^2)
- *Average case:* Number of swaps half of worst case = n(n-1)/4 = theta(n^2)
## Correctness
Proof sort of by induction.
An algorithm is correct if given an input the output satisfies the conditions
stated. The algorithm must terminate.
### The loop invariant
Invariant condition able to make a loop equivalent to a straight path in an
execution graph.
# Heaps and Heapsort
A data structure is a way to structure data.
A binary heap is like an array and can be of two types: max heap and min heap.
## Interface of an heap
- `Build_max_heap(A)` and rearranges a into a max-heap;
- `Heap_insert(H, key)` inserts `key` in the heap;
- `Heap_extract_max(H)` extracts the maximum `key`;
- `H.heap_size` returns the size of the heap.
A binary heap is like a binary tree mapped on an array:
```
1
/ \
/ \
2 3
/ \ / \
4 5 6 7
=> [1234567]
```
The parent position of `n` is the integer division of n by 2:
```python
def parent(x):
return x // 2
```
The left of `n` is `n` times 2, and the right is `n` times 2 plus 1:
```python
def left(x):
return x * 2
def right(x):
return x * 2 + 1
```
**Max heap property**: for all i > 1 A[parent(i)] >= A[i]