Hashing and Data Structures

26 Mar 2014

###Problem 1

Universal hash function: at most $\frac{1}{m}$ of $\lvert P \rvert$ parameters cause collision
Universal hash functions can have different “effectiveness”
- 1 parameter: $\log m$ bits
- 2 parameter: $2\log m$ bits
Linear hash function is easy to break

###Problem 2

Trade space for time
Ensuring some invariant, thus the correctness of the algorithm

###Problem 3

Use data structure as a component for online algorithm
Use additional infomation to determine if it is possible that answer is inthe subtree

###Problem 4

Any memory allocation scheme is bad, because it cannot predict the future
We can only decide to use some scheme based on some assumption or observation on the user
We can try to be $X$ -competitive

Greedy Algorithms

26 Mar 2014

#####Entropy is the optimal(best we can do) of infomation compression

####About greedy algorithm

You don’t have to show that your greedy algorithm beats OPT, you just need to show that it’s as good as the OPT
Proving greedy algorithm:
- Induction
- Extreme

###Problem 1

To disprove it, just try find counterexamples

###Problem 2

Greedy algorithm is not always the solution, if it’s the case, try some other algorithms

###Problem 3

When swap into OPT, make sure that it’s at least as good as OPT

###Problem 4

Shannon-Fano is not good, you need to modify it to make it actually work
If encoding individually is not good, try encoding grouply
Proof technique: use contradiction when it’s not easy to quantitize the constraints
Lempel-Ziv is a good compression algorithm

###Problem 5

Huffman encoding is guaranteed to be within range $1$ of $H(p)$
For infomation with low entropy, use Huffman combined with run-length

ML: Week 2

25 Mar 2014

Multivariate linear regression

####Improve Gradiant Descent

Feature Scaling
1. Make sure features are on a similar scale
2. Get every feature into approximately $-1\leq x_i \leq 1$ range
Mean Normalization
1. Make features have approximately zero mean(excluding $x_0 = 1$ )
2. where:
  - $\mu_1$ is avg value of $x_1$
  - $s_1$ is range of $x_1$ or standard deviation
Playing around with learning rate $\alpha$
1. Plot $J(\theta)$ as function of number of iterations
2. if $\alpha$ is too small: slow convergence
3. if $\alpha$ is too large: not decrease or even not converge (possibly slow convergence)

####Normal Equation

Solve $\theta$ analytically
Doesn’t work well if n is too big(in millions)

Taking inverse of a matrix is approx on the order of cube of dimension
Suitable for linear regression
Use pinv in Octave

####Invertibility

Why $X^TX$ can be non-invertible

Redundant features(linear dependent)

$x_1$ = size in $feet^2$

$x_2$ = size in $m^2$
Too many features

eg: $m\leq n$
- Delete some features
- Use regularization

DoCP: Lesson 1

20 Mar 2014

#####Notes - Test driven - Extreme value - lexigraphic ordering - use the comparator function, not the item itself - computing v.s. doing - computing: return result, pure functions - doing: doesn’t return result, inpure functions or subroutines

    It's easy to test pure functions

#####Python features:

Same function can have different return type(dynamic typing system)
The use of ‘–234’.index(r)
The use of list.count
list: reversed

#####Refactoring

DRY: don’t repeat yourself

#####Lesson Learned

understand
define pieces
reuse
test
explore the design space
- correctness
- efficiency
- elegance
- features

Note: Case Study

19 Mar 2014

###Leader Election in Ring

####Take away

Using libraries

It’s standard notions so you already understand it. And it reduces the risk of mistakes.

####Question

How to modify the model to allow message to be dropped?
Each step only one process send id?(concurrency)

Dealt with in fact Traces
Step may send itself out again.
The discussion for section 4

###Memory