Matrix Multiplication
Product of matrix
Matrices and vectors are stored in sparse form:
- Matrices:
for non-zero entries - Vectors:
for elements of the vector
Vector Fits in Memory
If the vector fits in memory, but the matrix does not:
- Each map task gets the entire vector and some values from the matrix
- For each value
, it outputs
- For each value
- Each reduce task is given one or more rows
, and calculates the sum
Vector Doesn’t Fit in Memory
- Matrix and vector both split into stripes (matrix split into columns, vector into rows)
- For each partial row of the matrix, the partial sum of products is calculated
- For each row, the reduce task can calculate the total sum from multiple partial sums
Matrix-Matrix Multiplication
Repeat matrix-vector multiplication across each column in the second matrix.
Cloud Computing
A platform: a collection of integrated and networked hardware, software and internet infrastructure.
Tiers:
- Software as a Service (service value nets): user-facing value e.g. SalesForce CRM
- Platform as a Service: e.g. Google App Platform, Heroku
- Infrastructure as a Service: renting storage, compute resources e.g. Linode, AWS
Virtualization
Multiple VMs running on a single physical machines: sandboxes user code to ensure it is completely isolated from any other user code.
Cloud advantages:
- Much faster and greater scaling possible
- Ease of use: high-level services available
- No dedicated personnel required to manage hardware
Disadvantages:
- Local processing may be faster than transferring data over the network
- No control when it fails
- Privacy concerns