22. Final Lecture

Big data viewed from two directions:

Exam topics:

Knowledge: awareness

Analysis: pros and cons, comparisons to others, is this an appropriate solution to this scenario?

Application (skill):

CodeBox environment.

Have access to course notes, sample solutions to labs, locked-down version of Colab. Bring phone for 2FA.

10-15 multiple choice questions

2 hours

Code questions on Spark

Analysis type questions on MPI

Multichoice on GPU programming, message passing, threads, locks and atomics, work queues, schedulers

Algorithms:

Communication cost: Amdahl’s Law, Gustafson’s Law

Weak vs strong scalability

SAMPLE QUESTION

Caching (spatial/temporal locality) Time sensitive operation: logging, DDOS detection

sc.readFile().flatMap(lambda line: [strip(el) for el in line.split(" ") if len(strip(el)) and [next(char) for char in el if not isalpha(el)] is None]).

next((True for char in word if not char.isalpha()), None) is not None