01. Introduction
Course Weighting
- Assignment 1 (25%)
- Mid-semester test (20%)
- Wed 24 March, 7pm, E8
- Assignment 2 (25%)
- Compulsory lab in final week
- Final Exam (30%)
The Basics
Modern web applications should:
- Consume services from another system
- Provide services to another system
- Be modular
- Be able to respond to multiple asynchronous request
- Make changes persistent
- Allow/restrict user access
- Synchronize information across views
- Be responsive (both speed and design)
Reference Model
HTTP/REST
SQL Queries <------>
DB <------> HTTP Server . API API . Browser
(Node.JS/Express) . .
------->
Static Resources
URIs and URLs
URI: Uniform Resource Identifier. String which identifies a resource.
URL: Uniform Resource Locator: URI with an access mechanism specified.
scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]
APIs
API can accept parameter information via multiple methods:
- URL query parameters
- URL path
- HTTP headers
- Request contents
APIs will usually use all three.
REST: Representational State Transfer
A way for developers to use HTTP methods explicitly and consistently in accordance with HTTP protocol definitions (counter-example: GET /path/user?id=1&action=delete; GET requests should never modify data)
CRUD
- Create:
POST - Retrieve:
GET - Update:
PUT - Delete:
DELETE
02. Javascript Continued
Asynchronous
JS Event Loop
JS is a single threaded and single concurrent language and hence has a single call stack, heap, queue etc.
Call stack maintains record of function calls. Calling a function pushes onto the stack, returning pops off.
Heap: memory allocation to variables/objects.
Queue: list of messages (and their associated callback function) that need to be processed. When the call stack is empty, the callback function for the oldest message gets pushed onto the stack.
setTimeout(() => console.log("1"), 0);
console.log("2");
// Outputs 2, then 1
On the browser, there are queues for the DOM, network requests and timers; these are part of the Javascript Web API, not the language itself. Node.JS these are not available or reimplemented.
Callback Hell/The Pyramid of Doom
async1(result => {
async2(result => {
async3(result => {
async4(result => {
async5(result => {
async6(result => {
...
});
});
});
});
});
});
This may occur if the API under-fetches data: the client needs to call one API to get enough information to call another.
Promises have tried to solve this.
Promises
An object with three possible states:
- Pending
- Fulfilled; operation completed successfully
- Rejected; operation failed
somePromise
.then(someResult => {
...
return someResult;
})
.then(newResult => ...)
.catch(err => ...)
.finally(() => ...)
Async/await
Syntactic sugar over promises introduced in ES6.
By adding the async modifier to the function, the function will return a promise.
async function f() {
return 1;
}
// Or:
let g = async () => 1;
// Before ES6:
let h = () => new Promise((resolve, reject) => resolve(1));
By adding the await modifier before a promise or expression in a function (doesn’t work in global scope), it will force the function to wait for the promise to return. try/catch can be used as normal; if the promise fails, the contents of the catch statement will run.
(async () => {
const i = await f();
const j = await g();
const k = await h();
})();
// Before ES6:
f()
.then(i => g())
.then(j => h();
.then(k => {})
Modules and Dependencies
CommonJS
- One specification used for managing module dependencies
- Adopted by NodeJS
- The de-facto standard
- Browserify (or similar) required to use CommonJS in the front-end
Use either module.exports.* or exports.* to expose module’s public interface: the public interface is whatever value is assigned to module.exports.
Import the module using require().
npm
A node-specific package manger.
Use npm install --save module_name to install a package and add it to the project’s package.json file.
03. Data Persistence - SQL and NoSQL, Memory Stores, GraphDB
POJO and JSON
POJO: Plain Old JS Object. Essentially a struct.
JSON is very similar to a POJO:
- Data-interchange format
- No versioning
- For serializing data
SQL
ACID:
- Atomic: transaction is either fully completed or fully fails
- Consistency: DB is in a valid state before and after the transaction (invariants preserved)
- Isolated: transactions should be isolated from other in-progress transactions
- Durable: once transactions are committed, they should not be lost
CAP Theorem: if you have a distributed system, pick two:
- Consistency: every read receives the most recent data
- Availability: every read receives a response
- Partition tolerance: system can function even if network goes down
BASE: give up consistency and get:
- Basic availability (through replication)
- Soft state: the state can change over time - this happens due to:
- Eventual consistency: the data will be consistent eventually (if the data is not changed frequently)
A BASE transaction may act like multiple transactions in ACID.
Key-Value Stores
Don’t bother defining schemas, just plonk a key and value together.
- Upsides: fast, simple, flexible, scalable
- Downsides:
- No validation at all
- Consistency checking offloaded from DB to application
- No relationships
- No aggregate operations
- No search operations (other than via key)
In-memory data store (e.g. redis):
- Stores whole DB in memory
- Useful for caching common responses
Document Database
- Bunch of JSON/XML files indexed and stored in DB
- Lots of duplicated data
- Risk of inconsistent document structures
Graph Database
- Node: entity
- Edges: relationship between entities
- Can be uni/bi-directional
- Weights map to relationship properties
- Properties: describe attributes of a node/edge
- Hypergraph: one edge joining multiple nodes
Semantic web:
- Subject-predicate-object (nodes are nouns, edges are verbs, attributes are adjectives/adverbs respectively)
- Nodes are URIs/values
- Represented as Resource Description Framework (RDF) triple stores
- Linked data on the web
Labelled property graph:
- Nodes and edges have an internal structure
- More efficient to store than RDF
Graph databases give measures of centrality such as:
- Degree: indegree/outdegree
- Closeness: average length of all shortest paths
- Betweenness: number of times a node acts as a bridge along the shortest paths
- Eigenvector: influence of a node in the network
Query languages available for graph databases (e.g. Cyper for neo4j).
04. HTTP Servers, REST and GraphQL
Databases, JS behavior, REST versioning
REST
A REST service is a platform and language independent service that runs on top of HTTP(S).
REST services use the same HTTP verbs used by web browsers and hence, they can be thought of as a subset of web apps.
HTTP/REST resources usually have a unique identifier (e.g. integer. UUID). Often, the structure of the URL mirrors the structure of the data.
Misc:
PUTis a replace operation;PATCHis a partial change- How would you represent the deletion of a property? Null value?
- REST has no security, encryption, session management or QoS guarantees
- REST, being built on top of HTTP, is stateless
- The entire resource is returned
- Underfetching/overfetching due to request-response architecture
URLs:
- Hide all implementation details
- e.g. file extensions related to the server software (php) so that server language can change
- Be consistent with singularity/plurality of resource names
- Keep everything lowercase
- Replace spaces for hyphens
- Always provide default page/resource as a response
State timescales (shortest to longest):
- Individual HTTP request: stateless
- Business transaction: e.g. multiple
GETrequests to get information, thenPOSTrequest - Session state: required for web applications
- Preferences: state stored in DB
- Record state
Session state:
- Include session ID as query parameter in
GETrequests: easy to see and copy - Cookies: random identifier sent with each HTTP request to the server
- Session/persistent: deleted/stored when browser closed/session ended
- Secure: HTTPS only
HTTPOnly: not accessible via JSSet-cookie: key=value; Expires = date- Cookies identify the combination of user account/browser/device
HTTP
Stateless protocol; each request/response self-contained.
GET: retrieve named resourceHEAD:GET, but return only headersPOST: create new resourcePUT: change specified resourceOPTIONS: get supported methods for the given resourcePATCH: modify existing resourceDELETE: delete specified resource
Safe methods (nullipotentcy): methods that do not modify resources. GET and HEAD are safe.
Idempotent methods: intended state is the same regardless of how many times the method is called. POST and PATCH are not idempotent.
The lack of side effects helps in making the API more fault tolerant
CAP Theorem
- Consistency: read will always return most recent write
- Availability: non-failing node will return reasonable response within a reasonable amount of time
- Partition tolerance: system will continue to function when it becomes fragmented
In theory, you can have two of the three. However, networks fail (which you have no control over) so you must tolerate partitioning.
API Versioning
Ensure compatibility between the API provider and users. Adding new features may not require new versions, but removing or amending features will require a new version.
Versions may be useful for:
- A/B testing
- System testing; dev, test, prod
- For different subsets of clients e.g. business vs consumers, geographical/political regions
- Rollback of unacceptable APIs
From a client perspective, an API is backwards compatible if it it can continue to function through a service change (e.g. adding new features) and forwards compatible if the client can be changed without needing a service change.
Semantic Versioning
MAJOR.MINOR.PATCH:
- Major: incompatible API changes
- Minor: functionality added in backwards-compatible manner
- Patch: bug fixes
Encoding Versioning
- Query parameter (e.g.
?version=xx.xx) - URL (e.g.
api/v1)- Semantically messy: looks like you are selecting version of a resource
- Header
- Hard to test
Publishing
- Credibility: trust that the API will do what it says on the tin
- Support: support the API
- Success: sharing success stories
- Access: no SDK is the best SDK (e.g. cURL requests)
- Documentation should be current, accurate and hopefully have guides/tutorials
- SDKs/samples in preferred languages
- Free/freemium use
- Instant API keys
- Sandboxes
05. GraphQL
RESTful API Limitations
Endpoints are often designed to match a client. But clients get updated endpoints may not; this may lead to over-fetching or under-fetching.
Over-fetching: download more data than you need. e.g. need list of usernames, but get list of users and all available fields
Under-fetching: download less data than you need. e.g. want information about friends; first fetch list of friend IDs, then make request to get user data about each friend.
GraphQL
A specification for how you specify data (c.f. strong typing) and how you query it.
Example:
type TypeName {
nonNullString: String!,
nullArray: [String]
}
// String is a *scalar* type; a base type
All GraphQL queries return a 200 response code (unless there is a network error). Errors are returned in user-defined fields.
GraphQL:
- Returns 200 for all queries (unless there is a network error). Errors are returned in user-defined fields
- Uses
GETs andPOSTs; the former encodes the query in the URL query parameters while the latter encodes it in the body as JSON - Is a language specification for composing queries to a server
- Still requires some pre-defined data/queries on the server-side (what is allowable etc.)
Queries are in a JSON-like structure e.g.
{
assets(10) {
id,
url,
nestedObject: {
prop1,
prop2
}
}
}
Automated Testing
Mocha and Chai
- Runs test files in order given to it by the filesystem
- Within each test, tests are independent and run asynchronously
- Can setup pre- and post- conditions with
before(),aftereachetc. - Call
doneat the end of a test - Can test HTTP requests using Chai-HTTP
- Ensure erroneous queries are tested as well
06. Security and Intro to Web Clients
Security
OWASP Top 10
- Injection
- Broken authentication (sessions handled incorrectly, assuming another user’s identity)
- Unprotected sensitive data (data stored in plain text at rest etc.)
- XML external entities being evaluated
- Broken access control (roles - permissions for authorized users)
- Security misconfiguration (insecure defaults, verbose error messages)
- XSS
- Insecure de-serialization (replay/injection attacks, privilege escalation, remote code execution)
- Vulnerable components (libraries not updated etc.)
- Insufficient logging/monitoring
Injection
Any time an application uses an interpreter of any type there is a danger of introducing an injection vulnerability.
When a web app passes information from an HTTP request as part of an external request, the input must be sanitized. All injection attacks are input-validation errors.
All external input is a threat; text inputs, check boxes, cookies, HTTP headers etc… Never rely on client-side validation.
Command injection: when unsanitized user input is part of a shell command.
Authentication
Authentication: establish claimed identity
Authorization: establish permission to act
Authentication always comes before authorization.
Three factors:
- Something you know
- Something you have
- Something you are
HTTP
HTTP is a stateless protocol; credentials must be sent with every request. Hence, SSL should be used for everything requiring authentication.
That is, every single request needs to be authenticated and then authorized before completing any request.
Session management: some session ID cookie needed as HTTP stateless; often exposed on the network
Side-doors: change password, forgot my password, secret questions etc.
Mitigation
Architecture:
- Authentication should be simple, centralized and standardized
- Use the standard session ID used by your container
- Be sure SSL projects the credentials and session ID at all times
Implementation:
- Check SSL certificate
- Examine all auth-related functions
- Verify that log off actually destroys the session
XSS
Attacker gets malicious script into a web page that stores data on the server; this script now has access to everything on the page, including DOM and cookies.
e.g. intercepting login requests and redirecting to an attacker-controlled website.
DOM-based XSS Injection:
- Untrusted data should only ever be displayed as text
- JS encode and delimit untrusted data with quoted strings; just looking for quotes isn’t enough (backticks, nice browsers correcting ‘sloppy’ markup, escape sequences etc.)
07. Design Patterns
Model View Something
Data and views of data: the data model and visualization of the data should be separated.
There may also be derived data and hence the data cannot be used directly; a bridge that connects the model and views (both of which can be reused) must be used.
There are several variations on this:
- Model View Controller
- Model View Adapter
- Model View Presenter
- Model View ViewModel
Which model to use? What ever the framework gives you.
As the data may be updated, it is useful to have a data-binding that updates the view when the data is updated. The binding may be one-way or two-way.
Record state: source-of-truth, possibly a server.
Session state: state in the client; the model. Data is bound to the view (presentation).
Model-View-Controller
- Model updates view
- User input on the view updates controller
- Controller updates model
- The view is stateless and has little logic
Benefits:
- Supports multiple, synchronized views
- Views and controllers are pluggable
Downsides:
- Complexity in large applications
- Couples controller and view
- Mixes platform-dependent/independent code within controller and view
Model-View-Presenter
Presenter is bridge between model and view - no direct communication between the model and view.
e.g. if there is invalid user input, the presenter will put a error message on the view, but not update the model.
Model-View-ViewModel
- ViewModel contains only the data required by the view and mediates communication between the Model and View: this is called the binder
- Model contains business logic and data
- Could be on the server
- Data binding between the View and ViewModel is two-way
- No communication between the View and Model
This is used by Vue.
08. Communicating with Servers
XHR/AJAX
Raw JS XMLHttpRequest` or abstraction (e.g. JQuery AJAX - Asynchronous JavaScript and XML requests).
const req = new XMLHttpRequest();
req.addEventListener("load", (response) => ...
req.open("GET", "http://example.com");
req.send();
How are multiple requests handled? TODO UNCLEAR
Event loop pattern (concurrency) is used - instead of executing all processes at once, it does things one at a time and reacting to events that happen.
(Parallelism = doing lots of things at once; concurrency = dealing with lots of things at once)
Once the response is received, it is pushed to the event queue and if there is an opening in the JS call stack, the event handler is called.
Fetch
Introduced 2017, supports promises/async.
fetch("http://example/com")
.then(res => res.json())
.then(obj => console.log(obj))
.catch(err => console.log(err));
CORS: Cross Origin Resource Sharing
By default, XHR requests can only be made to the same origin that the script is hosted on.
The server of the requested content can optionally allow send some headers to allow this, which the browser must support and enforce.
Origin: protocol (HTTP/HTTPS) + domain + port
Forbidden request-headers are those that cannot be modified by XHR. These include:
Proxy-*Sec-*Access-Control-Request-HeadersAccess-Control-Request-MethodOriginHostContent-LengthCookieCookie2
Forbidden response-headers cannot be read by XHR:
Set-CookieSet-Cookie2
Before a XHR request is sent, the browser makes a preflight request using the HTTP OPTIONS method, to which the server should respond with some headers:
Access-Control-Allow-Origin: origin that can request the resource. Can be*, although this disables cookiesAccess-Control-Allow-Methods: comma-separated list of allowed HTTP methodsAccess-Control-Max-Age: number of seconds the response of a preflight request can be cachedAccess-Control-Allow-Credentials: true: cookies sent if trueAccess-Control-Allow-Headers: request headers the JS can modify- Required if browser sends
Access-Control-Request-Headerswith the headers the JS is trying to modify Accept,Accept-Language,Content-Language, andContent-Typeare safe headers and always allowed
- Required if browser sends
Access-Control-Expose-Headers: response headers the JS can read- Safe headers:
Cache-Control,Content-Language,Content-Length,Content-Type,Expires,Last-ModifiedandPragma
- Safe headers:
An old workaround: JSON-P (JSON with padding). Request a <script> from another origin, then a function/variable defined in the script can be called.
WebSockets
Persistent 2-way communication.
Client sends GET request to specific endpoint with a few headers (e.g. Connection: Upgrade, Upgrade: websocket, Sec-WebSocket-Key/-Protocol/-Version) to which the server response with a 101 (switching protocols) and a few more headers. After this handshake, it switches to using WebSocket.
Some NPM packages fallback to HTTP if it is not supported. With the ws package:
const WebSocket = require("ws");
const ws = new WebSocket("ws://example.com/path");
ws.on("open", () => ws.send("A"));
ws.on("message", msg => console.log(msg));
09. Modules, DOM and Performance
Modules and the Development/Deploy Pipeline
SPAs consist of many JS and HTML templates which isn’t ideal in terms of resources and latency:
- Downloading many small files has overhead
- Data that is not needed at runtime is included (e.g. un-minified code)
- May not be written in JS (e.g. TypeScript, Dart)
Bundling is the process of taking the source code and optimizing it into a format that is better for the browser to consume.
In its simplest form, bundling simply concatenates all JS files into one big file (and the same thing for CSS) - tools such as Gulp follow this pattern.
Minification
The build process will also often minify (or Uglify) the JS by removing functions, whitespace, reducing function/variable names (except global function names cannot be renamed) etc.
To aid in debugging, sourcemaps can be made so that the dev tools can present the code as it originally was.
Webpack
Current best practice. To configure webpack, simply point it to the source directory and the app starting point - it will then recursively follow imports, ensuring that only deadcode is removed.
The output is something like main.${hash}.js.
Webpack supports TypeScript and other transpiled languages.
Lazy Loading
Breaks the SPA into logical chunks so that the entire JS code does not need to be downloaded at the start.
Framework and bundler need to agree on how to split the import graph - usually with bundler plugins.
To do it natively in Webpack:
// Instead of
import Component from "/Component";
// Use
const Component = () => import("./Component");
Compression
To reduce network transfer (but not the cost of parsing the JS), use a compression algorithm such as gzip. There is a small CPU cost to decompress, but this is small on modern machines.
In the HTTP request, the browser adds a header indicating the encodings it supports e.g. accept-encoding: gzip, deflate, br. Then, the server can optionally encode it in one of the given options, indicating this using content-encoding: gzip.
Caching
Can add HTTP headers to allow the browser to cache content.
expires:
- Expires on specific date:
expires: DoW, DD MMM YYYY HH:MM:SS GMT
cache-control:
no-store: don’t cache (unless already cached)max-age=${time_in_seconds}: expires after some durationmust-revalidate max-age=${time_in_seconds}: revalidate with server aftermax-age, use cached version otherwiseno-cache: cache, but always revalidate with server- Send
ETag: ${some_long_number}on first request - Send
If-None-Match: ${etag}for subsequent requests- If 304 Not Modified, use cached value
- Alternatively,
Last-ModifiedandIf-Unmodified-Since
- Send
ETag caching
On first request, server responds with ETag: ${some_long_number} and cache-control: no-cache.
On subsequent requests, browser adds If-None-Match: ${etag}; if the ETag matches the current version, the server returns a 304 not modified.
This requires a round-trip and the server to be available, but allows for changes to instantly propagate to all clients.
Caching in a SPA
index.html: use a cache that suits the release cadence. Probably an ETag.
All other file names contain their hash, so you can set cache-control: max-age=315360000 (one year).
Modules
Originally, JS functions existed in the global namespace and had no way of partitioning code. Hence, there was no way to protect module-internal symbols.
Modules solve the problem but… it was solved multiple times:
- ES2015 (now the official standard)
- CommonJS
- AMD (Asynchronous Module Definition)
- JS module pattern: `function(…) { /your entire module here/ }()
ES2015
ES2015 is:
- Part of the official ECMAScript2015 standard
- Well-supported natively or with polyfills
// lib.js
export default export0;
export const export1;
export const export2;
// main.js
// Import specific components
import theDefaultExport, { export1, export2 as alias } from "lib";
// To import anything
import * as lib from "lib";
// Run global code but don't import anything
import "lib";
CommonJS
CommonJS was adopted early on by NodeJS:
// lib.js
exports.export1 = "bla";
// main.js
const lib = require("./lib.js");
The DOM
JS can dynamically change any element in the DOM as well as their properties, including their CSS styles.
The DOM can become excessively large in some applications, with performance being hurt when a large number of nodes need to be modified.
Virtual DOM
The virtual DOM, popularized by React, is an abstraction of the DOM that exists in the JS. The virtual DOM is first modified, and the real DOM is updated only if required. Because repainting is a very expensive operation so by batching changes, it can lead to increased performance.
Hence, when using libraries such as React and Vue, you do not work directly with the DOM.
In order to update the virtual DOM, it needs to know when the state has changed. To do this it has two main methods:
- Dirty checking: poll the data at regular intervals to recursively check the data structure
- Observables: use observable data objects so that it is notified on changes
When the state is updated, it needs to compute the diff so that only the required DOM nodes are updated.
DOM-based XSS Injection
XSS: when a malicious script is run on a trusted side, getting full access to the DOM and cookies.
Hence, when manipulating the DOM, care must be taken to ensure that untrusted data is only ever treated as displayable text, not executed as code. This can be done by avoiding dangerous functions such as eval, innerHTML. Sanitizing the input data is difficult as some payloads can take advantage of browsers helpfully ‘fixing’ malformed HTML that gets through the sanitizer.
Performance
Response Times: 3 Important Limits
100 milliseconds: the limit for having the user feel the system is reacting instantaneously
1 seconds: the limit for the user’s flow of thought staying uninterrupted
10 seconds: the limit for keeping the user’s attention focused on the program
Yahoo Performance Rules
- Minimize HTTP requests; CSS sprites (NB: horizontal over vertical layout reduces file size), image maps etc.
- Use CDNs; closer is better
- Cache resources
- Use Gzip
- Put stylesheets in the head; this makes it seem faster
- Defer script loading - they can block parallel downloads from the same origin
- Make JS/CSS external
- Reduce DNS lookups; the number of unique hostnames used
- Minify JS/CSS
- Avoid redirects; each adds another round-trip
- Remove duplicate scripts
- Use ETags
- Cache and optimize AJAX requests
- Flush the buffer early; start sending before the full page loads (e.g. flush once after the head)
- Use GET over POST; headers and data are sent separately
- Post-load non-critical components
- Preload components - make use of idle time
- Reduce the number of DOM elements
- Reduce the number of
iframes - No 404s - they still download the body
- Reduce cookie size
- Minimize DOM access - create detached trees first before adding it to the DOM
- Reduce the number of event handlers - events bubble up so attach it to a parent element, then alter behavior depending on the target
- Optimize images; don’t scale in HTML
- Avoid empty
<img src="">- some browsers may make a request to the page or its directory
Lazy Loading with Intersection Observer
Intersection Observer API: register an element and a callback is executed when the element enters/exits another element, the viewport, or when the amount of intersection changes enough.
const observer = new IntersectionObserver(callback, {
root: document.querySelector("#elementToObserve"),
rootMargin: "0px 0px 0px 0px", // margin surrounding the root element to grow/shrink its bounding box
threshold: 0.5 // 0.0 means if a single pixel is visible, the target is visible. 1.0 means the entire element must be visible
})
Progressive Web Apps
Web apps that appear to be ‘installed’ like native applications.
Service workers allow notifications and background sync (offline cache). Service workers:
- Act as proxy servers sitting between the app and the network
- Run on their own thread
- Are headless: cannot access the DOM
- Require HTTPS
- Associated with a specific server/website
10. Web Storage and PWAs
Web Storage and IndexedDB
Local and Session Storage
Cookies; ~4 kB max, sent to server on every request. HTML5 storage supports ~5 MB max.
Storage is key-value string pairs accessible to JS on a specific origin (protocol + domain + port).
Two types of storage:
- Local storage: permanent data for site; key-value string pairs
- Session storage: cleared after end of session
IndexedDB
A indexed NoSQL DB:
- Multiple object stores
- Primary keys
- Indexes
- Asynchronous CRUD requests
CacheStorage
Accessible to service workers and hence requires HTTPS.
Stores pairs of request and response objects; for caching web resources.
Can be few hundred MBs.
self.addEventListener('install', event => {
event.waitUntil(
caches.open("my-cache").then(async cache => {
// Fetch, then add to cache
await cache.addAll(
[
'/css/bootstrap.css',
'/css/main.css',
'/js/bootstrap.min.js',
'/js/jquery.min.js',
'/offline.html'
]
);
// Shorthand for:
await cache.add(new Request("some-path"));
// Can also put arbitrary data into cache
await cache.add("some-path", new Response("some-data"))
})
);
});
self.addEventListener('fetch', event => event.respondWith(
caches.open("my-cache")
.then(cache => cache.match(event.request))
.then(response => {
if (response == undefined) {
// Not in cache
return fetch(event.request).then(response => {
// Response/request body can only be read once, so use `clone`
cache.put(event.request, response.clone());
return response;
});
}
return response;
})
)
);
Progressive Web Apps
Appear to be ‘installed’ like native apps: begin life in a browser tab and can be ‘installed’.
A PWA must:
- Originate from a secure origin
- Load while offline
- Reference a Web App Manifest: JSON file with properties such as name, start URL and icon
- Linked in the HTML page:
<link rel="manifest" href="./manifest.json">
- Linked in the HTML page:
It should:
- Act like an app; mobile friendly and fluid animations
- Load quickly: < 5s before service worker installed, < 2s after installation
Service Workers
Proxy ‘servers’ between the web app and network. Runs headless in its own thread, and must use HTTPS.
Service workers are used for notifications and background sync.
Service workers must be started by the web page:
if ("serviceWorker" in navigator) window.addEventListener("load", () => {
navigator.serviceWorker.register("./worker.js").then(registration => {
console.log(`Scope: ${registration.scope}`);
}).catch(err => console.error(err));
});
After installation, it can be in a few states:
- Activated
- Idle
- Terminated
- Fetch/message
Using service workers and CacheStorage:
const cacheName = "name";
const precacheResources = ["/", "/index.html", "/main.css"];
self.addEventListener("install", event => {
event.waitUntil(caches.open(cacheName)).then(cache => cache.addAll(precacheResources));
});
self.addEventListener("activate", () => console.log("Activated!"));
self.addEventListener("fetch", event => {
console.log(`Intercepting fetch to ${event.request.url}`);
event.respondWith(
caches.match(event.request).then(cachedResponse => {
if (cachedResponse) return cachedResponse;
// Fallback to making a network request if not in cache
return fetch(event.request);
})
)
});
WebAssembly
Binary code that is pre-compiled to wasm bytecode - supported languages include C, C++ and Rust.
Ahead-of-time or just-in-time (interpreted) compilation of wasm.
SENG365 Exam Notes
GraphQL
REST: weakly-typed, multiple round-trips; bad especially in mobile networks.
Prevents over-/under-fetching by fetching exactly what data you need, all without having to update the backend and frontend in sync.
All responses return 200; errors returned in user-defined fields.
Security
OWASP Top 10:
- Injection:
eval, SQL injection, log injection (e.g. newline in username) - Broken auth: MITM attack with session information exposed in plain-text, un-hashed/weak passwords, no rate limits, no session invalidation
- Sensitive data exposure: plain text or weak encryption while in transit, passwords not encrypted, other flaw used to extract information after decryption
- XML external entities: XML parser accessing local files etc.
- Broken access control: roles/user permissions not validated
- Security misconfiguration: insecure defaults, unnecessary features enabled, no hardening, default accounts/passwords used, unpatched, stack traces in error messages
- XSS:
- Persistent XSS: page with unsanitized user input viewed by victim
- Reflected/DOM-based XSS: victim sent URL with attacker-controlled JS in query parameter
- Insecure de-serialization: attacker modifies serialized data sent to server (e.g. changing own user permissions, directory path, replay attacks)
- Vulnerable components: outdated/unpatched libraries, databases, OSes etc.
- Insufficient logging
Authentication: establish claimed identity
Authorization: establish permission to act
HTTP stateless: authenticate and authorize every request.
SPAs
SPAs: can redraw any part of the page without requiring a full reload.
Model-View-*:
- Presentation: specific to UI
- Session state; data bound to the view
- Record state, source of truth, possibly server
Model-View-Controller:
- Model: responsible for data and business rules
- Updates the view
- View: a representation of the modelA
- Sends events to controller
- Controller: responds to user events
- Updates the model
- Controller and view have mix of platform dependent/independent code
model user event
Model ---------> View -------> Controller --------> Model
update event handler
Model-View-Presenter:
- Three-tier: view and model completely isolated
- Presentation layer handles model updates and updates view if necessary
model update
Model <--> Presenter <---> View
user event
Model-View-ViewModel:
- ViewModel is view of the model; model transformed if needed and containing only what the view needs
- Two way binding between view and view model
- Business logic and data in the model
Used by Vue.
model update two-way
Model <---------------> ViewModel <---------> View
event handler binding
Communication
XMLHttpRequest (or JQuery’s AJAX - async JS and XML):
const req = new XMLHttpRequest();
req.addEventListener("load", (response) => ...
req.open("GET", "http://example.com");
req.send();
Asynchronous, via callbacks. JS’s concurrency model uses event loops - it is single threaded so it can deal with multiple tasks at once, but is not parallel - it does not run multiple things simultaneously. When the call stack is empty, it grabs oldest finished item from the event loop which is finished and pushes it onto the call stack.
Fetch is a newer API which uses promises:
fetch("http://example.com")
.then(res => res.json())
.then(obj => console.log(obj))
.catch(err => console.log(err));
CORS
Browser disallows XHR requests to different origins (protocol + domain + port).
Server can optionally allow this. Browser sends an HTTP OPTIONS request before making any XHR request:
Access-Control-Allow-Origin: origin that can request the resource. Can be*, although this disables cookiesAccess-Control-Allow-Methods: comma-separated list of allowed HTTP methodsAccess-Control-Max-Age: number of seconds the response of a preflight request can be cachedAccess-Control-Allow-Credentials: true: cookies sent if trueAccess-Control-Allow-Headers: request headers the JS can modifyAccept,Accept-Language,Content-Language, andContent-Typeare safe headers and always allowed
Access-Control-Expose-Headers: response headers the JS can read (safe headers always allowed)
WebSockets
Persistent two-way communication.
Client sends GET request to specific endpoint with a few headers (e.g. Connection: Upgrade, Upgrade: websocket, Sec-WebSocket-Key/-Protocol/-Version) to which the server response with a 101 (switching protocols) and a few more headers. After this handshake, it switches to using WebSocket.
const WebSocket = require("ws");
const ws = new WebSocket("ws://example.com/path");
ws.on("open", () => ws.send("A"));
ws.on("message", msg => console.log(msg));
Performance
Bundling: bundles code from multiple files into one, possibly minifying it.
Webpack: configured with app entry point; recursively follows imports, so only code that is used is outputted. Output file name contains hash.
Lazy loading: download only the chunks you need. Requires integration with framework.
Compression: accept-encoding: gzip, ... request header, server may respond with content-encoding: gzip.
Caching
expires:
- Expires on specific date:
expires: DoW, DD MMM YYYY HH:MM:SS GMT
cache-control:
no-store: don’t cache (unless already cached)max-age=${time_in_seconds}: expires after some durationmust-revalidate max-age=${time_in_seconds}: revalidate with server aftermax-age, use cached version otherwiseno-cache: cache, but always revalidate with server- Send
ETag: ${some_long_number}on first request - Send
If-None-Match: ${etag}for subsequent requests- If 304 Not Modified, use cached value
- Alternatively,
Last-ModifiedandIf-Unmodified-Since
- Send
Weak caching requires round-trip but ensures all clients have the up-to-date version of the resource.
SPAs:
index.htmL: probably an ETag- All other files have hash in their name so use
cache-control: max-age=3153600(one year)
Modules
// ES2015: ECMAScript Standard
export const bla = 10;
export default bla;
import defaultExport, { bla, bla as alias } from "lib";
import * as lib from "lib";
// CommonJS: used in many NodeJS libraries
exports.bla = 10;
const lib = require("./lib.js")
Virtual DOM
Copy of the DOM in JS:
- Observable changes (or dirty checking - data structure polled)
- Compute diff
- Update portion of DOM that changed
Web Storage
Cookies: ~4 kB max sent in headers on every request.
Local/session storage: ~5 MB max of key-value string pairs. Specific to origin.
IndexedDB:
- Indexed, NoSQL DB
- Multiple object stores
- Primary keys
- Async CRUD
CacheStorage:
- Cache pairs of request/response objects
Progressive Web Apps
Web apps that appear to be ‘installed’ like native applications.
PWAs:
- Come from secure origin
- Load offline
- Reference a web app manifest in the HTML head
Service workers allow notifications and background sync (offline cache). Service workers:
- Act as proxy servers sitting between the app and the network
- Run on their own thread
- Are headless: cannot access the DOM
- Require HTTPS
- Associated with a specific server/website
Testing
Automated tests of the backend server fails because interpretation and/or implementation of API by the server and tests differ:
- Bad assumptions
- Race conditions (manual vs automated timescales)
- Server environment different from dev
Test for both successful and failure cases; note that some tests may pass for the wrong reasons.
Tests should be independent where possible.
User pre- and post-conditions for set-up and tidy-up.
Chai-HTTP:
- Throws errors for 400/500 status codes
W3C WebDriver:
- Interface to manipulate DOM and control behavior of the user agent
- Platform- and language-agnostic
Selenium-WebDriver:
- Supports automation of dynamic web pages (e.g. SPAs)
- Can be used without Selenium Server:
- Browsers and tests run on the same machine
- If server used:
- Selenium-Grid distributes tests over multiple VMs
- Can connect to machines with a particular browser version
const webdriver = require("selenium-driver"),
By = webdriver.By,
key = webdriver.Key,
until = webdriver.until
;
const driver = new webdriver.Builder().forBrowser("firefox").build();
driver.get("https://google.com")
.then(() =>
driver.findElement(By.name("q"))
.sendKeys("reddit", Key, RETURN)
).then(() => {
driver.wait(until.titleIs("reddit - Google Search"), 1000)
}).then(() => driver.quit())
Vue test utils: arrange, act, assert
Misc
"use strict"; // First statement inside a script or function
// this defaults to undefined
// variables must be declared
// errors thrown instead of tolerating some bad code
// `with` statements and octet notation rejected (use 0o123)
// `eval` and other keywords cannot be assigned
// ES6 modules always in strict mode
Semester 1 Test Notes
- URI: Uniform Resource Identifier
- URL: Uniform Resource Locator; URI + access mechanism (e.g. HTTP)
- scheme://user:password@host:port/path?query#fragment
// REQUEST
{METHOD_NAME} {URL} HTTP/{VERSION}
//RESPONSE
HTTP/{VERSION} {STATUS_CODE} {STATUS_DESCRIPTION}
// HEADERS
{HEADER_NAME}: {HEADER_VALUE}
// EXAMPLE
Set-cookie: {NAME}={VALUE}; Expires={DATE}
{BODY}
- 1xx: info
- 2xx: success
- 200: okay
- 201: created
- 3xx: redirection
- 301: permanent redirect
- 307: temporary redirect
- 4xx: client error
- 400: bad request
- 401: unauthorized
- 403: forbidden
- 404: not found
- 5xx: server error
Body:
- Known length:
Content-TypeandContent-Lengthheaders - Unknown length (HTTP/1.1):
Transfer-Encoding: chunked(data sent in chunks)
REST:
Identify resources by ID (integer/UUID).
Safe: read-only
Idempotent: calling method multiple times does not affect state
| HTTP Method | REST Type | Safe | Idempotent |
|---|---|---|---|
GET |
Read | Yes | Yes |
HEAD |
N/A | Yes | Yes |
PUT |
Update/override | No | Yes |
DELETE |
Delete | No | Yes |
POST |
Create | No | No |
PATCH |
Partial update | No | No |
e.g. PATCH could request to increment the value of some property
HTTP/REST are stateless: request must include all parameters and response must return entire resource. Underfetching/overfetching increases latency and data transfer.
Naming: no server-side extensions, all lowercase, / /-/g, consistent singular/plural naming
Versioning:
- For A/B testing, test/dev/prod separation, partitioning clients via type (business/consumer, region)
- Semantic versioning: incompatible changes/backwards-compatible features/backwards-compatible bug fixes
- Version in query parameter, URL or header
- Backwards compatible if client does not to be changed when the API updated
- Forwards compatible if client can be changed without the API updating
State:
- HTTP request: session parameter stored in URL (e.g. query parameter)
- Cookie:
- May be
persistentor be deleted when thesessionends (browser closed) Secure: HTTPS onlyHTTPOnly: not accessible to JS- Identifies browser on a specific account on a specific device
- May be
JS:
var variableName = function functionName() {}
// Immediately invoked function expressions
var IIFE = (function() { return val })(); // succeeds
// var IIFE = function() { return val }(); // functions cannot be immediately invoked
// Succeeds: converts it to a statement?
+function() { do_stuff }(); // + undefined equal to NaN
!function() { do_stuff }();
(() => {
// var x; // Variable is hoisted to top of function
if (true) {
var x = 1;
const y = 1;
}
console.log(x); // Succeeds
console.log(y); // Fails
})();
// Closures: uses variables in-scope at time of the definition
this; // window in browser, global in node
"use strict"; // First statement inside a script or function
// this defaults to undefined
// variables must be declared
// errors thrown instead of tolerating some bad code
// `with` statements and octel notation rejected
// `eval` and other keywords cannot be assigned
// ES6 modules always in strict mode
// Call stack: stack of function calls
// Heap: allocated memory
// Queue: queue of events. One for each of DOM, network, timers
setTimeout(() => console.log(0), 0);
console.log(1);
// 1 prints first
// JS waits for call stack to clear before running the oldest event
// CommonJS; one specification for managing module dependencies. Used by node
// Use ONE OF:
exports.name = val;
module.exports.name = val;
const val = require("val");
// JSON: lightweight data interchange
// No versioning
ACID:
- Atomic: all or nothing
- Consistency: invariants preserved
- Isolation: effects of in-progress effects invisible to others
- Durability: committed transactions are persistent
CAP: choose two of:
- Consistency: always read newest value
- Availability: always get a response
- Partition tolerance: works with partitioned/offline network
- Networks will fail so you will have to deal with this
BASE:
- Basic Availability
- Soft state: state can change over time
- Eventual consistency: wait long enough, then data will be consistent
BASE may split an ACID transaction into multiple ‘transactions’; invariants may not be preserved.
KV Databases:
- Simple; fast; flexible; easy to scale; high availability possible
- No schema or validation; no invariants/consistency checks; no relationships; no aggregate operations; no search (apart from via PK)
Document Databases:
- Store bunch of files and metadata
- Structured JSON, XMl
- Binary PDFs
- Build index from content and metadata
- No schema migrations; schema is stored with the documents
- Risk of inconsistent/obsolete document structures
- Redundant information stored
Graph DB:
- Nodes represent entities
- Edges can be uni- or bi- directional
- Hypergraph: edge that joins multiple nodes
- Properties for nodes/values; usually stored as key-value set
- Can use this to have nodes/edges of different types
- Semantic web (RDF):
subject-predicate-object(nodes are nouns, edges are verbs, attributes are adjectives/adverbs respectively) (better information interchange) - Labelled property graph: nodes/edges have internal structure (more efficient storage)
- Useful for centrality:
- Degree: Number of incoming/outgoing connections
- Closeness: average length of shortest paths
- Betweeness: number of times a node is in the shortest paths
- Eigenvector: influence of node on network