01. Principles

Security is a process, not a product

Bruce Schneier

Course

SENG406: brand-new course.

Developing secure software
Ensure that people using the software cannot break it (easily)

Assessment items:

Description	Marks	Due date
OWASP threat model (small groups)	10%	Week 4
Secure coding (improved codebase from lab)	20%	Week 6
Literature review (individual)	20%	Week 8
Security audit (open source software)	20%	Week 12
Final exam (in-person)	30%	N/A

Exam will cover mainly content covered in class. Additional resources covered in class are examinable.

Lecture content:

Principles of software security
Threat modeling, secure development lifecycle
Attack tactics
Common (web) vulnerabilities
Secure design
Security protocols, cryptography
Logging, auditing
Security testing
Security evaluation
Privacy, governance

Weeks 11, 12: time for final assignment.

Labs:

Week 2/3: threat modeling (tutorial)
Week 4: Web security tools (Burp, Wireshark)
Week 5/6: secure design (inc. static analysis)
Week 7: literature review (tutorial)
- Can start work on assignment during the tutorial
Week 8/9: fuzzing (inc. metasploit, sqlmap)
Week 10: privacy and governance (tutorial)

Software security is different from software engineering in general in that new attacks, threat vectors, and actors are coming out all the time, making it critical that we stay up to date. We must be proactive and ensure we are constantly monitoring the systems.

This course is about:

Modeling security threats (both software and organizational)
Evaluating software systems for security issues
Putting measure in place to:
- Minimize the risk of security breaches
- Restore systems after an attack (or after it just crashes)
Understanding data privacy and governance issues

It is not about:

Technical knowledge of secure networks
Advanced cryptography topics
Hardware-related security devices

Log4j:

Gave remote access to any server which logged a user-controlled string
Logs supported template strings
- Supported JNDI, which downloads and executes remote code
Root issue: user input was not being sanitized

Group discussion

Scenario:

Security engineer for Uber Eats
Three subsystems: end-user, restaurant driver
All users submit personal details, inc. address, bank details
Drivers geolocated in real time, can do multiple hops before a delivery
- Users cannot see hops for other users’ delivery

Vulnerability discovered in GPS library (fix ready, assume similar to log4j):

What information do you need to evaluate the impact of the vulnerability?

How can you get access to that information?

What action plan would you put in place?

How do you communicate the issue with customers/dev team?
How do you roll out the patch?
- How do you force users to update (if necessary)?

Postmortem

What did your learn, how will you minimize the risk of a similar incident?
What is needed for a postmortem? Who should be involved?

Step 1, first response:

What versions are affected?
- Vulnerable versions of the app, the library
- Affected subsystems
  - Trace paths lead to/from the library
Who was possibly affected?
- All active users during the time?
- Some subset of users?
Do we know if anyone’s data was actually leaked?
- Has anyone actually exploited the vulnerability?
- Do the logs contain any traces
Is the vulnerability public knowledge?
- If known, copycat attackers will likely appear quickly, including pre-made scripts that exploit the vulnerability, lowering the technical knowledge required by the attacker
Contact GPS library developers

Step 2, communication:

Communicate quickly and honestly
In-app notification, email and texts to every one that may have been affected
Plus public announcement: press release, Twitter etc.
Additional communication for those actually impacted
- For both drivers and users
From class: response should depend on
- If vulnerability is publicly known
- If there is a fix in place

Step 3, postmortem:

Why was the GPS library being used?
- Does it follow secure coding practices
- Did the team validate it before adding it into the app?
External security audit

Security Engineering

Six goals:

Authentication: people/data are who/what they claim they are/to be
Authorization: access control (both physical and digital) to restricted resources (both computing and data)
Confidentiality: non-public data is not freely accessible, including when in transit
Integrity: software, hardware and people remain unaltered, unless authorized
Accountability: trace of past transactions and actions, also for non-repudiation purposes
- Legal recourse against insider threats
Availability: resource are available for access, including resistance to attack

Terminology:

Asset:
- Anything that needs protection: hardware, software, information, communication
Policy:
- How the assets secured
- Specifies what needs to be protected, including user permissions
Theory v. practice
- Exhaustive list of secure and non-secure states, versus practical guidelines
Attack and agent
- Deliberate sequence of steps to exploit vulnerabilities, carried by an adversary
Threat
- Combination of an agent and vulnerability to corrupt an asset (through an attack vector)
- If there’s a server that no one is using that an attacker can crash, it doesn’t matter
Controls and countermeasures
- Technical, operational and management security mechanisms

Security violations and attacks:

Security violations: policy-violating actions that put the asset in an insecure state
Attacks: attack by some threat agent targeting some asset using a vulnerability

Risk assessment:

Risk assessment is complicated
Risks are context dependent
$\text{risk} = \text{threat} \times \text{vulnerability} \times \text{cost}$
Cost-benefit analysis:
- Estimated frequencies of occurrence drive security decisions
- $\text{AnnualLossExpectancy} = \sum_i^n{\text{frequency} * \text{cost}}$

02. Threat Modeling

‘Classic’ plan-based process:

Sequential process:
- Requirements
- Design
- Test plans
- Coding
- Testing
- Deployment/maintenance
90s: the internet becomes a thing. Hacking also became a thing.
Security-related activities jammed into the process
- User risk analysis
- Design risk analysis
- Static code analysis
- Developer training
- Coding standards development
- Security metrics development
- Penetration tests
  - Breaking into real systems

Secure development lifecycle (SDLC):

Requirement: security and privacy assessment, project plan
Architecture: poly assessment, architecture choices
Design: security test plan, threat modeling
Development: code quality assurance (e.g. static analysis, fuzzing)
Release: policy-compliance analysis, vulnerability scan, penetration testing

It is wrong to assume that if you can’t measure it, you can’t manage it - a costly myth

W. Edwards Deming

e.g. how well-trained are users in detecting phishing attempts? Even if you can’t measure it, you can still implement training.

Security assessment:

Early-stage planning
Discover and document relevant security aspects and turn them into functional requirements:
- Identify key persons, laws, standards, infrastructure, third-parties
- Plan security-related processes (e.g. coding, testing, review, certificating)
- Define reporting and monitoring objectives
  - React in real-time to anomalous conditions (e.g. system overload, DDOS attacks)
Privacy impact assessment
- Exhaustive list of collected data
- Understanding of legislating
- Education on data retention, manipulation
- Least-privilege as a core driver

Architecture:

Policy assessment and scoping:
- Define security objectives (e.g. expected level of security, accepted risk)
- Refine definitions of assets (e.g. new entry point, dependencies)
- Decompose software in subsystems: for all subsystems, identify threats
- Threat space:
  - STRIDE
    - Spoofing
      - Accessing/using another user’s credentials
    - Tampering
      - Maliciously changing or modifying persistent data or data in-transit
    - Repudiation
    - Information disclosure
      - Accessing data that they should not have access to
    - Denial of service
      - Denying access to valid users
    - Elevation of privilege
      - Gaining access to previleged resources to access information or compromise a system

Modelling data flows:

                                       Trusted Space
                                     |
                         _____       
---------    data      /       \     | sanitized     ----------
| Actor |  ------->   | Process | ----------------->  Database
---------              \       /     |   data        ----------
                         -----  
                                     |

Element	Spoof.	Tamper.	Repudiat.	Info. Discl.	DoS	Priv. Elev.
Data flow		x		x	x
Store		x		x	x
Process	x	x	x	x	x	x
Actor	x		x

Need to specify the semantics of the data flow: is it payment data? the weather?
The diagram should model process and data flow, not sub-systems

Software detailed design:

Security testing plan:
- Dedicated test scripts for security aspects (i.e. test STRIDE threats)
- Identify key stakeholders and resources (including knowledge gaps)
- ‘Misuse cases’: how can a hacker misuse something (i.e. hacker’s user stories with ACs)
  - e.g. signing up for an account with someone else’s email: confirmation emails required to prevent spoofing
  - Microsoft DREAD:
    - D: How big is the damage
    - R: How easy is it to reproduce
    - E: How easy is it to exploit? How much time, effort, and expertise is needed?
    - A: How many users are affected
    - D: How easy is it to discover?
- Define black-box testing strategy (e.g. penetration testing)
  - Sending random requests and seeing what comes back (always return uniform error responses!)
Review your threat model
- Is new design bringing new threats or assets?
- Is the design still in line with legislation and policies
- Can we ensure a fail-safe state
- What level of risk can you tolerate? $\text{risk} = {threat} \times \text{vulnerability} \times \text{cost}$

Development and coding:

Methods and tools:
- Linters, static code analysis
- Dynamic code analysis (execution monitoring)
- Secure coding practices (e.g. sanitation, encapsulation, least-privilege)
- Fuzzing
- Code reviews (with checklists)
Detailed data flow analysis

Release and deployment:

Policy compliance:
- Specify, report on results of security activities
- Update, review deployment infrastructure
- Discuss with stakeholders and document
Vulnerability scanning, continuous testing
- Tool-supported monitoring
- Chaos engineering: test fail-safe states
Reporting infrastructure
- Reporting of issues must be part of security policy
- Reporting workflow must be clear and communicated
- System must be monitored

OWASP Threat Modeling Process:

Decompose the application: identify use cases, users, external dependencies etc.
Determine, rank threats
Determine countermeasures or mitigations
- Accept, eliminate or mitigate risks
Entry points act as trust boundaries
- For each entry point, note down the different trust levels (e.g. anonymous, logged in) that can access it
Exit points: can be used in XSS, information extraction (e.g. SQL injection)
Assets:
- For each asset, note down trust levels (e.g. admin, web server process, DB user)
  - Assets could be things such as login sessions (e.g. cookies)
Threat tree:
- Root is attack on an asset
- Children are possible attack vectors (e.g. user did not log out)
- Grandchildren are possible mitigations for that vector

Uber Eats Scenario:

End-user app:
- Contains private details, contact details, address, and bank / PayPal account
- Order food to, review restaurants, pay via Uber Eats, can tip, and review driver
Restaurant app:
- Contains bank account, and business details
- Food safety information on menu items (e.g., allergen)
- Negotiated margin, and service fee (private to a restaurant)
- Receive orders from end-users, authenticate, match driver and ordered food
Driver app:
- Receive delivery requests, can combine multiple from same restaurant
- Can communicate to the end-users through the app

03. Attack Tactics

Adversaries:

All are bad, some have ‘good’ reasons:
- Government-funded agencies
  - e.g. NSA, GCHQ, GRU
  - Some of the most sophisticated adversaries
Botnets (i.e. infected machines)
Malware developers
Cashout (selling data, access to machines) and ransomware gangs
Locked-in products (e.g. printer cartridges, BMW heated seats)
- GM: expects to make more profit from software than cars
Everyday aggressions (e.g. hate campaigns, cyberbullying, abuse)

Targeting individuals is harder than targeting everyone:

Targeting individuals may require hardware access
Target everyone and filter (e.g. PRISM, Tempora programs)

Psychology Aspects

The User

Education:

Ability to recognize an attack (e.g. phishing, infected email)
Ability to install and maintain security tools (on a limited budget)
Ability to make informed choices (with limited knowledge)

Ability to detect deception:

Anchoring effect: the first details have a disproportionate impact on people’s judgment
Availability heuristic: inference in based on examples we know of
Origin of danger: skepticism about things we have heard vs. seen
Biased representation: bigger consequences lead to increased fear
Experience vs. education: lived experiences have a bigger impact

Behavioral Economics

Present bias and hyperbolic discounting:

From ‘update now’ interruptions to ‘pick a time’ reminders
- Some people would always click on ‘not now’ or ‘cancel’ and not install important security updates
Privacy paradox: people claim to care about their privacy, but at the same time give up a lot of information online to access free services
- In the physical world: signing up for membership cards and giving personal details to get discount

Defaults and ‘choice architecture’:

Opt-in vs opt-out
- e.g. becoming an organ donor in NZ is an opt-out process
Default settings to your benefit
Granularity of configurations to overwhelm users
- Facebook privacy settings
- GDPR cookie opt-out settings

Privacy control settings give people more rope to hang themselves

George Loewenstein

Intentionality and cognition

Attribution error: attributing error to personality, rather than context (but vice-versa for yourself)
Affect heuristic: emotions taking over decision-making
Halo effect: impersonating trusted and renowned brands
Cognitive dissonance: people do not wish to admit that they have been tricked, even in the face of evidence
Recency effect: so much information that you ignore most of it
- e.g. warnings/alerts with too many false-positives: people start ignoring them to the point that true-positives get ignored
Risk thermostat: increased safety pushes more risk elsewhere
- e.g. seatbelts leading to people driving faster and causing more accidents

Education must be fit for audience:

How effective is it to teach users to identify phishing by URL?
- Can you teach that to your grandparents?
Curiosity and thrill of risky behavior
- e.g. UK teenagers in DDoS attacks

Deception Techniques

Common sales techniques:

Reciprocity: the need to return favors
Social proof: the need to belong to a group; the smaller, the better
Authority: more likely to obey (purported) authority figures
Scarcity: fear of missing out

Stajano and Wilson’s 7 principles of scam (2011):

Distraction: like with magic shows, misdirect the audience’s attention
Social compliance: not questioning authority figures as much
Herd principles: follow the flow
Dishonesty: the deal is good because it is borderline unethical/illegal
Kindness: linked to reciprocity sale techniques
Need & greed: find what you want and make you dream of it
- Ask questions about what the victim wants rather than trying to directly sell the product
Time pressure: e.g. ‘only 2 more seats left’

User Credentials

Passwords

Passwords:

The most common authentication mechanisms
Rules for ‘good’ passwords
Its transfer and persistence must be carefully looked at
- Passwords can sometimes be accidentally stored in log files
Generated passwords and password managers lower the chance of ‘cross-hacking’

Advanced tools to safely reuse accounts:

Single sign-on
Intrusion alarms for stolen credentials (e.g. haveibeenpwned)

Password recovery is not just a ‘send a magic link’

Scope is important (e.g. website with non-sensitive data, versus a bank)
What happens to your security questions if an account is pwned?

Good password practices:

From local systems to openly accessible ones
- UNIX used to store encrypted passwords in /etc/passwd
The choice is often between choosing something ‘easy’ or writing it down
- Password managers may sometimes not be possible (e.g. organizational accounts)
- What is ‘easy’ for you (and not for a guesser?)

Memorability (Yan et al., ‘Password memorability and security, empirical results’, 2004):

Asked participants to generate passwords:
- Select your own guidelines (e.g. character classes, length)
- Passphrase-based (e.g. ‘correct-horse-battery-staple’)
- 8 character randomly chosen (with a week to memorize)
No significant difference in memorability between the three techniques
Found the passphrase was the most secure

Guidelines and real life:

NIST previously recommended frequent password changes
- Still widely followed by auditing organizations, despite new organization
- 40% were guessable from prior passwords (e.g. adding month/year, or an incrementing number, to the end of the password)
Organizational threats:
- Envelopes stuck to machines/under the keyboard
- Non-resettable default passwords
  - Or forgetting to change the defaults
- Non-encrypted passwords
Countermeasures to social-engineering threats
- No more links (e.g. emails from banks); just ask users to visit the website
- Customer education and phishing warnings (must be explicit)

Non-phishing Attacks

(Automated) systems to get illegitimate access to a particular account:

Brute forcing guesses
Potentially informed by prior data links

(Automated) systems to get details of all accounts:

Attempts to penetrate a server and steal a password file and key
The adversary can then recover the encrypted password offline

(Automated) systems to block accounts:

The service may have systems may to block accounts or trigger a login timeout after some number of failed attempts
Denial of service attack

If your encryption, OS and network security mechanism are trusted, it comes down to two factors:

Password entropy
User psychology

Security and Organization

Security players:

Red team:
- Offensive team
- Penetration/black box testing
- Social engineering
Blue team:
- Defensive team
- Damage control
- Incident response
- Operational security
Purple team:
- Improvement facilitation
- Data analysis

Types of malware:

Virus: spreads via user interaction
Worms: spreads automatically
Trojan: malware disguised as legitimate software
Rootkit: malware that can gain root access, often implemented very low in the software stack
Spyware: monitors a user’s activity
Blended threats: malware using multiple attack types
Remote access: gaining access to TeamViewer/AnyDesk/VNC/etc. instances
- VNC: attackers replaced executable on official distribution website with version containing malware
Adware: malware which maliciously feeds ads
Exploit kit: tools to automatically deploy and manage exploits

Knowledge Bases

Mitre Att&ck:

Directory of potential attack techniques, per type of operations systems
Grouped into 14 categories (e.g. reconnaissance, privilege escalation)
Each technique contains:
- Examples
- Potential mitigation
- Detection methods

Tactics:

Reconnaissance: gather knowledge about the target system (e.g. IP/port scanning)
Resource development: build the capability to launch attacks (e.g. create fake accounts)
Initial access: techniques used to gain first access to a system
Execution: running the payload on the compromised system
Persistence: ensuring the payload stays on the compromised machine
Privilege escalation: gaining higher-level permissions on the compromised machine
Defense evasion: evading detection by security software/staff (e.g. renaming, spoofing of parent process ID)
Credential access: gaining credentials of legitimate users (e.g. MITM, brute force, phishing)
Discovery: gathering information about the inner workings after gaining access to aa system
Lateral movement: gaining access to other systems
Collection: collecting data to exfiltrate (e.g. passwords, keys, financial data)
Command and control: infect a machine to run unwanted processes
Exfiltration: transferring collected data/knowledge from the infected systems to the adversary
Impact: interrupt, manipulate, or destroy infected systems (e.g. encrypt data, wipe, DoS)

Assignment

3/4 assignments are in small groups of 2-3
- Do not work with people you have already worked in previous assignments (in this course)
Group registration closes one week before delivery date
- Two weeks for the assignment 4
- You cannot submit your assignment if you miss it, and will get zero marks
All assignments have a grace period of one week: can submit up to one week after the submission date with no penalty
- No way to get an additional extension except under special circumstances
  - Must apply before the official submission date

04. Web Communications and Vulnerabilities

News of the week

https://www.socialmediatoday.com/news/twitter-reports-new-security-flaw-which-has-led-to-the-exposure-of-54-mill/629037/

Feature: connecting to people whose email and phone number you know.

Flaw allowed association of anonymous accounts with emails and phone numbers.

Introduced June 2021, disclosed after 6 months by security researcher, announced August 2022.

Web Communication

Encryption works. Properly implemented strong crypto systems are one of the few things that you can rely on. Unfortunately, endpoint security is so terrifically weak that NSA can frequently find ways around it.

Edward Snowden

Encryption works. The problem is everything else.

OSI model protocols:

Application: HTTP, FTP
Presentation: SSL
Session: SSH, NFS
Transport: TCP, UDP
Network: IP, NAT
Data link: ARP, PPP (Point to Point Protocol)
Physical: IEEE 802.3 (Ethernet), USB, 802.11 (Wi-Fi)

URL format:

   Unqualified hostname
          |--|
  https://foo.bar.example.com:443/some/path/to/a/file?query=cat
  |___|       |   |_________|
 Scheme       | Second-level domain
              |_____________|
                 Subdomain

OWASP Top 10

Open Web Application Security Project

A07:2021 – Identification and Authentication Failures

Automated attacks: credential stuffing, brute force
Weak/default passwords
Insecure password storage
No/ineffective MFA
Session identifier exposed in URL
Reuse of session identifier
Session IDs not validated

Insecure Design

Security flaws caused by:

Inappropriate or reckless design decisions that lead to potential risk
- Lack of knowledge, time pressure, wrong definition of deployment constraints
Careless exposition or transfer of sensitive data
- e.g. passwords, phone numbers being sent in logs

Secure design lifecycle as a drier:

Careful and continuous threat modeling
Active monitoring of vulnerabilities in third-party dependencies
Apply least-privilege principles

05. Secure Coding Principles

OWASP Top 10 (Class)

Broken Access Control:

Users can act outside their intended permissions
Causes:
- Not following principle of last privilege
- Lack of access control on some endpoints
- Account manipulation: manipulating URLs, cookies, unique IDs
- CORS misconfiguration
Countermeasures:
- Deny by default
- Only allow resource owners, not all users, to CRUD
- Re-use authentication systems (e.g. middleware)
- Follow standards (e.g. JWT)
Example:
- MV720 GPS tracker: unsecured HTTP endpoint

Cryptographic failure:

Sensitive data exposed due to weak or non-existent cryptographic algorithms
Causes
- Plain-text communication
- Old/insecure algorithms
- Hardcoded passwords
- Insufficient randomness
Countermeasures
- Store only required data
- Don’t cache sensitive data
- Key rotation, proper IV generation etc.
Example:
- Solana Slope wallet: seed phrases sent in plaintext to central server

Injection attack:

Executing user-provided code through an app to another system
Most frequent: SQL Injection
Causes:
- Unsanitized user input
- Dynamic queries with no content escaping
Countermeasures:
- Prepared statements, parameterized queries
- Escape user input
Example:
- SonicWall (~500K customers): SQL injection issue

Insecure Design

Some service with insecurities caused by its design and architecture, not the implementation

Security Misconfiguration:

Very broad: security not set up, set up poorly, or misconfigured
Causes:
- Unnecessary features enabled:
  - Ports, services accounts
  - Default/unused applications may have vulnerabilities, be outdated
  - Revealing too much information through logs/error messages
- Poor coding practices, default passwords, forgetting to re-enable firewall after testing
Countermeasures:
- Update configuration policies regularly
- Minimize attack surface
- Automatic configuration deployment (e.g. containers): make redeployment easy
Example:
- Jira: visibility of new projects was set to public by default? Adversary that link had access to all data in the project?

Vulnerable Components

Part of system/application which extends the functionality
Causes:
- The component may be unsupported, out-of-date, or otherwise vulnerable
- An attacker may be able to exploit that vulnerability to attack other systems
Countermeasures:
- Remove unused dependencies
- Check they are being maintained
- Keep a centralized list of dependencies and have a process in place to maintain it
- Obtain dependencies from trusted sources
- Don’t broadcast your dependencies

Authentication Failure

Bad handling of user identity, authentication, session management
Causes:
- Brute-force attacks, credential stuffing
- Default/well-known passwords
- No/bad 2FA
- Mishandling session identifiers (e.g. session ID in URL, re-use of identifiers after login)
- ‘Forgot password’ flaws
- Weak password storage
Countermeasures:
- Good 2FA
- Secure admin credentials, not default
- Check password against requirements, bad password lists
Example:
- CVE-2022-26138: Atlassian Confluence contains account with hard-coded user (disabledsystemuser) and password defined in plaintext config file
- UC library room booking: HTTP

Security Logging/Monitoring Failures

Insufficient logging
Exposure of sensitive information in log files
Causes:
- Relevant information not logged
- Logs not examined
- Logs not backed up in case of server failure or breach
- Logging systems not tested
- Logging levels not used appropriately, exposing sensitive data
Countermeasures:
- Ensure logs include sufficient context (and are encoded correctly to prevent injection)
- Ensure logs are in more than on place
- Regular system monitoring
Examples
- CodeFusion 8: failed authentication attempts not logged; brute force attacks possible
- NetProxy 4.03: did not log requests that did not have ‘http://’ in the URL

Server-side request forgery

Application reads data from a URL; attacker can supply their own credential to their own URL or an internal path (e.g. to file containing credentials)
Causes:
- Trusting user credentials
- Not following principle of least privilege
Countermeasures:
- Use allowlists for IP addresses, hosts that the application should be able to access
- Block requests to private IPs
- Verify response bodies
- Block non-HTTP/HTTPS protocols
Example:
- UC Scrumboard: can specify GitLab URL. Server checks if they get a response, and if they get a valid Git response
  - If you pass in localhost:$port, attacker can iterate through ports to determine what ports are in use

Unsafe Constructs

See OWASP Secure Coding Practices

TOCTOU Race

When there is a time lapse between time-of-check to time-of-use:

Two elements (e.g. processes, threads) concurrently making use of the same shared resource
System interruptions can allow an adversary to substitute files
- May often be exploited with temporary files, especially those with a known/predictable path
- Example:
  - Process checks if user can access the file
  - Attacker replaces file with a symlink/hardlink (e.g. to /etc/passwd)
  - Process allows user to read restricted file
- Countermeasures:
  - Use file descriptors instead (i.e. open once)
  - Span child processes with restricted permissions
  - Disallow creation of files by root (e.g. C compilation)
    - Lots of temporary files with well-known file paths?
    - If process reuses existing files instead of re-creating, attacker could inject malicious code?

Overflow Issues (Mostly C/C++)

This may cause the system may crash or read garbage data. However, an attacker may be able to exploit this.

Weak type safety: C will silently convert integers by keeping the least significant bits. After overflow of a signed integer, the value will be the maximum negative number.

Pointer arithmetic:

C uses pointer + offset
Array element size defines the step in bytes, with index for the offset
No bounds-checks on array accesses by default
If index is user-controlled and not sanitized, it may allow an attacker to read/write to arbitrary memory

Buffer overflow and NOP slide:

For each function call, the stack contains function arguments, local variables, return address
Buffer overflow of local variables may allow attacker to overwrite the return address, allowing them to jump to any address
No-operation (NOP) slide:
- Hard to predict where exactly the attacker-controlled payload is
- Fill memory before the payload with NOPs: processor will simply go through the NOP instructions until it gets to the payload
- The NOP sled means that the attacker does not need to know the exact memory address of the payload

Coding practices:

Type-safety, static code analysis:
- Uses languages with type-safe construct/libraries where possible
- Bounds-checking e.g. overflow flags, unchecked memory overwrites
- Non-executable stacks or heaps (although it prevents just-in-time interpretation/execution)
- OS-level protection: address space layout randomization (ASLR)
Least privilege, zero-trust knowledge:
- Authorized access by default: public access should be the exception
- Use securely-generated tokens, HTTP-only flags in cookies (to prevent client-side JS (e.g. malicious extensions) from reading it)
- Treat other components with care, segment internal networks

06. Cryptography 101

Anyone who tries to create his or her own cryptographic primitive is either a genius or a fool. Given the genius/fool ratio of our species, the odds aren’t very good.

Bruce Schneier

Current Events: Experian

Experian: US credit score service.

Past month:

Accounts were being hijacked
Hypothesis: if you create a new account created with an existing user’s account details (e.g. email), you can get access to their details
Experian denied allegations, even after researchers hacked their own accounts
Class action lawsuit filed

Principles

Security protocols are more than passwords.

At the core, security protocols are about preventing malicious people from doing bad things.

Security protocols exist outside of software:

Accessing a building with a card
Making sure you get the wine you paid for, not some cheap substitute
Accessing your car or house with a key

Eavesdropping risks:

Lurked PINs on your credit card/phone
Amplified then stolen encrypted car key codes
Vaccine pass QR codes

Simple Authentication Principle

Notation:

T \rightarrow G : T, \left\{ T, N \right\}_{KT}

Where:

$T$ is a token
$G$ is an access gateway (e.g. garage door)
$N$ is a nonce - a single use, unique number used to prevent replay attacks
$KT$ is an encryption key
$\left\{ T, N \right\}_{KT}$ is the encrypted value of $T$ and $N$ with key $K$
The LHS of the colon denotes communication between entities (sender/receiver)

The nonce is used to prevent replay attacks.

Challenge and Response

Often used by car transponders:

\begin{aligned} E \rightarrow T &: N \\ T \rightarrow E &: T, \left\{ T, N \right\}_{KT} \end{aligned}

Where:

$E$ is the engine controller
$T$ is the car key transponder (e.g. RFID, radio)
$N$ is a nonce
$\left\{ T, N \right\}_{K}$ is the encrypted value of $T$ and $N$ with key $K$

Early 2FA

\begin{aligned} S \rightarrow U &: N \\ U \rightarrow P &: N, \text{PIN} \\ P \rightarrow U &: \left\{ N, \text{PIN} \right\}_K \\ U \rightarrow S &: \left\{ N, \text{PIN} \right\}_K \end{aligned}

Where:

$S$ is the server
$U$ is the user
$P$ is the password generator
$N$ is the nonce/random challenge generated by the server
$\text{PIN}$ is the user’s PIN/password
$K$ is the key stored on the password generator and server
$\left\{ N, \text{PIN} \right\}_K$ is the encrypted value of the nonce and PIN with key $K$

Physical 2FA Devices

A physical device is used to generate authentication numbers:

Chap authentication program (CAP): challenge-response with key and mask
One time password (OTP): psuedo-random password generated on a device, or via SMS

Generation algorithm (protocol):

Requires clock synchronization between the device and remote server
Previous values (i.e. sequence to avoid replay attack) or challenge

Reflection Attack

Adversary finds a legitimate ‘password’ generator and then performs a MITM attack.

\begin{aligned} S \rightarrow A &: N \\ A \rightarrow U &: N \\ U \rightarrow A &: \left\{ N \right\}_K \\ A \rightarrow S &: \left\{ N \right\}_K \end{aligned}

Where:

$S$ is the server
$U$ is the user
$A$ is the adversary

Failures

Failures are often in the protocol:

CAP button overloaded: allowed repeat transactions with amount = 0
- Attackers could social engineer the code from the users (who believe it is safe), then use the code to perform non-zero amount transactions
OTP only checked against previous passcode
- If you had two cards, you could simply switch between them?
SSL/TLS encrypt data, but endpoints/metadata can leak data
Key fob cloning (repeat attack):
- Some car keys would broadcast the same key continuously, allowing an attacker in range of the signal to later replay the same signal to unlock the car
- Simple solution: use a counter; value must be strictly increasing

Reducing the amount of failures:

Use the right math for the right purpose
Ensure encryption keys are kept secret
Ensure keys can be revoked

General Encryption Principles

\begin{aligned} C =& E(P, K) \\ P =& D(C, K') \\ P =& D(E(P, K), K') \end{aligned}

Where:

$P$ is the plain text and $C$ is the encrypted text
$E$ is the encryption function and $D$ the decryption function
$K$ is the encryption key and $K'$ is the decryption key

Examples:

$K = K'$ : symmetric keys (e.g. AES)
$K = F(K')$ : particular derived keys (e.g. flashed on micro-controllers)

Cipher Examples

Ceasar Cipher:

$\forall P_i: C_i = \left( P_i + K \right) % 23$
where $P$ is the plaintext, $K$ is the encryption key and $C$ is the encrypted character
Weakness: frequency analysis

Vernam cipher:

Bit-by-bit symmetric encryption with a key $k$ that is as long as the plaintext
$\forall c : c_i = m_i \oplus k_i$
where $c$ is the encrypted bit and $m$ is the plaintext bit
Theoretically unbreakable if the keystream is truly random and only used once
- But does not ensure integrity: attacker can flip bits. If they know the data structure, they may be able to flip specific bits for harmful effects

Playfair block cipher:

Simple shift-based block cipher
Frequency analysis can be performed using repeating blocks
Hence, transformations are applied to the plaintext to prevent repeated blocks

Feistel cipher:

Ladder structure with multiple rounds applied to each half of the plaintext
Round function $F(K, P)$ applied to RHS
Result XORed with LHS
Swap left and right, then repeat
Round keys usually derived from one master key

Hash functions and control keys:

Used to check the integrity of messages
Initially used with wired telegraph payments using a codebook
e.g. SHA-1, SHA-256

Key Management

User-defined keys are relatively weak:

Not enough entropy: key needs to be larger
Using proof of a common secret: vulnerable to replay attacks
Public key encryption: how can you trust the public keys you receive?

Public key infrastructure:

Have a chain of trust, hard-coding trust for one or more root certificates
Trust models for CAs:
- Separate domains: one root CA
- Cross-certification/mesh: each CA issues cross-certificates to each other
  - Cross certification: each CA issues certificate to the other, allowing devices which trust only one of them to trust certificates signed by the other
  - Requires $O(n^2)$ certificates
- Bridge-CA model: one central bridge CA which cross-certifies with each CA

07. Access Control and Policies

Restricting access to the system.

Christchurch hot pools: stored proof of residence (driver’s license, passports) in system; had vulnerable plugin which allowed hacker to access this data. NZ privacy laws: data was not needed after initial verification, so they should have destroyed the data instead of storing it.

Early Memory Access Model

Processes are isolated from each other:

Supervisor: the sole program allowed to access the registers and load the program from it
Descriptor register: memory addresses that each program can use
Memory descriptor: address and size of memory allocations
If one process is given access to another’s memory (for data sharing), they will have to be next to each other

Relies on a privileged bit to control access to the descriptor register. It must be stored in read-only memory.

Limitations:

All or nothing: no way of sharing memory between processes
- Printers etc. require low-level access to I/O
Sub-routines make the system complex with contiguous memory access
- Cannot restrict the access to sub-routines in a fine-grained way
Addresses of processes are linked to the hardware

Multics

Need for access permissions (e.g. read/write/execute)
Flexibility between hardware addresses and process memory
Ability to dynamically link sub-routines

Each process has an array of segment descriptors:

Pointer to physical start address
Segment length
Access control bits
- R/Read, W/write, X/execute
- M: supervisor-only flag

This allowed the creation of an access control matrix:

Subjects: users, groups
Objects: files, programs, external services etc.
Row: capability list (C-list): all objects the subject can access
Column: access control list (ACL): the subjects that can access the given object
Cells: the permission level (e.g. RWX)
An instance of the matrix creates the access control policy

Exercise: Assignment 2

List all subjects in rows:
- User, superuser, not logged in
  - Project member (i.e. user groups)?
List all objects in columns:
- What are objects? Whole pages? Concepts (e.g. user profiles)? Features?
- Own profile, other public profile info, other private profile info
- Project
- Billing

Unix - Discretionary Access Control

Everything is a file.

Is directory
|
v | user | group | other |
d | rwx  |  rwx  |   rwx |
      ^       ^        ^
      |       |        |
setuid (s) setgid (s) t-bit (t)
Special bits replace the execute bit

A file belongs to an owner, who can define the permissions for all other users
The group is inherited from a parent directory by default
Permissions are strictly checked in order: user, group, then other
- If the owner has no read access (but others do), the owner cannot read

Mandatory Access Control

Security policies are not under the user (or even admin’s) control. In comparison, discretionary access control gives owners (e.g. creator of the file) full control.

This was:

Created under the multi-level security initiative of the US government
Defined security clearance levels for subjects/objects
- Top secret, secret, confidential, controlled unclassified, unclassified

Influenced other access control mechanisms:

Digital right Management (DRM): prevents sharing of network-accessible resources
Trusted Platform Module (TPM): monitors and locks the boot process with hashes
- e.g. Secure Enclave on iOS, Titan-M on Pixel, SELinux on Android
- Can be used to lock customers to some hardware or from installing different OSes

Rings of protection:

Ring 0 (inner-most ring) = hardware
- Backed by hardware
Access to other rings can only happen when:
- $P_1$ gives access to $P_2$ process through its $P_1$ ’s program segment
- and only in pre-authorized entry points (e.g. phone permissions)

Windows

Access control appeared in Windows NT (NB: UC’s domain is UOCNT):

Specify groups and users, inspired by UNIX
New permission attributes:
- Change ownership
- Change permissions
- Delete
Attributes are not binary (e.g. AccessDenied, AccessAllowed, SystemAudit)
Containers of objects (since Windows 8) with inheritance of permissions
Allows more flexibility in installing printer drivers

Can create domains of users:

Trust between domains can be uni or bidirectional
Permissions are managed in the registry
Users are remotely managed by Active Directory
User profiles and TLS certificates can override permissions

Lots of users, lots of permissions, lots of programs: a nightmare for admins, and incorrect permissions being assigned (and possibly even just giving admin access to everyone).

Take two:

Attack surface hardened with a closed kernel, TPM added, and most drivers were removed from the kernel:

User account control: all apps run under standard user rights
Additional permissions explicitly asked (i.e. elevated privilege pop-up at run-time)

Cleaner abstractions with principals and objects:

Security principals are groups, users, processes etc. with access rights
Each principal has a security identifier
Objects can be files, resources (e.g. printers), registry keys -Dynamic access control with contexts added to Active Directory (e.g. work vs. home)

Web Browsers

Reign of cookies:

Web servers are usually stateless

Security measures:

Secured transmission (HTTPS/TCP)
Anti-cross site request forgery: a token generated by the web server, often in a hidden input field
- Prevents a user accessing attacker-controlled website from making a request to a victim site

Cross-origin:

Some libraries/style sheets passed at request time
https://threatpost.com/amazon-alexa-one-click-attack-can-divulge-personal-data/158297/

Hardware-level Protection Mechanisms

08. Monitoring and Detecting Intrusions

Simplicity is the ultimate sophistication

Intrusion detection systems: IDS

In recent news: Microsoft Teams GIFShell Attack

Convince user to install a stager. Once done:

Command execution:
- Attacker sends GIF with embedded commands
- Teams logs the message in publicly accessible logs
- Stager can then extract commands from GIF and execute them
Exfiltration:
- Teams survey card filename has no length limit
- Stager submits card to the attacker’s public webhook
Initial infection:
- Sharepoint link generated for any files that are uploaded
- When message sent, contains a POST request with SharePoint link to that file (e.g. image)
- Attacker can replace that URL and Teams will display it as the original file type
  - e.g. send execute as an image: when user clicks it, it will download it
  - Can also use deep links to Excel etc., which may have vulnerabilities that allow RCE
- By default, Teams messages can be received from people outside the organization

Hardware

Hardware-level Protection Mechanisms

Intel requires that privilege level can only be changed by kernel processes:

However, the instruction set is not designed for virtualization; workarounds
Software Guard eXtension (SGX) creates enclave - encrypted, trusted zone

ARM uses TrustZone to:

Support hardware-level cryptographic functions
Lock phones to networks
Run licensed/critical code (e.g. fingerprint, SIM operations)

Hardware sandboxing with CHERI (ARM):

Allows fine-grained support to isolate processes at the CPU level
Enables sandboxed memory allocations
e.g. each browser tab runs in a separate process

Issues with enclaves:

Puts a very high level of trust on the manufacturer
DRM
TODO:
- IN THE EXAM!!!
- In relation to protection rings:
  - Protecting access to hardware via software access control (e.g. kernel/userspace barrier)
  - Or running completely separate (specialized?) hardware
  - e.g. bank apps on Android can use Secure Environment
Downsides: any vulnerabilities in the apps/endpoints using them can allow very low-level access to the hardware (e.g. cryptographic keys used for device boot)
Intel SGX:
- Untrusted section can create one or more enclaves in encrypted memory
  - Enclaves cannot be modified after they are built
- Untrusted sections can later call functions in the enclave
- SGX depreciated in 11/12th gen core processors
  - 4K Blu-rays require it; users won’t be able to view content they bought at the highest quality in the future
- Intel SGX Explained
  - Runs at ring 3 only (-1/hypervisor, 0/kernel, 1-2/drivers (not really used), 3/application)
AMD TrustZone
- ARM: has multiple processor modes (e.g. user, supervisor, system, …, hypervisor)
- TrustZone: secure and non-secure states (i.e. orthogonal to rings)
- Can partition SoC peripherals (e.g. areas of RAM only used by secure mode)

Mobile Platforms

OS:

iOS: simplified BSD with separate secure enclave
Android: simplified Linux with SELinux features

App management:

iOS: walled garden, with all apps being reviewed Apple
Android: signed by developers, with some being ‘Play Protect Verified’

Permissions:

Android:
- ‘Dangerous’ permissions must be accepted by users at run-time (post-Android 8)
  - Previously, would be shown during install and people would just click yes
  - Now users allow/deny one by one: allows them to make more informed decisions
- Vendors can define their own permissions: can lead to fragmentation
- SoK: Lessons Learned from Android Security Research for Appified Software Platforms:
  - Users cannot associate privacy risks with permissions; may underestimate or overestimate risks
  - Insecure IPC (exposed activities?): other apps can use this for privilege escalation
  - Web views: web to app/app to web for privilege escalation and data leakage
  - Permissions:
    - Over-privileged applications
    - Ad and other libraries running in same process and inheriting same privileges
    - Only shown on install/first use, not whenever the permissions are actually used
    - No mandatory access control (until SELinux)
  - APIs
    - Lack of good secure remote code loading API lead to unsigned implementations
    - MITM attacks in 95% of cases where developers customised TLS certificate validation

Monitoring and Response

MAPE-K control loop

(Monitor, Analyze, Plan, Execute), Knowledge.

Circa 2003, need for autonomic managers overlooking the functioning of running systems:

Self-configuring: system can deploy nodes on-demand
Self-healing: the system can handle failing components
Self-optimizing: able to manager the workload dynamically
- Important for cloud workloads where you pay by CPU hours etc.

Using a knowledge source (log files, system events):

Monitor a collecting of ‘interesting’ events; pass problematic ones onto the next step
Analyze the collected data (and predictions) and evaluate the issue
Plan a change in accordance with policies
Execute the plan (multiple actions/steps)

Exercise: MAKE-K on Assignment 2 Codebase

What aspects of the system should be logged?
- System load (e.g. requests/minute, CPU usage)
- Actions taken to the system (to allow rollbacks)
What data do you need to be captured? Why?
- For each request, IP addresses, user agents etc.
- POST requests, uploads etc.: non-sensitive data
  - Usernames probably not sensitive, although users may accidentally type their passwords into their username fields
- Access to admin panels
- Request response times
- CPU, RAM, disk usage
- Database queries, number of rows returned, processing time
- Non-standard requests (e.g. unused ports, unsupported protocols)
What safety measures can you apply automatically?
- Append-only logs or cloned logs
- Notify admin (e.g. through email) when anomalous events occur
- IP throttling/bans (e.g. fail2ban)
- Recaptcha
- Disabling pings, unused protocols etc. (or isolating it onto a different machine) (e.g. admin panel login/password change, low disk, high CPU, long response times)

Quality attributes:

How can you achieve self-configuration
What part of the system should self-heal?
Is managing the workload simply a matter of increasing resources?
- Or is this a symptom of an issue?

Base Rate Fallacy

Assuming that ‘interesting’ events are uncommon:

The base rate is the ratio of ‘interesting’ events to total events
A small false positive rate is large given a large population
This may lead to a vast majority of identified events being false positives

People cannot go through a thousand events to find the one true positive:

Intrusion Detection Systems

These can be categorized into three main techniques:

Signature-based:
- Identify specific patterns that match known-bad patterns
- Fast and low false-positive rate
- Can only detect known attacks
- Need examples of malicious traffic
- Rules defined at the low-level
- The more rules you define, the more resources are required to monitor the traffic occurring in real-time
Specification-based:
- Events deviate from per-application specifications of legitimate actions
  - Model the normal trends; anything outside that raises an alarm
- Uses manually-developed, system- or protocol-specific specifications
- Deep understanding of the system required
- Behavioral model of the system required
  - Needs a working implementation, possibly in active use, or a similar application to model this
- Can detect new attacks
- Not flexible to changes to the system - requires security-oriented testing during development
Anomaly-based:
- Machine learning used to record the steady state: requires recording of current system behavior
- Cannot detect anomalies that were present during the training stage
- Can detect new attacks
- May have a higher false-positive rate compared to other techniques
  - If expected behavior wasn’t captured during training
  - Changes to the system will require re-training
    - Difficult to model what is normal behavior - what if it is currently being attacked?

Factors to consider:

Initial setup and deployment
Types of detected events
Flexibility to new events
Extensibility by practitioners

Snort:

Low-level IDS
Packet sent to decoder:
- Determines packet protocol (e.g. IP, TCP)
- Checks for malformed packets, anomalies in the header
Preprocessors:
- Checks for IPs that are banned etc.
- Deals with IP refragmentation etc.
- Processes and normalizes the data into a standardized format
Detection:
- Snort rules, custom detectors applied
Log and verdict:
- Discard and log the packet, or send it to the downstream machine

All IDSes have a pipe-and-filter architecture, with the fastest, most basic rules being applied first to remove the most obvious bad packets.

Networking

LANs:

Hubs: broadcasts packets to all connected devices
- An infected device can monitor all traffic
Switches: sends packets to only the receiver
- Isolates traffic and may allow firewall rules to be applied

Ethernet:

Packet sniffing
Hardware-supported sniffing
- Port mirror: switch duplicates traffic to another device
- Test access port (TAP): port which reads the traffic going through the device

TCP:

Three-way handshake
SYN, SYN/ACK, ACK
SYN: send initial sequence number
ACK: acknowledge receipt of the sequence number
Each packet increments the sequence number

DDoS:

TCP doesn’t validate the sender’s IP address: can use this to make servers flood the victim with packets

IDSs contain rules to detect suspicious activities:

Ingress: filter packets entering the network/machine
Egress: filter packets leaving the machine/network

DNS poisoning:

Inject fake DNS entry to DNS server
DNS servers synchronize with each other, so the poisoned entry spreads
https://www.eweek.com/cloud/dns-poisoning-suspected-cause-of-huge-internet-outage-in-china/

09. Data Privacy and Sovereignty

Current Events: GitLab RCE

Any user with a login could remotely execute code through the GitHub import feature.

Patched in 15.3.1/15.2.3/15.1.5

Communication:

None until patch was released
Workarounds released in case organizations do not/cannot upgrade
No individual communication with organizations
UC did not know about the issue

Current Events: Optus Hack

2nd largest telecommunications company in Australia. 5 million drivers’ license/passports stolen.

Broken access control: faulty API which allowed the attacker to dump a large amount of data.

Initially released 10K entries as proof and requested million dollar ransom; apparently changed their mind and deleted the data.

Data Privacy

NZ Data Privacy Act 2020:

Defines 13 principles - guidelines, not rules
Principles 1, 2, 4: collection of data
Principle 3: privacy statements
Principles 5, 9: data storage security and duration
Principles 6, 7, 8: how data can be verified/amended
Principles 10, 11, 12: usage of data in/outside NZ
Principle 13: ID numbers and how and how to not use them

NZ Google Street View Wi-Fi collection (2010):

Network information (SSID, name, signal strength)
Payload information from unsecured networks
- No reason for collecting the data and was never used for anything

EU GDPR:

General Data Protection Regulation
Applies to organizations offering goods/services to the EU
99 articles in 11 chapters
Processors must give full information
Right to erasure (to be forgotten)
Right to object to processing of data
More explicit rights to lodge complaints
Transfer of data outside territory is institutional
Strong data sovereignty - requires mutual agreement
Explicit articles on liability and penalties

digital.govt.nz on GDPR:

While the GDPR imposes additional obligations on agencies, and provides additional privacy rights to EU residents, an agency is likely to comply with most of its obligations under the GDPR if it complies with the Privacy Act.

No…

Web Usability Standard 1.3

Web usability and privacy standard for NZ public service websites
Must be accessible: readable by a screen reader and be printable
Must identify as being affiliated to a government organization
Link to the main govt.nz website
Include contact information
Have a copyright and privacy statement
- Cookie usage
- Collation/usage of personal data
- Rights to access/amend/delete information

ISO 27000:

Closed standards - costs money to read them
27002 information security:
- Human resource and asset management
- Access control, inc. devices and procedures
27004 monitoring:
- What to monitor end measure
- Define, maintain, evaluate monitoring process
Domain-specific series:
- 27033 (7 parts) network security
- 27034 (7 parts) application security
- 27035 (4 parts) incident management

NZ Information Security Manual:

Mostly focused on government-led services/agencies
Practitioner’s manual covering:
- Governance: management, roles
- Security of personnel and facilities
- Monitoring and incident response
- Logging/forensics
- Cryptography requirements
Uses multi-level security (top-secret -> unclassified)

OWASP Secure Code Review Guide V2:

Vulnerability areas:
- Data Validation
- Authentication
- Session Management
- Authorization
- Cryptography
- Error Handling
- Logging
- Security Configuration
- Network Architecture
Source code scanners:
- Useful for identifying particular flaws, but cannot deal with flaws related to business logic
- Has a high false positive error and hence requires manual review
STRIDE:
- Spoofing another user’s identity or permissions
- Tampering: with HTTP requests made by clients
- Repudiation: matching activity to a user
- Information disclosure: private information breached
- Denial of service
- Elevation of privilege
DREAD:
- Damage: consequences of success
- Reproducibility: ease of attack; automation
- Exploitability: resources (and initial access required) for the attack
- Affected users: number of affected users, permission levels
- Discoverability: ease of attackers in finding the vulnerability
Mitigation:
- Not mitigated
- Partially mitigated: DREAD reduced
- Fully mitigated
Cyclomatic complexity: fixing high CC code has a greater chance of introducing new errors
Top 10 (2017):
1. Injection
- Bind SQL: boolean (valid/invalid query) or timing
1. Broken auth/session management
- Limit brute force
- Protect forgot password
- Require HTTPS
- Out of band comms (e.g. 2FA)
- Session hijacking:
  - Use HTTP-only cookies; no JS access
  - Don’t send IDs in URLs
  - Change session ID on elevation (e.g. log in)
  - As well as periodically (or enforce periodic logouts)
1. XSS
- Escape user input when rendering a page
1. Insecure direct object reference
- Verifying that user has access to view/edit some object (e.g. ID for a different account)
- Data binding: attacker passes in additional parameters that the endpoint is not expecting but is automatically bound
1. Security misconfiguration:
- URL rewriting: double URI encode; bypass initial security control layer, but next layer also runs decoding without security checks
1. Sensitive data exposure
- Proper encryption at rest and in transit
- Use of standardized/validated implementations
- Avoid wildcard certificates
- Reusing salts, IV, insecure entropy source
- Secure key storage
1. Missing function-level access control
- Response checks auth, but request already executed
1. CSRF
- User on malicious site executes request to target site
1. Components with known vulnerabilities
2. Unvalidated redirects/forwards

OWASP Secure Coding Practices Quick Reference Guide V2:

Potential targets:
- Software/associated information
- OSes of the servers
- Backend DB
- Other applications (if running in a shared environment)
- User systems
- Other software the user interacts with
Checklist:
- Input validation:
  - Backend validation (trusted system)
  - Partition data as trusted/untrusted
  - Have a centralized input validation routine
  - Specify character set, canonicalize strings
  - Reject on validation failure
  - Validate all data: URL parameters, HTTP headers, cookies etc.
    - Ensure only ASCII
  - Validate redirects (attacker sends request to redirect target)
  - Validate data range, length, type
  - Use allowlists of characters where possible
- Output:
  - Encode on trusted system
  - Use standard, validated routine for encoding
  - Sanitize untrusted data
- Authentication:
  - Specifically mark publicly accessible pages rather than the other way round
  - Use well-tested authentication services/libraries
  - Centralize authentication code/services
  - Separate authentication logic from business logic
  - Authentication controls should fail securely
  - Validate authentication data only when all inputs received
  - Return minimal information about authentication failures
  - Store secrets securely (not in source code)
- Error handling
  - Do not log session identifiers, account information etc.
  - The application, not the server, should handle application errors
  - Log both successes and failures
  - Log4j
  - Use a unified logging system
  - Log all:
    - Input validation failures
    - Authentication attempts
    - Access control failures
    - Tampering attempts
    - Requests using invalid/expired session tokens
    - System exceptions
    - All admin functions
    - TLS failures
    - Crypto failures
    - Use a HMAC to validate log integrity
  - Data protection
    - Least privilege
    - Purge sensitive data from caches after they are no longer required
    - Remove unnecessary information about system details from documentation and comments in user-accessible code
    - Do not include sensitive information in GET requests
    - Disable client-side caching when required
    - Implement access control for data stored on servers (e.g. cache directory only accessible by specific users)
  - Misc
    - Disable unnecessary library/framework/system functionality
    - File uploads: check headers/magic bytes, not just the extension
    - Never send absolute paths to clients
    - Make application files read-only where possible
    - Raise/drop privileges as late/soon as possible

Secure Code Review Best Practices:

Understand the developer’s approaches to security
Use multiple automated tools
Do not access level of risk: let the development team do this
Focus on the big picture, not individual lines of code
Pick a small subset of weaknesses to manually investigate
Follow up with development team
Secure code review != pen testing

Penetration testing on live systems:

May cause crashes or heavy load, impacting legitimate users
Bug bounties: running on sandboxed systems and has a legal framework
Disclosure: what happens if the company does not fix it?

Māori Data Sovereignty:

Te Mana Raraunga network:
- Founded in 2016
- Individuality vs collective ownership
- All data collected from Māori should belong to Māori
- Data should be analyzed in context with help from Māori
- Six principles:
  - Rangatiratanga: authority and self-determination
  - Whakapapa: data in their relationships
  - Whanaungatanga: obligations and accountability
  - Kotahitanga: collective benefit and capacity
    - Not just for a private benefit
  - Manaakitanga: reciprocity, respect and consent
    - Ask for consent
  - Kaitiakitanga: guardianship and ownership
    - Build knowledge within the Māori community

Patriot Act (9/11), CLOUD Act (2018):

Government can request access to any and all data held in US servers or US-led companies
Lots of companies based in Ireland (for tax purposes): government wanted access and was not allowed by the courts
- The fix? Write a new law to allow it

China National Intelligence Law:

Applies to all Chinese-led companies or those registered in China
Allows all means necessary to carry out intelligence work

Local legal agreements can prevent data transfer:

Auditing of data sometimes requires formal approval (e.g. unions in France)
GDPR picky on data storage location (mutual exchange agreement)
GDPR and CLOUD can conflict
- US requests must comply with GDPR

VPNs vs Tor:

Tor:
- Access to Tor via rendezvous points
- Data goes through intermediaries: proxies
- Data encrypted multiple times (3 should be enough)
- Only the entry node knows your IP address
- Tor exit nodes may be known
- Tor provides anonymity
VPNs:
- VPNs provide privacy
- Data sent to server to mask your IP
- Server must be trusted: it knows your identity and the sites being visited
- VPN servers may be known