02. Principles

Controversial topic debates: defending extreme viewpoints is difficult and requires research to justify the viewpoint.

Software engineering principles not black and white:

Contradicting principles
Dependent on context, priorities, constraints, requirements
Software is abstract; unlike other engineering areas, there are no laws of physics; we create the rules

Technical Debt

Design decisions made in the past under circumstances that are no longer relevant Conscious, un-ideal decisions made in the past that must be corrected

Ward Cunningham, 1992: quick and easy approach that comes with interest - additional work that must be done in the future: the longer you wait, the more code relies on the debt - interest grows over time.

Design stamina hypothesis (Martin Fowler): in a time-functionality graph:

Good design is linear
Bad/no design starts of faster, but drops off over time
Where the lines meet is the design payoff line

Types of Technical Debt

Two axes: deliberate/inadvertent, and reckless/prudent

Deliberate-reckless: don’t have time to design
Deliberate-prudent: must ship now and deal with the consequences
- The ‘best’ type of technical debt
Inadvertent-reckless: not understanding the technical debt in the project
- The worst type: don’t even know you are accruing technical debt
Inadvertent-prudent: understanding the accrued technical debt only after writing it

Reasons for Debt

Time: deadlines
- Faster time to market may lead to increased short-/long-term budget
- Prototypes: usually end up being part of the shipping product even though they should be thrown out once they are done
Money: budget constraints
- Interest must be paid eventually though
Knowledge/experience

Caused by:

Change in business decisions
Market changes
Scope changes/creep
- The scrum master should:
  - Be the interface between the development team and the rest of the world: marketing, management, etc.
  - Never be part of the development team; they should be part of the management so that they have the authority to protect the team and say ‘no’ to the needs/wants of others
    - In reality, it is cheaper to have them part of the team
Resourcing changes
Poor management
Inexperienced team

Measuring Technical Debt

302: a lot of debt by the end of the year.

Measure how much technical debt there is by:

Checking how much refactoring is being done
Measuring sprint velocity
- This must be done from the start

Types of debt:

Deliberate technical debt can easily be measured by documenting the debt how much time it would take to pay it off
Inadvertent technical debt is more difficult to measure
Debt from third-party libraries is inadvertent, reckless technical debt (although we can get a rough estimate of debt using metrics like the number of open issues)

Interest rates:

If the interest rate is high, it will only be used in extreme circumstances
- e.g. preparing for demo for VC funding
- For critical applications (e.g. planes, banks), technical debt is extremely expensive
Sometimes, the debt never needs to be paid off:
- Prototypes
- When you won’t build on top of the debted code
- Programs with short lifespans - e.g. for short advertising campaign
- Firmware - will rarely be updated
  - But when a new version of the hardware is released, it will likely be reused

Positive/negative value, visible/invisible attributes:

Visible, positive: feature
Visible, negative: bug
Invisible, positive: architecture
Invisible, negative: technical debt

Pick a process/framework (Scrum/Kanban/Waterfall): Which part is devoted to Technical Debt correction/payment?

Fan-in vs Fan-out

Fan-in:
- Number of direct dependents
- Utility functions should have a high fan-in
- the larger the fan in, the more stuff breaks when the module breaks
Fan-out:
- Number of direct dependencies
- Initialization function will likely have a very high fan-out
- Any dependency breaking may break the entire module

Refactor vs Re-engineering

Refactor: rewrite that does not change the module’s external interface (‘refactor’ in IDEs change method names and hence signatures, so it isn’t actually a refactor)
Re-engineering: rewrite that changes the interface and hence requires dependents to be updated

Hence, refactorings should be done as-you-go while re-engineerings should be done infrequently and only after careful planning.

Reuse vs KISS

Object-oriented programming built on:

Code reuse
- Opportunistic: create reusable modules/methods as you go
  - Internal/external: when you use external libraries, you take on their technical debt as well
- Planned/strategic: create modules/methods in preparation for plans
Modeling the real world

But reuse didn’t work - requirements for each program and the abstractions required differ.

Reuse is big design up front:

Waterfall: objects and entities must be designed
‘Just in case’ planning
Generalized utility functions
More planning/analyses = cost savings (if you get it right)
Bugs easier to find and fix

Unfortunately, determining the ‘correct’ design is impossible until implementation.

Situations when reuse does work:

Design patterns (not code)
Libraries (module with methods that you can call)
Frameworks (frameworks call your code)

Sapir-Whorf hypothesis/linguistic relativity: the structure of a language influences how you think. In programming terms: the programming paradigms we are used to influence our mindset and how we solve problems.

Reuse requires generic and abstract code/thinking:

Abstraction: extracting commonalities between similar classes/objects to create more generic class
Extensibility:
- Open-closed principle
- Minimize impacts of future changes on existing classes
- Generic/abstract classes/packages
- Planning for future ‘maybes’
  - Sometimes may be over-engineering
  - Future needs may change

KISS:

You Ain’t Gonna Need It (YAGNI)
- Do not implement until needed
- Do not try and predict the future
- Do not over-engineer
- Leads to:
  - Reduced cognitive load
    - 7 ± 2: experts could hold much more as they would ‘chunk’ multiple items together
  - Fewer bugs
  - Dead code: code whose results are not used (but may cause exceptions and hence still have some use)
  - Unreachable code: branches that will never be executed
- Is technical debt good?
The simplest solution may be the best solution
- Less code, fewer bugs
- Reduced complexity
- Possibly faster
- Makes testing easier - fewer inputs and branches
- Possibly faster to develop - however, finding the simple/elegant solution to a problem is often difficult
Refactoring:
- Keep it simple; continually refactor and extend the code as required

Design Principles

Encapsulation vs Information Hiding

Encapsulation is a tool to draw a border around a module.

Information hiding is a principle where you hide internal details from the outside world. This can be done using encapsulation.

This is used to hide what varies; anything that could be changed should be hidden (e.g. algorithm used for sorting).

Hence, argument and return types should be as high/generic as possible (eg. return Collection instead of ArrayList).

If a property or method is private, the type doesn’t matter as the type is encapsulated anyway.

Visibility, Access Levels, Modifiers

‘Never use public properties; use getters and setters instead’.

Getters and setters; two extreme viewpoints:

Getters should never be used:
- Tell, don’t ask: the class should have methods that modify the properties; other classes should never modify the property directly
- e.g. Bread class should have a toast method instead of a Toaster class toasting the bread, with the Bread class and its sub-classes implementing a Toastable interface
Getters should always be used:
- Everything done inside the class should also always go through the getters/setters
  - This allows the property type to be changed without affecting the rest of the class: a secondary level of encapsulation
- NB: if you simply return an object, the object could be modified and hence, removing the point of the getter
  - Hence, either return a copy or wrap it around an unmodifiable element (e.g. List.unmodifiableList in Java).

Coupling & Cohesion

Coupling: the extend to which two modules depend on each other.

Cohesion: how well the methods and properties within a module belong with each other.

Aim for high cohesion, low coupling.

Principle: keep data and behavior together (i.e. high cohesion).

The principle of separation of concerns separates data and behavior, but puts the related behaviors together.

The SOLID Principles

Single Responsibility Principle (SRP)

Each thing should only be in charge of one thing.

A responsibility = a reason for the module to change.

The SRP conflicts with the modeling of the real world, where objects usually do more than two things:

e.g. a modem does multiple things: it dials/hangups, and sends/receives data
Having both of these roles within a single interface violates the SRP

In addition, applying the SRP mindlessly can lead to:

Increased coupling and needless complexity
- Getting all the data you need may require it to pass through multiple middlemen if the data is spread too thin
  - But at the same time, having all the data together can lead to a god class
Difficulty in on-boarding new team members or understanding how to architect the program
Code fragmentation and broken/leaky encapsulation

Figuring out what the Single Responsibility should be can often be difficult?

Robert Martin’s thoughts on SRP:

…This principle is about people.

When you write a software module, you want to make sure that when changes are requested, those changes can only originate from a single person, or rather, a single tightly coupled group of people representing a single narrowly defined business function.

Imagine you took your car to a mechanic in order to fix a broken electric window. He calls you the next day saying it’s all fixed. When you pick up your car, you find the window works fine; but the car won’t start. It’s not likely you will return to that mechanic because he’s clearly an idiot.

Open/Closed Principle (OCP)

Modules should be open for extension, but closed for modification.

That is, you should be able to extend the behavior of an existing program without modifying it.

Interfaces are useful because they are an agreement that you will follow some defined behavior (for all public methods/properties); that is, Design-by-contract:

Pre-conditions: entry conditions that the client must ensure are met
Post-conditions: obligations by the service that must be true when the service method exits
Invariants: properties that are guaranteed to be maintained
All children must abide by their parent’s contract: they can loosen pre-conditions and tighten post-conditions, but not vice-versa
- e.g. Java Collection interface’s add method returns a boolean: whether or not the collection has been modified as a result of the operation

The open/closed principle forces abstractions and loose coupling and often requires dependency inversion.

Libraries and plug-in architectures are often good examples of OCP.

Can a program be fully closed? Probably not as this requires big design up-front.

Protected Variation: anything that is likely to change should be hidden and pushed downwards, with stable interfaces above/around them.

Liskov-Substitution Principle (LSP)

You should be able to change the subclass of an object without changing the behavior of the program i.e. design-by-contract: children adhere to their parent’s contract.

The LSP is not easy to implement and has no immediate benefits; rather, it gives long-term trust in modules.

Interface Segregation Principle (ISP)

Clients should not be forced to depend on interfaces/methods they will not use:

Many specific interfaces over one generic one
Avoids interface pollution
Classes should not be forced to implement irrelevant methods

Martin Fowler’s original article.

Dependency Inversion Principle (DIP)

High-level modules should not depend on low-level modules: both should depend on abstractions/interfaces.

From this, the following follows:

Abstractions should never depend on details
Code should depend on things that are at the same or higher level of abstraction
High-level policy should not depend on low-level details
- Low-level details can change
Low-level dependencies should be captured in low-level abstractions

Mostly taken for granted by the newer generation of programmers learning OO languages.

Common Closure Principle (CCP)

SRP at the package level: classes in a package should be closed together against the same kind of changes.

Highly-coupled classes should be grouped together in a package
- With the end result of increasing cohesion at the class level
What affects one affects all

Common Reuse Principle (CRP)

Classes in a package are reused together: if you reuse one class, reuse all of them.

Classes being reused within the same context should be part of the same package.

e.g. Util package in Java.

Abstract Factory (AKA Kit Pattern)

Dependency inversion: client no longer needs to care about the specifics of the implementations.

Factories define an interface to instantiate new instances of a specific implementation of a class/interface, removing the need for a client to know the exact type being instantiated.

Hence, this is an example of dependency inversion as the client uses an interface to distance itself from the specific class and constructor being called.

An abstract factory takes this further by giving the factory interface methods to instantiate multiple related (and possibly dependent) objects.

The abstract factory keeps behavior, not data together.

Factory methods give looser coupling; details are (how the objects are instantiated) brought down to concrete classes, while interfaces are given to the higher layers (abstract classes)

The abstract factory is an example of parallel hierarchy: multiple hierarchies following the same structure. e.g.:

      Operator              Vehicle
   ______|______           ____|____
  ▽            ▽           ▽       ▽
Pilot        Cyclist     Plane    Bike

The factory method ensures the right operator is assigned to the vehicle. But what if you already have a specific operator you want to assign to the vehicle?

If you have a setOperator(Operator) method on the Vehicle interface, it defeats the point of the factory method. Rather, the concrete classes (Plane, Bike) must have setPilot(Pilot) and setCyclist(Cyclist) methods.

That is, go as high as you can in your hierarchy, but no further - there is no point raising it to the top if it means it fails to meet your requirements.

Stable Dependencies Principle (SDP)

Want stability; lack of changes, at the top of the hierarchy. See: hide what varies, contracts.

A module should depend on modules that are more stable than itself.

Maximum stability: if environment changes, module can’t change. Additionally requires big design up-front.

Should stability/instability be distributed across the entire program? No; some parts of the program will need to change frequently.

Stable Abstractions Principle (SAP)

A module should be as abstract as it is stable:

Concepts should be stable and abstract; real-world objects should be more concrete
Unstable classes should be concrete
Changes should be made on concrete classes
Maintainability: TODO
Extendability: open-closed principle

Tell, Don’t ask

Law of Demeter

If you have method M in object O, then M can call the methods of:

O
O’s direct component objects
- Two-dot properties/methods (e.g. a.b.c()) increases complexity/difficulty in understanding the code
- Try add method to a which calls b.c() if possible
- This message-chaining is a code smell
- Confidence:
  - Confident about yourself
  - Less confident about your friend
  - Low confidence about your friend’s friends
M’s parameters
Objects instantiated inside M

#noestimate

What does it mean?

Estimating is a waste of time; just do the work
Estimates aren’t useful; tasks take as long as they will take
- Clients get unhappy when estimates are not met
Complexity of work means estimates are unreliable

Standard agile estimates story points to determine the number of stories done in the sprint and calculate their velocity.

#noestimate instead just completes tasks by priority and uses the tasks completed to calculate velocity. As the tasks are sliced vertically, the client gets a tangible end result at the end of each sprint.

So why estimate? The process (e.g. planning poker, discussion) is useful even if the estimates themselves are not.

Vertically slicing means:

Even if the project is stopped prematurely, there is something to deliver
Requirements can change; if customer realizes they did not want what they asked for, or the situation has changed, you can change future stories
Stories may get quite large:
- MVP: do the minimum required to get the story working
  - Downside: re-engineering may be required in the future
- Alternatively, talk to the PO and do things ‘right’ if you are certain you will need certain functionality later

Story mapping:

Epics: giant stories
Split the epic into stories
Prioritize stories such that useful functionality is delivered each sprint
Walking skeleton: when multiple epics are done simultaneously such that the skeleton of the epics slowly comes together each sprint

Class Debates

Always/Never Write Documentation

Always:

On-boarding/knowledge transfer
Justifying design decisions
Large codebases, makes navigability easier (usability)

Never:

Documentation always lags behind; old documentation can be counter-productive
Reading the code can be more useful than reading documentation
Documentation can be an excuse for bad/complex code

Always, counterpoints:

Complexity of code matches complexity of the problem space: code describes how, not why
- Documentation: high-level overview
Technical debt: code will not be perfect; need documentation to explain what needs to be changed
Documentation should be written before the code (like TDD) - in this case, documentation will always be updated (like TDD)
On-boarding new technologies

Never, counterpoints:

Bob Martin: “A comment is a failure to express yourself in code. If you fail, then write a comment; but try not to fail”
Documentation not being updated still remains an additional risk
There should always be an obvious solution
Code too complicated to be understandable: can always be simplified to a point where the code is self-explanatory
Grady Booch: “Clean code is simple and direct. Clean code reads like well-written prose. Clean code never obscures the designer’s intent but rather is full of crisp abstractions and straightforward lines of control.”

Collective vs Individual Code Ownership

Collective:

High bus factor: person exiting the company will not
Allows a cross-functional team
Reduces knowledge siloing
Improves review process: more people familiar with the code base
Reduces level of pre-planning required
Stops the blame game; share the blame and reward; the process, not the individual, had issues.
When person leaves the project, who remains responsible for the code?
Less communication overhead - do not need to talk to specific person in order to make improvements/bug fixes
http://www.extremeprogramming.org/rules/collective.html

Individual:

Does not mean siloing; means accountability
- Ensures there is always an expert for any part of the code base
Allows specialization; more efficient distribution of labor
Code reviews: fresh set of eyes better for finding issues, bugs
Higher standard of quality when your name is attached to your code

Individual, counterpoints:

Code ownership required in small company; there needed to be a domain expert??
Collective ownership means completely anonymous code and no accountability
Allows more even split over workload
Reduces risk of merge conflicts as there should only be one person working in each area
Too many cooks
Code creator will always have better understanding of the code - know who to talk to

Collective, counterpoints:

Code reviews a form of collective ownership
Responsibility and ownership are different: you are still responsible for quality, tests, etc. of the code you wrote
Individual ownership often becomes ‘your code is broken, fix it’
If owner on vacation, you get stuck
If your own code and no one else can work on it, why bother documenting?
Merge requests: even if there is individual ownership there are still changes that require multiple modules to be updated in tandem