Controversial topic debates: defending extreme viewpoints is difficult and requires research to justify the viewpoint.
Software engineering principles not black and white:
- Contradicting principles
- Dependent on context, priorities, constraints, requirements
- Software is abstract; unlike other engineering areas, there are no laws of physics; we create the rules
Technical Debt
Design decisions made in the past under circumstances that are no longer relevant Conscious, un-ideal decisions made in the past that must be corrected
Ward Cunningham, 1992: quick and easy approach that comes with interest - additional work that must be done in the future: the longer you wait, the more code relies on the debt - interest grows over time.
Design stamina hypothesis (Martin Fowler): in a time-functionality graph:
- Good design is linear
- Bad/no design starts of faster, but drops off over time
- Where the lines meet is the design payoff line
Types of Technical Debt
Two axes: deliberate/inadvertent, and reckless/prudent
- Deliberate-reckless: don’t have time to design
- Deliberate-prudent: must ship now and deal with the consequences
- The ‘best’ type of technical debt
- Inadvertent-reckless: not understanding the technical debt in the project
- The worst type: don’t even know you are accruing technical debt
- Inadvertent-prudent: understanding the accrued technical debt only after writing it
Reasons for Debt
- Time: deadlines
- Faster time to market may lead to increased short-/long-term budget
- Prototypes: usually end up being part of the shipping product even though they should be thrown out once they are done
- Money: budget constraints
- Interest must be paid eventually though
- Knowledge/experience
Caused by:
- Change in business decisions
- Market changes
- Scope changes/creep
- The scrum master should:
- Be the interface between the development team and the rest of the world: marketing, management, etc.
- Never be part of the development team; they should be part of the management so that they have the authority to protect the team and say ‘no’ to the needs/wants of others
- In reality, it is cheaper to have them part of the team
- The scrum master should:
- Resourcing changes
- Poor management
- Inexperienced team
Measuring Technical Debt
302: a lot of debt by the end of the year.
Measure how much technical debt there is by:
- Checking how much refactoring is being done
- Measuring sprint velocity
- This must be done from the start
Types of debt:
- Deliberate technical debt can easily be measured by documenting the debt how much time it would take to pay it off
- Inadvertent technical debt is more difficult to measure
- Debt from third-party libraries is inadvertent, reckless technical debt (although we can get a rough estimate of debt using metrics like the number of open issues)
Interest rates:
- If the interest rate is high, it will only be used in extreme circumstances
- e.g. preparing for demo for VC funding
- For critical applications (e.g. planes, banks), technical debt is extremely expensive
- Sometimes, the debt never needs to be paid off:
- Prototypes
- When you won’t build on top of the debted code
- Programs with short lifespans - e.g. for short advertising campaign
- Firmware - will rarely be updated
- But when a new version of the hardware is released, it will likely be reused
Positive/negative value, visible/invisible attributes:
- Visible, positive: feature
- Visible, negative: bug
- Invisible, positive: architecture
- Invisible, negative: technical debt
Pick a process/framework (Scrum/Kanban/Waterfall): Which part is devoted to Technical Debt correction/payment?
Fan-in vs Fan-out
- Fan-in:
- Number of direct dependents
- Utility functions should have a high fan-in
- the larger the fan in, the more stuff breaks when the module breaks
- Fan-out:
- Number of direct dependencies
- Initialization function will likely have a very high fan-out
- Any dependency breaking may break the entire module
Refactor vs Re-engineering
- Refactor: rewrite that does not change the module’s external interface (‘refactor’ in IDEs change method names and hence signatures, so it isn’t actually a refactor)
- Re-engineering: rewrite that changes the interface and hence requires dependents to be updated
Hence, refactorings should be done as-you-go while re-engineerings should be done infrequently and only after careful planning.
Reuse vs KISS
Object-oriented programming built on:
- Code reuse
- Opportunistic: create reusable modules/methods as you go
- Internal/external: when you use external libraries, you take on their technical debt as well
- Planned/strategic: create modules/methods in preparation for plans
- Opportunistic: create reusable modules/methods as you go
- Modeling the real world
But reuse didn’t work - requirements for each program and the abstractions required differ.
Reuse is big design up front:
- Waterfall: objects and entities must be designed
- ‘Just in case’ planning
- Generalized utility functions
- More planning/analyses = cost savings (if you get it right)
- Bugs easier to find and fix
Unfortunately, determining the ‘correct’ design is impossible until implementation.
Situations when reuse does work:
- Design patterns (not code)
- Libraries (module with methods that you can call)
- Frameworks (frameworks call your code)
Sapir-Whorf hypothesis/linguistic relativity: the structure of a language influences how you think. In programming terms: the programming paradigms we are used to influence our mindset and how we solve problems.
Reuse requires generic and abstract code/thinking:
- Abstraction: extracting commonalities between similar classes/objects to create more generic class
- Extensibility:
- Open-closed principle
- Minimize impacts of future changes on existing classes
- Generic/abstract classes/packages
- Planning for future ‘maybes’
- Sometimes may be over-engineering
- Future needs may change
KISS:
- You Ain’t Gonna Need It (YAGNI)
- Do not implement until needed
- Do not try and predict the future
- Do not over-engineer
- Leads to:
- Reduced cognitive load
- 7 ± 2: experts could hold much more as they would ‘chunk’ multiple items together
- Fewer bugs
- Dead code: code whose results are not used (but may cause exceptions and hence still have some use)
- Unreachable code: branches that will never be executed
- Reduced cognitive load
- Is technical debt good?
- The simplest solution may be the best solution
- Less code, fewer bugs
- Reduced complexity
- Possibly faster
- Makes testing easier - fewer inputs and branches
- Possibly faster to develop - however, finding the simple/elegant solution to a problem is often difficult
- Refactoring:
- Keep it simple; continually refactor and extend the code as required
Design Principles
Encapsulation vs Information Hiding
Encapsulation is a tool to draw a border around a module.
Information hiding is a principle where you hide internal details from the outside world. This can be done using encapsulation.
This is used to hide what varies; anything that could be changed should be hidden (e.g. algorithm used for sorting).
Hence, argument and return types should be as high/generic as possible (eg. return Collection instead of ArrayList).
If a property or method is private, the type doesn’t matter as the type is encapsulated anyway.
Visibility, Access Levels, Modifiers
‘Never use public properties; use getters and setters instead’.
Getters and setters; two extreme viewpoints:
- Getters should never be used:
- Tell, don’t ask: the class should have methods that modify the properties; other classes should never modify the property directly
- e.g.
Breadclass should have atoastmethod instead of aToasterclass toasting the bread, with the Bread class and its sub-classes implementing aToastableinterface
- Getters should always be used:
- Everything done inside the class should also always go through the getters/setters
- This allows the property type to be changed without affecting the rest of the class: a secondary level of encapsulation
- NB: if you simply return an object, the object could be modified and hence, removing the point of the getter
- Hence, either return a copy or wrap it around an unmodifiable element (e.g.
List.unmodifiableListin Java).
- Hence, either return a copy or wrap it around an unmodifiable element (e.g.
- Everything done inside the class should also always go through the getters/setters
Coupling & Cohesion
Coupling: the extend to which two modules depend on each other.
Cohesion: how well the methods and properties within a module belong with each other.
Aim for high cohesion, low coupling.
Principle: keep data and behavior together (i.e. high cohesion).
The principle of separation of concerns separates data and behavior, but puts the related behaviors together.
The SOLID Principles
Single Responsibility Principle (SRP)
Each thing should only be in charge of one thing.
A responsibility = a reason for the module to change.
The SRP conflicts with the modeling of the real world, where objects usually do more than two things:
- e.g. a modem does multiple things: it dials/hangups, and sends/receives data
- Having both of these roles within a single interface violates the SRP
In addition, applying the SRP mindlessly can lead to:
- Increased coupling and needless complexity
- Getting all the data you need may require it to pass through multiple middlemen if the data is spread too thin
- But at the same time, having all the data together can lead to a god class
- Getting all the data you need may require it to pass through multiple middlemen if the data is spread too thin
- Difficulty in on-boarding new team members or understanding how to architect the program
- Code fragmentation and broken/leaky encapsulation
Figuring out what the Single Responsibility should be can often be difficult?
Robert Martin’s thoughts on SRP:
…This principle is about people.
When you write a software module, you want to make sure that when changes are requested, those changes can only originate from a single person, or rather, a single tightly coupled group of people representing a single narrowly defined business function.
Imagine you took your car to a mechanic in order to fix a broken electric window. He calls you the next day saying it’s all fixed. When you pick up your car, you find the window works fine; but the car won’t start. It’s not likely you will return to that mechanic because he’s clearly an idiot.
Open/Closed Principle (OCP)
Modules should be open for extension, but closed for modification.
That is, you should be able to extend the behavior of an existing program without modifying it.
Interfaces are useful because they are an agreement that you will follow some defined behavior (for all public methods/properties); that is, Design-by-contract:
- Pre-conditions: entry conditions that the client must ensure are met
- Post-conditions: obligations by the service that must be true when the service method exits
- Invariants: properties that are guaranteed to be maintained
- All children must abide by their parent’s contract: they can loosen pre-conditions and tighten post-conditions, but not vice-versa
- e.g. Java Collection interface’s
addmethod returns a boolean: whether or not the collection has been modified as a result of the operation
- e.g. Java Collection interface’s
The open/closed principle forces abstractions and loose coupling and often requires dependency inversion.
Libraries and plug-in architectures are often good examples of OCP.
Can a program be fully closed? Probably not as this requires big design up-front.
Protected Variation: anything that is likely to change should be hidden and pushed downwards, with stable interfaces above/around them.
Liskov-Substitution Principle (LSP)
You should be able to change the subclass of an object without changing the behavior of the program i.e. design-by-contract: children adhere to their parent’s contract.
The LSP is not easy to implement and has no immediate benefits; rather, it gives long-term trust in modules.
Interface Segregation Principle (ISP)
Clients should not be forced to depend on interfaces/methods they will not use:
- Many specific interfaces over one generic one
- Avoids interface pollution
- Classes should not be forced to implement irrelevant methods
Martin Fowler’s original article.
Dependency Inversion Principle (DIP)
High-level modules should not depend on low-level modules: both should depend on abstractions/interfaces.
From this, the following follows:
- Abstractions should never depend on details
- Code should depend on things that are at the same or higher level of abstraction
- High-level policy should not depend on low-level details
- Low-level details can change
- Low-level dependencies should be captured in low-level abstractions
Mostly taken for granted by the newer generation of programmers learning OO languages.
Common Closure Principle (CCP)
SRP at the package level: classes in a package should be closed together against the same kind of changes.
- Highly-coupled classes should be grouped together in a package
- With the end result of increasing cohesion at the class level
- What affects one affects all
Common Reuse Principle (CRP)
Classes in a package are reused together: if you reuse one class, reuse all of them.
Classes being reused within the same context should be part of the same package.
e.g. Util package in Java.
Abstract Factory (AKA Kit Pattern)
Dependency inversion: client no longer needs to care about the specifics of the implementations.
Factories define an interface to instantiate new instances of a specific implementation of a class/interface, removing the need for a client to know the exact type being instantiated.
Hence, this is an example of dependency inversion as the client uses an interface to distance itself from the specific class and constructor being called.
An abstract factory takes this further by giving the factory interface methods to instantiate multiple related (and possibly dependent) objects.
The abstract factory keeps behavior, not data together.
Factory methods give looser coupling; details are (how the objects are instantiated) brought down to concrete classes, while interfaces are given to the higher layers (abstract classes)
The abstract factory is an example of parallel hierarchy: multiple hierarchies following the same structure. e.g.:
Operator Vehicle
______|______ ____|____
▽ ▽ ▽ ▽
Pilot Cyclist Plane Bike
The factory method ensures the right operator is assigned to the vehicle. But what if you already have a specific operator you want to assign to the vehicle?
If you have a setOperator(Operator) method on the Vehicle interface, it defeats the point of the factory method. Rather, the concrete classes (Plane, Bike) must have setPilot(Pilot) and setCyclist(Cyclist) methods.
That is, go as high as you can in your hierarchy, but no further - there is no point raising it to the top if it means it fails to meet your requirements.
Stable Dependencies Principle (SDP)
Want stability; lack of changes, at the top of the hierarchy. See: hide what varies, contracts.
A module should depend on modules that are more stable than itself.
Maximum stability: if environment changes, module can’t change. Additionally requires big design up-front.
Should stability/instability be distributed across the entire program? No; some parts of the program will need to change frequently.
Stable Abstractions Principle (SAP)
A module should be as abstract as it is stable:
- Concepts should be stable and abstract; real-world objects should be more concrete
- Unstable classes should be concrete
- Changes should be made on concrete classes
- Maintainability: TODO
- Extendability: open-closed principle
Tell, Don’t ask
Law of Demeter
If you have method M in object O, then M can call the methods of:
- O
- O’s direct component objects
- Two-dot properties/methods (e.g.
a.b.c()) increases complexity/difficulty in understanding the code - Try add method to
awhich callsb.c()if possible - This message-chaining is a code smell
- Confidence:
- Confident about yourself
- Less confident about your friend
- Low confidence about your friend’s friends
- Two-dot properties/methods (e.g.
- M’s parameters
- Objects instantiated inside M
#noestimate
What does it mean?
- Estimating is a waste of time; just do the work
- Estimates aren’t useful; tasks take as long as they will take
- Clients get unhappy when estimates are not met
- Complexity of work means estimates are unreliable
Standard agile estimates story points to determine the number of stories done in the sprint and calculate their velocity.
#noestimate instead just completes tasks by priority and uses the tasks completed to calculate velocity. As the tasks are sliced vertically, the client gets a tangible end result at the end of each sprint.
So why estimate? The process (e.g. planning poker, discussion) is useful even if the estimates themselves are not.
Vertically slicing means:
- Even if the project is stopped prematurely, there is something to deliver
- Requirements can change; if customer realizes they did not want what they asked for, or the situation has changed, you can change future stories
- Stories may get quite large:
- MVP: do the minimum required to get the story working
- Downside: re-engineering may be required in the future
- Alternatively, talk to the PO and do things ‘right’ if you are certain you will need certain functionality later
- MVP: do the minimum required to get the story working
Story mapping:
- Epics: giant stories
- Split the epic into stories
- Prioritize stories such that useful functionality is delivered each sprint
- Walking skeleton: when multiple epics are done simultaneously such that the skeleton of the epics slowly comes together each sprint
Class Debates
Always/Never Write Documentation
Always:
- On-boarding/knowledge transfer
- Justifying design decisions
- Large codebases, makes navigability easier (usability)
Never:
- Documentation always lags behind; old documentation can be counter-productive
- Reading the code can be more useful than reading documentation
- Documentation can be an excuse for bad/complex code
Always, counterpoints:
- Complexity of code matches complexity of the problem space: code describes how, not why
- Documentation: high-level overview
- Technical debt: code will not be perfect; need documentation to explain what needs to be changed
- Documentation should be written before the code (like TDD) - in this case, documentation will always be updated (like TDD)
- On-boarding new technologies
Never, counterpoints:
- Bob Martin: “A comment is a failure to express yourself in code. If you fail, then write a comment; but try not to fail”
- Documentation not being updated still remains an additional risk
- There should always be an obvious solution
- Code too complicated to be understandable: can always be simplified to a point where the code is self-explanatory
- Grady Booch: “Clean code is simple and direct. Clean code reads like well-written prose. Clean code never obscures the designer’s intent but rather is full of crisp abstractions and straightforward lines of control.”
Collective vs Individual Code Ownership
Collective:
- High bus factor: person exiting the company will not
- Allows a cross-functional team
- Reduces knowledge siloing
- Improves review process: more people familiar with the code base
- Reduces level of pre-planning required
- Stops the blame game; share the blame and reward; the process, not the individual, had issues.
- When person leaves the project, who remains responsible for the code?
- Less communication overhead - do not need to talk to specific person in order to make improvements/bug fixes
- http://www.extremeprogramming.org/rules/collective.html
Individual:
- Does not mean siloing; means accountability
- Ensures there is always an expert for any part of the code base
- Allows specialization; more efficient distribution of labor
- Code reviews: fresh set of eyes better for finding issues, bugs
- Higher standard of quality when your name is attached to your code
Individual, counterpoints:
- Code ownership required in small company; there needed to be a domain expert??
- Collective ownership means completely anonymous code and no accountability
- Allows more even split over workload
- Reduces risk of merge conflicts as there should only be one person working in each area
- Too many cooks
- Code creator will always have better understanding of the code - know who to talk to
Collective, counterpoints:
- Code reviews a form of collective ownership
- Responsibility and ownership are different: you are still responsible for quality, tests, etc. of the code you wrote
- Individual ownership often becomes ‘your code is broken, fix it’
- If owner on vacation, you get stuck
- If your own code and no one else can work on it, why bother documenting?
- Merge requests: even if there is individual ownership there are still changes that require multiple modules to be updated in tandem