Process
- Sniff
- Prioritise and evaluate
- Refactor:
- Split
- Join
- Move
- Extract
- Rename
- etc.
How large is large?
- Smells are inherently subjective
- Easier to detect in code we know
- Can be informed by measurements - software metrics
- Define quantities that represent code qualities we want to understand
- Gather data
- Analyze results, statistics
- Percentiles, min/max, outliers etc.
Morphology
Fan-out:
- How many other methods/functions does the method call
- Is it too ‘big’?
Fan-in:
- How many others call the method
- How reusable is it?
Method Length
Metrics:
- Lines of code
- Number of statements
- Comments included?
- Declarations included?
- Whitespace included?
- Amount of logic: cyclomatic complexity
- Number of branches
- Amount of nesting
Long methods a problem because:
- Method may be doing too many things
- Single responsibility principle
- Cohesion
How long is too long? Use:
- Metrics
- Counting rules
- Distribution analysis
- e.g. statement count vs cyclomatic complexity; find and refactor outliers
Object-Oriented Metrics
Chidamber & Kemerer suite commonly used:
- Weighted Methods per Class (WMC): class size
- How many methods are there? Takes into account constructors, overloads etc.
- Number of Children (NOC): structure
- Depth in Inheritance Tree (DIT): structure
- Coupling Between Objects (CBO): dependencies
- Response For Class (RFC): message passing
- Lack of Cohesion Of Methods (LCOM): property/method interactions
Large Class Smell
Knows too much, does too much; violating single responsibility principle
i.e. a God class
Break into smaller cohesive classes: extract class, extract interface
Long Parameter Lists
- Interface segregation
- If the parameter list is long, what does it tell us about the method?
- Single responsibility
- God method?
- How complex is it?
Solutions:
- Introduce a parameter object
- e.g. replacing start/end date with date range object
- Preserve the whole object
- Instead of dissecting and sending particular properties of an object, just send the whole object
- Replace the parameter with a method call
- Caller should let the method get the data itself, rather than doing it for it
Duplicated Code
- ‘Once and only once’
- Same problem may be solved in multiple ways (different programmers)
Want a single place of truth.
Solution: extract it into a method
Message Chains
bla.getSubProperty().getSubProperty()...
Bad because:
- Long message stack
- Complexity
- Cause dependency between classes in the chain
- Changes in any relationship causes cascading changes
Law of Demeter: only talk to immediate friends.
Method m() of object o should only invoke methods of:
oitself- Parameters of
m - Objects created within
m - Properties (attributes, direct components) of
o
Dead/Unreachable/Deactivated/Commented Code
Dead code:
- Source code that might be executed but the result of which is never used
- Unreferenced variables/functions not dead code; automatically removed by compiler/linker
Unreachable code:
- Code that can never be reached (e.g. switch statement, after return statement)
- No control flow path to the code; harder to read, takes up more memory, cache
Deactivated code:
- Code that can’t be executed now e.g.
#if os(iOS)
Commented code:
- Misleading, difficult to read and maintain
- Why was it commented out? Can it be deleted?
- What code can we trust?
Switch Statements
Large switch/if statements:
- What does it mean?
- Adds conditional complexity
- May call methods all over the codebase
- OOP should rarely use switch statements; use polymorphism
Solutions:
- Switch on types? Use polymorphism
- Switch on type code? Replace type code with subclasses
- Checking against null? Introduce null objects
Can leave it alone if it just performs some simple actions
Comments
- Integrity: are comments up to date? Can they be trusted
- Why are comments needed? Is it unreadable? Doing too many things?
- Comments vs Javadoc vs Git commits
Solution:
- If expression is difficult to understand, extract it to a variable
- If the code is difficult to understand, extract it to a method
- Rename method to make it more precise
Names
Type Embedded in Name
Found mostly in old code. e.g. strFirstName
What happens if type is changed? Decisions should be hidden.
Uncommunicative Names
Should be descriptive, succinct, and have consistent names.
Speculative Generality
When you make general solutions because you speculate/anticipate what you might need in the future: do not speculate about tomorrow’s problems.
YAGNI, so don’t over-engineer your solution.
But at the same time, need to balance this with planning or extensibility.
Solutions:
- Collapse hierarchy (e.g. if you have a bunch of classes that don’t currently do anything)
- Inline class (opposite of extracting class)
Inappropriate Intimacy
How much does a class need to know about another? Ideally little; low coupling is preferable.
Solutions:
- Move functions; keep data nad methods together
- Change bi-directional association to unidirectional
- Does a key need to know about the key chain it’s on?
- Replace superclass with delegate (composition over inheritance)
Indecent Exposure
- Every variable and method should be private unless it has to be public
- Hide your decisions
- Don’t worry about efficiency: only move to direct access if informed by performance monitoring
Feature Envy
Method making extensive use of another class (e.g. envious of their methods and which they had them).
Cohesive elements should be in the same module/class.
Shotgun Surgery
When making a change requires splattering lots of small changes across a large swath of the system; changes should be localized.
This probably means the single responsibility principal has been violated.
Solutions:
- Move methods/data
- Create a new class
Test Smells
Hard to test code:
- Highly coupled
- Asynchronous/multi-threaded
- GUI
- Buggy tests
Obscure tests:
- Difficult to understand
- Too much/too little information
- Eager tests; testing too much functionality
- Irrelevant information
- Hard-coded test data
- Indirect tests
Production bugs:
- Too many bugs getting to production
- Is there test coverage? Are the tests good?
- Are we checking or testing?
- Humans test, machine check
- Are tests covering all possibilities
- Are tests buggy?
High maintenance:
- Tests need to be modified often
- Are tests too complex, obscure?
- Single responsibility principal
- Test duplication?
Fragile tests:
- Interface too sensitive (e.g. GUI tests sensitive to resizing)
- Context too sensitive
- Pass condition sensitive to minor changes
- Pass condition sensitive to date/time, server state
Erratic tests:
- Failing for no reason
- Works on only some environments
- Be wary of conditions in tests