21-22. Code Smells

Process

Sniff
Prioritise and evaluate
Refactor:
- Split
- Join
- Move
- Extract
- Rename
- etc.

How large is large?

Smells are inherently subjective
Easier to detect in code we know
Can be informed by measurements - software metrics
- Define quantities that represent code qualities we want to understand
- Gather data
- Analyze results, statistics
  - Percentiles, min/max, outliers etc.

Morphology

Fan-out:

How many other methods/functions does the method call
Is it too ‘big’?

Fan-in:

How many others call the method
How reusable is it?

Method Length

Metrics:

Lines of code
- Number of statements
- Comments included?
- Declarations included?
- Whitespace included?
Amount of logic: cyclomatic complexity
- Number of branches
Amount of nesting

Long methods a problem because:

Method may be doing too many things
Single responsibility principle
Cohesion

How long is too long? Use:

Metrics
Counting rules
Distribution analysis
- e.g. statement count vs cyclomatic complexity; find and refactor outliers

Object-Oriented Metrics

Chidamber & Kemerer suite commonly used:

Weighted Methods per Class (WMC): class size
- How many methods are there? Takes into account constructors, overloads etc.
Number of Children (NOC): structure
Depth in Inheritance Tree (DIT): structure
Coupling Between Objects (CBO): dependencies
Response For Class (RFC): message passing
Lack of Cohesion Of Methods (LCOM): property/method interactions

Large Class Smell

Knows too much, does too much; violating single responsibility principle

i.e. a God class

Break into smaller cohesive classes: extract class, extract interface

Long Parameter Lists

Interface segregation
If the parameter list is long, what does it tell us about the method?
- Single responsibility
- God method?
How complex is it?

Solutions:

Introduce a parameter object
- e.g. replacing start/end date with date range object
Preserve the whole object
- Instead of dissecting and sending particular properties of an object, just send the whole object
Replace the parameter with a method call
- Caller should let the method get the data itself, rather than doing it for it

Duplicated Code

‘Once and only once’
Same problem may be solved in multiple ways (different programmers)

Want a single place of truth.

Solution: extract it into a method

Message Chains

bla.getSubProperty().getSubProperty()...

Bad because:

Long message stack
Complexity
Cause dependency between classes in the chain
- Changes in any relationship causes cascading changes

Law of Demeter: only talk to immediate friends.

Method m() of object o should only invoke methods of:

o itself
Parameters of m
Objects created within m
Properties (attributes, direct components) of o

Dead/Unreachable/Deactivated/Commented Code

Dead code:

Source code that might be executed but the result of which is never used
Unreferenced variables/functions not dead code; automatically removed by compiler/linker

Unreachable code:

Code that can never be reached (e.g. switch statement, after return statement)
No control flow path to the code; harder to read, takes up more memory, cache

Deactivated code:

Code that can’t be executed now e.g. #if os(iOS)

Commented code:

Misleading, difficult to read and maintain
Why was it commented out? Can it be deleted?
What code can we trust?

Switch Statements

Large switch/if statements:

What does it mean?
Adds conditional complexity
May call methods all over the codebase
OOP should rarely use switch statements; use polymorphism

Solutions:

Switch on types? Use polymorphism
Switch on type code? Replace type code with subclasses
Checking against null? Introduce null objects

Can leave it alone if it just performs some simple actions

Comments

Integrity: are comments up to date? Can they be trusted
Why are comments needed? Is it unreadable? Doing too many things?
Comments vs Javadoc vs Git commits

Solution:

If expression is difficult to understand, extract it to a variable
If the code is difficult to understand, extract it to a method
Rename method to make it more precise

Names

Type Embedded in Name

Found mostly in old code. e.g. strFirstName

What happens if type is changed? Decisions should be hidden.

Uncommunicative Names

Should be descriptive, succinct, and have consistent names.

Speculative Generality

When you make general solutions because you speculate/anticipate what you might need in the future: do not speculate about tomorrow’s problems.

YAGNI, so don’t over-engineer your solution.

But at the same time, need to balance this with planning or extensibility.

Solutions:

Collapse hierarchy (e.g. if you have a bunch of classes that don’t currently do anything)
Inline class (opposite of extracting class)

Inappropriate Intimacy

How much does a class need to know about another? Ideally little; low coupling is preferable.

Solutions:

Move functions; keep data nad methods together
Change bi-directional association to unidirectional
- Does a key need to know about the key chain it’s on?
Replace superclass with delegate (composition over inheritance)

Indecent Exposure

Every variable and method should be private unless it has to be public
- Hide your decisions
Don’t worry about efficiency: only move to direct access if informed by performance monitoring

Feature Envy

Method making extensive use of another class (e.g. envious of their methods and which they had them).

Cohesive elements should be in the same module/class.

Shotgun Surgery

When making a change requires splattering lots of small changes across a large swath of the system; changes should be localized.

This probably means the single responsibility principal has been violated.

Solutions:

Move methods/data
Create a new class

Test Smells

Hard to test code:

Highly coupled
Asynchronous/multi-threaded
GUI
Buggy tests

Obscure tests:

Difficult to understand
Too much/too little information
- Eager tests; testing too much functionality
- Irrelevant information
- Hard-coded test data
- Indirect tests

Production bugs:

Too many bugs getting to production
Is there test coverage? Are the tests good?
Are we checking or testing?
- Humans test, machine check
Are tests covering all possibilities
Are tests buggy?

High maintenance:

Tests need to be modified often
Are tests too complex, obscure?
Single responsibility principal
Test duplication?

Fragile tests:

Interface too sensitive (e.g. GUI tests sensitive to resizing)
Context too sensitive
- Pass condition sensitive to minor changes
- Pass condition sensitive to date/time, server state

Erratic tests:

Failing for no reason
Works on only some environments
Be wary of conditions in tests