Object-Oriented Software Engineering

Refactoring

Or, Object-Oriented Program Algebra and the Art of Elegant Programming
— use program transformations (the algebraic equivalence laws of OOP) to produce cleaner, better code.

The "bible" of refactoring is Fowler's book, Refactoring (click on this link for access to the eBook for JHU students — there are limited licenses so it may not work when you try). It is a very well-written book which I would have made a required text for the course if it didn't cost $50.

Note that Fowler didn't come up with the idea of refactoring, he just popularized it.

Also see refactoring.guru for a very nice interactive description of the individual refactorings.

Refactoring by Example

We will spend a day refactoring the example following Refactoring Chapter 1, also showing how IntelliJ can be used to automate refactorings.

A modernized example of the video rental application he uses for his example can be downloaded here;
an example of the sources as they appear by the end of lecture may be downloaded here.

IntelliJ and Refactoring

IntelliJ (and some other IDEs) have a Refactor menu — use it!!
It will automatically perform many of Fowler's refactoring operations for you
It includes refactorings not mentioned by Fowler which are also very useful
The IDE automatically checks many of the preconditions that a refactoring needs in fact hold (some it can't easily check so its not airtight).
We will demo the IntelliJ refactoring menu in lecture.

Foundations of Refactoring

Why Refactor?

To improve the quality of code (or design)
.. which makes software easier to understand
.. and in turn helps in finding bugs
.. which overall allows you to complete a project more quickly and reliably.

How to refactor

Make many minimal changes in series, each one being fast and quickly testable.
The bulk of the refactoring book is a common of common small transformations

When to Refactor?

In order to elegantly incorporate new functionality
"First make the change easy (by doing a series of refactorings); then make the easy change."
When you run into an anti-pattern aka bad smell (a topic we cover below)
The rule of three: if you do three things the same time in an implementation, its time to refactor to avoid more repetition in the future
Around a consistently buggy area: bugs are a sign that code is bad, and refactoring will improve that code
At code review time in the project lifecycle
(A code review is a meeting at which the current state of the code is reviewed in detail.)

Everyone has a different coding methodology that requires more or less refactoring.

Beethoven was a serious refactorer: he wrote, threw out, redid, patched etc his symphonies
Mozart wrote great musical "code" right off the top of his head.

Limitations of Refactoring

If you are using a external library, those can't be refactored, limiting the improvements that can be made. (Note however that with shared code you can refactor your team-mates code; this is one of the main strengths of having a shared codebase)
Interfaces that are published should change as little as possible, but refactoring often wants to change those interfaces. This is a fundamental tension in software engineering.
If the design is awful, its not worth trying to refactor—throw it out!

Testing and Refactoring

Testing and refactoring mutually support each other
If you don't have enough tests you could introduce bugs in the process of refactoring
No tests → afraid to refactor because of fear of bug introduction → little refactoring → codebase is bad

Here is the sequence of events to follow to test in the context of a refactoring:

Make sure you have unit tests covering the code you are about to change
Make sure you are 100% compliant with tests
Perform one step of refactoring
Re-run tests and fix any fails to get back to 100%

Fowler's library of refactorings

The bulk of the refactoring book is a library of useful refactorings; we will review the following ones in lecture (these are also the only ones we will ask about on HW). Links below are to the refactoring on refactoring.guru.

Extract Method — for too-long methods doing multiple independent things
Inline Method — for too-short methods doing almost nothing
Inline Temp — code can be easier to read if you in-place perform the calculation
Replace Temp With Query — again getting rid of temps; also this one can set up other refactorings since temps get in the way
Split Temporary Variable — don't use a temporary for multiple independent purposes
Replace Method With Method Object — primarily useful to support breaking apart a too-large method
Extract Class — one class doing the job of two, an SRP violation
Move Method — method is in wrong class
Move Field — same as previous but for field
Inline Class — one class has very few responsibilities
Collapse Hierarchy — a variation on the previous for under-utilized subclasses
Replace Inheritance With Delegation — do this if you are violating the Liskov Substitution Principle (LSP)
Replace Delegation With Inheritance — inverse of above; inheritance is more expressive (because of is-a, LSP), delegation is more flexible (no need to require LSP etc)
Encapsulate Field — aka "use getters and setters", to limit accessibility and support change
Replace Conditional With Polymorphism — replace switch on your own data with dynamic dispatch
Replace Type Code with State/Strategy — a refinement of previous for the case your data can change
Introduce Assertion — one tool to help program defensively
Replace Parameter With Explicit Methods — don't switch on a parameter, turn into individual methods
Replace Parameter With Method Call — don't pass things the callee can get on their own
Introduce Parameter Object — If a certain group of parameters is always being passed together, make an object in their own right
Form Template Method — how and why to apply the Template Method design pattern

Code Smells aka anti-patterns

A code smell/anti-pattern is a pattern of coding that smells (and is) wrong.
See the Wikipedia anti-pattern list and Fowler's smells at refactoring.guru.

General Anti-Patterns

We will review some of the general anti-patterns from the above Wikipedia page.

Analysis Paralysis — going round and round on planning (overthinking it) without building anything.
Bike Shedding — wasting too much time on trivial aspects of an important project
Big Ball of Mud — code with poor separation of concerns, commonly produced by coders only putting out the next fire and not well-trained in good design. Beware, there is a MASS of mud out there.
Input Kludge — lack of validation of input to make sure it is proper.
BaseBean — don't inherit when relation is not is-a, delegate instead. Related to LSP and replace inheritance with delegation above.
Call Super — don't require subclass overrider to call super method, its a bad interface boundary. Note this one is contentious, its a trade-off.
Circle-Ellipse Problem — we covered this as square/rectange under the Liskov Substitution Principle (LSP).
Sequential Couping — methods with an implicit constraint on which one must be called first. Occasionally this is required, if so make the contract clear in API docs; or, code defensively (raise exception if violated).

Some of Fowler's bad smells with refactorings to fix

The names below link to the refactoring.guru page on the various smells.

Duplicate Code AKA DRY
— extract out the common bits into their own method (extract method) if code is in same class
— if two classes duplicate code, consider extract class to create a new class to hold the shared functionality.
Long Method
— extract method!
Long Parameter List
— replace parameter with method call: don't pass data the receiver can get on its own accord — receiver explicitly asks for data itself. You may need to pass an object as argument that the callee can then fetch all the goodies from.
OR: if the parameters are a natural, self-contained grouping, introduce parameter object.
Example: day month, year, hour minute second parameters ==> date parameter
— both of these changes are turning passive objects into more active objects, a Good Thing.
Divergent Change (a.k.a. SRP violation)
If you have a fixed class that does distinctly different things consider separating out the varying code into varying classes (extract class) that either subclass or are contained by the non-varying class. This smell is closely related to the Single Responsibility Principle.
Shotgun Surgery (more details)
The smell: adding a new feature requires very similar code to be added to a bunch of places, and copy/paste is likely involved.
— try to move method and move field to put the common bits into one place so no copy/paste needed.
Feature Envy
Method in one class uses lots of pieces from another class.
— move method to move it to the other class.
Data Clumps
Data that's always hanging with each other (e.g. name/street/zip). If its clumping, its a concept
— extract class to make a new class just for that data. Will also help trim argument lists too since name street zip now passed as one address object. Closely related to long parameter list above.
Switch (case) statements
Yes, any switch can potentially smell, and especially instanceOf ones.
— Use inheritance and polymorphism instead (replace conditional with polymorphism or replace type code with state/strategy)
— we covered an example of this in Fowler Chapter 1; one of the more difficult refactorings
Lazy Class
Class doesn't seem to be doing anything. Get rid of it!
- collapse heirarchy if a subclass is nearly vacuous.
- inline class (stick the class' methods and fields in the class that was using it and get rid of original class).
Speculative generality
Class designed to do something in the future but never ends up doing it. Thinking too far ahead. Or you though you needed this generality but you didn't.
— like above, collapse hierarchy or inline class
Message chains
Say you want to send a message to object D in class A but you have to go through B to get C and C to get D.
— In other words, you were writing something like a.getB().getC().getD().messageToD()
— These are strong violations of object encapsulation / active objects philosophy
— find a sensible spot in the middle where a logical interface can be inserted.
— e.g. use hide delegate to hide C and D in B, and add a method to B that does what A wanted to do with D: a.getB().letDKnowThisMessage().
Inappropriate Intimacy (aka Object Orgy)
directly getting in and munging with the internals of another class.
— Signifies lack of good interface abstractions (recall design principle, "code to interfaces not implementations")
— To fix this, move methods, etc, to consolidate the intimate bits.
Data Class aka god class aka Data-centric design
We have already covered this in the design princinples lecture: in data-centric design, there are some data classes which are pretty much structs: no interesting methods, and one god class that is like the C main program with all the functions in it.
— look who uses the data and how they use it and move some of that code to the data class via a combination of extract method and move method
— the Fowler chapter 1 example had several examples of this since it was initially a completely data-centric design
Comments
Comments in the middle of methods are often deodorant. Refactor so each comment block is its own method — extract method.

OOSE