Refactoring
Or, Object-Oriented Program Algebra and the Art of Elegant Programming
use program transformations (the algebraic equivalence laws of OOP) to produce
cleaner, better code.
The "bible" of refactoring is Fowler's book, Refactoring (click on this link for access to the eBook for JHU students there are limited licenses so it may not work when you try). It is a very well-written book which I would have made a required text for the course if it didn't cost $50.
Note that Fowler didn't come up with the idea of refactoring, he just popularized it.
Also see refactoring.guru for a very nice interactive description of the individual refactorings.
Refactoring by Example
We will spend a day refactoring the example following Refactoring Chapter 1, also showing how IntelliJ can be used to automate refactorings.
- A modernized example of the video rental application he uses for his example can be downloaded here;
- an example of the sources as they appear by the end of lecture may be downloaded here.
IntelliJ and Refactoring
- IntelliJ (and some other IDEs) have a
Refactor
menu use it!! - It will automatically perform many of Fowler's refactoring operations for you
- It includes refactorings not mentioned by Fowler which are also very useful
- The IDE automatically checks many of the preconditions that a refactoring needs in fact hold (some it can't easily check so its not airtight).
- We will demo the IntelliJ refactoring menu in lecture.
Foundations of Refactoring
Why Refactor?
- To improve the quality of code (or design)
- .. which makes software easier to understand
- .. and in turn helps in finding bugs
- .. which overall allows you to complete a project more quickly and reliably.
- Make many minimal changes in series, each one being fast and quickly testable.
- The bulk of the refactoring book is a common of common small transformations
- In order to elegantly incorporate
new functionality
"First make the change easy (by doing a series of refactorings); then make the easy change." - When you run into an anti-pattern aka bad smell (a topic we cover below)
- The rule of three: if you do three things the same time in an implementation, its time to refactor to avoid more repetition in the future
- Around a consistently buggy area: bugs are a sign that code is bad, and refactoring will improve that code
- At code review time in the project lifecycle
(A code review is a meeting at which the current state of the code is reviewed in detail.)
- Beethoven was a serious refactorer: he wrote, threw out, redid, patched etc his symphonies
- Mozart wrote great musical "code" right off the top of his head.
- If you are using a external library, those can't be refactored, limiting the improvements that can be made. (Note however that with shared code you can refactor your team-mates code; this is one of the main strengths of having a shared codebase)
- Interfaces that are published should change as little as possible, but refactoring often wants to change those interfaces. This is a fundamental tension in software engineering.
- If the design is awful, its not worth trying to refactorthrow it out!
Testing and Refactoring
- Testing and refactoring mutually support each other
- If you don't have enough tests you could introduce bugs in the process of refactoring
- No tests → afraid to refactor because of fear of bug introduction → little refactoring → codebase is bad
- Make sure you have unit tests covering the code you are about to change
- Make sure you are 100% compliant with tests
- Perform one step of refactoring
- Re-run tests and fix any fails to get back to 100%
Fowler's library of refactorings
The bulk of the refactoring book is a library of useful refactorings; we will review the following ones in lecture (these are also the only ones we will ask about on HW). Links below are to the refactoring on refactoring.guru.
- Extract Method for too-long methods doing multiple independent things
- Inline Method for too-short methods doing almost nothing
- Inline Temp code can be easier to read if you in-place perform the calculation
- Replace Temp With Query again getting rid of temps; also this one can set up other refactorings since temps get in the way
- Split Temporary Variable don't use a temporary for multiple independent purposes
- Replace Method With Method Object primarily useful to support breaking apart a too-large method
- Extract Class one class doing the job of two, an SRP violation
- Move Method method is in wrong class
- Move Field same as previous but for field
- Inline Class one class has very few responsibilities
- Collapse Hierarchy a variation on the previous for under-utilized subclasses
- Replace Inheritance With Delegation do this if you are violating the Liskov Substitution Principle (LSP)
- Replace Delegation With Inheritance inverse of above; inheritance is more expressive (because of is-a, LSP), delegation is more flexible (no need to require LSP etc)
- Encapsulate Field aka "use getters and setters", to limit accessibility and support change
- Replace Conditional With Polymorphism replace switch on your own data with dynamic dispatch
- Replace Type Code with State/Strategy a refinement of previous for the case your data can change
- Introduce Assertion one tool to help program defensively
- Replace Parameter With Explicit Methods don't
switch
on a parameter, turn into individual methods - Replace Parameter With Method Call don't pass things the callee can get on their own
- Introduce Parameter Object If a certain group of parameters is always being passed together, make an object in their own right
- Form Template Method how and why to apply the Template Method design pattern
Code Smells aka anti-patterns
- A code smell/anti-pattern is a pattern of coding that smells (and is) wrong.
- See the Wikipedia anti-pattern list and Fowler's smells at refactoring.guru.
General Anti-Patterns
We will review some of the general anti-patterns from the above Wikipedia page.
- Analysis Paralysis going round and round on planning (overthinking it) without building anything.
- Bike Shedding wasting too much time on trivial aspects of an important project
- Big Ball of Mud code with poor separation of concerns, commonly produced by coders only putting out the next fire and not well-trained in good design. Beware, there is a MASS of mud out there.
- Input Kludge lack of validation of input to make sure it is proper.
- BaseBean don't inherit when relation is not is-a, delegate instead. Related to LSP and replace inheritance with delegation above.
- Call Super don't require subclass overrider to call super method, its a bad interface boundary. Note this one is contentious, its a trade-off.
- Circle-Ellipse Problem we covered this as square/rectange under the Liskov Substitution Principle (LSP).
- Sequential Couping methods with an implicit constraint on which one must be called first. Occasionally this is required, if so make the contract clear in API docs; or, code defensively (raise exception if violated).
Some of Fowler's bad smells with refactorings to fix
The names below link to the refactoring.guru page on the various smells.
- Duplicate Code AKA DRY
extract out the common bits into their own method (extract method) if code is in same class
if two classes duplicate code, consider extract class to create a new class to hold the shared functionality. - Long Method
extract method! - Long Parameter List
replace parameter with method call: don't pass data the receiver can get on its own accord receiver explicitly asks for data itself. You may need to pass an object as argument that the callee can then fetch all the goodies from.
OR: if the parameters are a natural, self-contained grouping, introduce parameter object.
Example: day month, year, hour minute second parameters ==> date parameter
both of these changes are turning passive objects into more active objects, a Good Thing. - Divergent Change (a.k.a. SRP violation)
If you have a fixed class that does distinctly different things consider separating out the varying code into varying classes (extract class) that either subclass or are contained by the non-varying class. This smell is closely related to the Single Responsibility Principle. - Shotgun Surgery (more details)
The smell: adding a new feature requires very similar code to be added to a bunch of places, and copy/paste is likely involved.
try to move method and move field to put the common bits into one place so no copy/paste needed.
- Feature Envy
Method in one class uses lots of pieces from another class.
move method to move it to the other class. - Data Clumps
Data that's always hanging with each other (e.g. name/street/zip). If its clumping, its a concept
extract class to make a new class just for that data. Will also help trim argument lists too since name street zip now passed as one address object. Closely related to long parameter list above. - Switch (case) statements
Yes, any switch can potentially smell, and especially instanceOf ones.
Use inheritance and polymorphism instead (replace conditional with polymorphism or replace type code with state/strategy)
we covered an example of this in Fowler Chapter 1; one of the more difficult refactorings - Lazy Class
Class doesn't seem to be doing anything. Get rid of it!- collapse heirarchy if a subclass is nearly vacuous.
- inline class (stick the class' methods and fields in the class that was using it and get rid of original class).
- Speculative generality
Class designed to do something in the future but never ends up doing it. Thinking too far ahead. Or you though you needed this generality but you didn't.
like above, collapse hierarchy or inline class - Message chains
Say you want to send a message to object D in class A but you have to go through B to get C and C to get D.
In other words, you were writing something likea.getB().getC().getD().messageToD()
These are strong violations of object encapsulation / active objects philosophy
find a sensible spot in the middle where a logical interface can be inserted.
e.g. use hide delegate to hide C and D in B, and add a method to B that does what A wanted to do with D:a.getB().letDKnowThisMessage()
. - Inappropriate Intimacy (aka Object Orgy)
directly getting in and munging with the internals of another class.
Signifies lack of good interface abstractions (recall design principle, "code to interfaces not implementations")
To fix this, move methods, etc, to consolidate the intimate bits. - Data Class aka god class aka Data-centric design
We have already covered this in the design princinples lecture: in data-centric design, there are some data classes which are pretty muchstruct
s: no interesting methods, and one god class that is like the C main program with all the functions in it.
look who uses the data and how they use it and move some of that code to the data class via a combination of extract method and move method
the Fowler chapter 1 example had several examples of this since it was initially a completely data-centric design - Comments
Comments in the middle of methods are often deodorant. Refactor so each comment block is its own method extract method.