C++ Refactoring walk through


Example

Here's a program which might benefit from refactoring. It's a simple program using C++11 which is intended to calculate and print all prime numbers from 1 to 100 and is based on a program that was posted on CodeReview for review.

#include <iostream>
#include <vector>
#include <cmath>

int main()
{
    int l = 100;
    bool isprime;
    std::vector<int> primes;
    primes.push_back(2);
    for (int no = 3; no < l; no += 2) {
        isprime = true;
        for (int primecount=0; primes[primecount] <= std::sqrt(no); ++primecount) {
            if (no % primes[primecount] == 0) {
                isprime = false;
                break;
            } else if (primes[primecount] * primes[primecount] > no) {
                std::cout << no << "\n";
                break;
            }
        }
        if (isprime) {
            std::cout << no << " ";
            primes.push_back(no);
        }
    }
    std::cout << "\n";
}

The output from this program looks like this:

3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

The first thing we notice is that the program fails to print 2 which is a prime number. We could simply add a line of code to simply print that one constant without modifying the rest of the program, but it might be neater to refactor the program to split it into two parts - one that creates the prime number list an another that prints them. Here's how that might look:

#include <iostream>
#include <vector>
#include <cmath>

std::vector<int> prime_list(int limit)
{
    bool isprime;
    std::vector<int> primes;
    primes.push_back(2);
    for (int no = 3; no < limit; no += 2) {
        isprime = true;
        for (int primecount=0; primes[primecount] <= std::sqrt(no); ++primecount) {
            if (no % primes[primecount] == 0) {
                isprime = false;
                break;
            } else if (primes[primecount] * primes[primecount] > no) {
                break;
            }
        }
        if (isprime) {
            primes.push_back(no);
        }
    }
    return primes;
}

int main() 
{
    std::vector<int> primes = prime_list(100);
    for (std::size_t i = 0; i < primes.size(); ++i) {
        std::cout << primes[i] << ' ';
    }
    std::cout << '\n';
}

Trying this version, we see that it does indeed work correctly now:

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

The next step is to notice that the second if clause is not really needed. The logic in the loop looks for prime factors of each given number up to the square root of that number. This works because if there are any prime factors of a number at least one of them must be less than or equal to the square root of that number. Reworking just that function (the rest of the program remains the same) we get this result:

std::vector<int> prime_list(int limit)
{
    bool isprime;
    std::vector<int> primes;
    primes.push_back(2);
    for (int no = 3; no < limit; no += 2) {
        isprime = true;
        for (int primecount=0; primes[primecount] <= std::sqrt(no); ++primecount) {
            if (no % primes[primecount] == 0) {
                isprime = false;
                break;
            }
        }
        if (isprime) {
            primes.push_back(no);
        }
    }
    return primes;
}

We can go further, changing variable names to be a bit more descriptive. For example primecount isn't really a count of primes. Instead it's an index variable into the vector of known primes. Also, while no is sometimes used as an abbreviation for "number", in mathematical writing, it's more common to use n. We can also make some modifications by eliminating the break, and by declaring variables closer to where they're used.

std::vector<int> prime_list(int limit)
{
    std::vector<int> primes{2};
    for (int n = 3; n < limit; n += 2) {
        bool isprime = true;
        for (int i=0; isprime && primes[i] <= std::sqrt(n); ++i) {
            isprime &= (n % primes[i] != 0);
        }
        if (isprime) {
            primes.push_back(n);
        }
    }
    return primes;
}

We can also refactor main to use a "range-for" to make it a bit neater:

int main() 
{
    std::vector<int> primes = prime_list(100);
    for (auto p : primes) {
        std::cout << p << ' ';
    }
    std::cout << '\n';
}

This is just one way refactoring might be done. Others might make different choices. However, the purpose for refactoring remains the same, which is to improve the readability and possibly the performance of the code without necessarily adding features.