Finding duplicate code with CCFinder

If you're attempting to refactor a large code base, give CCFinder a try.

I checked out the following tools suggested by the Wikipedia entry on duplicate code :
- Simian
- CCFinder
- PMD Target

Simian was pretty easy to download, install and run. It seemed to do a pretty good job of listing duplicate code sections in the code. But it only gave a text list. I guess I would have saved the output and sliced and diced it a bit in Excel or something, but that would still leave much to be desired.

After checking out the screen shots of CCFinder, I decided to give it a try. It's free, and *do* I appreciate that, but the registration process was really pretty cumbersome. Its CAPTCHA function is by far the worst I've seen -- I failed it several times. There's also a password for the unzip file, and a license you have to install.

It's also unclear if you need SilverLight 2.x+, Python, .Net etc. I think the installer's been improving, but the whole experience needs some clarity.

I had some problems with my Java path (admittedly my problem), but I finally got the thing installed. I think it was worth the effort.

I was analyzing a PHP code base. Unfortunately, PHP wasn't explicitly supported. So I had to rename the files to .cpp (close enough), and remove the "preprocess" option from the analyzer. Then I directed it to my parent directory, and let it go.

You get a nice visual showing where blocks of duplicate code are. This page gives a good overview of how to navigate and discover things :

I found it most useful to look at the sorted Clone-set Table (biggest clones highest). The source code on the right was helpful, but I often went to a diff tool (ex. WinMerge) for further analysis.

I guess it's obvious, but CCFinder also found intra-file clones -- something you can't find with something like WinMerge.

Anyway, one thumb down for CCFinder's install experience, but two thumbs up for an otherwise nice, free tool.


There are 3 Comments

Could you say a little more about your Java path problem and its solution? I'm having a similar problem.

I am trying to install and run CCFinderX on WinXP.
The installation doc at the site ( asks to unzip the file with a password, and then run .
However, the downloaded zip ( from did not ask for a password, neither does the zip have setup.exe.
When I tried to execute or , it gave error saying that the Python Interpreter Path is not defined.

Looks like I do not have the correct ccfx zip file. Pl let me know where to download the correct file, or where do I find the correct installation instructions.

Warm regards
Ravindra Naik

It may be expecting v 2.6 of python. Give its path in gemx.bat (if not default).