Tuesday, September 4, 2012

HCL: a new Color Space for a pack of lies.


HCL: a new Color Space for a more Effective
Content-based Image Retrieval
M. Sarifuddin and Rokia Missaoui
RESEARCH REPORT
D´epartement d’informatique et d’ing´enierie, Universit´e du Qu´ebec en Outaouais,
C.P. 1250, Succ. B, Gatineau (Qc), Canada, J8X 3X7.

I conduct an analysis of many dominant color spaces. Including the pack of lies called HCL. I've use colorspaces rather frequently and have implementations of all the major ones. Due to the modular nature of they are rather easy to collect. So seeing a reference to HCL (by M. Sarifuddin and Rokia Missaoui), I decided to give it a try. I managed to find a good implementation in PERL (Copyright (C) 2007, Mattia Barbon) and port it over to java to give it a try. After all the pretty pictures in the paper made it seem really effective.


Look how obvious (i) is better than something silly like DeltaE LAB. That's so superior! But, wait, some of those figures don't make sense. DeltaE is the change in the Euclidean distance. Sure, you can toss it at the tristimulus values in RGB, and LAB but LCH clearly has H. Hue. You can't apply it there. And Delta E94, is a modification of LAB colors. Why are you applying to things which aren't lab? And a cylindric distance on HSV? That's typically viewed as a cone but I suppose a cylinder would work too. But, moreover, why does I get to have so much more yellow? I mean, am I to suppose that Delta E, LAB just thinks those greens are much closer? Or is there a major flaw in methodology here?

So checking the methodology we find that it randomly chooses crimped RGB colors 0 <= R,G,B <=255, step 15. So there's 17 different steps in any of the particular RGB values. And 4913 (17³) different colors available. So loading up my trusty Photoshop eye dropper (on auspices of finding the same first index color of yellow for a test) I came across an oddity. These values are not mod-15. It can't really have used this methodology, that or some color rounding issue caused it to fail or the conversion of the colors into a .pdf did some color quantization, in a paper about color.

Checking figure 3i, here we find that this theory cannot be justified at all. The color yellow is 253,254,31. The next three boxes are only changed by their blue component 34,39,45. These colors are not permitted by the methodology. SHENANIGANS! Not only do you have a heck of a lot of yellow, you have yellows you're not even *allowed* to have.

Also, why are these "randomly" varying? "Each one of them is compared to a collection of randomly generated colors using each one of the proposed similarity measures." -- No. That's not fair. Then if your distance criteria is unforgiving you simply win. If the color is within 3 values between RGB, go ahead and give that a distance of 1. For *everything else* return a infinity. Well then it just keeps randomly getting new colors until it finds things that my threshold function allows? Namely pretty much identical colors?

Let's try this again, with something proper. 48 squares. Index color and first 47 colors sorted from this 4913. No random. No threshold. The closest colors without repeats given that specific criteria.

 Lab Delta 2000.






 Lab Delta 1994







Lab Delta Euclidean







Hunter Lab







Luv








RGB Delta Euclidean







Redmean







And Finally....


Drumroll please


HCL!







Oh, did I mention the hue formula is wrong?

        if (rg >= 0 && gb >= 0) {
            H = 2 * H / 3;
        } else if (rg >= 0 && gb < 0) {
            H = 4 * H / 3;
        } else if (rg < 0 && gb >= 0) {
            H = 180 + 4 * H / 3;
        } else if (rg < 0 && gb < 0) {
            H = 2 * H / 3 - 180;
        }
It treats rg (R - G) and gb (G - B) as complimentary colors. These aren't complementary but tristimulus. It corrects this by tweaking the ranges of the hue. What was 90 and 90 becomes 60 and 120. Nudging the hue into where it would be if there were complementary colors.

Since it uses Arctan(rg/gb) to make the hue ranging from -90° to +90° it needs to shift two of the sections over. And the paper wrongly chooses the -+, and --. When really it should use +-, --. Shifting over quadrant II and IV rather than III, IV. The angle 0 is +Y, not +X. The range being utilized are quadrant I, and quadrant III. The +Y bits. When the Y (in this case gb) is negative it needs to be rotated by 180. It does this for +X sections. Leaving nothing in quadrant I, and two overlapping color areas in quadrant III.


+180 and -180 are the same thing. It doesn't matter which way you turn when you turn around. You end up around. So really it should be:

if (gb > 0) H += 180;

rather than,

if (rg > 0) H += 180;

Though in the equations this is done to preserve the sign.


You properly need to invert the hue. And roll it over to to the other side.








This turns out to be a lot of work for something that isn't really that great. The colors are read left to right, top to bottom. And it *should* have the closest colors to the index color. So the fact that there's a color in the 3 row that looks a lot like the index color and certainly more than the browns and greens in there. Means that it gets marked off. Compare this to something rather nice like LAB DeltaE2000:









While there are some colors that seem a bit closer (though the viewing area can make a significant amount of difference with background colors etc) it really does seem to keep the best colors right up top.

But, the paper also includes it's own distance formula. Rather than using cylindrical distance, we can use DistanceHCL. Which gives us:








 Is this an improvement? Yes. Is this an improvement over CIEDE2000 (Lab DeltaE2000), no. Not remotely.

One of the things that should be noted is that there's a big difference between color distance routines which are simply different foldings of RGB space, and things like LAB which actually pushes and pulls various hue ranges with regard to human eyes. It makes a big difference apparently because regardless how well you can tweak the space into a different shape, if there's no regard to the color of that shape at a some specific area you will always be hampered by the non-linear nature of RGB. We see greens more clearly than blues, and blues better than reds (although blue makes less of a contribution to our perception of gamma).

No comments: