Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Get list of distinct pixel RGB values
#1
Hi all,

I'm attempting to write a python script that palette swaps an un-indexed image, as sample colorize doesn't quite seem to have a way to force zero interpolation between colours.

I've got code that iterates across a pixel region pixel-by-pixel using the RowIterator from the colorxhtml.py script, finds the unique colours, arranges them by brightness, then uses gimp_image_select_color and gimp_edit_bucket_fill to replace the colours from drawable 1 with the colours from drawable 2, exactly, in order. All good.

The problem is iterating across the drawables takes way longer than it should - about 30 seconds for a ~250x250px layer. That's a bit much. Is there a way to get a list of unique pixel RGB values easily (ignoring transparency)? The histogram functions seem to work on individual channels.

I guess I could do a nested histogram in 1 channel, select areas for each of the values it took, then histogram each of those sub-areas in the next channel, then repeat again, but that seems like it'd be even less efficient!

Edit: Argh, sorry, I meant to post this in scripting! I don't seem to have the rights to delete it and repost, apologies.
Reply
#2
I would like to see you code, because even though the colorxhtml script isn't that optimized, it cannot be that bad.

This code:
Code:
image=gimp.image_list()[0]
layer=image.active_layer
region=layer.get_pixel_rgn(0, 0, 400,400)

from collections import defaultdict
colors=defaultdict(int)
for x in range(400):
   for y in range(400):
       colors[region[x,y]]+=1
len(colors)

runs in 5 seconds, and report the same number of colors as the color cube analysis.

   

It runs even faster (a couple of seconds) with an image reduced to 256 colors...

The colors you obtain are a 3-byte string such as
Code:
> region[0,0]
'x\x89|'

which means :
  • Red is 0x78 = 120 ("x"in ASCII)
  • Green is 0x89 = 137
  • Blue is 0x7C = 124 ("|"in ASCII)
   

(to have a representative number of colors, my test layer was a gray (0x80) to which I added some low RGB noise).

You can of course write into the pixel region and return it as a new layer.

If you want really fast processing you can use numpy so iterations are done by C code) but of course it isn't part of your regular Python runtime and adding it to the Gimp python runtime on Windows may be an ordeal for your prospective users.
Reply
#3
Sorry for the very late reply - managed to completely forget about this after I fixed it :/. The code's on GitHub, and it's been revised quite a bit since I posted this. Originally, I was calculating the sum RGB for each pixel and then using that as the dict key, but that was clearly a daft move when I could just do the summing afterwards.

The code I ended up with is below, with changes to account for 1. some palettes having multiple colours of the same sum RGB, and 2. some of the images having 1-2 stray pixels out of the palette. Fixing those meant switching to using pixel RGB as key like you, which then solved the performance issues.
Code:
def extract_sorted_palette(
    layer, include_transparent, count_threshold,
    current_progress, progress_fraction,
):
    """
    Extracts a palette from an image, by finding the discrete RGB values
    and then sorting them by total R+G+B value.
    """
    palette_counts = {}    
    progress_step = progress_fraction / layer.height

    region = layer.get_pixel_rgn(
        0, 0, layer.width, layer.height
    )

    for index_row in range(0, layer.height):
        for pixel in RowIterator(region[0:layer.width, index_row], layer.bpp):
            colour_rgb = pixel[0:3]

            if layer.has_alpha and pixel[3] == 0 and not include_transparent:
                continue

            elif colour_rgb not in palette_counts:
                palette_counts[colour_rgb] = 1

            else:
                palette_counts[colour_rgb] += 1

        gimp.progress_update(current_progress + progress_step * index_row)

    # Now we've counted all the pixel colours, discard outliers and sort
    palette = {}
    for colour_rgb, colour_count in palette_counts.items():
        colour_sum = sum(colour_rgb)

        if colour_count > count_threshold:
            if colour_sum in palette:
                if colour_rgb != palette[colour_sum]:
                    colour_duplicate = palette[colour_sum]
                    raise KeyError(
                        "Multiple colours in layer with same total RGB values: " + \
                        str(colour_rgb) + "(" + str(colour_count) + " pixels) and " + \
                        str(colour_duplicate) + "(" + str(palette_counts[colour_duplicate]) + " pixels). "
                        "Cannot automatically sort colours by brightness. " + \
                        "Try increasing the 'ignore colours with less than this many pixels' setting " + \
                        "to drop stray pixels."
                    )
            else:
                palette[colour_sum] = colour_rgb

    sorted_palette = [
        palette[key] for key in sorted(list(palette.keys()))
    ]
    return sorted_palette
Though looking at your example I can't believe I forgot about defaultdict, I'll fix that. Should also be using actual perceived brightness rather than RGB intensity too.
Reply
#4
(03-23-2023, 01:28 PM)thetalkietoaster Wrote: Sorry for the very late reply - managed to completely forget about this after I fixed it :/. The code's on GitHub, and it's been revised quite a bit since I posted this. Originally, I was calculating the sum RGB for each pixel and then using that as the dict key, but that was clearly a daft move when I could just do the summing afterwards.

The code I ended up with is below, with changes to account for 1. some palettes having multiple colours of the same sum RGB, and 2. some of the images having 1-2 stray pixels out of the palette. Fixing those meant switching to using pixel RGB as key like you, which then solved the performance issues.
Code:
def extract_sorted_palette(
   layer, include_transparent, count_threshold,
   current_progress, progress_fraction,
):
   """
   Extracts a palette from an image, by finding the discrete RGB values
   and then sorting them by total R+G+B value.
   """
   palette_counts = {}    
   progress_step = progress_fraction / layer.height

   region = layer.get_pixel_rgn(
       0, 0, layer.width, layer.height
   )

   for index_row in range(0, layer.height):
       for pixel in RowIterator(region[0:layer.width, index_row], layer.bpp):
           colour_rgb = pixel[0:3]

           if layer.has_alpha and pixel[3] == 0 and not include_transparent:
               continue

           elif colour_rgb not in palette_counts:
               palette_counts[colour_rgb] = 1

           else:
               palette_counts[colour_rgb] += 1

       gimp.progress_update(current_progress + progress_step * index_row)

   # Now we've counted all the pixel colours, discard outliers and sort
   palette = {}
   for colour_rgb, colour_count in palette_counts.items():
       colour_sum = sum(colour_rgb)

       if colour_count > count_threshold:
           if colour_sum in palette:
               if colour_rgb != palette[colour_sum]:
                   colour_duplicate = palette[colour_sum]
                   raise KeyError(
                       "Multiple colours in layer with same total RGB values: " + \
                       str(colour_rgb) + "(" + str(colour_count) + " pixels) and " + \
                       str(colour_duplicate) + "(" + str(palette_counts[colour_duplicate]) + " pixels). "
                       "Cannot automatically sort colours by brightness. " + \
                       "Try increasing the 'ignore colours with less than this many pixels' setting " + \
                       "to drop stray pixels."
                   )
           else:
               palette[colour_sum] = colour_rgb

   sorted_palette = [
       palette[key] for key in sorted(list(palette.keys()))
   ]
   return sorted_palette
Though looking at your example I can't believe I forgot about defaultdict, I'll fix that. Should also be using actual perceived brightness rather than RGB intensity too.

I just skimmed the function extract_sorted_palette that you pasted into your post compared to your recent github version. Did you mean to change the logic when reversing the if statement:

From:

if layer.has_alpha and pixel[3] == 0 and not include_transparent:
    continue

To:

if include_transparent or layer.has_alpha and pixel[3] > 0:
    up the count

On a brief look, it doesn't seem right for a case when include_transparent is False. Would you be wanting something more like this:

if include_transparent or not layer.has_alpha or pixel[3] > 0:
    up the count
Reply


Forum Jump: