A Tale of Two Bresenhams

...or How To Draw Thick Bresenham Lines in Rust

I decided last post to construct rivers basically using a bunch of circles. One reason for that is because it would make it easy to have the middle of my rivers be deeper than the edges. Another reason is because properly drawing thick lines has been frustrating me a bit lately and I was eager to just move on. They were both pretty bad reasons. Bad because the zillion circle idea turned out to be really slow (partly because I kept a ring buffer of previous tiles so I could avoid where my circles overlapped), and also bad because where's the fun in being lazy and not figuring something out?

I tried lots of algorithms for thick lines. Lots. Some of them just didn't work, possibly my fault. Some of them were just plain wrong and the line ends weren't perpendicular. I suck at math, so it took me a bit to wrap my head around the problem. At first, I'm thinking to myself: A thick line is just a rectangle...all I need are a couple perpendicular lines to set the corners to, and fill in all the pixels one line at a time. How hard can it be?

Every friggin' site on the internet says that to draw a perpendicular line, just invert the slope. Problem solved! (I mean, except for zero and infinity, but that's easy because at that point we're dealing with vertical or horizontal lines, and even I can add to X or add to Y to get a perpendicular point). But alas, the internet lies. Just blindly inverting a slope will give you a perpendicular slope only some of the time.

I finally found a promising link here: https://github.com/ArminJo/Arduino-BlueDisplay/blob/master/src/LocalGUI/ThickLine.hpp. At first I recoiled in horror at the camelCase variable names, the same exact way I hated Rust's snake_case names at first. Rust got this one right though. I ended up making 3 different versions, all tailored for my specific use, which is why I'm not making (or contributing to) a crate.

get_thick_line_unchecked() will make a proper fat line, even if it overlaps canvas boundaries (which are still only square). I call it unchecked because if any parts of the line go out of bounds, there will be duplicate points. get_thick_line() will also make a proper fat line, account for out of bounds, but also remove duplicates. Option #3 is the scary one, and I'll write about it in a bit. Both of the unchecked versions will produce perfect lines with zero duplicates if no parts of the line rectangle go out of bounds. Here's the code for the first version:

#[inline]
// ported from https://github.com/ArminJo/Arduino-BlueDisplay/blob/master/src/LocalGUI/ThickLine.hpp
pub fn get_thick_line_unchecked(
    from: (usize, usize),
    to: (usize, usize),
    line_width: usize,
    thick_mode: ThicknessMode,
    map_width: usize,
) -> Vec<(usize, usize)> {
    let mut line: Vec<(usize, usize)> = vec![];
    let mut x1 = from.0 as isize;
    let mut x2 = to.0 as isize;
    let mut y1 = from.1 as isize;
    let mut y2 = to.1 as isize;
    let mut dx: isize;
    let mut dy: isize;
    let dx2: isize;
    let dy2: isize;
    let mut err: isize;
    let mut step_x: isize;
    let mut step_y: isize;
    let max_width = (map_width - 1) as isize;
    if x1 > max_width {
        x1 = max_width;
    }
    if x2 > max_width {
        x2 = max_width;
    }
    if y1 > max_width {
        y1 = max_width;
    }
    if y2 > max_width {
        y2 = max_width;
    }
    if line_width <= 1 {
        return get_line_overlap((x1, y1), (x2, y2), LINE_OVERLAP_NONE, map_width);
    }
    dy = x2 - x1;
    dx = y2 - y1;
    let mut swap = true;
    if dx < 0 {
        dx = -dx;
        step_x = -1;
        swap = !swap;
    } else {
        step_x = 1;
    }
    if dy < 0 {
        dy = -dy;
        step_y = -1;
        swap = !swap;
    } else {
        step_y = 1;
    }
    dx2 = dx << 1;
    dy2 = dy << 1;
    let mut overlap: usize;
    let mut draw_start_adjust_count = line_width / 2;
    if thick_mode == ThicknessMode::LineThicknessDrawCounterclockwise {
        draw_start_adjust_count = line_width - 1;
    } else if thick_mode == ThicknessMode::LineThicknessDrawClockwise {
        draw_start_adjust_count = 0;
    }
    if dx >= dy {
        if swap {
            draw_start_adjust_count = (line_width - 1) - draw_start_adjust_count;
            step_y = -step_y;
        } else {
            step_x = -step_x;
        }
        err = dy2 - dx;
        for _ in (0..draw_start_adjust_count).rev() {
            x1 -= step_x;
            x2 -= step_x;
            if err >= 0 {
                y1 -= step_y;
                y2 -= step_y;
                err -= dx2;
            }
            err += dy2;
        }
        line.append(&mut get_line_overlap(
            (x1, y1),
            (x2, y2),
            LINE_OVERLAP_NONE,
            map_width,
        ));
        err = dy2 - dx;
        for _ in (1..line_width).rev() {
            x1 += step_x;
            x2 += step_x;
            overlap = LINE_OVERLAP_NONE;
            if err >= 0 {
                y1 += step_y;
                y2 += step_y;
                err -= dx2;
                overlap = LINE_OVERLAP_MAJOR;
            }
            err += dy2;
            line.append(&mut get_line_overlap(
                (x1, y1),
                (x2, y2),
                overlap,
                map_width,
            ))
        }
    } else {
        if swap {
            step_x = -step_x;
        } else {
            draw_start_adjust_count = (line_width - 1) - draw_start_adjust_count;
            step_y = -step_y;
        }
        err = dx2 - dy;
        for _ in (0..draw_start_adjust_count).rev() {
            y1 -= step_y;
            y2 -= step_y;
            if err >= 0 {
                x1 -= step_x;
                x2 -= step_x;
                err -= dy2;
            }
            err += dx2;
        }
        line.append(&mut get_line_overlap(
            (x1, y1),
            (x2, y2),
            LINE_OVERLAP_NONE,
            map_width,
        ));
        err = dx2 - dy;
        for _ in (1..line_width).rev() {
            y1 += step_y;
            y2 += step_y;
            overlap = LINE_OVERLAP_NONE;
            if err >= 0 {
                x1 += step_x;
                x2 += step_x;
                err -= dy2;
                overlap = LINE_OVERLAP_MAJOR;
            }
            err += dx2;
            line.append(&mut get_line_overlap(
                (x1, y1),
                (x2, y2),
                overlap,
                map_width,
            ));
        }
    }
    // these are 2.5x slower than using get_thick_line() to remove dupes:
    //line.sort();
    //line.dedup();
    line
}

and....

const LINE_OVERLAP_NONE: usize = 0; // No line overlap, like in standard Bresenham
const LINE_OVERLAP_MAJOR: usize = 0x01; // Overlap - first go major then minor direction. Pixel is drawn as extension after actual line
const LINE_OVERLAP_MINOR: usize = 0x02; // Overlap - first go minor then major direction. Pixel is drawn as extension before next line
const LINE_OVERLAP_BOTH: usize = 0x03; // Overlap - both

#[derive(PartialEq)]
pub enum ThicknessMode {
    LineThicknessMiddle = 0,
    LineThicknessDrawClockwise = 1,
    LineThicknessDrawCounterclockwise = 2,
}

#[inline]
// ported from https://github.com/ArminJo/Arduino-BlueDisplay/blob/master/src/LocalGUI/ThickLine.hpp
fn get_line_overlap(
    from: (isize, isize),
    to: (isize, isize),
    overlap: usize,
    map_width: usize,
) -> Vec<(usize, usize)> {
    let mut x1 = from.0;
    let x2 = to.0;
    let mut y1 = from.1;
    let y2 = to.1;
    let mut line: Vec<(usize, usize)> = vec![];
    let mut dx: isize;
    let mut dy: isize;
    let dx2: isize;
    let dy2: isize;
    let mut err: isize;
    let step_x: isize;
    let step_y: isize;
    let max_width = (map_width - 1) as isize;
    dx = x2 - x1;
    dy = y2 - y1;
    if dx < 0 {
        dx = -dx;
        step_x = -1;
    } else {
        step_x = 1;
    }
    if dy < 0 {
        dy = -dy;
        step_y = -1;
    } else {
        step_y = 1;
    }
    dx2 = dx << 1;
    dy2 = dy << 1;
    line.push((
        x1.clamp(0, max_width) as usize,
        y1.clamp(0, max_width) as usize,
    ));
    if dx > dy {
        err = dy2 - dx;
        while x1 != x2 {
            x1 += step_x;
            if err >= 0 {
                if overlap & LINE_OVERLAP_MAJOR != 0 {
                    line.push((
                        x1.clamp(0, max_width) as usize,
                        y1.clamp(0, max_width) as usize,
                    ));
                }
                y1 += step_y;
                if overlap & LINE_OVERLAP_MINOR != 0 {
                    line.push((
                        (x1 - step_x).clamp(0, max_width) as usize,
                        y1.clamp(0, max_width) as usize,
                    ));
                }
                err -= dx2;
            }
            err += dy2;
            line.push((
                x1.clamp(0, max_width) as usize,
                y1.clamp(0, max_width) as usize,
            ));
        }
    } else {
        err = dx2 - dy;
        while y1 != y2 {
            y1 += step_y;
            if err >= 0 {
                if overlap & LINE_OVERLAP_MAJOR != 0 {
                    line.push((
                        x1.clamp(0, max_width) as usize,
                        y1.clamp(0, max_width) as usize,
                    ));
                }
                x1 += step_x;
                if overlap & LINE_OVERLAP_MINOR != 0 {
                    line.push((
                        x1.clamp(0, max_width) as usize,
                        (y1 - step_y).clamp(0, max_width) as usize,
                    ));
                }
                err -= dy2;
            }
            err += dx2;
            line.push((
                x1.clamp(0, max_width) as usize,
                y1.clamp(0, max_width) as usize,
            ));
        }
    }
    line
}

Okay, that's the first version. It works, but possibly includes duplicate points. If you're actually trying to rasterize a rectangle pixel by pixel in 2023, then dupes might not matter.

Here's the guaranteed dupe free version, with an option for an unsafe unchecked version:

#[inline]
// ported from https://github.com/ArminJo/Arduino-BlueDisplay/blob/master/src/LocalGUI/ThickLine.hpp
// i strongly discourage use of the unsafe_unchecked version unless you're 100% sure of what you're doing
pub fn get_thick_line(
    from: (usize, usize),
    to: (usize, usize),
    line_width: usize,
    thick_mode: ThicknessMode,
    map_width: usize,
    unsafe_unchecked: bool,
) -> Vec<(usize, usize)> {
    let mut line: Vec<(isize, isize)> = vec![];
    let mut x1 = from.0 as isize;
    let mut x2 = to.0 as isize;
    let mut y1 = from.1 as isize;
    let mut y2 = to.1 as isize;
    let mut dx: isize;
    let mut dy: isize;
    let dx2: isize;
    let dy2: isize;
    let mut err: isize;
    let mut step_x: isize;
    let mut step_y: isize;
    let max_width = (map_width - 1) as isize;
    if x1 > max_width {
        x1 = max_width;
    }
    if x2 > max_width {
        x2 = max_width;
    }
    if y1 > max_width {
        y1 = max_width;
    }
    if y2 > max_width {
        y2 = max_width;
    }
    fn convert(input: &Vec<(isize, isize)>, max_width: isize) -> Vec<(usize, usize)> {
        let mut retval: Vec<(usize, usize)> = Vec::with_capacity(input.len());
        for x in 0..input.len() {
            if input[x].0 >= 0
                && input[x].0 <= max_width
                && input[x].1 >= 0
                && input[x].1 <= max_width
            {
                retval.push((input[x].0 as usize, input[x].1 as usize));
            }
        }
        retval
    }
    unsafe fn fast_convert(input: Vec<(isize, isize)>) -> Vec<(usize, usize)> {
        let mut v = std::mem::ManuallyDrop::new(input);
        Vec::from_raw_parts(v.as_mut_ptr() as *mut (usize, usize), v.len(), v.capacity())
    }
    if line_width <= 1 {
        line = get_line_overlap2((x1, y1), (x2, y2), LINE_OVERLAP_NONE);
        return convert(&line, max_width);
    }
    dy = x2 - x1;
    dx = y2 - y1;
    let mut swap = true;
    if dx < 0 {
        dx = -dx;
        step_x = -1;
        swap = !swap;
    } else {
        step_x = 1;
    }
    if dy < 0 {
        dy = -dy;
        step_y = -1;
        swap = !swap;
    } else {
        step_y = 1;
    }
    dx2 = dx << 1;
    dy2 = dy << 1;
    let mut overlap: usize;
    let mut draw_start_adjust_count = line_width / 2;
    if thick_mode == ThicknessMode::LineThicknessDrawCounterclockwise {
        draw_start_adjust_count = line_width - 1;
    } else if thick_mode == ThicknessMode::LineThicknessDrawClockwise {
        draw_start_adjust_count = 0;
    }
    if dx >= dy {
        if swap {
            draw_start_adjust_count = (line_width - 1) - draw_start_adjust_count;
            step_y = -step_y;
        } else {
            step_x = -step_x;
        }
        err = dy2 - dx;
        for _ in (0..draw_start_adjust_count).rev() {
            x1 -= step_x;
            x2 -= step_x;
            if err >= 0 {
                y1 -= step_y;
                y2 -= step_y;
                err -= dx2;
            }
            err += dy2;
        }
        line.append(&mut get_line_overlap2(
            (x1, y1),
            (x2, y2),
            LINE_OVERLAP_NONE,
        ));
        err = dy2 - dx;
        for _ in (1..line_width).rev() {
            x1 += step_x;
            x2 += step_x;
            overlap = LINE_OVERLAP_NONE;
            if err >= 0 {
                y1 += step_y;
                y2 += step_y;
                err -= dx2;
                overlap = LINE_OVERLAP_MAJOR;
            }
            err += dy2;
            line.append(&mut get_line_overlap2((x1, y1), (x2, y2), overlap));
        }
    } else {
        if swap {
            step_x = -step_x;
        } else {
            draw_start_adjust_count = (line_width - 1) - draw_start_adjust_count;
            step_y = -step_y;
        }
        err = dx2 - dy;
        for _ in (0..draw_start_adjust_count).rev() {
            y1 -= step_y;
            y2 -= step_y;
            if err >= 0 {
                x1 -= step_x;
                x2 -= step_x;
                err -= dy2;
            }
            err += dx2;
        }
        line.append(&mut get_line_overlap2(
            (x1, y1),
            (x2, y2),
            LINE_OVERLAP_NONE,
        ));
        err = dx2 - dy;
        for _ in (1..line_width).rev() {
            y1 += step_y;
            y2 += step_y;
            overlap = LINE_OVERLAP_NONE;
            if err >= 0 {
                x1 += step_x;
                x2 += step_x;
                err -= dy2;
                overlap = LINE_OVERLAP_MAJOR;
            }
            err += dx2;
            line.append(&mut get_line_overlap2((x1, y1), (x2, y2), overlap));
        }
    }
    if unsafe_unchecked {
        unsafe { fast_convert(line) }
    } else {
        convert(&line, max_width)
    }
}

and...

#[inline]
// ported from https://github.com/ArminJo/Arduino-BlueDisplay/blob/master/src/LocalGUI/ThickLine.hpp
fn get_line_overlap2(
    from: (isize, isize),
    to: (isize, isize),
    overlap: usize,
) -> Vec<(isize, isize)> {
    let mut x1 = from.0;
    let x2 = to.0;
    let mut y1 = from.1;
    let y2 = to.1;
    let mut line: Vec<(isize, isize)> = vec![];
    let mut dx: isize;
    let mut dy: isize;
    let dx2: isize;
    let dy2: isize;
    let mut err: isize;
    let step_x: isize;
    let step_y: isize;
    dx = x2 - x1;
    dy = y2 - y1;
    if dx < 0 {
        dx = -dx;
        step_x = -1;
    } else {
        step_x = 1;
    }
    if dy < 0 {
        dy = -dy;
        step_y = -1;
    } else {
        step_y = 1;
    }
    dx2 = dx << 1;
    dy2 = dy << 1;
    line.push((x1, y1));
    if dx > dy {
        err = dy2 - dx;
        while x1 != x2 {
            x1 += step_x;
            if err >= 0 {
                if overlap & LINE_OVERLAP_MAJOR != 0 {
                    line.push((x1, y1));
                }
                y1 += step_y;
                if overlap & LINE_OVERLAP_MINOR != 0 {
                    line.push((x1 - step_x, y1));
                }
                err -= dx2;
            }
            err += dy2;
            line.push((x1, y1));
        }
    } else {
        err = dx2 - dy;
        while y1 != y2 {
            y1 += step_y;
            if err >= 0 {
                if overlap & LINE_OVERLAP_MAJOR != 0 {
                    line.push((x1, y1));
                }
                x1 += step_x;
                if overlap & LINE_OVERLAP_MINOR != 0 {
                    line.push((x1, y1 - step_y));
                }
                err -= dy2;
            }
            err += dx2;
            line.push((x1, y1));
        }
    }
    line
}

Above I talked about scary Option #3. That's calling get_thick_line() with unsafe_unchecked = true. The unsafe version just directly casts an (isize, isize) vec into a (usize, usize) vec. If you are 100% damn sure that the entirety of your thick line lies within the canvas boundaries, you'll get a thick line with no duplicates and no crashing.

If, however, you cast negative signed coordinates into unsigned ones, and actually attempt to use them without checking them...Bad Things Will Happen™ (and if you have overflow checks enabled, the app will crash before you can even attempt to crash the app in a different way)

Here are some example images produced using the code, showing proper handling of out of bounds coordinates:

fat_line_0

fat_line_1

fat_line_2

fat_line_3

fat_line_4

fat_line_5

I'll have a crude little benchmark further below. First though, a little tidbit about a problem I had testing the code originally. Thick lines in the middle of the canvas were like the ones above, perfectly fine. But as parts of the line got further out of bounds, the line would take on a trapezoid shape, or sometimes sort of "bounce" off the boundary of the canvas.

Here's an example of the trapezoid anomaly (notice the starting points in yellow are off-center as well):

fat_line_weird

I eventually tracked it down to the beginning of the drawLineOverlap function here. So I snipped out my version of this code:

    if (aXStart >= LOCAL_DISPLAY_WIDTH) {
        aXStart = LOCAL_DISPLAY_WIDTH - 1;
    }

    if (aXEnd >= LOCAL_DISPLAY_WIDTH) {
        aXEnd = LOCAL_DISPLAY_WIDTH - 1;
    }

    if (aYStart >= LOCAL_DISPLAY_HEIGHT) {
        aYStart = LOCAL_DISPLAY_HEIGHT - 1;
    }

    if (aYEnd >= LOCAL_DISPLAY_HEIGHT) {
        aYEnd = LOCAL_DISPLAY_HEIGHT - 1;
    }

...and the issue disappeared. I'm not sure if it's actually a bug in the implementation I copied, or if it was my fault in how I ported the code. All I know is that it's fixed now in my code =)

Crude benchmark code:

fn test_fat_lines(tests: usize, map_width: usize, line_width: usize) {
    println!("----------------------- [ test begin ] -----------------------");
    let mut rng = nanorand::tls_rng();
    let mut points1: Vec<(usize, usize)> = vec![];
    let mut points2: Vec<(usize, usize)> = vec![];
    let mut line: Vec<(usize, usize)>;
    let mut len: usize = 0;
    for _ in 0..tests {
        points1.push((rng.generate_range(0..map_width), rng.generate_range(0..map_width)));
        points2.push((rng.generate_range(0..map_width), rng.generate_range(0..map_width)));
    }

    let start = Instant::now();
    for x in 0..tests {
        line = utils::get_thick_line_unchecked(
            points1[x],
            points2[x],
            line_width,
            utils::ThicknessMode::LineThicknessMiddle,
            map_width,
        );
        len += line.len();
    }
    let duration = start.elapsed();
    println!(
        "{} iterations for get_thick_line_unchecked() took: {:?}",
        tests, duration
    );
    println!(
        "total pixels = {}, line width = {}, canvas size = {}x{}\n",
        len, line_width, map_width, map_width
    );


    len = 0;
    let start = Instant::now();
    for x in 0..tests {
        line = utils::get_thick_line(
            points1[x],
            points2[x],
            line_width,
            utils::ThicknessMode::LineThicknessMiddle,
            map_width,
            false,
        );
        len += line.len();
    }
    let duration = start.elapsed();
    println!(
        "{} iterations for get_thick_line() took: {:?}",
        tests, duration
    );
    println!(
        "total pixels = {}, line width = {}, canvas size = {}x{}\n",
        len, line_width, map_width, map_width
    );

    len = 0;
    let start = Instant::now();
    for x in 0..tests {
        line = utils::get_thick_line(
            points1[x],
            points2[x],
            line_width,
            utils::ThicknessMode::LineThicknessMiddle,
            map_width,
            true,
        );
        len += line.len();
    }
    let duration = start.elapsed();
    println!(
        "{} iterations for get_thick_line(unsafe unchecked) took: {:?}",
        tests, duration
    );
    println!(
        "total pixels = {}, line width = {}, canvas size = {}x{}\n",
        len, line_width, map_width, map_width
    );
}

And the results of a few tests with different parameters:

----------------------- [ test begin ] -----------------------
5000 iterations for get_thick_line_unchecked() took: 16.8303775s
total pixels = 1470765496, line width = 500, canvas size = 1000x1000

5000 iterations for get_thick_line() took: 22.3055861s
total pixels = 1348434370, line width = 500, canvas size = 1000x1000

5000 iterations for get_thick_line(unsafe unchecked) took: 16.1727516s
total pixels = 1470765496, line width = 500, canvas size = 1000x1000

----------------------- [ test begin ] -----------------------
5000 iterations for get_thick_line_unchecked() took: 5.7484512s
total pixels = 589019150, line width = 200, canvas size = 1000x1000

5000 iterations for get_thick_line() took: 7.6234731s
total pixels = 578641650, line width = 200, canvas size = 1000x1000

5000 iterations for get_thick_line(unsafe unchecked) took: 5.1384081s
total pixels = 589019150, line width = 200, canvas size = 1000x1000

----------------------- [ test begin ] -----------------------
10000 iterations for get_thick_line_unchecked() took: 1.8566417s
total pixels = 294929612, line width = 50, canvas size = 1000x1000

10000 iterations for get_thick_line() took: 2.2578805s
total pixels = 294517820, line width = 50, canvas size = 1000x1000

10000 iterations for get_thick_line(unsafe unchecked) took: 1.6420243s
total pixels = 294929612, line width = 50, canvas size = 1000x1000

----------------------- [ test begin ] -----------------------
50000 iterations for get_thick_line_unchecked() took: 1.5864814s
total pixels = 288522185, line width = 10, canvas size = 1000x1000

50000 iterations for get_thick_line() took: 1.854512s
total pixels = 288501000, line width = 10, canvas size = 1000x1000

50000 iterations for get_thick_line(unsafe unchecked) took: 1.4808243s
total pixels = 288522185, line width = 10, canvas size = 1000x1000

The unsafe code is always fastest of course, but it's not that far behind the version with potential duplicates. In the pixel counts you can see how many pixels were removed by the version of the function which removes dupes. get_thick_line(unsafe unchecked) will ONLY have duplicates if it has been abused (like in my above tests).

That's all for today, code for this post and the previous one can be found here: https://github.com/0xDEADFED5/dwarf_dreams/tree/main/thick_bresenham.

Kudos and respect to ArminJo, without their code I'd probably still be scratching my head.

quick edit:

I ran the tests again, this time with coordinates firmly within the canvas, but with the same parameters as the previous tests. As expected, each function call returned the same number of pixels, but with differing performance. Here are the results:

----------------------- [ test begin ] -----------------------
5000 iterations for get_thick_line_unchecked() took: 22.2698107s
total pixels = 2500000000, line width = 500, canvas size = 1000x1000

5000 iterations for get_thick_line() took: 30.3468048s
total pixels = 2500000000, line width = 500, canvas size = 1000x1000

5000 iterations for get_thick_line(unsafe unchecked) took: 19.8017243s
total pixels = 2500000000, line width = 500, canvas size = 1000x1000

----------------------- [ test begin ] -----------------------
5000 iterations for get_thick_line_unchecked() took: 8.2815981s
total pixels = 1000000000, line width = 200, canvas size = 1000x1000

5000 iterations for get_thick_line() took: 11.9877238s
total pixels = 1000000000, line width = 200, canvas size = 1000x1000

5000 iterations for get_thick_line(unsafe unchecked) took: 7.7751189s
total pixels = 1000000000, line width = 200, canvas size = 1000x1000

----------------------- [ test begin ] -----------------------
10000 iterations for get_thick_line_unchecked() took: 3.6693278s
total pixels = 500000000, line width = 50, canvas size = 1000x1000

10000 iterations for get_thick_line() took: 6.2115697s
total pixels = 500000000, line width = 50, canvas size = 1000x1000

10000 iterations for get_thick_line(unsafe unchecked) took: 3.3808899s
total pixels = 500000000, line width = 50, canvas size = 1000x1000

----------------------- [ test begin ] -----------------------
50000 iterations for get_thick_line_unchecked() took: 1.749504s
total pixels = 500000000, line width = 10, canvas size = 1000x1000

50000 iterations for get_thick_line() took: 2.2008861s
total pixels = 500000000, line width = 10, canvas size = 1000x1000

50000 iterations for get_thick_line(unsafe unchecked) took: 1.4769065s
total pixels = 500000000, line width = 10, canvas size = 1000x1000

Bresenham, splines, and nipples, oh my!

Site update: added Yandex and Twitter login support to the comment system.

Well, I've been testing lots of different hydraulic erosion simulation code on my noise based terrain. I started off with a pretty complex sim I put together that modeled water sources, water quantity, water velocity, amount of sediment carried in the water, soil deposition rates, etc. I tested this on 8000x8000 maps of f32 data.

Even though my code is fully threaded (current machine: 8 core/16 thread), my simulations were slow. Incredibly slow. Like possibly approaching the heat death of the universe for my simulation to end slow. After several iterations of simplifying the simulation I had a realization. I don't need some crazy full map erosion process that takes forever. I mostly want mountains that don't suck and realistic looking rivers. After I get that nailed I can get to work implementing terrain for specific biomes.

My new simplified river plan is this, for each river:

  • Starting point is a mountain peak which receives precipitation
  • Chart a path downhill
  • Widen river as conditions are met
  • End the river at an existing body of water, a depression to form a lake, or the edge of map
  • Also, erosion only at this time
  • When water physics is added in I'll add evaporation and take another look at sediment deposits
  • A modest speed increase: I was sharing one map between 16 threads and using reader/writer locking semantics. I didn't measure contention, but I suspected a fair amount. So now I create 16 copies of the map, give each thread their own version to work on, and merge the maps back together at the end. I got around a 10% performance boost. RAM is meant to be used right? ....right?

I'm not going to traverse every single tile/pixel downhill. First off, it's friggin' slow. Secondly, it produces some weird shaped rivers. That can be fixed by running a gazillion iterations, but that's one of the things I'm trying to avoid. So instead I'm going to search for the lowest point in a set radius, draw a line to it, and keep doing that until I have a river of attached line segments.

First I need to be able to draw a line from x1, y1 to x2, y2, preferably as quickly as possible. Luckily for me, some guy named Jack Elton Bresenham came up with a fast algorithm called Bresenham's line algorithm back in 1962.

This page has some cool examples of Bresenham's algorithm in C. (for some reason uBlock hates this page now, but as of this blog post date that page is safe and ad-free)

Here's my Rust implementation based on the code from that page:

// adapted from http://members.chello.at/%7Eeasyfilter/bresenham.html
#[inline]
pub fn get_line_unchecked(from: (usize, usize), to: (usize, usize)) -> Vec<(usize, usize)> {
    let mut line: Vec<(usize, usize)> = vec![];
    let mut ix0 = from.0 as isize;
    let mut iy0 = from.1 as isize;
    let ix1 = to.0 as isize;
    let iy1 = to.1 as isize;
    let dx: isize = ix1.abs_diff(ix0) as isize;
    let mut sx: isize = 1;
    if ix0 > ix1 {
        sx = -1;
    }
    let dy: isize = -(iy1.abs_diff(iy0) as isize);
    let mut sy: isize = 1;
    if iy0 > iy1 {
        sy = -1
    }
    let mut err = dx + dy;
    let mut e2: isize = 0;
    loop {
        line.push((ix0 as usize, iy0 as usize));
        if ix0 == ix1 && iy0 == iy1 {
            break;
        }
        e2 = 2 * err;
        if e2 >= dy {
            err += dy;
            ix0 += sx;
        }
        if e2 <= dx {
            err += dx;
            iy0 += sy;
        }
    }
    line
}

I called it get_line_unchecked because it's naive. I intend to draw lines on a specific canvas/map size within only the positive coordinate plane. This function doesn't know that. It just gives me a vector of points from one point to the other. Since I intend to use each of these points as a direct memory index into my map I want to make sure I return valid points. If a line extends off my map and I try to use that as a map index I'm going to experience 1 of 2 possible errors: Best case scenario, I index the wrong location on the map and modify something where I didn't intend to. Worst case scenario is I attempt to index beyond my maximum map size, which results in an array out of bounds error, or memory access exception, or whatever happens in Rust...and then the application crashes. Also, this function uses signed integers (to support the entire coordinate plane), so if I get a negative coordinate and cast it to an unsigned integer for my map index one of those 2 possible errors I just wrote about is going to happen. It's definitely more likely to be the crashy kind of error though.

So I updated the code a bit. I still use that previous code when I know it's safe and I want extra speed. Otherwise I use this one:

// adapted from http://members.chello.at/%7Eeasyfilter/bresenham.html
/// if remove_out_of_bounds = false, points outside of map will be placed on edge of map
/// if remove_out_of_bounds = true, points outside of map will be ignored
#[inline]
pub fn get_line(
    from: (usize, usize),
    to: (usize, usize),
    map_width: usize,
    remove_out_of_bounds: bool,
) -> Vec<(usize, usize)> {
    let mut line: Vec<(usize, usize)> = vec![];
    let mut ix0 = from.0 as isize;
    let mut iy0 = from.1 as isize;
    let ix1 = to.0 as isize;
    let iy1 = to.1 as isize;
    let dx: isize = ix1.abs_diff(ix0) as isize;
    let mut sx: isize = 1;
    let imap_width = map_width as isize;
    let mut clamped: bool = false;
    let mut try_clamp = |coord: isize| -> usize {
        if coord > imap_width - 1 {
            clamped = true;
            map_width - 1
        } else if coord < 0 {
            clamped = true;
            0
        } else {
            coord as usize
        }
    };

    if ix0 > ix1 {
        sx = -1;
    }
    let dy: isize = -(iy1.abs_diff(iy0) as isize);
    let mut sy: isize = 1;
    if iy0 > iy1 {
        sy = -1
    }
    let mut err = dx + dy;
    let mut e2: isize = 0;
    let mut skip_point: bool;
    loop {
        skip_point = false;
        if remove_out_of_bounds
            && (ix0 > imap_width - 1 || ix0 < 0 || iy0 > imap_width - 1 || iy0 < 0)
        {
            skip_point = true;
        }
        if !skip_point {
            line.push((try_clamp(ix0), try_clamp(iy0)))
        };
        if ix0 == ix1 && iy0 == iy1 {
            break;
        }
        e2 = 2 * err;
        if e2 >= dy {
            err += dy;
            ix0 += sx;
        }
        if e2 <= dx {
            err += dx;
            iy0 += sy;
        }
    }
    if !remove_out_of_bounds && clamped {
        line.sort();
        line.dedup();
    }
    line
}

My points are usize tuples because for better or worse, Rust forces slice indexes to be usize... and I use my points as indexes. I only have a width parameter because my game maps are square. I use the try_clamp closure so that I can keep track if any points have extended beyond the map. I keep track because if I'm not removing out of bounds map points and placing them on the map edge I could end up with duplicate points. The whole reason I want these points is so I can manipulate those parts of the map. I don't want to perform those manipulations multiple times at the same location on accident. Therefore, if any point locations have been clamped, I know that I need to sort my vector and remove possible duplicates.

Here are some totally random lines representing a sort of rudimentary river:

line_grid_scaled

Okay, next thing I need to do is figure out how to select a circle of pixels/tiles. For every point along my "river", depending on how wide the river is, I want to select a circle of tiles, and erode earth from that circle, eroding less earth the further out from the center of the circle that I go. I sort of picture it like dragging an ice cream cone shape along the path of the river, carving out the earth using that shape.

On that page I shared earlier there's a version of Bresenham's algorithm for circles. Here's my Rust version for a filled circle:

#[inline]
//adapted from http://members.chello.at/%7Eeasyfilter/bresenham.html
pub fn get_full_circle(xm: usize, ym: usize, mut r: isize, width: usize) -> Vec<(usize, usize)> {
    let xm = xm as isize;
    let ym = ym as isize;
    //let mut r = r as isize;
    let width2 = width as isize;

    let mut x = -r;
    let mut y = 0;
    let mut err: isize = 2 - 2 * r;
    let mut empty_circle: Vec<(usize, usize)> = vec![];
    let mut full_circle: Vec<(usize, usize)> = vec![];
    let mut clamped: bool = false;
    // need to know if clamp has ocurred, as this can create duplicate coordinates
    let mut try_clamp = |coord: isize| -> usize {
        if coord < 0 {
            clamped = true;
            0
        } else if coord >= width2 {
            clamped = true;
            width - 1
        } else {
            coord as usize
        }
    };
    while x < 0 {
        empty_circle.push((try_clamp(xm - x), try_clamp(ym + y)));
        empty_circle.push((try_clamp(xm - y), try_clamp(ym - x)));
        empty_circle.push((try_clamp(xm + x), try_clamp(ym - y)));
        empty_circle.push((try_clamp(xm + y), try_clamp(ym + x)));
        r = err;
        if r <= y {
            y += 1;
            err += y * 2 + 1;
        }
        if r > x || err > y {
            x += 1;
            err += x * 2 + 1;
        }
    }
    if empty_circle.len() == 0 {
        return empty_circle;
    }
    // sort by X axis. this allows easy dupe removal and quickly getting coords inside the circle
    empty_circle.sort();
    if clamped {
        empty_circle.dedup();
    }
    let mut lowest = usize::MAX;
    let mut highest = 0;
    let mut current = empty_circle.first().unwrap().0;
    //find lowest Y and highest Y for each X coordinate, and grab all the coords between (inclusive)
    for p in empty_circle {
        if p.0 != current {
            for y in lowest..=highest {
                full_circle.push((current, y));
            }
            lowest = usize::MAX;
            highest = 0;
            current = p.0;
        }
        if p.1 < lowest {
            lowest = p.1;
        }
        if p.1 > highest {
            highest = p.1;
        }
    }
    for y in lowest..=highest {
        full_circle.push((current, y));
    }
    full_circle
}

This code looks quite a bit different because I want a filled circle, not just a circle outline. Basically for each horizontal (X) step across the circle diameter I make, I fill in the whole range of vertical (Y) pixels. You could do it the other way around too, of course. I think it could be optimized a bit, but it works. Oh yeah, and like before I've got my clamping code in there in case I try to get a filled circle too close to the map edge.

Here's an example:

bresenham_filled_circle

I'm going to be drawing lots of circles though, and I wasn't sure if that was the fastest way to get a filled circle. So I wrote a more naive version. Basically, if I want a filled circle of radius R, first get the tiles/pixels in a square 2R X 2R centered where I want the circle. Then calculate the distance of each pixel from the center. If it's less than or equal to the circle radius I want, then it's inside the circle.

Here's my code for that version:

#[inline]
pub fn get_full_circle_naive(
    xm: usize,
    ym: usize,
    r: f32,
    width: usize,
    with_nipples: bool,
    ignore_center: bool,
) -> Vec<(usize, usize)> {
    let mut full_circle: Vec<(usize, usize)> = vec![];
    let mut nipples: Vec<(usize, usize)> = vec![];
    let startx: usize;
    let starty: usize;
    let endx: usize;
    let endy: usize;
    let r2 = r as usize;
    if xm >= r2 {
        startx = xm - r2;
    } else {
        startx = 0;
    }
    if ym >= r2 {
        starty = ym - r2;
    } else {
        starty = 0;
    }
    if width > xm + r2 {
        endx = xm + r2
    } else {
        endx = width - 1;
    }
    if width > ym + r2 {
        endy = ym + r2;
    } else {
        endy = width - 1;
    }
    let mut dist: f32;
    for x in startx..=endx {
        for y in starty..=endy {
            if ignore_center && x == xm && y == ym {
                continue;
            }
            dist = distance((x, y), (xm, ym));
            if dist < r {
                full_circle.push((x, y));
            } else if with_nipples && dist == r {
                nipples.push((x, y));
            }
        }
    }
    if with_nipples {
        full_circle.append(&mut nipples);
    }
    full_circle
}

Oh right, and my terribly complex distance function:

#[inline(always)]
pub fn distance(pos1: (usize, usize), pos2: (usize, usize)) -> f32 {
    let x1 = pos1.0 as f32;
    let x2 = pos2.0 as f32;
    let y1 = pos1.1 as f32;
    let y2 = pos2.1 as f32;
    ((x1 - x2).powi(2) + (y1 - y2).powi(2)).sqrt()
}

Here's what it looks like rendered:

naive_filled_circle_with_nipples

Well, that's what it looked like with the first version I made. I call it the nipple version because well...I mean look at it. I usually like nipples, but not those ones. So I modified the code a bit, version 2 is what you see above. And here's an example of calling the function + with_nipples = false:

naive_filled_circle

Much better. The Bresenham one still looks the best IMO though. What about performance?

Here's my janky test code (using nanorand crate). I didn't look at the assembly code that was emitted, but I tried to stop the compiler from cheating:

fn test_circles(test_num: usize, r: isize) {
    debug_assert!(r > 0);
    let rf = r as f32;
    let mut points: Vec<(usize, usize)> = vec![];
    let mut rng = nanorand::tls_rng();
    for _ in 0..test_num {
        points.push((rng.generate_range(0..10_000), rng.generate_range(0..10_000)));
    }
    let mut circle: Vec<(usize, usize)>;
    let mut len: usize = 0;
    let start = Instant::now();
    for x in 0..test_num {
        circle = utils::get_full_circle(points[x].0, points[x].1, r, 11000);
        len += circle.len();
    }
    let duration = start.elapsed();
    println!("len = {}", len);
    println!(
        "{} iterations for utils::get_full_circle() took: {:?}",
        test_num, duration
    );

    len = 0;
    let start = Instant::now();
    for x in 0..test_num {
        circle = utils::get_full_circle_naive(points[x].0, points[x].1, rf, 11000, true, false);
        len += circle.len();
    }
    let duration = start.elapsed();
    println!("len = {}", len);
    println!(
        "{} iterations for utils::get_full_circle_naive() took: {:?}",
        test_num, duration
    );
}

I tested with a large number of smallish circles, and a smaller number of larger circles, results were not terribly different. In debug mode with opt-level = 0, the naive filled circle code is just over 4x slower! Ouch. I tried integer math that I ported from C for sqrt and distance calculations, and it was freakishly slower yet. In release mode with opt-level = 3, the bresenham filled circle code is 28% faster with MSVC toolchain, and it's about 10% faster with GNU toolchain. Interestingly, the integer code is slower with GNU, and the floating point code is faster with GNU, explaining the narrowed performance difference.

This brings me back to the first picture I posted above. My sharp angled, not very river-like line. This thing:

line_grid_scaled

I'm no riverologist, but I don't think rivers usually have those sharp angles. So now I need to smooth em out. I did a bit of searching and it looks like I want some cubic splines. I used the first crate I found, cubic_spline and after running those same points through it, here's my result:

spline_grid_scaled

Much better. Of course, after I finished up that rendering code, I found another crate called splines. I haven't tested it yet, but I will, because I'm curious about any differences. Next post I'll hopefully have some mountainy mountains and rivery rivers to show off.

In other news, rust-analyzer started doing this thing where it would reindex everything every new line of code I typed. Sucked a lot. I trashed everything Rust related on the machine and started out clean, now my coding sanity is restored. Well, as much as it could be anyhow.