alxhill.dev

View on GitHub

Rust Dev Log

Reverse chronological dev-log of my journey learning Rust in the hopes it’s useful to someone (me) someday (when I inevitably forget how I solved something a week earlier).

Resources

Final Ray Tracer

After compiling the ray tracer to WASM and adding some small additional JS, you can now run it live in any modern browser:

click to toggle rendering

Pretty pleased with the end result - works well enough and I started learning a new language by building a medium-complexity project.

The performance numbers are also decent - ~550ms/frame on my M2 Air, and eyeballing it maybe 50% slower (800ms?) on an iPhone 14 Pro. For unoptimised single-threaded code on a CPU, that seems decent.

Amusingly, the hardest part about getting it working on the web was not the Rust or WASM pieces, but browser API differences - Chrome and Safari play slightly differently when re-rendering an ImageBuffer if the underlying Uint8Array data changes. Chrome allows re-using all the objects, but for Safari I had to re-create the wrappers around the data each time.

2023-03-25

2023-02-26

Briefly spent some time parallelising the renderer. As a naive approach, used a library called rayon to switch from iterating over one pixel at a time to a “parallel iterator” which can handle distributing work across threads, with work-stealing to avoid downtime.

There were two pain-points here:

  1. I couldn’t use mutable objects when sharing them across threads. This affected the sampler (which updates an index) and the output image view.
  2. The Scene object couldn’t safely be shared between threads as it didn’t implement Sync.

So my first attempt didn’t compile:

pub fn render_parallel<R: Renderable + Send + Sync, T: RenderTarget + Send + Sync>(
    view_plane: &mut ViewPlane,
    mut renderer: R,
    mut img: T,
) {
    ParallelIterator::for_each(view_plane.into_par_iter(), move |xy| {
        // error: img cannot be mutably borrowed
        img.set_pixel(&xy, &renderer.render_pixel(&xy));
    });
}

I solved the first issue by making the sampler store an AtomicU32 instead of a raw u32. This means it can be mutated (safely) without needing a mutable reference. For the output image, I switched from a for_each to map + collect, then writing to the image buffer at the end.

The compiler helpfully informed me that the issue with Scene not implementing Sync was all the way down in my Object type:

   = help: the trait Sync is not implemented for (dyn Shadeable + 'static)
   = note: required for Arc<(dyn Shadeable + 'static)> to implement Sync
   = note: required because it appears within the type Option<Arc<(dyn Shadeable + 'static)>>
   = note: required because it appears within the type Object
   = note: required for Unique<Object> to implement Sync
   = note: required because it appears within the type alloc::raw_vec::RawVec<Object>
   = note: required because it appears within the type Vec<Object>
   = note: required because it appears within the type rust_raytracing::Scene
   = note: required because it appears within the type &rust_raytracing::Scene
   = note: required because it appears within the type RenderContext<'_, rust_raytracing::MultiJittered, rust_raytracing::PinholeCamera>
note: required by a bound in render_parallel

The core issue is that Object looks like this:

#[derive(Debug)]
pub struct Object {
    pub geometry: Geometry,
    pub material: Arc<dyn Shadeable>,
}

Specifically, the trait Shadeable can be implemented by any type - including ones that are not safe to send between threads. At first I thought this was inherent in how dyn worked, so rewrote the Material system to be an enum instead of an Arc. This worked fine, but does mean that Materials have to take up much more space than they need (e.g a Normal material has no fields, while a Phong has to leave space for two lambertian BDRFs, a glossy BDRF and an optional PerfectlySpecular BDRF - a total of 5 doubles and 4 colors). At the scale of this project, that doesn’t realluy matter, but it seems preferable to share materials rather than copy them around with so much extra padding.

The alternative solution was much simpler - make Shadeable require Sync + Send. Because none of the Materials do any mutation, no other code needed:

pub trait Shadeable: Debug + Sync + Send {
    fn shade(&self, hit: Hit, scene: &Scene, depth: Depth) -> RGBColor;
}

Now the code compiles and runs, with the following render_parallel implementation:

pub fn render_parallel<R: Renderable + Sync, T: RenderTarget>(
    view_plane: &ViewPlane,
    renderer: &R,
    img: &mut T,
) {
    let pixels: Vec<(ViewXY, RGBColor)> = ParallelIterator::map(view_plane.into_par_iter(), |xy| {
        (xy, renderer.render_pixel(&xy))
    })
    .collect();
    for pixel in pixels {
        img.set_pixel(&pixel.0, &pixel.1);
    }
}

So, is it faster? Not much!

Performance

Previously, I’d migrated from a dynamic sampling architecture (generates values on the fly) to a static one, which precomputes a buffer of samples and moves through them. This is more flexible, allowing for samplers that generate multiple values in one go (e.g MultiJittered, N-Rooks), but switching compute for memory reads has had a negative impact on performance - from about 130ms per frame to 250ms.

I was hoping that using Rayon would get us some much needed perf gains for each frame, but sadly while it did reduce them by about 50ms, we’re still well above where it was before and far below where you’d want to be given 8x the computing power.

I haven’t tested yet, but I’d assume that single pixels are too small a unit of compute to make up for the overhead of using threads - or that the AtomicU32 is slowing things down (it does get used multiple times per-pixel at the moment).

2023-02-17

Rendering to Canvas in JS is now working in the browser through WASM:

shiny spheres in a browser

Overall this was pretty smooth sailing, barring some small self-inflicted bugs it’s pretty easy to interoperate between the two languages (it’s particularly nice that JS’s ImageData supports memory buffers natively, meaning the whole thing is zero-copy). The part that’s surprised me most is this:

pub fn pixels(&self) -> *const Rgba {
    // buffer is a Vec<Rgba>
    self.target.buffer.as_ptr()
}

This directly returns a pointer to the underlying Vec memory, which seemed like it should be unsafe/discouraged given Rust’s focus on memory safety. However, it’s dereferencing the pointer that’s the unsafe action - sending it around is fine so long as it doesn’t get used. Then, when it’s sent to JS, it gets used like this:

let pixels_array = new Uint8ClampedArray(memory.buffer, scene.pixels(), scene.width() * scene.height() * 4);

So JavaScript effectively has full access to Rust’s memory, with the data sent between the two languages via raw pointers. I’m sure there are also typed ways to do this, so will investigate that in future.

When I first checked performance, it was running at about ~800ms per frame compared to ~140ms per frame when running natively. Surprisingly, this was mostly due to the console being open - when I closed it, the frame-time dropped to ~260ms per frame (see screenshot).

2023-02-15

Decided to try compiling the current ray tracer to WASM, to see both how hard it is and what the perf is like.

2023-02-06

I spent some time trying to solve the lifetime issue with the allocator/scene. Here’s the outline of what I was trying to get working:

let scene_arena = Bump::new();
let scene = Scene::new(&scene_alloc);

let obj = scene_arena.alloc(Object:new(...));
// other allocations

scene.add_object(obj);

pixel_canvas.render(move |image| => {
    render_to(&scene, &mut CanvasTarget::new(image))
})

So the state of the world is:

This code fails to compile as scene_arena does not live long enough.

I spent a long time trying to fix this and tried a number of solutions that didn’t work.

Scene was defined as:

    #[derive(Debug)]
    pub struct Scene<'w> {
        pub objects: Vec<&'w Object<'w>>,
        pub lights: Vec<&'w Light>,
        pub bg_color: RGBColor,
    }

So my first thought was that we could switch this to:

#[derive(Debug)]
pub struct Scene {
    pub objects: Vec<Object>,
    pub lights: Vec<Light>,
    pub bg_color: RGBColor,
}

Unfortunately, it’s not that simple - the definition of Object is:

#[derive(Debug)]
pub struct Object<'w> {
    pub geometry: Geometry,
    pub material: &'w dyn Shadeable,
}

This means that Vec<Object> has the same problem as Scene - someone needs to own the reference to the material. Generics can’t help us here, as Vec<Object<T>> can have different values for T. And Rust doesn’t allow pub material: dyn Shadeable as dynamic types can’t be sized.

Ultimately, I switched back to using Arc<dyn Shadeable> inside the object instead of a reference, and then was finally able to move the scene into the closure and run the render function on it. This also made me realise that the arena wasn’t doing anything, as at the end of the day it was owned entirely by the main() function. So, I removed that, switched to constructing Arc’s for the materials and making Objects that get moved directly into the Scene.

Other solutions would be making Scene own the materials and providing a reference or borrow of a material to an object when it gets created. Still not sure how this would fit with generics though. Alternatively, could switch to using an enum implementing Shadeable for materials instead of dyn Shadeable, which is less efficient but avoids the Sized issue with dynamic traits.

2023-02-05

four shiny spheres with shadows

2023-02-04

2023-02-02

three spheres with materials

2023-01-22

two spheres and a plane

2023-01-17

2023-01-16

a red circle

2023-01-14

2023-01-13

Copilot generated hit function

2022-04-01

2022-??-??

Adding some writeups here a few days after actually trying to use Rust