Arena-Based Allocation
2019-10-08Render Times
Let's start with a simple comparison between the original C++ code and the current Rust version. We are going to render a scene which needs some time to finish and results in this image:
Both versions render exactly the same image (pixel by pixel):
> imf_diff anim-bluespheres_cpp.exr anim-bluespheres_rust.exr
anim-bluespheres_cpp.exr anim-bluespheres_rust.exr: no differences.
== "anim-bluespheres_cpp.exr" and "anim-bluespheres_rust.exr" are identical
As a side note: The rs-pbrt
renderer can be compiled to render
OpenEXR images, but the default is to
render Portable Network
Graphics
(PNG) files.
I opened an issue where I documented the render times on my laptop (a 4 processor Linux machine).
C++
> time ~/builds/pbrt/release/pbrt anim-bluespheres.pbrt
14m58.087s
Rust
> time ~/git/self_hosted/Rust/pbrt/target/release/rs_pbrt -i anim-bluespheres.pbrt
24m7.235s
Suggestion
So, that's about 15 minutes vs. 24 minutes. My current assumption is that C++ uses Arena-Based Allocation, and the current Rust code does not. That might have a big influence on performance. So let's investigate that a bit ...
Debugger
Let's use a debugger to find some interesting places. First the
class MemoryArena
is defined in the core/memory.h
file:
class
...
MemoryArena {
...
void *Alloc(size_t nBytes) {
...
}
template <typename T>
T *Alloc(size_t n = 1, bool runConstructor = true) {
}
void Reset() {
...
}
...
};
We are mainly interested in Alloc()
and Reset()
calls. The
following define is relevant to find such calls:
#define ARENA_ALLOC(arena, Type) new ((arena).Alloc(sizeof(Type))) Type
Let's use ripgrep to find some interesting breakpoints:
# in the directory where the test scene resides
> rg Material anim-bluespheres.pbrt
19:Material "plastic" "texture Kd" "lines-tex"
45:Material "uber" "color Kr" [1 1 1] "color Kd" [.2 .2 .2]
47:Material "mirror"
# in the directory where we keep the C++ source code for PBRT
> rg -tcpp "b.df = ARENA_ALLOC" | grep plastic
materials/plastic.cpp: si->bsdf = ARENA_ALLOC(arena, BSDF)(*si);
> rg -tcpp "b.df = ARENA_ALLOC" | grep uber
materials/uber.cpp: si->bsdf = ARENA_ALLOC(arena, BSDF)(*si, 1.f);
materials/uber.cpp: si->bsdf = ARENA_ALLOC(arena, BSDF)(*si, e);
> rg -tcpp "b.df = ARENA_ALLOC" | grep mirror
materials/mirror.cpp: si->bsdf = ARENA_ALLOC(arena, BSDF)(*si);
# in the directory where the test scene resides
> gdb
The GNU Debugger
(gdb) file ~/builds/pbrt/debug/pbrt
(gdb) set args --nthreads 1 anim-bluespheres.pbrt
(gdb) b materials/plastic.cpp:50
Breakpoint 1 at 0x3fb034: file /home/jan/git/github/pbrt-v3/src/materials/plastic.cpp, line 50.
(gdb) b materials/uber.cpp:56
Breakpoint 2 at 0x3ff1b9: file /home/jan/git/github/pbrt-v3/src/materials/uber.cpp, line 56.
(gdb) b materials/mirror.cpp:51
Breakpoint 3 at 0x3fa1a7: file /home/jan/git/github/pbrt-v3/src/materials/mirror.cpp, line 51.
(gdb) run
...
Rendering: [ ]
Thread 1 "pbrt" hit Breakpoint 3, pbrt::MirrorMaterial::ComputeScatteringFunctions (this=0x55555ac2a210, si=0x7fffffffc550, arena=..., mode=pbrt::TransportMode::Radiance, allowMultipleLobes=true)
at /home/jan/git/github/pbrt-v3/src/materials/mirror.cpp:51
51 si->bsdf = ARENA_ALLOC(arena, BSDF)(*si);
(gdb) where
#0 pbrt::MirrorMaterial::ComputeScatteringFunctions (this=0x55555ac2a210, si=0x7fffffffc550, arena=..., mode=pbrt::TransportMode::Radiance, allowMultipleLobes=true)
at /home/jan/git/github/pbrt-v3/src/materials/mirror.cpp:51
#1 0x00005555558b1f91 in pbrt::GeometricPrimitive::ComputeScatteringFunctions (this=0x55555ac6a990, isect=0x7fffffffc550, arena=..., mode=pbrt::TransportMode::Radiance, allowMultipleLobes=true)
at /home/jan/git/github/pbrt-v3/src/core/primitive.cpp:145
#2 0x00005555559b8332 in pbrt::SurfaceInteraction::ComputeScatteringFunctions (this=0x7fffffffc550, ray=..., arena=..., allowMultipleLobes=true, mode=pbrt::TransportMode::Radiance)
at /home/jan/git/github/pbrt-v3/src/core/interaction.cpp:99
#3 0x000055555591e00f in pbrt::PathIntegrator::Li (this=0x5555560b7710, r=..., scene=..., sampler=..., arena=..., depth=0) at /home/jan/git/github/pbrt-v3/src/integrators/path.cpp:107
#4 0x00005555559b5a05 in pbrt::SamplerIntegrator::<lambda(pbrt::Point2i)>::operator()(pbrt::Point2i) const (__closure=0x5555566bf450, tile=...) at /home/jan/git/github/pbrt-v3/src/core/integrator.cpp:291
#5 0x00005555559b76d2 in std::_Function_handler<void(pbrt::Point2<int>), pbrt::SamplerIntegrator::Render(const pbrt::Scene&)::<lambda(pbrt::Point2i)> >::_M_invoke(const std::_Any_data &, <unknown type in /home/jan/builds/pbrt/debug/pbrt, CU 0xf8d5b0, DIE 0xfb6107>) (__functor=..., __args#0=<unknown type in /home/jan/builds/pbrt/debug/pbrt, CU 0xf8d5b0, DIE 0xfb6107>) at /usr/include/c++/7/bits/std_function.h:316
#6 0x0000555555885d61 in std::function<void (pbrt::Point2<int>)>::operator()(pbrt::Point2<int>) const (this=0x7fffffffd2d0, __args#0=...) at /usr/include/c++/7/bits/std_function.h:706
#7 0x0000555555884b5c in pbrt::ParallelFor2D(std::function<void (pbrt::Point2<int>)>, pbrt::Point2<int> const&) (func=..., count=...) at /home/jan/git/github/pbrt-v3/src/core/parallel.cpp:252
#8 0x00005555559b62e9 in pbrt::SamplerIntegrator::Render (this=0x5555560b7710, scene=...) at /home/jan/git/github/pbrt-v3/src/core/integrator.cpp:240
#9 0x000055555583f23f in pbrt::pbrtWorldEnd () at /home/jan/git/github/pbrt-v3/src/core/api.cpp:1623
#10 0x00005555558ac4e1 in pbrt::parse (t=std::unique_ptr<pbrt::Tokenizer> = {...}) at /home/jan/git/github/pbrt-v3/src/core/parser.cpp:1083
#11 0x00005555558ac8c7 in pbrt::pbrtParseFile (filename="anim-bluespheres.pbrt") at /home/jan/git/github/pbrt-v3/src/core/parser.cpp:1101
#12 0x0000555555833033 in main (argc=4, argv=0x7fffffffdfa8) at /home/jan/git/github/pbrt-v3/src/main/pbrt.cpp:169
So, how do we get from the Render()
loop to one of the materials
ComputeScatteringFunctions()
method?
# from here
> rg -tcpp SamplerIntegrator::Render core/integrator.cpp
228:void SamplerIntegrator::Render(const Scene &scene) {
# to e.g.
> rg -tcpp ComputeScatteringFunctions materials/mirror.cpp
45:void MirrorMaterial::ComputeScatteringFunctions(SurfaceInteraction *si,
Here a UML Sequence diagram:
And bits and pieces from the C++ source code:
// integrator.cpp
void SamplerIntegrator::Render(const Scene &scene) {
...
MemoryArena arena;
...
if (rayWeight > 0) L = Li(ray, scene, *tileSampler, arena);
...
arena.Reset();
...
}
// path.cpp
Spectrum PathIntegrator::Li(const RayDifferential &r, const Scene &scene,
Sampler &sampler, MemoryArena &arena,
int depth) const {
...
// Intersect _ray_ with scene and store intersection in _isect_
SurfaceInteraction isect;
bool foundIntersection = scene.Intersect(ray, &isect);
...
// Compute scattering functions and skip over medium boundaries
isect.ComputeScatteringFunctions(ray, arena, true);
...
}
// interaction.cpp
void SurfaceInteraction::ComputeScatteringFunctions(const RayDifferential &ray,
MemoryArena &arena,
bool allowMultipleLobes,
TransportMode mode) {
ComputeDifferentials(ray);
primitive->ComputeScatteringFunctions(this, arena, mode,
allowMultipleLobes);
}
// primitive.cpp
void GeometricPrimitive::ComputeScatteringFunctions(
SurfaceInteraction *isect, MemoryArena &arena, TransportMode mode,
bool allowMultipleLobes) const {
ProfilePhase p(Prof::ComputeScatteringFuncs);
if (material)
material->ComputeScatteringFunctions(isect, arena, mode,
allowMultipleLobes);
CHECK_GE(Dot(isect->n, isect->shading.n), 0.);
}
// mirror.cpp
void MirrorMaterial::ComputeScatteringFunctions(SurfaceInteraction *si,
MemoryArena &arena,
TransportMode mode,
bool allowMultipleLobes) const {
// Perform bump mapping with _bumpMap_, if present
if (bumpMap) Bump(bumpMap, si);
si->bsdf = ARENA_ALLOC(arena, BSDF)(*si);
Spectrum R = Kr->Evaluate(*si).Clamp();
if (!R.IsBlack())
si->bsdf->Add(ARENA_ALLOC(arena, SpecularReflection)(
R, ARENA_ALLOC(arena, FresnelNoOp)()));
}
Back to debugging:
# first call to PathIntegrator::Li()
(gdb) b path.cpp:79
Breakpoint 1 at 0x3c9ac3: file /home/jan/git/github/pbrt-v3/src/integrators/path.cpp, line 79.
(gdb) run
Thread 1 "pbrt" hit Breakpoint 1, pbrt::PathIntegrator::Li (this=0x5555560b7710, r=..., scene=..., sampler=..., arena=..., depth=0) at /home/jan/git/github/pbrt-v3/src/integrators/path.cpp:79
79 Float etaScale = 1;
(gdb) b memory.h:83
Breakpoint 2 at 0x55555584184c: file /home/jan/git/github/pbrt-v3/src/core/memory.h, line 83.
(gdb) b memory.h:118
Breakpoint 3 at 0x5555558471e3: memory.h:118. (9 locations)
(gdb) b memory.h:123
Breakpoint 4 at 0x555555841a41: file /home/jan/git/github/pbrt-v3/src/core/memory.h, line 123.
(gdb) continue
Thread 1 "pbrt" hit Breakpoint 2, pbrt::MemoryArena::Alloc (this=0x7fffffffcd80, nBytes=120) at /home/jan/git/github/pbrt-v3/src/core/memory.h:83
83 nBytes = (nBytes + align - 1) & ~(align - 1);
(gdb) continue
Thread 1 "pbrt" hit Breakpoint 2, pbrt::MemoryArena::Alloc (this=0x7fffffffcd80, nBytes=8) at /home/jan/git/github/pbrt-v3/src/core/memory.h:83
83 nBytes = (nBytes + align - 1) & ~(align - 1);
(gdb) continue
Thread 1 "pbrt" hit Breakpoint 2, pbrt::MemoryArena::Alloc (this=0x7fffffffcd80, nBytes=32) at /home/jan/git/github/pbrt-v3/src/core/memory.h:83
83 nBytes = (nBytes + align - 1) & ~(align - 1);
(gdb) continue
Thread 1 "pbrt" hit Breakpoint 4, pbrt::MemoryArena::Reset (this=0x7fffffffcd80) at /home/jan/git/github/pbrt-v3/src/core/memory.h:123
123 currentBlockPos = 0;
So, in this case we call pbrt::MemoryArena::Alloc()
three times
before releasing the memory again with
pbrt::MemoryArena::Reset()
. The allocated memory is 120, 8, and 32
bytes, most likely for instances of the classes BSDF
, FresnelNoOp
,
and SpecularReflection
:
class BSDF {
...
// BSDF Private Data
const Normal3f ns, ng;
const Vector3f ss, ts;
int nBxDFs = 0;
static PBRT_CONSTEXPR int MaxBxDFs = 8;
BxDF *bxdfs[MaxBxDFs];
friend class MixMaterial;
};
class FresnelNoOp : public Fresnel {
...
};
class SpecularReflection : public BxDF {
...
private:
// SpecularReflection Private Data
const Spectrum R;
const Fresnel *fresnel;
};
class BxDF {
...
// BxDF Public Data
const BxDFType type;
};
Flamegraphs
Here the difference between the C++ code:
And the Rust counterpart: