From Maths
Jump to: navigation, search

LibSR is my software rendering project.


  1. LibSR:Proof of concept - this was little more than a "quick rush to get it working", it was however a success, it was slow, but a success.
  2. LibSR:Prototype - think of it as a version less than 1.0.0. It's a prototype, but done properly, not a hacked together rushed proof of concept.

Axillary stages

There was a LibSR:Vectorisation experiment, this investigated the performance gains of having things aligned and trying to coax GCC into vectorising code.




Reasons for creation

I needed a way to experiment with OpenGL ES 2.0 stuff but I couldn't get Mesa's one to work (without removing my other drivers first) also not all my devices support a programmable pipeline. With this in mind a software renderer (based off OpenGL ES 2.0) seemed the way to go.

The API is different (I didn't implement OpenGL ES 2.0) however creating a 'bridge' between the two would be VERY easy.

Build types

There are 2 build types:

  1. Vectorised - this enables alignment of various types, as well as vectorised implementations of common operations (recommended)
  2. Fallback - Generic, designed for readable code

The build may be further configured for:

  • Debugging - which enables many assertions that check all is well during running of LibSR, these can be configured (via the configuration file) to trap when an assertion is broken, so a debugger can step in and help analyse what's going on.
  • Bad luck - Sometimes it is just down to luck that things work as intended. A bad luck build should have a minimum impact on performance. It does not change the correctness of the program. An example of luck is alignment from malloc, it could so happen that on one platform malloc returns 16 byte aligned pointers. A bad luck build[Note 1] will have all allocations that do not request alignment aligned to an odd byte, which should cause anything dependent on alignment to crash.

Release options

There are a few categories of options that affect the release build. These are:

  1. Framebuffers
  2. Buffers & arrays
  3. Shader programs
  4. Shader resources
  5. Inlinable operators


  • LIBSR_STENCIL_BUFFER_BITS - defaults to value 8 - OpenGL ES 2.0 requires the stencil buffer be at least 8 bits
    • 8 is currently the only supported value, the stencil bits are packed in with the colour buffer, to give 4 bytes per pixel data

Buffers and arrays

  • LIBSR_MAX_NUMBER_OF_DATA_BUFFERS - default value 1024 - It is unknown how many buffers OpenGL ES 2.0 requires as a minimum.
    • The maximum number of databuffers the system will allow to be allocated.
  • LIBSR_MAX_NUMBER_OF_DATA_ARRAYS - default value 4096 - It is unknown how many arrays OpenGL ES 2.0 requires as a minimum.
    • Maximum number of arrays the system will allow to be used.

Shader programs

  • LIBSR_MAX_NUMBER_OF_VERTEX_SHADERS - default value is 64 - It is unknown how many vertex shaders OpenGL ES 2.0 requires as a minimum.
  • LIBSR_MAX_NUMBER_OF_FRAGMENT_SHADERS - default value is 128 - It is unknown how many fragment shaders OpenGL ES 2.0 requires as a minimum.
  • LIBSR_MAX_NUMBER_OF_SHADER_PROGRAMS - default is 128 - It is unknown how many shader programs OpenGL ES 2.0 requires at a minimum.

Shader resources

  • LIBSR_VERTEX_SHADER_UNIFORM_SLOTS - default is 512 - OpenGL ES 2.0 requires that 128 vectors be supported, 512 slots in LibSR parlance
  • LIBSR_FRAGMENT_SHADER_UNIFORM_SLOTS - default is 64 - OpenGL ES 2.0 requires that 16 vectors be supported, 64 slots in LibSR parlance
  • LIBSR_ATTRIBUTE_SLOTS_NUM - default is 32 - OpenGL ES 2.0 requires that 8 Vec4 attributes be available, 32 slots in LibSR parlance
  • LIBSR_VARYING_FLOAT_SLOTS_NUM - default is 32 - OpenGL ES 2.0 requires that 8 Vec4 varyings be available, 32 slots in LibSR parlance
    • There are currently no plans to implement double precision floating point slots, if it is implemented however, it'll be a separate buffer and not count towards the 32 slot allowance for floats (it would have its own allowance).

Inlinable operators

If any of the following are set to 1 an implementation of the function will be provided in the header for LibSR marked as inlinable. The version will be the vectorised or fallback version of the entire build.

  • LIBSR_INLINABLE_VEC4F_BIN_OP_ADD - inlines Vec4f operator+(const Vec4f&, const Vec4f&)
  • LIBSR_INLINABLE_VEC4F_BIN_OP_SUB - inlines Vec4f operator-(const Vec4f&, const Vec4f&)
  • LIBSR_INLINABLE_MAT4F_BIN_OP_MUL - inlines Mat4f operator*(const Mat4f&, const Mat4f&)
  • LIBSR_INLINABLE_MAT4F_BIN_OP_ADD - inlines Mat4f operator+(const Mat4f&, const Mat4f&)
  • LIBSR_INLINABLE_MAT4F_BIN_OP_SUB - inlines Mat4f operator-(const Mat4f&, const Mat4f&)
  • LIBSR_INLINABLE_MAT4F_MUL_VEC4F - inlines Vec4f operator*(const Mat4f&, const Vec4f&)
    • Note: this needs to be investigated as it is highly likely to be worth inlining.

At time of writing all these are 0 and not inlined this is to keep the header small enough for visual inspection

Debug options

    • Turns on the LIBSR_BOUNDS_CHECK(index,capacity) and LIBSR_POINTER_BOUNDS_CHECK(calculation,base,capacity) macros (which have an empty implementation otherwise)
    • If LIBSR_DEBUG_TRAP_ON_BOUNDS_CHECK_FAILURE is on then (in addition to an error message on stderr) a SIGTRAP is emitted, allowing a debugger to reveal LibSR's state.

Bad luck options

    • Causes all pointers that are not requested to have any alignment be aligned to odd byte boundaries. This is achieved by requesting an allocation of size+1 aligned to a 2 byte boundary, and adding 1 to it. When free is called on such odd pointers, one is subtracted from the pointer's value and this new value is freed.
    • Causes all pointers (allocated with alignment) to be ALWAYS be aligned to their alignment only and no bigger. So for example if you request a 16 byte aligned pointer, the address will not be divisible by 32