Skip to main content

Assembler/disassembler for the Xbox nv2a vertex shader

Project description

Trivial assembler for the nv2a vertex shader.

The Cg -> vp20 path performs various optimizations that sometimes make it hard to force unusual test conditions. This assembler performs no optimizations, and simply translates operands into machine code.

Instructions

  • ADD dst, src1, src2

    • Sum dst = src1 + src2
  • ARL a0, src

    • Load address register - a0 = src
  • DP3 dst, src1, src2

    • 3-component (xyz) dot product - dst = src1 dotproduct_xyz src2
  • DP4 dst, src1, src2

    • 4-component (xyzw) dot product - dst = src1 dotproduct_xyzw src2
  • DPH dst, src1, src2

    • Homogenous dot product - dst = src1.x * src2.x + src1.y * src2.y + src1.z * src2.z + src2.w
  • DST dst, src1, src2

    • Compute distance vector. src1.yz should be distance squared, src2.yw should be 1.0 / distance
    dst.x = 1.0
    dst.y = src1.y * src2.y
    dst.z = src1.z
    dst.w = src2.w
    
  • EXPP dst, src

    • Partial precision 2^x exponentiation.

      x_floor = floorf(src.x)
      dst.x = pow(2.0f, x_floor)
      dst.y = src.x - x_floor
      dst.z = pow(2.0f, src.x)
      dst.w = 1.0f
      
  • LIT dst, src

    • Calculate lighting coefficients The src vector should be initialized with:

      src.x = normal dotproduct direction_to_light

      src.y = normal dotproduct light_half_vector

      src.w = power

      kMax = 127.9961f
      
      dst.x = 1.0f
      dst.y = 0.0f
      dst.z = 0.0f
      dst.w = 1.0f
      
      power = clamp(src.w, -kMax, kMax)
      
      if (src.x > 0.0f) {
      
        dst.x = src.x
      
        if (src.y > 0.0f) {
          dst.z = pow(src.y, power)
        }
      
      }
      
  • LOGP dst, src

    • Partial precision base 2 logarithm
    dst.x = exponent
    dst.y = mantissa
    dst.z = log2(abs(src.x))
    
  • MAD dst, src1, src2, src3

    • Multiplies two sources, then adds a third. dst = src1 * src2 + src3
  • MAX dst, src1, src2

    • Returns the component-wise maximum between two vectors. dst.C = max(src1.C, src2.C); for C in {x, y, z, w}
  • MIN dst, src1, src2

    • Returns the component-wise minimum between two vectors. dst.C = min(src1.C, src2.C); for C in {x, y, z, w}
  • MOV dst, src

    • Assigns the value of one register to another. dst = src
  • MUL dst, src1, src2

    • Multiply dst = src1 * src2
  • RCC dst, src

    • Compute clamped reciprocal. dst.xyzw = 1 / src[.x] clamped to ~(5.42101e-020f, 1.884467e+019)
  • RCP dst, src

    • Compute reciprocal. dst.xyzw = 1 / src[.x]
  • RSQ dst, src

    • Compute reciprocal of the square root. dst.xyzw = 1 / sqrt(src[.x])
  • SGE dst, src1, src2

    • Per component greater than or equal comparison. dst.C = 1.0 if src1.C >= src2.C else 0.0; for C in {x, y, z, w}
  • SLT dst, src1, src2

    • Per component less than comparison. dst.C = 1.0 if src1.C < src2.C else 0.0; for C in {x, y, z, w}

Registers

Temporary registers

  • r0 - r11
  • r12 - read-only view of the oPos output register

Constant/uniform registers

  • c0 - c191

Constant registers may also use relative addressing via bracket syntax. E.g., c[a0 + 12]

Address register

  • a0

May only be set via the ARL instruction. E.g., ARL a0, v12

Inputs

  • v0, iPos
  • v1, iWeight
  • v2, iNormal
  • v3, iDiffuse
  • v4, iSpecular
  • v5, iFog
  • v6, iPts
  • v7, iBackDiffuse
  • v8, iBackSpecular
  • v9, iTex0
  • v10, iTex1
  • v11, iTex2
  • v12, iTex3
  • v13
  • v14
  • v15

Outputs

  • oB0, oBackDiffuse
  • oB1, oBackSpecular
  • oD0, oDiffuse
  • oD1, oSpecular
  • oFog
  • oPos
  • oPts
  • oTex0, oT0
  • oTex1, oT1
  • oTex2, oT2
  • oTex3, oT3

Swizzle/destination masks

  • xyzw
  • rgba

Uniform macros

A simple macro syntax is supported to allow symbolic naming for c-register access. Two types are currently implemented, vector and matrix4.

A vector type aliases a single c register. E.g., to give a symbolic name to c10: #uniform_name vector 10. The macro can then be used in subsequent instructions; e.g., mov r0, #uniform_name.

A matrix4 type aliases a contiguous set of 4 c registers. E.g., to give a symbolic name to a model matrix passed at 'c96': #my_model_matrix matrix4 96. To access the second row: mul r0, v0.y, #my_model_matrix[1].

Operation macros

Higher level operations are supported via macros that expand to multiple instructions.

matmul4x4 - Multiply a 4 element vector by a 4x4 matrix

%matmul4x4 <dst> <read_register> <#matrix4_uniform>

Expands to 4 commands.

%matmul4x4 r0 iPos #model_matrix
------------------------------------
dp4 r0.x, iPos, #model_matrix[0]
dp4 r0.y, iPos, #model_matrix[1]
dp4 r0.z, iPos, #model_matrix[2]
dp4 r0.w, iPos, #model_matrix[3]

norm3 - Normalize a 3-component vector.

%norm3 <dst> <read_register> <read_write_temp_register>

Expands to 3 commands and requires one sacrificial register.

%norm3 r2 iNormal r0
------------------------------------
dp3 r0.x, iNormal, iNormal  ; r0.x = len(normal)^2
rsq r0.w, r0.x  ; r0.w = 1 / len(normal)
mul r2.xyz, iNormal, r0.w  ; r2.xyz = normalized(normal)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nv2a_vsh-0.1.12.tar.gz (67.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nv2a_vsh-0.1.12-py3-none-any.whl (64.4 kB view details)

Uploaded Python 3

File details

Details for the file nv2a_vsh-0.1.12.tar.gz.

File metadata

  • Download URL: nv2a_vsh-0.1.12.tar.gz
  • Upload date:
  • Size: 67.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nv2a_vsh-0.1.12.tar.gz
Algorithm Hash digest
SHA256 99f90241a38b210b51ecbbaa4d1033bd0d439830b59613e7fb4ea3d9634c92ba
MD5 699a5e2f871ad784e67b2178ab47ed39
BLAKE2b-256 cc16a81b3099293e630bdf9c67884cd74b8183c2a49a3759673967397aa7b2b6

See more details on using hashes here.

File details

Details for the file nv2a_vsh-0.1.12-py3-none-any.whl.

File metadata

  • Download URL: nv2a_vsh-0.1.12-py3-none-any.whl
  • Upload date:
  • Size: 64.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nv2a_vsh-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 263e5a4d00ba603c542b1a0628b01e972225792086c852aee24ba817e957a22a
MD5 4d799fbd26b187ba4f63b955ef79fa61
BLAKE2b-256 c06c2da879ff90e50bcdcc36c4dc38111030322c3624119b4b539f67d3f8ac4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page