diff --git a/docs/drivers/asahi.rst b/docs/drivers/asahi.rst index 0c34a24a63b..36d4f1b8682 100644 --- a/docs/drivers/asahi.rst +++ b/docs/drivers/asahi.rst @@ -35,13 +35,13 @@ Vertex shader A vertex shader (running on the :term:`Unified Shader Cores`) outputs varyings with the ``st_var`` instruction. ``st_var`` takes a *vertex output index* and a 32-bit value. The maximum number of *vertex outputs* is specified as the "output count" -of the shader in the "Bind Vertex Pipeline" packet. The value may be interpreted +of the shader in the "VDM State Vertex Outputs" structure. The value may be interpreted consist of a single 32-bit value or an aligned 16-bit register pair, depending on whether interpolation should happen at 32-bit or 16-bit. Vertex outputs are indexed starting from 0, with the *vertex position* always coming first, the 32-bit user varyings coming next with perspective, flat, and linear interpolated varyings grouped in that order, then 16-bit user varyings with the same groupings, -and finally *point size* and *clip distances* at the end if present. Note that +and finally *point size*, *layer/viewport*, and *clip distances* at the end if present. Note that *clip distances* are not accessible from the fragment shader; if the fragment shader needs to read the interpolated clip distance, the vertex shader must *also* write the clip distance values to a user varying for the fragment shader @@ -94,12 +94,14 @@ lowered for APIs that require this (OpenGL but not Vulkan). - Packed pair of 16-bit linear varyings r * - 1 - Point size + * - 1 + - Layer/viewport * - 1 - Clip distance for plane 0 * - - ... * - 1 - - Clip distance for plane 15 + - Clip distance for plane 16 Remapping ````````` @@ -145,7 +147,7 @@ from varying slots. This preloading appears to occur in fixed function hardware, a simplification from PowerVR which requires a specialized program for the programmable data sequencer to do the preload. -The "Bind fragment pipeline" packet points to coefficient register bindings, +The "Fragment Shader" structure points to coefficient register bindings, preceded by a header. The header contains the number of 32-bit varying slots. As the *W* slot is always present, this field is always nonzero. Slots whose index is below this count are treated as 32-bit. The remaining slots are treated as @@ -165,6 +167,12 @@ bindings should be generated outside of the compiler. For simple APIs where the bindings are fixed and known at compile-time, the bindings could be generated within the compiler. +Mathematically, the value of the coefficient register is a vector in +:math:`\mathbb{R}^3`. The X and Y components are the screen-space partial +derivatives of the varying with respect to X and Y. The Z component is the +interpolated value of the varying at the upper-left pixel in the 32x32 tile that +the pixel belongs to. + Fragment shader ``````````````` @@ -179,6 +187,10 @@ coefficient register for W is passed as a second argument. As an example, if there's a single varying to interpolate, an instruction like ``iter r0, cf1, cf0`` is used. +It is occassionally useful to manipulate the raw coefficient registers, for +example to implement interpolation modes not natively supported by the hardware. +``ldcf`` is used for this purpose. + Iterator ````````