XeHP supports a 20:5 pointer format, so the offset can legitimately
be more than UINT16_MAX. Likewise, with 256B binding table mode on
Icelake/Tigerlake, we might have 18:8 pointers that exceed UINT16_MAX.
Thanks to Felix DeGrood for catching this!
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16538>
With this option enabled range of input values for fsin and fcos is
limited to [-2*pi : 2*pi] by calculating the reminder after 2*pi modulo
division. This helps to improve calculation precision for large input
arguments on Intel.
-v2: Add limit_trig_input_range option to prog_key to update shader
cache (Lionel)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16388>
We no longer emit STATE_BASE_ADDRESS in every batch on XeHP, so the
decoder might not know what the various base addresses are if it's only
looking at a single batch. Fortunately, they also never change, so we
can just emit them once here.
On earlier platforms, initializing them here should be harmless. We'll
emit STATE_BASE_ADDRESS if we change them, which will update these.
Thanks to Iván Briano for catching this.
Fixes: 8831cb38aa ("anv: Stop updating STATE_BASE_ADDRESS on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16287>
If the secondary command buffer executed used push constants on a
different set of stages than the primary is using, we may end up not
reallocating them for the primary, getting misrender artifacts at best,
or a nice GPU hang at worst.
Fixes the tests from a CTS from the future:
dEQP-VK.dynamic_rendering.random.*
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16439>
it's been flaking with "2022-05-05 16:29:49.055151: [0m[31mERROR - Failure
getting run results: parsing results: Reading from dEQP: timed out waiting
for fd to be ready (See \"//results/c32.r1.log\")" and a pile of missings
since the brief "whoops, HW CI failed to listen to the test exit code"
regression.
The only ways I know of to hit this case would be:
1) The deqp binary abruptly wedges on its own. This happens with NFS
failures sometimes, but the rest of the run went fine and we never got the
kernel complaining about NFS, so that seems unlikely.
2) The stderr pipe filled up before stdout was completed, and deqp got
wedged trying to output stderr (happens sometimes when you do like
NIR_DEBUG=print in your run).
Both of these seem unlikely, given that we've got a big .qpa file that
made it all the way to writing out test case durations at the end of the
run before abruptly terminating. Why didn't we have at least some of the
test results parsed?
The next deqp-runner release we integrate will solve #2, and cleans up
these error paths a bunch, so I'm hoping we get more information soon.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16350>
On Gfx7 we can only give the sample location for a given multisample
number. This means everytime the multisampling value changes, we have
to re-emit the locations. It's fine because it's also where
(3DSTATE_MULTISAMPLE) the number of samples is stored.
On Gfx8+ though, 3DSTATE_MULTISAMPLE only holds the number of samples
and all the sample locations for all number of samples are located in
3DSTATE_SAMPLE_PATTERN. So to be more effecient there, we need to
track the locations for all sample numbers and compare new values with
the relevant sample count when touching the dynamic state.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
Ken performed some tests with shader-db to evaluate the effects
```
Across all 145,848 shaders generated, the results were:
Total bytes compacted before: 3,326,224
Total bytes compacted after: 60,963,280
```
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15399>
Discrete platforms don't have LLC, but on those, we mmap our buffers
with WC. So we shouldn't need to clflush there.
Anv already had a boolean field on the physical device to know whether
we need to use clflush(), based off the memory heaps available. So use
that instead.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15780>
This changes the intel_device_info calculation to call an additional
DRM query requesting the geometry topology from the kernel, which may
differ from the result of the current topology query on XeHP+
platforms with compute-only and 3D-only DSSes. This seems more
reliable than the current guesswork done in intel_device_info.c trying
to figure out which DSSes are available for the render CS.
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14143>
This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.
All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
shadow2darray with NIR, since the shading languages don't expose txl of
those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>