Commit Graph

4177 Commits

Author SHA1 Message Date
Jason Ekstrand be7e9870d6 anv: Stop including POS in FS input limits
It is an input but it comes in as part of the shader payload and doesn't
count towards the limits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-02 18:56:51 -05:00
Eric Engestrom b80930a6fe anv: add support for VK_EXT_memory_budget
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-30 15:40:33 +00:00
Juan A. Suarez Romero 8d621e8ff7 anv: enable descriptor indexing capabilities
This enables the remaining capabilities in SPV_EXT_descriptor_indexing.

Fixes: 6e230d7607 "anv: Implement VK_EXT_descriptor_indexing"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-30 09:23:46 +02:00
Rafael Antognolli 9175c7058e intel/blorp: Make blorp update the clear color in gen11.
Hardware docs say that Gen11 requires the use of two MI_ATOMICs of size
QWORD when updating the clear color. The second MI_ATOMIC also needs CS
Stall and Return Data Control set.

v2: Remove include of srgb header (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-29 21:19:59 +00:00
Rafael Antognolli f8c3f408a6 intel/genxml: Update MI_ATOMIC genxml definition.
Change some of the single bit fields to booleans, and add an enum with
the definition of the ATOMIC_OPCODE.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-29 21:19:59 +00:00
Jordan Justen 38ffd7ce79 intel/genxml: Support base-16 in value & start fields in gen_sort_tags.py
With python's int(), if the optional second parameter is 0, then
python will support the 0x prefix for hex numbers.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-29 21:19:58 +00:00
Plamena Manolova 232c0f6489 isl: Set ClearColorConversionEnable.
The ClearColorConversionEnable bit needs to be set
for GEN11 when inderect clear colors are used.

Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-04-29 21:19:58 +00:00
Eric Engestrom 7ca8ba199f delete autotools .gitignore files
One special case, `src/util/xmlpool/.gitignore` is not entirely deleted,
as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-04-29 21:17:19 +00:00
Lionel Landwerlin 9628631a38 Revert "anv: limit URB reconfigurations when using blorp"
In commit 0d46e404 ("anv: limit URB reconfigurations when using
blorp") we tried to limit the number of URB reconfiguration by
checking if the last allocation is large enough to fit the blorp
dispatch.

We used the last bound pipeline to compare the allocation. The problem
with this is that the pipeline is bound but its commands might not
have been emitted into the command buffer yet.

Let's just revert commit 0d46e40467
since it didn't seem to yield any performance improvement.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0d46e404 ("anv: limit URB reconfigurations when using blorp")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110535
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-29 11:41:27 +00:00
Kenneth Graunke 9dcf90d7ba intel/fs: Don't emit empty ELSE blocks.
While we can clean this up later, it's trivial to not generate the
stupid code in the first place, which saves some optimization work.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-28 22:36:09 -07:00
Tapani Pälli 376c3e8f87 anv: expose VK_EXT_queue_family_foreign on Android
VK_ANDROID_external_memory_android_hardware_buffer requires this
extension. It is safe to enable it since currently aux usage is
disabled for ahw buffers.

Fixes following dEQP extension dependency test on Android:
   dEQP-VK.api.info.device#extensions

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-29 07:31:02 +03:00
Jason Ekstrand 934f178341 anv/descriptor_set: Don't fully destroy sets in pool destroy/reset
In 105002bd2d, we fixed a memory leak bug where we weren't properly
destroying descriptor when destroying/resetting a descriptor pool.
However, the only real leak that happened was that we we take a
reference to the descriptor set layout in the descriptor set and we
weren't dropping our reference.  Everything else in the descriptor set
is tied to the pool itself and doesn't need to be freed on a per-set
basis.  This commit changes the destroy/reset functions to only bother
walking the list of sets to unref the layouts and otherwise we just
assume that the whole-pool destroy/reset takes care of the rest.

Now that we're doing more non-trivial things with descriptor sets such
as allocating things with util_vma_heap, per-set destruction is starting
to show up on perf traces.  This takes reset back to where it's supposed
to be as a cheap whole-pool operation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-26 05:40:28 +00:00
Jason Ekstrand baf4802e3e anv: Better handle 32-byte alignment of descriptor set buffers
In c520f4dec9, we chose to align the sizes of descriptor set buffers to
32 bytes.  We have to align the descriptor set buffer to 32B so that
it's valid for using with push constants.  We align the size as well so
we don't leave lots of holes with util_vma_heap_alloc.  Unfortunately,
we were only aligning it for alloc and not for free so we were still
creating piles of holes when we delete descriptor sets.  This causes
terrible perf for the allocator once we've deleted piles of descriptor
sets.

This commit reworks the code so that we align the descriptor set buffer
size to 32B for both alloc and free.  The result is that it takes the
new crucible vkResetDescriptorPool from 104.567719 to 2.898354 seconds.

Fixes: c520f4dec9 "anv: Add a concept of a descriptor buffer"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110497
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-26 05:40:28 +00:00
Caio Marcelo de Oliveira Filho 055f6281d4 intel/fs: Don't handle texop_tex for shaders without implicit LOD
These will be lowered by nir_lower_tex() with the
lower_tex_when_implicit_lod_not_supported, so don't need the extra
handling here.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-25 12:13:06 -07:00
Topi Pohjolainen ff642fb0e6 intel/compiler/fs/icl: Use dummy masked urb write for tess eval
One cannot write the URB arbitrarily and therefore the message
has to be carefully constructed. The clever tricks originate
from Kenneth and Jason, I'm just writing the patch.

Fixes GPU hangs on ICL with Vulkan CTS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2019-04-25 22:00:43 +03:00
Juan A. Suarez Romero b06ae53606 Revert "intel/compiler: split is_partial_write() into two variants"
This reverts commit 40b3abb4d1.

It is not clear that this commit was entirely correct, and unfortunately
it was pushed by error.

CC: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-25 09:19:10 +02:00
Lionel Landwerlin f15409ee55 i965: fix icelake performance query enabling
This was a rebase issue which lost of change to a file moved from i965
to src/intel/perf.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 134e750e16 ("i965: extract performance query metrics")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-25 01:11:54 +00:00
Dave Airlie 3323cf08f0 intel/compiler: fix uninit non-static variable. (v2)
Pointed out by coverity.

v2: init nir_locals also.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-25 06:06:57 +10:00
Rafael Antognolli f2041d2a92 intel/isl: Resize clear color buffer to full cacheline
Fixes MCS fast clear gpu hangs with Vulkan CTS on ICL in CI.

v2 (Nanley): In the title s/Align/Resize/

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-04-24 08:56:42 +03:00
Jason Ekstrand 45957c05b0 anv/descriptor_set: Properly align descriptor buffer to a page
Instead of aligning and then taking inline uniforms into account, we
need to take inline uniforms into account and then align to a page.
Otherwise, we may not be aligned to a page and allocation may fail.

Fixes: 43f40dc7cb "anv: Implement VK_EXT_inline_uniform_block"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-24 05:40:27 +00:00
Jason Ekstrand 3d33c13eca anv/descriptor_set: Only vma_heap_finish if we have a descriptor buffer
Fixes: 7bb34ecff9 "anv: release memory allocated by bo_heap when..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-24 05:40:27 +00:00
Jason Ekstrand 0bc1942c9d anv/descriptor_set: Destroy sets before pool finalization
Fixes: 105002bd2d "anv: destroy descriptor sets when pool gets..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-24 05:40:27 +00:00
Jason Ekstrand 6be603edf7 anv/descriptor_set: Unlink sets from the pool in set_destroy
anv_descriptor_pool_free_set is called on the clean-up path of
anv_descriptor_set_create and the set may not have been added to the
pool's list of sets yet.  While we're here, we move adding it to that
list into set_create for symmetry.

Fixes: 105002bd2d "anv: destroy descriptor sets when pool gets..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-24 05:40:27 +00:00
Ian Romanick 21223acf7d intel/fs: Fix D to W conversion in opt_combine_constants
Found by GCC warning:

src/intel/compiler/brw_fs_combine_constants.cpp: In function ‘bool needs_negate(const fs_reg*, const imm*)’:
src/intel/compiler/brw_fs_combine_constants.cpp:306:34: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
       return ((reg->d & 0xffffu) < 0) != (imm->w < 0);
               ~~~~~~~~~~~~~~~~~~~^~~

The result of the bit-and is a 32-bit value with the top bits all zero.
This will never be < 0.  Instead of masking off the bits, just cast to
int16_t and let the compiler handle the actual conversion.

Fixes: e64be391dd ("intel/compiler: generalize the combine constants pass")
Cc: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-23 19:48:33 -07:00
Ian Romanick 26391cceaa intel/compiler: Lower ffma on Gen4 and Gen5
flrp32 is also a 3-source instruction, but there is another pending
series that handles that for Gen4 and Gen5.

v2: Rebase on "intel/compiler: Don't have sepearate, per-Gen
nir_options"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-23 17:50:28 -07:00
Ian Romanick fd1fa9afc7 intel/compiler: Don't have sepearate, per-Gen nir_options
Instead, just have separate scalar vs. vector nir_options and do
per-Gen "fix ups".

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-23 17:50:16 -07:00
Bas Nieuwenhuizen f2e0f5c3c4 vulkan/wsi: Add X11 adaptive sync support based on dri options.
The dri options are optional. When the dri options are not provided
the WSI will not  use adaptive sync.

FWIW I think for xf86-video-amdgpu this still requires an X11 config
option, so only people who opt in can get possible regressions from this.

So then the remaining question is: why do this in the WSI?

It has been suggested in another MR that the application sets this.
However, I disagree with that as I don't think we'll ever get a
reasonable set of applications setting it.

The next questions is whether this can be a layer. It definitely
can be as implemented now. However, I think this generally fits
well with the function of the WSI. Furthemore, for e.g. the DISPLAY
WSI this is much harder to do in a layer.

Of course, most of the WSI could almost be a layer, but I think
this still fits best in the WSI.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-23 23:49:39 +00:00
Lionel Landwerlin 0fb0058f18 anv: fix argument name for vkCmdEndQuery
Doesn't fix anything but it's not the right function prototype.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 673f33c77d ("anv: Implement CmdBegin/EndQueryIndexed")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-04-24 04:33:26 +08:00
Lionel Landwerlin b1ba7ffdbd intel: workaround VS fixed function issue on Gen9 GT1 parts
The issue is noticeable in the
dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d
test where a triangle goes missing when we use the maximum number of
URB entries as specified by the documentation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107505
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-23 13:41:20 +08:00
Matt Turner 4ec258ac3c intel/compiler: Improve fix_3src_operand()
Allow ATTR and IMM sources unconditionally (ATTR are just GRFs, IMM will
be handled by opt_combine_constants(). Both are already allowed by
opt_copy_propagation().

Also allow FIXED_GRF if the regioning is 8,8,1. Could also allow other
stride=1 regions (e.g., 4,4,1) and scalar regions but I don't think
those occur. This is sufficient to allow a pass added in a future commit
(fs_visitor::lower_linterp) to avoid emitting extra MOV instructions.

I removed the 'src.stride > 1' case because it seems wrong: 3-src
instructions on Gen6-9 are align16-only and can only do stride=1 or
stride=0. A run through Jenkins with an assert(src.stride <= 1) never
triggers, so it seems that it was dead code.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-04-22 16:54:31 -07:00
Matt Turner 8aae7a3998 intel/compiler: Add unit tests for sat prop for different exec sizes
The two new unit tests verify that propagating a saturate between
instructions of different exec sizes does not happen.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-04-22 16:54:21 -07:00
Matt Turner 54d4d34b96 intel/compiler: Use SIMD16 instructions in fs saturate prop unit test
Will allow us to test that propagation between instructions of different
exec sizes does not happen (in the next commit).

The stray-looking change in intervening_dest_write is to adjust the size
of the texture result to keep the test functioning identically when the
instructions' exec sizes are doubled. Without the change, the texture
does not overwrite the destination fully as the unit test intends.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-04-22 16:54:17 -07:00
Rafael Antognolli 70e03e220c intel/fs: Remove fs_generator::generate_linterp from gen11+.
We now have a lowering pass that will do this at the fs_visitor level,
so we can remove this code from gen11+.

v2: Reduce size of the "i" array from 4 to 2 (Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-22 16:54:00 -07:00
Rafael Antognolli 9ea90aae1e intel/fs: Add a lowering pass for linear interpolation.
On gen11, instead of using a PLN instruction, we convert
FS_OPCODE_LINTERP to 2 or 4 multiply adds. That is done in the
fs_generator code.

This patch adds a lowering pass that does the same thing at the
fs_visitor. It also drops the usage of NF types, since we don't need the
extra precision and it lets us skip the accumulator. With all that, some
optimizations will still be run on the generated code, and we should get
better scheduling.

v2: Update comment about saturation and conditional mod (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-22 16:54:00 -07:00
Rafael Antognolli c0504569ea intel/fs: Move the scalar-region conversion to the generator.
Move the scalar-region conversion from the IR to the generator, so it
doesn't affect the Gen11 path. We need the non-scalar regioning
for a later lowering pass that we are adding.

v2: Better commit message (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-22 16:54:00 -07:00
Rafael Antognolli 0778748eba intel/fs: Only propagate saturation if exec_size is the same.
Otherwise it could propagate the saturation from a SIMD16 instruction
into a SIMD8 instruction. With that, only part of the destination
register, which is the source of the move with saturation, would have
been updated.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-04-22 16:53:55 -07:00
Ian Romanick a6ccc4c0c8 intel/fs: Add support for float16 to the fsign optimizations
Commit ad98fbc217 ("intel/fs: Refactor code generation for nir_op_fsign
to its own function") criss-crossed with c2b8fb9a81 ("anv/device:
expose VK_KHR_shader_float16_int8 in gen8+"), and I was not paying
enough attention when I rebased.  This adds back the float16 changes and
enables the optimization.

v2: Incorporate more changes from 19cd2f5deb and a8d8b1a139 that I
missed in the previous version.

Fixes: ad98fbc217 ("intel/fs: Refactor code generation for nir_op_fsign to its own function")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110474
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
2019-04-20 20:49:34 -07:00
Jason Ekstrand 828ec41154 anv: Rework the descriptor set layout create loop
Previously, we were storing the per-binding create info pointer in the
immutable_samplers field temporarily so that we can switch the order in
which we walk the loop.  However, now that we have multiple arrays of
structs to walk, it makes more sense to store an index of some sort.
Because we want to leave immutable_samplers as NULL for undefined
bindings, we store index + 1 and then subtract one later.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 23:26:41 +00:00
Jason Ekstrand 2b388c3d04 anv: Ignore descriptor binding flags if bindingCount == 0
I missed this on the first go round.  The bindingCount field of
VkDescriptorSetLayoutBindingFlagsCreateInfoEXT is allowed to be zero
which means the flags array is ignored.

Fixes: d6c9bd6e01 "anv: Put binding flags in descriptor set layouts"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 23:26:41 +00:00
Jason Ekstrand 9ce7c29724 anv/nir: Add a central helper for figuring out SSBO address formats
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand 6e230d7607 anv: Implement VK_EXT_descriptor_indexing
Now that everything is in place to do bindless for all resource types
except input attachments and UBOs, VK_EXT_descriptor_indexing is
"trivial".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand d6c9bd6e01 anv: Put binding flags in descriptor set layouts
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand c0d9926df7 anv: Use bindless handles for images
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand 83af92e593 intel/fs: Add support for bindless image load/store/atomic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand e6803f6b6f anv: Use bindless textures and samplers
This commit changes anv to put bindless handles and sampler pointers
into the descriptor buffer and use those instead of bindful when we run
out of binding table space.  This "spilling" of descriptors allows to to
advertise an almost unbounded number of images and samplers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand bf61f057f7 anv: Pass the plane into lower_tex_deref
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand f16fcb9db7 anv: Use write_image_view to initialize immutable samplers
Instead of setting it manually, call the helper.  When setting
descriptor sets becomes more complicated than just setting some struct
values, this will keep immutable sampler handling correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand e612c3b9bf anv: Count the number of planes in each descriptor binding
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand 843286d324 intel/fs: Add support for bindless texture ops
We add two new texture sources for bindless surface and sampler handles.
Bindless surface handles are expected to be pre-shifted so that the
20-bit surface state table index is in the top 20 bits of the 32-bit
handle.  This lets us avoid any extra shifts in the shader.  Bindless
sampler handles are 32-byte aligned byte offsets from general state base
address.  We use 32-byte aligned instead of 16-byte aligned to avoid
having to use more indirect messages than needed.  It means we can't
tightly pack samplers but that's probably not a big deal.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand 2edf29b933 intel,nir: Lower TXD with a bindless sampler
When we have a bindless sampler, we need an instruction header.  Even in
SIMD8, this pushes the instruction over the sampler message size maximum
of 11 registers.  Instead, we have to lower TXD to TXL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00