Michel Hermier reported libdrm segfault (and kernel crash) on nv40 using
gallium :
http://www.mail-archive.com/nouveau@lists.freedesktop.org/msg06563.html
It turns out these were caused by some missing WAIT_RING (or wrong
computation of the WAIT_RING sizes). Unlike all other libdrm_nouveau users,
nvfx gallium tried to use a mininum calls of WAIT_RING, one WAIT_RING could
apply to many methods for different code paths and spread across several
functions. This made it too tricky to find out what the missing or wrong
WAIT_RING was.
By restoring BEGIN_RING, we force one WAIT_RING per method, and it's much
easier to check if the free size required in the pushbuffer is correct. As
curro said, "let's keep it simple for the maintainers until the big
bottlenecks are gone"
Benchmarked on nv35 with openarena, nexuiz and ut2004 and no performance
regression.
The core of this patch was made with Coccinelle, with minor manual fixes
made on top.
Tested-by: Michel Hermier <hermier@frugalware.org>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
This is the hack for input interactivity of frontbuffer rendering
(like we do for backbuffer at intelDRI2Flush()) by waiting for the n-2
frame to complete before starting a new one. However, for an
application doing multiple contexts or regular rebinding of a single
context, this would end up lockstepping the CPU to the GPU because
every unbind was considered the end of a frame.
Improves WOW performance on my Ironlake by 48.8% (+/- 2.3%, n=5)
Given a dispatch slot, entry_get_public returns the address of the
corresponding public entry point. There may be more than one of them.
But since they are all equivalent, it is fine to return any one of them.
With entry_get_public, the address of any public entry point can be
calculated at runtime when an assembly dispatcher is used. There is no
need to have a mapping table in such case. This omits the unnecessary
relocations from the binary.
Hidden entries are just like normal entries except that they are not
exported. Since it is not always possible to hide them, and two hidden
aliases can share the same entry, the name of hidden aliases are mangled
to '_dispatch_stub_<slot>'.
Split out function name generation from _c_decl to _c_function, and use
it everywhere. Add an optional 'export' argument to _cdecl. It is
prepended to the returned string.
Really no idea why I didn't see this before, but these values were opposite
the register spec.
this seems to fix rv530 HiZ on my laptop, will reenable in next commit.
Signed-off-by: Dave Airlie <airlied@redhat.com>