b16f06fd0593099aad74775a41cf74d4c09c3f6a
Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to llvm not recognizing it's all the same fetch, since it would have been possible some of the fetches getting replaced with zeros in case vector size exceeds remaining fetch count - the values of such fetches don't matter at all though). Also, for elts gathering, use vectorized code as well. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). v3: skip the fake index buffer, not needed due to the jit code never seeing the real index buffer in the first place. Fix a bug with mask expansion (needs SExt, not ZExt). Also, be really really careful to keep the behavior the same, even in cases where it looks wrong, and add comments why the code is doing the seemingly wrong stuff... Fortunately it's not actually more complex in the end... Also change function order slightly just to make the diff more readable. No piglit change. Passes some internal testing with another api too... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
File: docs/README.WIN32 Last updated: 21 June 2013 Quick Start ----- ----- Windows drivers are build with SCons. Makefiles or Visual Studio projects are no longer shipped or supported. Run scons libgl-gdi to build gallium based GDI driver. This will work both with MSVS or Mingw. Windows Drivers ------- ------- At this time, only the gallium GDI driver is known to work. Source code also exists in the tree for other drivers in src/mesa/drivers/windows, but the status of this code is unknown. Recipe ------ Building on windows requires several open-source packages. These are steps that work as of this writing. - install python 2.7 - install scons (latest) - install mingw, flex, and bison - install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs get pywin32-218.4.win-amd64-py2.7.exe - install git - download mesa from git see http://www.mesa3d.org/repository.html - run scons General ------- After building, you can copy the above DLL files to a place in your PATH such as $SystemRoot/SYSTEM32. If you don't like putting things in a system directory, place them in the same directory as the executable(s). Be careful about accidentially overwriting files of the same name in the SYSTEM32 directory. The DLL files are built so that the external entry points use the stdcall calling convention. Static LIB files are not built. The LIB files that are built with are the linker import files associated with the DLL files. The si-glu sources are used to build the GLU libs. This was done mainly to get the better tessellator code. If you have a Windows-related build problem or question, please post to the mesa-dev or mesa-users list.
Description
Languages
C
75.5%
C++
17.2%
Python
2.7%
Rust
1.8%
Assembly
1.5%
Other
1%