Just got back from a week in Hawaii with the family. It was great to finally take a break, and come back refreshed instead of feeling overwhelmed.
Two weeks ago the good news for GEM was getting ReadPixels to go fast. It should be fast under our architecture, since we can map the pages and get at them cached. We just didn't get around to making sure that was the case for a while, and it turns out we had some traps. Conformance tests that go and write a pixel, then read the value, then write a new pixel etc. were taking hours instead of the minutes that they should. The fix was to make the fallback spans code (which is how our ReadPixels is handled, though this is pretty pessimal) use pread instead of mapping and reading the contents out. This gave the kernel the information it needed to not clflush the whole buffer when you just needed a little bit out. That fixed the conformance tests. Once I combined that with a little cache of pread data so I didn't syscall per pixel, I was getting faster large-scale ReadPixels out of GEM than we'd seen out of the aperture reads that the driver's always used before. And there's still significant room for improvement.
ReadPixels shouldn't be important. In theory apps aren't doing this, since ReadPixels is so slow on just about everyone's hardware that you should figure out some way to get around using it. But it turned out that at least gl-117 was actually using it, and it was a performance bottleneck, which is now almost gone (~10% of the profile). So in this case conformance testing ended up forcing us to fix a real app bug.
I also fixed a couple of performance regressions seen on the 965 with openarena -- one was a failure in the drm-gem-merge branch, and another was cache ping-ponging for vertex/index buffer uploads since GTT mappings started happening. It's now back up to 76 fps from 28 fps.
With our recent fixes plus the changes in response to lkml review, I'm ready to take another stab at getting into linux-next. That's the gating step for releasing libdrm, then the 2d driver, then making an appropriate mesa tarball, then getting all the new hotness into distros. It's been a long time coming...
Complete Story
0 comments:
Post a Comment