Bringing performance to next levels
So, after profiling a lot, thinking and optimizing performance, there is quite some optimization possible in variour subtypes.
General
Make stats of used resources, and how to optimize them in most cases
Chunk rendering
There are 2 stages:
Meshing
- reduce memory allocation to a minimum
- Use an index buffer for triangle rendering (saves a ton of temporary memory on cpu and permanent one the gpu side)
- transform texture data first
- pre bake texture uv, shaderTextureId
- pre bake all models and index all faces. Just add the positioon and face index? Might be slower due to caching
- rewrite fluid renderer that is clean, testable and does not allocate memory on the heap
- mesher (good for block entities): if all neighbour blocks are full opaque, can the renderer be left out no matter what? -> yes?
- use shared buffer (create one buffer and reuse it for uploading to the gpu)
Rendering
So rendering has 2 bottle necks. One (pretty small one) is meshing the chunks, sometimes frame stuttering is clearly there when the frustum changes with a render distance of 64. Probably because of updating the queues.
The other bottle neck is moving in general with updating all queues.
The main bottle neck is the gpu is with high render distance and the drawing.
- optimize shader
- remove animation buffer #113 (closed)
- use mesh optimizer
- try loading meshes multi threaded? https://stackoverflow.com/questions/11097170/multithreaded-rendering-on-opengl (but that is not actually a bottleneck)
- use vulkan
- render transparent meshes together with opaque ones (not after) and disable alpha blending on them
- performance: transform block entities async and draw at the same time
- use z buffer of previous frame to calculate object occlusion and cull it at beginning of frame
- fog culling
Joining
- verify player data with yggdrasil on network thread (tab)
- light calculation
- downloading skins
- join while loading (but not sure if thats a good idea)
- prepare chunk mesh while waiting for position (but that needs the position for render distance etc)
- check if (PixLyzer)block is waterlogged
- preload section occlusion for some chunks? they cause lag stuttering initially
Loading
- cache more #120
- speed up texture reading (maybe fork PNGDecoder)
- don't load mcmeta on same thread #113 (closed)
- speed up jackson (maybe with afterburner or dtos; espcially loading JsonObject is slow)
- jar assets: don't use tar input stream (they are so bloated!)
- improve colormap reading
- improve rgba copying
- compiling shaders
- init phase: start everything async and then wait for all tasks to complete
- improve mbf reading
- block model baking super slow
- javafx loading bar? eating resources
- use FluidState for caching fluids (improves physics aabb checking, and chunk prototype creation)
Booting
- speed up javafx
- make TaskWorker work as soon as entries are added (auto work, kutil)
- use binary css (bss)
Physics
They actually slow down the entire system when having tons of entities
- optimize fluid physics (especially water)
Light
Calculating skylight is a massive slowdown
Network
-
calculate occlusion per chunk(is on demand now) - uncompressing chunks
- parsing packet, network in general
HUD
Lets not talk about it #119
To be updated...
Edited by Moritz Zwerger