看U3D文档,心得:对于3D场景,使用分层次的距离裁剪,小物件分到一个层,稍远时就被裁掉,大物体分到一个层,距离很远时才裁掉,甚至不载。中物体介于二者之间。

文档如下:

Good performance is critical to the success of many games. Below are some simple guidelines for maximizing the speed of your game’s rendering.

Locate high graphics impact

The graphical parts of your game can primarily impact on two systems of the computer: the GPU and the CPU. The first rule of any optimization is to find where the performance problem is, because strategies for optimizing for GPU vs. CPU are quite different (and can even be opposite - for example, it’s quite common to make the GPU do more work while optimizing for CPU, and vice versa).

Common bottlenecks and ways to check for them:

  • GPU is often limited by fillrate or memory bandwidth.

    • Lower the display resolution and run the game. If a lower display resolution makes the game run faster, you may be limited by fillrate on the GPU.
  • CPU is often limited by the number of batches that need to be rendered.
    • Check “batches” in the Rendering Statistics window. The more batches are being rendered, the higher the cost to the CPU.

Less-common bottlenecks:

  • The GPU has too many vertices to process. The number of vertices that is acceptable to ensure good performance depends on the GPU and the complexity of vertex shaders. Generally speaking, aim for no more than 100,000 vertices on mobile. A PC manages well even with several million vertices, but it is still good practice to keep this number as low as possible through optimization.
  • The CPU has too many vertices to process. This could be in skinned meshes, cloth simulation, particles, or other game objects and meshes. As above, it is generally good practice to keep this number as low as possible without compromising game quality. See the section on CPU optimization below for guidance on how to do this.
  • If rendering is not a problem on the GPU or the CPU, there may be an issue elsewhere - for example, in your script or physics. Use the Unity Profiler to locate the problem.

CPU optimization

To render objects on the screen, the CPU has a lot of processing work to do: working out which lights affect that object, setting up the shader and shader parameters, and sending drawing commands to the graphics driver, which then prepares the commands to be sent off to the graphics card.

All this “per object” CPU usage is resource-intensive, so if you have lots of visible objects, it can add up. For example, if you have a thousand triangles, it is much easier on the CPU if they are all in one mesh, rather than in one mesh per triangle (adding up to 1000 meshes). The cost of both scenarios on the GPU is very similar, but the work done by the CPU to render a thousand objects (instead of one) is significantly higher.

Reduce the visible object count. To reduce the amount of work the CPU needs to do:

  • Combine close objects together, either manually or using Unity’s draw call batching.
  • Use fewer materials in your objects by putting separate textures into a larger texture atlas.
  • Use fewer things that cause objects to be rendered multiple times (such as reflections, shadows and per-pixel lights).

Combine objects together so that each mesh has at least several hundred triangles and uses only one Material for the entire mesh. Note that combining two objects which don’t share a material does not give you any performance increase at all. The most common reason for requiring multiple materials is that two meshes don’t share the same textures; to optimize CPU performance, ensure that any objects you combine share the same textures.

When using many pixel lights in the Forward rendering path, there are situations where combining objects may not make sense. See the Lighting performance section below to learn how to manage this.

GPU: Optimizing model geometry

There are two basic rules for optimizing the geometry of a model:

  • Don’t use any more triangles than necessary
  • Try to keep the number of UV mapping seams and hard edges (doubled-up vertices) as low as possible

Note that the actual number of vertices that graphics hardware has to process is usually not the same as the number reported by a 3D application. Modeling applications usually display the number of distinct corner points that make up a model (known as the geometric vertex count). For a graphics card, however, some geometric vertices need to be split into two or more logical vertices for rendering purposes. A vertex must be split if it has multiple normals, UV coordinates or vertex colors. Consequently, the vertex count in Unity is usually higher than the count given by the 3D application.

While the amount of geometry in the models is mostly relevant for the GPU, some features in Unity also process models on the CPU (for example, mesh skinning).

Lighting performance

The fastest option is always to create lighting that doesn’t need to be computed at all. To do this, use Lightmapping to “bake” static lighting just once, instead of computing it each frame. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but:

  • It runs a lot faster (2–3 times faster for 2-per-pixel lights)
  • It looks a lot better, as you can bake global illumination and the lightmapper can smooth the results

In many cases you can apply simple tricks instead of adding multiple extra lights. For example, instead of adding a light that shines straight into the camera to give a Rim Lighting effect, add a dedicated Rim Lighting computation directly into your shaders (see Surface Shader Examples to learn how to do this).

Lights in forward rendering

Also see: Forward rendering

Per-pixel dynamic lighting adds significant rendering work to every affected pixel, and can lead to objects being rendered in multiple passes. Avoid having more than one Pixel Light illuminating any single object on less powerful devices, like mobile or low-end PC GPUs, and use lightmaps to light static objects instead of calculating their lighting every frame. Per-vertex dynamic lighting can add significant work to vertex transformations, so try to avoid situations where multiple lights illuminate a single object.

Avoid combining meshes that are far enough apart to be affected by different sets of pixel lights. When you use pixel lighting, each mesh has to be rendered as many times as there are pixel lights illuminating it. If you combine two meshes that are very far apart, it increase the effective size of the combined object. All pixel lights that illuminate any part of this combined object are taken into account during rendering, so the number of rendering passes that need to be made could be increased. Generally, the number of passes that must be made to render the combined object is the sum of the number of passes for each of the separate objects, so nothing is gained by combining meshes.

During rendering, Unity finds all lights surrounding a mesh and calculates which of those lights affect it most. The Quality Settings are used to modify how many of the lights end up as pixel lights, and how many as vertex lights. Each light calculates its importance based on how far away it is from the mesh and how intense its illumination is - and some lights are more important than others purely from the game context. For this reason, every light has a Render Mode setting which can be set to Important or Not Important; lights marked as Not Important have a lower rendering overhead.

Example: Consider a driving game in which the player’s car is driving in the dark with headlights switched on. The headlights are probably the most visually significant light source in the game, so their Render Mode should be set to Important. There may be other lights in the game that are less important, like other cars’ rear lights or distant lampposts, and which don’t improve the visual effect much by being pixel lights. TheRender Mode for such lights can safely be set to Not Important to avoid wasting rendering capacity in places where it has little benefit.

Optimizing per-pixel lighting saves both the CPU and GPU work: the CPU has fewer draw calls to do, and the GPU has fewer vertices to process and pixels to rasterize for all the additional object renders.

GPU: Texture compression and mipmaps

Use Compressed textures to decrease the size of your textures. This can resulting in faster load times, a smaller memory footprint, and dramatically increased rendering performance. Compressed textures only use a fraction of the memory bandwidth needed for uncompressed 32-bit RGBA textures.

Texture mipmaps

Always enable Generate mipmaps for textures used in a 3D scene. A mipmap texture enables the GPU to use a lower resolution texture for smaller triangles.This is similar to how texture compression can help limit the amount of texture data transfered when the GPU is rendering.

The only exception to this rule is when a texel (texture pixel) is known to map 1:1 to the rendered screen pixel, as with UI elements or in a 2D game.

LOD and per-layer cull distances

Culling objects involves making objects invisible. This is an effective way to reduce both the CPU and GPU load.

In many games, a quick and effective way to do this without compromising the player experience is to cull small objects more aggressively than large ones. For example, small rocks and debris could be made invisible at long distances, while large buildings would still be visible.

There are a number of ways you can achieve this:

Realtime shadows

Realtime shadows are nice, but they can have a high impact on performance, both in terms of extra draw calls for the CPU and extra processing on the GPU. For further details, see the Light Performance page.

GPU: Tips for writing high-performance shaders

Different platforms have vastly different performance capabilities; a high-end PC GPU can handle much more in terms of graphics and shaders than a low-end mobile GPU. The same is true even on a single platform; a fast GPU is dozens of times faster than a slow integrated GPU.

GPU performance on mobile platforms and low-end PCs is likely to be much lower than on your development machine. It’s recommended that you manually optimize your shaders to reduce calculations and texture reads, in order to get good performance across low-end GPU machines. For example, some built-in Unity shaders have “mobile” equivalents that are much faster, but have some limitations or approximations.

Below are some guidelines for mobile and low-end PC graphics cards:

Complex mathematical operations

Transcendental mathematical functions (such as powexplogcossintan) are quite resource-intensive, so avoid using them where possible. Consider using lookup textures as an alternative to complex math calculations if applicable.

Avoid writing your own operations (such as normalizedotinversesqrt). Unity’s built-in options ensure that the driver can generate much better code. Remember that the Alpha Test (discard) operation often makes your fragment shader slower.

Floating point precision

While the precision (float vs half vs fixed) of floating point variables is largely ignored on desktop GPUs, it is quite important to get a good performance on mobile GPUs. See the Shader Data Types and Precision page for details.

For further details about shader performance, see the Shader Performance page.

Simple checklist to make your game faster

  • Keep the vertex count below 200K and 3M per frame when building for PC (depending on the target GPU).
  • If you’re using built-in shaders, pick ones from the Mobile or Unlit categories. They work on non-mobile platforms as well, but are simplified and approximated versions of the more complex shaders.
  • Keep the number of different materials per scene low, and share as many materials between different objects as possible.
  • Set the Static property on a non-moving object to allow internal optimizations like static batching.
  • Only have a single (preferably directional) pixel light affecting your geometry, rather than multiples.
  • Bake lighting rather than using dynamic lighting.
  • Use compressed texture formats when possible, and use 16-bit textures over 32-bit textures.
  • Avoid using fog where possible.
  • Use Occlusion Culling to reduce the amount of visible geometry and draw-calls in cases of complex static scenes with lots of occlusion. Design your levels with occlusion culling in mind.
  • Use skyboxes to “fake” distant geometry.
  • Use pixel shaders or texture combiners to mix several textures instead of a multi-pass approach.
  • Use half precision variables where possible.
  • Minimize use of complex mathematical operations such as powsin and cos in pixel shaders.
  • Use fewer textures per fragment.

See Also

Optimizing graphics performance的更多相关文章

  1. Unity3D Optimizing Graphics Performance for iOS

    原地址:http://blog.sina.com.cn/s/blog_72b936d801013ptr.html icense Comparisons http://unity3d.com/unity ...

  2. Unity性能优化(4)-官方教程Optimizing graphics rendering in Unity games翻译

    本文是Unity官方教程,性能优化系列的第四篇<Optimizing graphics rendering in Unity games>的翻译. 相关文章: Unity性能优化(1)-官 ...

  3. Designing for iOS: Graphics & Performance

    http://robots.thoughtbot.com/designing-for-ios-graphics-performance  [原文] In the previous article, w ...

  4. 优化脚本性能 Optimizing Script Performance

    This page gives some general hints for improving script performance on iOS. 此页面提供了一些一般的技巧,提高了在iOS上的脚 ...

  5. Performance Optimization (2)

    DesktopGood performance is critical to the success of many games. Below are some simple guidelines f ...

  6. 转 unity 优化

    最近研究U3D开发,个人认为,精通一种新的技术,最快最好的方法就是看它的document,而且个人习惯不喜欢看中文的资料,原汁原味的东西是最正确的,一翻译过来很多东西就都不那么准确了.于是通读了uni ...

  7. Unity开发-你必须知道的优化建议

    转自:http://blog.csdn.net/leonwei/article/details/18042603 最近研究U3D开发,个人认为,精通一种新的技术,最快最好的方法就是看它的documen ...

  8. 【转】分析器窗口 Profiler window

    转自unity圣典: http://game.ceeger.com/Manual/ProfilerWindow.html http://game.ceeger.com/Manual/Profiler. ...

  9. [Unity优化] Unity CPU性能优化

    前段时间本人转战unity手游,由于作者(Chwen)之前参与端游开发,有些端游的经验可以直接移植到手游,比如项目框架架构.代码设计.部分性能分析,而对于移动终端而言,CPU.内存.显卡甚至电池等硬件 ...

随机推荐

  1. Apache Spark 内存管理详解

    在spark里面,内存管理有两块组成,一部分是JVM的堆内内存(on-heap memory),这部分内存是通过spark dirver参数executor-memory以及spark.executo ...

  2. 为什么要使用MONO

    今天中午我收到一个Email,是关于以前写的一个MONO文章的疑问,我对此做了一些解释,希望与有相同问题的朋友一起分享一下,邮件内容如下: 我在网上找到了您写的一篇关于Mono的“在windows下使 ...

  3. Linux 后台进程管理

    fg.bg.jobs.&.ctrl + z命令一. &加在一个命令的最后,可以把这个命令放到后台执行 ,如gftp &,二.ctrl + z可以将一个正在前台执行的命令放到后台 ...

  4. 廖雪峰Java1-2Java程序基础-2变量和数据类型

    1.变量 变量是可以持有某个基本类型的数值,或者指向某个对象. 变量必须先定义后使用 定义: 变量类型 变量名 = 初始值; 2.java基本数据类型 整数类型:long int short byte ...

  5. mysql binlog协议分析--具体event

    这几天在修改canal, 连接mysql和maria接收到的event有所区别 拿一个简单的insert sql来举例 mysql 会有以下几个event写入到binlog里 1.ANONYMOUS_ ...

  6. 第3章 文件I/O(8)_贯穿案例:构建标准IO函数库

    9. 贯穿案例:构建标准IO函数库 //mstdio.h #ifndef __MSTDIO_H__ #define __MSTDIO_H__ #include <unistd.h> #de ...

  7. PHP的mysqli_query参数MYSQLI_STORE_RESULT和MYSQLI_USE_RESULT的区别

    这篇文章主要介绍了PHP的mysqli_query参数MYSQLI_STORE_RESULT和MYSQLI_USE_RESULT的区别,本文给出了这两个参数的5个区别,需要的朋友可以参考下 虽然nos ...

  8. windows CIFS sabma协议识别

    今天在linux上搭建了CIFS协议,使用sabma4.7.0版本. 通过smbstatus可以查看smb软件的版本 通过/bin/smbstatus -d 0可以查看目前正连接的客户端. 问题来了: ...

  9. centos6性能监控软件

    常用软件在此下载 http://rpm.pbone.net/ http://pkgs.org/ collectl 显示cpu\disk\network的实时信息http://dl.fedoraproj ...

  10. filter vs servlet

    主要从如下四个方面介绍他们之间的区别:                 1.概念.                 2.生命周期.                 3.职责. 4.执行过程. 一.概念 ...