提交测试

This commit is contained in:
2024-01-16 17:22:21 +08:00
parent 92862c0372
commit 73635fda01
654 changed files with 178015 additions and 2 deletions

149
docs/notes/checkpoints.rst Normal file
View File

@@ -0,0 +1,149 @@
.. _3d_viz:
3D Checkpoint Visualization
===========================
.. image:: ../img/koala.jpg
Visualizing 3D inputs and outputs of your model during training is an
essential diagnostic tool. Kaolin provides a :ref:`simple API to checkpoint<writing checkpoints>` **batches of meshes, pointclouds and voxelgrids**, as well as **colors and
textures**, saving them in :ref:`the USD format<file format>`. These checkpoints can then be visualized locally using :ref:`Kaolin Omniverse App<ov app>` or by launching :ref:`Kaolin Dash3D<dash 3d>` on the commandline, allowing remote visualization through a web browser.
.. _writing checkpoints:
Writing Checkpoints:
--------------------
In a common scenario, model performance is visualized for a
small evaluation batch. Bootstrap 3D checkpoints in your python training
code by configuring a :class:`~kaolin.visualize.Timelapse` object::
import kaolin
timelapse = kaolin.visualize.Timelapse(viz_log_dir)
The ``viz_log_dir`` is the directory where checkpoints will be saved. Timelapse will create files and subdirectories under this path, so providing
a dedicated ``viz_log_dir`` separate from your other logs and configs will help keep things clean. The :class:`~kaolin.visualize.Timelapse` API supports point clouds,
voxel grids and meshes, as well as colors and textures.
Saving Fixed Data
^^^^^^^^^^^^^^^^^
To save any iteration-independent data,
call ``timelapse`` before your training loop
without providing an ``iteration`` parameter, e.g.::
timelapse.add_mesh_batch(category='ground_truth',
faces_list=face_list,
vertices_list=gt_vert_list)
timelapse.add_pointcloud_batch(category='input',
pointcloud_list=input_pt_clouds)
The ``category`` identifies the meaning of the data. In this toy example,
the model learns to turn the ``'input'`` pointcloud into the ``'output'`` mesh. Both the ``'ground_truth'`` mesh and the ``'input'`` pointcloud batches are only saved once for easy visual comparison.
Saving Time-varying Data
^^^^^^^^^^^^^^^^^^^^^^^^
To checkpoint time-varying data during training, simply call :meth:`~kaolin.visualize.Timelapse.add_mesh_batch`, :meth:`~kaolin.visualize.Timelapse.add_pointcloud_batch` or :meth:`~kaolin.visualize.Timelapse.add_voxelgrid_batch`, for example::
if iteration % checkpoint_interval == 0:
timelapse.add_mesh_batch(category='output',
iteration=iteration,
faces_list=face_list,
vertices_list=out_vert_list)
.. Tip::
For any data type, only time-varying data needs to be saved at every iteration. E.g., if your output mesh topology is fixed, only save ``faces_list`` once, and then call ``add_mesh_batch`` with only the predicted ``vertices_list``. This will cut down your checkpoint size.
Saving Colors and Appearance
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We are working on adding support for colors and semantic ids to
point cloud and voxel grid checkpoints. Mesh API supports multiple time-varying materials
by specifying a :class:`kaolin.io.PBRMaterial`. For an example
of using materials, see
`test_timelapse.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/tests/python/kaolin/visualize/test_timelapse.py>`_.
Sample Code
^^^^^^^^^^^
We provide a `script <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/visualize_main.py>`_ that writes mock checkpoints, which can be run as follows::
python examples/tutorial/visualize_main.py \
--test_objs=path/to/object1.obj,path/to/object2.obj \
--output_dir=path/to/logdir
In addition, see :ref:`diff_render` tutorial.
.. _file format:
Understanding the File Format:
------------------------------
Kaolin :class:`~kaolin.visualize.Timelapse` writes checkpoints using Universal Scene Descriptor (USD) file format (`Documentation <https://graphics.pixar.com/usd/docs/index.html>`_), developed with wide support for use cases in visual effects, including time-varying data. This allows reducing redundancy in written
data across time.
After checkpointing with :class:`~kaolin.visualize.Timelapse`, the input ``viz_log_dir`` will contain
a similar file structure::
ground_truth/mesh_0.usd
ground_truth/mesh_1.usd
ground_truth/mesh_...
ground_truth/textures
input/pointcloud_0.usd
input/pointcloud_1.usd
input/pointcloud_...
output/mesh_0.usd
output/mesh_1.usd
output/mesh_...
output/pointcloud_0.usd
output/pointcloud_1.usd
output/pointcloud_...
output/textures
Here, the root folder names correspond to the ``category`` parameter
provided to :class:`~kaolin.visualize.Timelapse` functions. Each element
of the batch of every type is saved in its own numbered ``.usd`` file. Each USD file can be viewed on its
own using any USD viewer, such as `NVIDIA Omniverse View <https://www.nvidia.com/en-us/omniverse/apps/view/>`_, or the whole log directory can be visualized
using the tools below.
.. Caution::
Timelapse is designed to only save one visualization batch for every category and type. Saving multiple batches without interleaving the data can be accomplished by creating custom categories.
.. _ov app:
Visualizing with Kaolin Omniverse App:
--------------------------------------
.. image:: ../img/ov_viz.jpg
USD checkpoints can be visualized using a dedicated Omniverse Kaolin App `Training Visualizer <https://docs.omniverse.nvidia.com/app_kaolin/app_kaolin/user_manual.html#training-visualizer>`_.
This extension provides full-featured support and high-fidelity rendering
of all data types and materials that can be exported using :class:`~kaolin.visualize.Timelapse`, and allows creating custom visualization layouts and viewing meshes in multiple time-varying materials. `Download NVIDIA Omniverse <https://www.nvidia.com/en-us/omniverse/>`_ to get started!
.. _dash 3d:
Visualizing with Kaolin Dash3D:
-------------------------------
.. image:: ../img/dash3d_viz.jpg
Omniverse app requires local access to a GPU and to the saved checkpoints, which is not always possible.
We are also developing a lightweight ``kaolin-dash3d`` visualizer,
which allows visualizing local and remote checkpoints without specialized
hardware or applications. This tool is bundled with the latest
builds as a command-line utility
To start Dash3D on the machine that stores the checkpoints, run::
kaolin-dash3d --logdir=$TIMELAPSE_DIR --port=8080
The ``logdir`` is the directory :class:`kaolin.visualize.Timelapse` was configured with. This command will launch a web server that will stream
geometry to web clients. To connect, simply visit ``http://ip.of.machine:8080`` (or `localhost:8080 <http://localhost:8080/>`_ if connecting locally or with ssh port forwarding).
Try it now:
^^^^^^^^^^^^^
See Dash3D in action by running it on our test samples and visiting `localhost:8080 <http://localhost:8080/>`_::
kaolin-dash3d --logdir=$KAOLIN_ROOT/tests/samples/timelapse/notexture/ --port=8080
.. Caution:: Dash3d is still an experimental feature under active development. It only supports **triangle meshes** and **pointclouds** and cannot yet visualize colors, ids or textures. The web client was tested the most on `Google Chrome <https://www.google.com/chrome/>`_. We welcome your early feedback on our `github <https://github.com/NVIDIAGameWorks/kaolin/issues>`_!

View File

@@ -0,0 +1,13 @@
.. _diff_render:
Differentiable Rendering
========================
.. image:: ../img/clock.gif
Differentiable rendering can be used to optimize the underlying 3D properties, like geometry and lighting, by backpropagating gradients from the loss in the image space. We provide an end-to-end tutorial for using the :mod:`kaolin.render.mesh` API in a Jupyter notebook:
`examples/tutorial/dibr_tutorial.ipynb <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/dibr_tutorial.ipynb>`_
In addition to the rendering API, the tutorial uses Omniverse Kaolin App `Data Generator <https://docs.omniverse.nvidia.com/app_kaolin/app_kaolin/user_manual.html#data-generator>`_ to create training data, :class:`kaolin.visualize.Timelapse` to write checkpoints, and
Omniverse Kaolin App `Training Visualizer <https://docs.omniverse.nvidia.com/app_kaolin/app_kaolin/user_manual.html#training-visualizer>`_ to visualize them.

View File

@@ -0,0 +1,237 @@
Differentiable Camera
*********************
.. _differentiable_camera:
Camera class
============
.. _camera_class:
:class:`kaolin.render.camera.Camera` is a one-stop class for all camera related differentiable / non-differentiable transformations.
Camera objects are represented by *batched* instances of 2 submodules:
- :ref:`CameraExtrinsics <camera_extrinsics_class>`: The extrinsics properties of the camera (position, orientation).
These are usually embedded in the view matrix, used to transform vertices from world space to camera space.
- :ref:`CameraIntrinsics <camera_intrinsics_class>`: The intrinsics properties of the lens
(such as field of view / focal length in the case of pinhole cameras).
Intrinsics parameters vary between different lens type,
and therefore multiple CameraIntrinsics subclasses exist,
to support different types of cameras: pinhole / perspective, orthographic, fisheye, and so forth.
For pinehole and orthographic lens, the intrinsics are embedded in a projection matrix.
The intrinsics module can be used to transform vertices from camera space to Normalized Device Coordinates.
.. note::
To avoid tedious invocation of camera functions through
``camera.extrinsics.someop()`` and ``camera.intrinsics.someop()``, kaolin overrides the ``__get_attributes__``
function to forward any function calls of ``camera.someop()`` to
the appropriate extrinsics / intrinsics submodule.
The entire pipeline of transformations can be summarized as (ignoring homogeneous coordinates)::
World Space Camera View Space
V ---CameraExtrinsics.transform()---> V' ---CameraIntrinsics.transform()---
Shape~(B, 3) (view matrix) Shape~(B, 3) |
|
(linear lens: projection matrix) |
+ homogeneus -> 3D |
V
Normalized Device Coordinates (NDC)
Shape~(B, 3)
When using view / projection matrices, conversion to homogeneous coordinates is required.
Alternatively, the `transform()` function takes care of such projections under the hood when needed.
How to apply transformations with kaolin's Camera:
1. Linear camera types, such as the commonly used pinhole camera,
support the :func:`view_projection_matrix()` method.
The returned matrix can be used to transform vertices through pytorch's matrix multiplication, or even be
passed to shaders as a uniform.
2. All Cameras are guaranteed to support a general :func:`transform()` function
which maps coordinates from world space to Normalized Device Coordinates space.
For some lens types which perform non linear transformations,
the :func:`view_projection_matrix()` is non-defined.
Therefore the camera transformation must be applied through
a dedicated function. For linear cameras,
:func:`transform()` may use matrices under the hood.
3. Camera parameters may also be queried directly.
This is useful when implementing camera params aware code such as ray tracers.
How to control kaolin's Camera:
- :class:`CameraExtrinsics`: is packed with useful methods for controlling the camera position and orientation:
:func:`translate() <CameraExtrinsics.translate()>`,
:func:`rotate() <CameraExtrinsics.rotate()>`,
:func:`move_forward() <CameraExtrinsics.move_forward()>`,
:func:`move_up() <CameraExtrinsics.move_up()>`,
:func:`move_right() <CameraExtrinsics.move_right()>`,
:func:`cam_pos() <CameraExtrinsics.cam_pos()>`,
:func:`cam_up() <CameraExtrinsics.cam_up()>`,
:func:`cam_forward() <CameraExtrinsics.cam_forward()>`,
:func:`cam_up() <CameraExtrinsics.cam_up()>`.
- :class:`CameraIntrinsics`: exposes a lens :func:`zoom() <CameraIntrinsics.zoom()>`
operation. The exact functionality depends on the camera type.
How to optimize the Camera parameters:
- Both :class:`CameraExtrinsics`: and :class:`CameraIntrinsics` maintain
:class:`torch.Tensor` buffers of parameters which support pytorch differentiable operations.
- Setting ``camera.requires_grad_(True)`` will turn on the optimization mode.
- The :func:`gradient_mask` function can be used to mask out gradients of specific Camera parameters.
.. note::
:class:`CameraExtrinsics`: supports multiple representions of camera parameters
(see: :func:`switch_backend <CameraExtrinsics.switch_backend()>`).
Specific representations are better fit for optimization
(e.g.: they maintain an orthogonal view matrix).
Kaolin will automatically switch to using those representations when gradient flow is enabled
For non-differentiable uses, the default representation may provide better
speed and numerical accuracy.
Other useful camera properties:
- Cameras follow pytorch in part, and support arbitrary ``dtype`` and ``device`` types through the
:func:`to()`, :func:`cpu()`, :func:`cuda()`, :func:`half()`, :func:`float()`, :func:`double()`
methods and :func:`dtype`, :func:`device` properties.
- :class:`CameraExtrinsics`: and :class:`CameraIntrinsics`: individually support the :func:`requires_grad`
property.
- Cameras implement :func:`torch.allclose` for comparing camera parameters under controlled numerical accuracy.
The operator ``==`` is reserved for comparison by ref.
- Cameras support batching, either through construction, or through the :func:`cat()` method.
.. note::
Since kaolin's cameras are batched, the view/projection matrices are of shapes :math:`(\text{num_cameras}, 4, 4)`,
and some operations, such as :func:`transform()` may return values as shapes of :math:`(\text{num_cameras}, \text{num_vectors}, 3)`.
Concluding remarks on coordinate systems and other confusing conventions:
- kaolin's Cameras assume column major matrices, for example, the inverse view matrix (cam2world) is defined as:
.. math::
\begin{bmatrix}
r1 & u1 & f1 & px \\
r2 & u2 & f2 & py \\
r3 & u3 & f3 & pz \\
0 & 0 & 0 & 1
\end{bmatrix}
This sometimes causes confusion as the view matrix (world2cam) uses a transposed 3x3 submatrix component,
which despite this transposition is still column major (observed through the last `t` column):
.. math::
\begin{bmatrix}
r1 & r2 & r3 & tx \\
u1 & u2 & u3 & ty \\
f1 & f2 & f3 & tz \\
0 & 0 & 0 & 1
\end{bmatrix}
- kaolin's cameras do not assume any specific coordinate system for the camera axes. By default, the
right handed cartesian coordinate system is used. Other coordinate systems are supported through
:func:`change_coordinate_system() <CameraExtrinsics.change_coordinate_system()>`
and the ``coordinates.py`` module::
Y
^
|
|---------> X
/
Z - kaolin's NDC space is assumed to be left handed (depth goes inwards to the screen).
The default range of values is [-1, 1].
CameraExtrinsics class
======================
.. _camera_extrinsics_class:
:class:`kaolin.render.camera.CameraExtrinsics` holds the extrinsics parameters of a camera: position and orientation in space.
This class maintains the view matrix of camera, used to transform points from world coordinates
to camera / eye / view space coordinates.
This view matrix maintained by this class is column-major, and can be described by the 4x4 block matrix:
.. math::
\begin{bmatrix}
R & t \\
0 & 1
\end{bmatrix}
where **R** is a 3x3 rotation matrix and **t** is a 3x1 translation vector for the orientation and position
respectively.
This class is batched and may hold information from multiple cameras.
:class:`CameraExtrinsics` relies on a dynamic representation backend to manage the tradeoff between various choices
such as speed, or support for differentiable rigid transformations.
Parameters are stored as a single tensor of shape :math:`(\text{num_cameras}, K)`,
where K is a representation specific number of parameters.
Transformations and matrices returned by this class support differentiable torch operations,
which in turn may update the extrinsic parameters of the camera::
convert_to_mat
Backend ---- > Extrinsics
Representation R View Matrix M
Shape (num_cameras, K), Shape (num_cameras, 4, 4)
< ----
convert_from_mat
.. note::
Unless specified manually with :func:`switch_backend`,
kaolin will choose the optimal representation backend depending on the status of ``requires_grad``.
.. note::
Users should be aware, but not concerned about the conversion from internal representations to view matrices.
kaolin performs these conversions where and if needed.
Supported backends:
- **"matrix_se3"**\: A flattened view matrix representation, containing the full information of
special euclidean transformations (translations and rotations).
This representation is quickly converted to a view matrix, but differentiable ops may cause
the view matrix to learn an incorrect, non-orthogonal transformation.
- **"matrix_6dof_rotation"**\: A compact representation with 6 degrees of freedom, ensuring the view matrix
remains orthogonal under optimizations. The conversion to matrix requires a single Gram-Schmidt step.
.. seealso::
`On the Continuity of Rotation Representations in Neural Networks, Zhou et al. 2019
<https://arxiv.org/abs/1812.07035>`_
Unless stated explicitly, the definition of the camera coordinate system used by this class is up to the
choice of the user.
Practitioners should be mindful of conventions when pairing the view matrix managed by this class with a projection
matrix.
CameraIntrinsics class
======================
.. _camera_intrinsics_class:
:class:`kaolin.render.camera.CameraIntrinsics` holds the intrinsics parameters of a camera:
how it should project from camera space to normalized screen / clip space.
The instrinsics are determined by the camera type, meaning parameters may differ according to the lens structure.
Typical computer graphics systems commonly assume the intrinsics of a pinhole camera (see: :class:`PinholeIntrinsics` class).
One implication is that some camera types do not use a linear projection (i.e: Fisheye lens).
There are therefore numerous ways to use CameraIntrinsics subclasses:
1. Access intrinsics parameters directly.
This may typically benefit use cases such as ray generators.
2. The :func:`transform()` method is supported by all CameraIntrinsics subclasses,
both linear and non-linear transformations, to project vectors from camera space to normalized screen space.
This method is implemented using differential pytorch operations.
3. Certain CameraIntrinsics subclasses which perform linear projections, may expose the transformation matrix
via dedicated methods.
For example, :class:`PinholeIntrinsics` exposes a :func:`projection_matrix()` method.
This may typically be useful for rasterization based rendering pipelines (i.e: OpenGL vertex shaders).
This class is batched and may hold information from multiple cameras.
Parameters are stored as a single tensor of shape :math:`(\text{num_cameras}, K)` where K is the number of
intrinsic parameters.
currently there are two subclasses of intrinsics: :class:`kaolin.render.camera.OrthographicIntrinsics` and
:class:`kaolin.render.camera.PinholeIntrinsics`.
API Documentation:
------------------
* Check all the camera classes and functions at the :ref:`API documentation<kaolin.render.camera>`.

158
docs/notes/installation.rst Normal file
View File

@@ -0,0 +1,158 @@
:orphan:
.. _installation:
Installation
============
Most functions in Kaolin use PyTorch with custom high-performance code in C++ and CUDA. For this reason,
full Kaolin functionality is only available for systems with an NVIDIA GPU, supporting CUDA. While it is possible to install
Kaolin on other systems, only a fraction of operations will be available for a CPU-only install.
Requirements
------------
* Linux, Windows, or macOS (CPU-only)
* Python >= 3.8, <= 3.10
* `CUDA <https://developer.nvidia.com/cuda-toolkit>`_ >= 10.0 (with 'nvcc' installed) See `CUDA Toolkit Archive <https://developer.nvidia.com/cuda-toolkit-archive>`_ to install older version.
* torch >= 1.8, <= 2.1.1
Quick Start (Linux, Windows)
----------------------------
| Make sure any of the supported CUDA and torch versions below are pre-installed.
| The latest version of Kaolin can be installed with pip:
.. code-block:: bash
$ pip install kaolin==0.15.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-{TORCH_VER}_cu{CUDA_VER}.html
.. Note::
Replace *TORCH_VER* and *CUDA_VER* with any of the compatible options below.
.. rst-class:: center-align-center-col
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch / CUDA** | **cu113** | **cu116** | **cu117** | **cu118** | **cu121** |
+==================+===========+===========+===========+===========+===========+
| **torch-2.1.1** | | | | ✓ | ✓ |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-2.1.0** | | | | ✓ | ✓ |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-2.0.1** | | | ✓ | ✓ | |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-2.0.0** | | | ✓ | ✓ | |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-1.13.1** | | ✓ | ✓ | | |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-1.13.0** | | ✓ | ✓ | | |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-1.12.1** | ✓ | ✓ | | | |
+------------------+-----------+-----------+-----------+-----------+-----------+
| **torch-1.12.0** | ✓ | ✓ | | | |
+------------------+-----------+-----------+-----------+-----------+-----------+
For example, to install kaolin for torch 1.12.1 and CUDA 11.3:
.. code-block:: bash
$ pip install kaolin==0.15.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.12.1_cu113.html
You can check https://nvidia-kaolin.s3.us-east-2.amazonaws.com/index.html to see all the wheels available.
Installation from source
------------------------
.. Note::
We recommend installing Kaolin into a virtual environment. For instance to create a new environment with `Anaconda <https://www.anaconda.com/>`_:
.. code-block:: bash
$ conda create --name kaolin python=3.8
$ conda activate kaolin
1. Clone Repository
^^^^^^^^^^^^^^^^^^^
Clone and optionally check out an `official release <https://github.com/NVIDIAGameWorks/kaolin/tags>`_:
.. code-block:: bash
$ git clone --recursive https://github.com/NVIDIAGameWorks/kaolin
$ cd kaolin
$ git checkout v0.15.0 # optional
2. Install dependencies
^^^^^^^^^^^^^^^^^^^^^^^
You can install the dependencies running:
.. code-block:: bash
$ pip install -r tools/build_requirements.txt -r tools/viz_requirements.txt -r tools/requirements.txt
2. Test CUDA
^^^^^^^^^^^^
You can verify that CUDA is properly installed at the desired version with nvcc by running the following:
.. code-block:: bash
$ nvidia-smi
$ nvcc --version
3. Install Pytorch
^^^^^^^^^^^^^^^^^^
Follow `official instructions <https://pytorch.org>`_ to install PyTorch of a supported version.
Kaolin may be able to work with other PyTorch versions, but we only explicitly test within the version range 1.10.0 to 2.1.1.
See below for overriding PyTorch version check during install.
Here is how to install the latest Pytorch version supported by Kaolin for cuda 11.8:
.. code-block:: bash
$ pip install torch==2.1.1 --extra-index-url https://download.pytorch.org/whl/cu118
4. Optional Environment Variables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* If trying Kaolin with an unsupported PyTorch version, set: ``export IGNORE_TORCH_VER=1``
* If using heterogeneous GPU setup, set the architectures for which to compile the CUDA code, e.g.: ``export TORCH_CUDA_ARCH_LIST="7.0 7.5"``
* In some setups, there may be a conflict between cub available with cuda install > 11 and ``third_party/cub`` that kaolin includes as a submodule. If conflict occurs or cub is not found, set ``CUB_HOME`` to the cuda one, e.g. typically on Linux: ``export CUB_HOME=/usr/local/cuda-*/include/``
5. Install Kaolin
^^^^^^^^^^^^^^^^^
.. code-block:: bash
$ python setup.py develop
.. Note::
Kaolin can be installed without GPU, however, CPU support is limited and many CUDA-only functions will be missing.
Testing your installation
-------------------------
Run a quick test of your installation and version:
.. code-block:: bash
$ python -c "import kaolin; print(kaolin.__version__)"
Running tests
^^^^^^^^^^^^^
For an exhaustive check, install testing dependencies and run tests as follows:
.. code-block:: bash
$ pip install -r tools/ci_requirements.txt
$ export CI='true' # on Linux
$ set CI='true' # on Windows
$ pytest --import-mode=importlib -s tests/python/
.. Note::
These tests rely on CUDA operations and will fail if you installed on CPU only, where not all functionality is available.

98
docs/notes/overview.rst Normal file
View File

@@ -0,0 +1,98 @@
:orphan:
.. _overview:
API Overview
============
Below is a summary of Kaolin functionality. Refer to :ref:`tutorial_index` for specific use cases, examples
and recipes that use these building blocks.
Operators for 3D Data:
^^^^^^^^^^^^^^^^^^^^^^
:ref:`kaolin/ops<kaolin.ops>` contains operators for efficient processing functions of batched 3d models and tensors. We provide, conversions between 3d representations, primitives batching of heterogenenous data, and efficient mainstream functions on meshes and voxelgrids.
.. toctree::
:maxdepth: 2
../modules/kaolin.ops
I/O:
^^^^
:ref:`kaolin/io<kaolin.io>` contains functionality to interact with files.
We provide, importers and exporters to popular format such as .obj and .usd, but also utility functions and classes to preprocess and cache datasets with specific transforms.
.. toctree::
:maxdepth: 2
../modules/kaolin.io
Metrics:
^^^^^^^^
:ref:`kaolin/metrics<kaolin.metrics>` contains functions to compute distance and losses such as point_to_mesh distance, chamfer distance, IoU, or laplacian smoothing.
.. toctree::
:maxdepth: 2
../modules/kaolin.metrics
Differentiable Rendering:
^^^^^^^^^^^^^^^^^^^^^^^^^
:ref:`kaolin/render<kaolin.render>` provide functions related to differentiable rendering, such a DIB-R rasterization, application of camera projection / translation / rotation, lighting, and textures.
.. toctree::
:maxdepth: 2
../modules/kaolin.render
3D Checkpoints and Visualization:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:ref:`kaolin/visualize<kaolin.visualize>` contains utilities for writing 3D checkpoints for visualization. Currently we provide timelapse exporter that can be quickly picked up by the `Omniverse Kaolin App <https://docs.omniverse.nvidia.com/app_kaolin/app_kaolin/user_manual.html#training-visualizer>`_.
.. toctree::
:maxdepth: 2
../modules/kaolin.visualize
Utilities:
^^^^^^^^^^
:ref:`kaolin/utils<kaolin.utils>` contains utility functions to help development of application or research scripts. We provide functions to display and check informations about tensors, and features to fix seed.
.. toctree::
:maxdepth: 2
../modules/kaolin.utils
Non Commercial
^^^^^^^^^^^^^^
:ref:`kaolin/non_commercial<kaolin.non_commercial>` contains features under `NSCL license <https://github.com/NVIDIAGameWorks/kaolin/blob/master/LICENSE.NSCL>`_ restricted to non commercial usage for research and evaluation purposes.
.. toctree::
:maxdepth: 2
../modules/kaolin.non_commercial
Licenses
========
Most of Kaolin's repository is under `Apache v2.0 license <https://github.com/NVIDIAGameWorks/kaolin/blob/master/LICENSE>`_, except under :ref:`kaolin/non_commercial<kaolin.non_commercial>` which is under `NSCL license <https://github.com/NVIDIAGameWorks/kaolin/blob/master/LICENSE.NSCL>`_ restricted to non commercial usage for research and evaluation purposes. For example, FlexiCubes method is included under :ref:`non_commercial<kaolin.non_commercial>`.
Default `kaolin` import includes Apache-licensed components:
.. code-block:: python
import kaolin
The non-commercial components need to be explicitly imported as:
.. code-block:: python
import kaolin.non_commercial

279
docs/notes/spc_summary.rst Normal file
View File

@@ -0,0 +1,279 @@
Structured Point Clouds (SPCs)
******************************
.. _spc:
Structured Point Clouds (SPC) is a sparse octree-based representation that is useful to organize and
compress 3D geometrically sparse information.
They are also known as sparse voxelgrids, quantized point clouds, and voxelized point clouds.
.. image:: ../img/mesh_to_spc.png
Kaolin supports a number of operations to work with SPCs,
including efficient ray-tracing and convolutions.
The SPC data structure is very general. In the SPC data structure, octrees provide a way to store and efficiently retrieve coordinates of points at different levels of the octree hierarchy. It is also possible to associate features to these coordinates using point ordering in memory. Below we detail the low-level representations that comprise SPCs and allow corresponding efficient operations. We also provide a :ref:`convenience container<kaolin.rep>` for these low-level attributes.
Some of the conventions are also defined in `Neural Geometric Level of Detail: Real-time Rendering with
Implicit 3D Surfaces <https://nv-tlabs.github.io/nglod/>`_ which uses SPC as an internal representation.
.. warning::
Structured Point Clouds internal layout and structure is still experimental and may be modified in the future.
Octree
======
.. _spc_octree:
Core to SPC is the `octree <https://en.wikipedia.org/wiki/Octree>`_, a tree data
structure where each node have up to 8 childrens.
We use this structure to do a recursive three-dimensional space partitioning,
i.e: each node represents a partitioning of its 3D space (partition) of :math:`(2, 2, 2)`.
The octree then contains the information necessary to find the sparse coordinates.
In SPC, a batch of octrees is represented as a tensor of bytes. Each bit in the byte array ``octrees`` represents
the binary occupancy of an octree bit sorted in `Morton Order <https://en.wikipedia.org/wiki/Z-order_curve>`_.
The Morton order is a type of space-filling curve which gives a deterministic ordering of
integer coordinates on a 3D grid. That is, for a given non-negative 1D integer coordinate, there exists a
bijective mapping to 3D integer coordinates.
Since a byte is a collection of 8 bits, a single byte ``octrees[i]``
represents an octree node where each bit indicate the binary occupancy of a child node / partition as
depicted below:
.. image:: ../img/octants.png
:width: 600
For each octree, the nodes / bytes are following breadth-first-search order (with Morton order for
childrens order), and the octree bytes are then :ref:`packed` to form ``octrees``. This ordering
allows efficient tree access without having to explicilty store indirection pointers.
.. figure:: ../img/octree.png
:scale: 30 %
:alt: An octree 3D partitioning
Credit: https://en.wikipedia.org/wiki/Octree
The binary occupancy values in the bits of ``octrees`` implicitly encode position data due to the bijective
mapping from Morton codes to 3D integer coordinates. However, to provide users a more straight
forward interface to work with these octrees, SPC provides auxilary information such as
``points`` which is a :ref:`packed` tensor of 3D coordinates. Refer to the :ref:`spc_attributes` section
for more details.
Currently SPCs are primarily used to represent 3D surfaces,
and so all the leaves are at the same ``level`` (depth).
This allow very efficient processing on GPU, with custom CUDA kernels, for ray-tracing and convolution.
The structure contains finer details as you go deeper in to the tree.
Below are the Levels 0 through 8 of a SPC teapot model:
.. image:: ../img/spcTeapot.png
Additional Feature Data
=======================
The nodes of the ``octrees`` can contain information beyond just the 3D coordinates of the nodes,
such as RGB color, normals, feature maps, or even differentiable activation maps processed by a
convolution.
We follow a `Structure of Arrays <https://en.wikipedia.org/wiki/AoS_and_SoA>`_ approach to store
additional data for maximum user extensibility.
Currently the features would be tensors of shape :math:`(\text{num_nodes}, \text{feature_dim})`
with ``num_nodes`` being the number of nodes at a specific ``level`` of the ``octrees``,
and ``feature_dim`` the dimension of the feature set (for instance 3 for RGB color).
Users can freely define their own feature data to be stored alongside SPC.
Conversions
===========
Structured point clouds can be derived from multiple sources.
We can construct ``octrees``
from unstructured point cloud data, from sparse voxelgrids
or from the level set of an implicit function :math:`f(x, y, z)`.
.. _spc_attributes:
Related attributes
==================
.. note::
If you just wanna use the structured point clouds without having to go through the low level details, take a look at :ref:`the high level classes <kaolin.rep>`.
.. _spc_lengths:
``lengths:``
------------
Since ``octrees`` use :ref:`packed` batching, we need ``lengths`` a 1D tensor of size ``batch_size`` that contains the size of each individual octree. Note that ``lengths.sum()`` should equal the size of ``octrees``. You can use :func:`kaolin.ops.batch.list_to_packed` to pack octrees and generate ``lengths``
.. _spc_pyramids:
``pyramids:``
-------------
:class:`torch.IntTensor` of shape :math:`(\text{batch_size}, 2, \text{max_level} + 2)`. Contains layout information for each octree ``pyramids[:, 0]`` represent the number of points in each level of the ``octrees``, ``pyramids[:, 1]`` represent the starting index of each level of the octree.
.. _spc_exsum:
``exsum:``
----------
:class:`torch.IntTensor` of shape :math:`(\text{octrees_num_bytes} + \text{batch_size})` is the exclusive sum of the bit counts of each ``octrees`` byte.
.. note::
To generate ``pyramids`` and ``exsum`` see :func:`kaolin.ops.spc.scan_octrees`
.. _spc_points:
``point_hierarchies:``
----------------------
:class:`torch.ShortTensor` of shape :math:`(\text{num_nodes}, 3)` correspond to the sparse coordinates at all levels. We refer to this :ref:`packed` tensor as the **structured point hierarchies**.
The image below show an analogous 2D example.
.. image:: ../img/spc_points.png
:width: 400
the corresponding ``point_hierarchies`` would be:
>>> torch.ShortTensor([[0, 0], [1, 1],
[1, 0], [2, 2],
[2, 1], [3, 1], [5, 5]
])
.. note::
To generate ``points`` see :func:`kaolin.ops.generate_points`
.. note::
the tensors ``pyramid``, ``exsum`` and ``points`` are used by many Structured Point Cloud functions; avoiding their recomputation will improve performace.
Convolutions
============
We provide several sparse convolution layers for structured point clouds.
Convolutions are characterized by the size of the input and output channels,
an array of ``kernel_vectors``, and possibly the number of levels to ``jump``, i.e.,
the difference in input and output levels.
.. _kernel-text:
An example of how to create a :math:`3 \times 3 \times 3` kernel follows:
>>> vectors = []
>>> for i in range(-1, 2):
>>> for j in range(-1, 2):
>>> for k in range(-1, 2):
>>> vectors.append([i, j, k])
>>> Kvec = torch.tensor(vectors, dtype=torch.short, device=device)
>>> Kvec
tensor([[-1, -1, -1],
[-1, -1, 0],
[-1, -1, 1],
...
...
[ 1, 1, -1],
[ 1, 1, 0],
[ 1, 1, 1]], device='cuda:0', dtype=torch.int16)
.. _neighborhood-text:
The kernel vectors determine the shape of the convolution kernel.
Each kernel vector is added to the position of a point to determine
the coordinates of points whose corresponding input data is needed for the operation.
We formalize this notion using the following neighbor function:
.. math::
n(i,k) = \text{ID}\left(P_i+\overrightarrow{K}_k\right)
that returns the index of the point within the same level found by adding
kernel vector :math:`\overrightarrow{K}_k` to point :math:`P_i`.
Given the sparse nature of SPC data, it may be the case that no such point exists. In such cases, :math:`n(i,k)`
will return an invalid value, and data accesses will be treated like zero padding.
Transposed convolutions are defined by the transposed neighbor function
.. math::
n^T(i,k) = \text{ID}\left(P_i-\overrightarrow{K}_k\right)
The value **jump** is used to indicate the difference in levels between the iput features
and the output features. For convolutions, this is the number of levels to downsample; while
for transposed convolutions, **jump** is the number of levels to upsample. The value of **jump** must
be positive, and may not go beyond the highest level of the octree.
Examples
--------
You can create octrees from sparse feature_grids
(of shape :math:`(\text{batch_size}, \text{feature_dim}, \text{height}, \text{width}, \text{depth})`):
>>> octrees, lengths, features = kaolin.ops.spc.feature_grids_to_spc(features_grids)
or from point cloud (of shape :math:`(\text{num_points, 3})`):
>>> qpc = kaolin.ops.spc.quantize_points(pc, level)
>>> octree = kaolin.ops.spc.unbatched_points_to_octree(qpc, level)
To use convolution, you can use the functional or the torch.nn.Module version like torch.nn.functional.conv3d and torch.nn.Conv3d:
>>> max_level, pyramids, exsum = kaolin.ops.spc.scan_octrees(octrees, lengths)
>>> point_hierarchies = kaolin.ops.spc.generate_points(octrees, pyramids, exsum)
>>> kernel_vectors = torch.tensor([[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1],
[1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]],
dtype=torch.ShortTensor, device='cuda')
>>> conv = kaolin.ops.spc.Conv3d(in_channels, out_channels, kernel_vectors, jump=1, bias=True).cuda()
>>> # With functional
>>> out_features, out_level = kaolin.ops.spc.conv3d(octrees, point_hierarchies, level, pyramids,
... exsum, coalescent_features, weight,
... kernel_vectors, jump, bias)
>>> # With nn.Module and container class
>>> input_spc = kaolin.rep.Spc(octrees, lengths)
>>> conv
>>> out_features, out_level = kaolin.ops.spc.conv_transpose3d(
... **input_spc.to_dict(), input=out_features, level=level,
... weight=weight, kernel_vectors=kernel_vectors, jump=jump, bias=bias)
To apply ray tracing we currently only support non-batched version, for instance here with RGB values as per point features:
>>> max_level, pyramids, exsum = kaolin.ops.spc.scan_octrees(
... octree, torch.tensor([len(octree)], dtype=torch.int32, device='cuda')
>>> point_hierarchy = kaolin.ops.spc.generate_points(octrees, pyramids, exsum)
>>> ridx, pidx, depth = kaolin.render.spc.unbatched_raytrace(octree, point_hierarchy, pyramids[0], exsum,
... origin, direction, max_level)
>>> first_hits_mask = kaolin.render.spc.mark_pack_boundaries(ridx)
>>> first_hits_point = pidx[first_hits_mask]
>>> first_hits_rgb = rgb[first_hits_point - pyramids[max_level - 2]]
Going further with SPC:
=======================
Examples:
----------------------------
See our Jupyter notebook for a walk-through of SPC features:
`examples/tutorial/understanding_spcs_tutorial.ipynb <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/understanding_spcs_tutorial.ipynb>`_
And also our recipes for simple examples of how to use SPC:
* `spc_basics.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/spc/spc_basics.py>`_: showing attributes of an SPC object
* `spc_dual_octree.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/spc/spc_dual_octree.py>`_: computing and explaining the dual of an SPC octree
* `spc_trilinear_interp.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/spc/spc_trilinear_interp.py>`_: computing trilinear interpolation of a point cloud on an SPC
SPC Documentation:
------------------
Functions useful for working with SPCs are available in the following modules:
* :ref:`kaolin.ops.spc<kaolin.ops.spc>` - general explanation and operations
* :ref:`kaolin.render.spc<kaolin.render.spc>` - rendering utilities
* :class:`kaolin.rep.Spc` - high-level wrapper

View File

@@ -0,0 +1,101 @@
.. _tutorial_index:
Tutorial Index
==============
Kaolin provides tutorials as ipython notebooks, docs pages and simple scripts. Note that the links
point to master.
Detailed Tutorials
------------------
* `Camera and Rasterization <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/camera_and_rasterization.ipynb>`_: Rasterize ShapeNet mesh with nvdiffrast and camera:
* Load ShapeNet mesh
* Preprocess mesh and materials
* Create a camera with ``from_args()`` general constructor
* Render a mesh with multiple materials with nvdiffrast
* Move camera and see the resulting rendering
* `Optimizing Diffuse Lighting <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/diffuse_lighting.ipynb>`_: Optimize lighting parameters with spherical gaussians and spherical harmonics:
* Load an obj mesh with normals and materials
* Rasterize the diffuse and specular albedo
* Render and optimize diffuse lighting:
* Spherical harmonics
* Spherical gaussian with inner product implementation
* Spherical gaussian with fitted approximation
* `Optimize Diffuse and Specular Lighting with Spherical Gaussians <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/sg_specular_lighting.ipynb>`_:
* Load an obj mesh with normals and materials
* Generate view rays from camera
* Rasterize the diffuse and specular albedo
* Render and optimize diffuse and specular lighting with spherical gaussians
* `Working with Surface Meshes <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/working_with_meshes.ipynb>`_:
* loading and constructing :class:`kaolin.rep.SurfaceMesh` objects
* batching of meshes
* auto-computing common attributes (like ``face_normals``)
* `Deep Marching Tetrahedra <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/dmtet_tutorial.ipynb>`_: reconstructs a tetrahedral mesh from point clouds with `DMTet <https://nv-tlabs.github.io/DMTet/>`_, covering:
* generating data with Omniverse Kaolin App
* loading point clouds from a ``.usd`` file
* chamfer distance as a loss function
* differentiable marching tetrahedra
* using Timelapse API for 3D checkpoints
* visualizing 3D results of training
* `Understanding Structured Point Clouds (SPCs) <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/understanding_spcs_tutorial.ipynb>`_: walks through SPC features, covering:
* under-the-hood explanation of SPC, why it's useful and key ops
* loading a mesh
* sampling a point cloud
* converting a point cloud to SPC
* setting up camera
* rendering SPC with ray tracing
* storing features in an SPC
* `Differentiable Rendering <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/dibr_tutorial.ipynb>`_: optimizes a triangular mesh from images using `DIB-R <https://github.com/nv-tlabs/DIB-R-Single-Image-3D-Reconstruction>`_ renderer, covering:
* generating data with Omniverse Kaolin App, and loading this synthetic data
* loading a mesh
* computing mesh laplacian
* DIB-R rasterization
* differentiable texture mapping
* computing mask intersection-over-union loss (IOU)
* using Timelapse API for 3D checkpoints
* visualizing 3D results of training
* `Fitting a 3D Bounding Box <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/bbox_tutorial.ipynb>`_: fits a 3D bounding box around an object in images using `DIB-R <https://github.com/nv-tlabs/DIB-R-Single-Image-3D-Reconstruction>`_ renderer, covering:
* generating data with Omniverse Kaolin App, and loading this synthetic data
* loading a mesh
* DIB-R rasterization
* computing mask intersection-over-union loss (IOU)
* :ref:`3d_viz`: explains saving 3D checkpoints and visualizing them, covering:
* using Timelapse API for writing 3D checkpoints
* understanding output file format
* visualizing 3D checkpoints using Omniverse Kaolin App
* visualizing 3D checkpoints using bundled ``kaolin-dash3d`` commandline utility
* `Reconstructing Point Cloud with DMTet <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/dmtet_tutorial.ipynb>`_: Trains an SDF estimator to reconstruct a mesh from a point cloud covering:
* using point clouds data generated with Omniverse Kaolin App
* loading point clouds from an USD file.
* defining losses and regularizer for a mesh with point cloud ground truth
* applying marching tetrahedra
* using Timelapse API for 3D checkpoints
* visualizing 3D checkpoints using ``kaolin-dash3d``
Simple Recipes
--------------
* I/O and Data Processing:
* `usd_kitchenset.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/usd_kitchenset.py>`_: loading multiple meshes from a ``.usd`` file and saving
* `spc_from_pointcloud.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/dataload/spc_from_pointcloud.py>`_: converting a point cloud to SPC object
* `occupancy_sampling.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/preprocess/occupancy_sampling.py>`_: computing occupancy function of points in a mesh using ``check_sign``
* `spc_basics.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/spc/spc_basics.py>`_: showing attributes of an SPC object
* `spc_dual_octree.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/spc/spc_dual_octree.py>`_: computing and explaining the dual of an SPC octree
* `spc_trilinear_interp.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/spc/spc_trilinear_interp.py>`_: computing trilinear interpolation of a point cloud on an SPC
* Visualization:
* `visualize_main.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/tutorial/visualize_main.py>`_: using Timelapse API to write mock 3D checkpoints
* `fast_mesh_sampling.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/preprocess/fast_mesh_sampling.py>`_: Using CachedDataset to preprocess a ShapeNet dataset we can sample point clouds efficiently at runtime
* Camera:
* `cameras_differentiable.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/cameras_differentiable.py>`_: optimize a camera position
* `camera_transforms.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_transforms.py>`_: using :func:`Camera.transform()` function
* `camera_ray_tracing.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_ray_tracing.py>`_: how to design a ray generating function using :class:`Camera` objects
* `camera_properties.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_properties.py>`_: exposing some the camera attributes and properties
* `camera_opengl_shaders.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_opengl_shaders.py>`_: Using the camera with glumpy
* `camera_movement.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_movement.py>`_: Manipulating a camera position and zoom
* `camera_init_simple.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_init_simple.py>`_: Making Camera objects with the flexible :func:`Camera.from_args()` constructor
* `camera_init_explicit.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_init_explicit.py>`_: Making :class:`CameraIntrinsics` and :class:`CameraExtrinsics` with all the different constructors available
* `camera_coordinate_systems.py <https://github.com/NVIDIAGameWorks/kaolin/blob/master/examples/recipes/camera/camera_coordinate_systems.py>`_: Changing coordinate system in a :class:`Camera` object