138 lines
6.3 KiB
Markdown
138 lines
6.3 KiB
Markdown
---
|
|
name: python-tensor-typing
|
|
description: Use when working on tensor-heavy or numerical Python code in repositories that already use or are explicitly standardizing on jaxtyping and beartype. Apply shape and dtype annotations plus boundary-focused runtime validation without introducing these tools to unrelated code unless requested.
|
|
---
|
|
|
|
# Python Tensor Typing
|
|
|
|
Use this skill for tensor-heavy or numerical code that benefits from explicit shape and dtype contracts. It should not leak into unrelated Python work.
|
|
|
|
## Defaults
|
|
|
|
- Use `DType[ArrayType, "shape names"]`, for example `Float[np.ndarray, "batch channels"]`.
|
|
- Reuse axis names to express shared dimensions across arguments and returns.
|
|
- Prefer reusable aliases for common tensor shapes.
|
|
- Prefer concrete array types after normalization; use broader input types only at ingestion boundaries.
|
|
- Prefer `@jaxtyped(typechecker=beartype)` on stable boundaries and test-targeted helpers by default so tensor shape or dtype mismatches fail early.
|
|
- Avoid applying runtime checking blindly to hot inner loops.
|
|
- Only avoid `from __future__ import annotations` in modules that rely on runtime annotation inspection.
|
|
|
|
```python
|
|
import numpy as np
|
|
from beartype import beartype
|
|
from jaxtyping import Float, jaxtyped
|
|
|
|
Batch = Float[np.ndarray, "batch channels"]
|
|
|
|
|
|
@jaxtyped(typechecker=beartype)
|
|
def normalize(x: Batch) -> Batch:
|
|
...
|
|
```
|
|
|
|
## Integrated Jaxtyping Reference
|
|
|
|
Use this section as the built-in cheat sheet. Do not rely on a separate reference file.
|
|
|
|
### Mental Model
|
|
|
|
- `jaxtyping` provides shape and dtype annotations plus runtime type checking for JAX, PyTorch, NumPy, MLX, and TensorFlow arrays and tensors.
|
|
- The name is historical; it is not JAX-only.
|
|
- Static type checkers do not fully understand shape constraints. In practice `dtype[array, shape]` is mostly treated as just `array`, so use runtime checks to catch shape or dtype mistakes early.
|
|
- This skill defaults to `beartype` as the runtime checker.
|
|
|
|
### Core Syntax
|
|
|
|
- Annotate arrays as `DType[ArrayType, "shape"]`.
|
|
- Common dtype families:
|
|
- `Shaped` for any dtype
|
|
- `Bool`
|
|
- `Num` for numeric tensors
|
|
- `Real`, `Float`, `Complex`, `Int`, `UInt`
|
|
- precision-specific forms like `Float32`, `Int64`, `UInt8`
|
|
- Representative examples:
|
|
- `Float[np.ndarray, "batch channels"]`
|
|
- `Int[np.ndarray, "persons"]`
|
|
- `Shaped[ArrayLike, "batch time features"]`
|
|
- `Float[Tensor, "... channels"]`
|
|
|
|
### Shape Rules
|
|
|
|
- Use fixed integers for exact axes: `"3 3"`.
|
|
- Use named axes to enforce equality across values: `"batch time"` and `"time features"`.
|
|
- Use symbolic expressions when the return shape is derived from argument shapes: `"dim-1"`.
|
|
- Use `...` for anonymous zero-or-more axes. You may only use one variadic axis per annotation.
|
|
- Use `*name` for a named variadic axis.
|
|
- Use `#name` when broadcasting with size `1` should be accepted.
|
|
- Use `_name` or just `_` when an axis is documentation-only and should not be runtime-checked.
|
|
- Use `name=...` for documentation-only labels like `rows=4`.
|
|
- Use `""` for a scalar shape.
|
|
- Use `"..."` when you only want dtype checking and do not want to constrain shape.
|
|
- Prefer meaningful names like `batch`, `frames`, `persons`, `keypoints`, `dims`, `channels`.
|
|
|
|
### Array And Alias Guidance
|
|
|
|
- Prefer concrete normalized array types in core logic:
|
|
- `Float[np.ndarray, "..."]`
|
|
- `Float[torch.Tensor, "..."]`
|
|
- `Float[jax.Array, "..."]`
|
|
- `Float[tf.Tensor, "..."]`
|
|
- `Float[mx.array, "..."]`
|
|
- Use broader input types like `Shaped[ArrayLike, "..."]` only at ingestion boundaries.
|
|
- Keep repeated shapes in local aliases instead of rewriting them in every signature.
|
|
- You can nest existing `jaxtyping` aliases:
|
|
|
|
```python
|
|
from jaxtyping import Float
|
|
import jax
|
|
|
|
Image = Float[jax.Array, "channels height width"]
|
|
BatchImage = Float[Image, "batch"]
|
|
```
|
|
|
|
- Duck-typed arrays also work if they expose `.shape` and `.dtype`.
|
|
- `typing.Any`, unions, type aliases, and bounded `TypeVar`s can all be used as the array type parameter when needed.
|
|
|
|
### Runtime Checking Patterns
|
|
|
|
- Prefer `@jaxtyped(typechecker=beartype)` for function-level checks.
|
|
- Use it on adapters, public tensor utilities, deserializers, boundary normalization functions, and tests.
|
|
- Do not decorate the hottest inner loops unless the runtime cost is acceptable.
|
|
- Avoid stringized annotations and avoid `from __future__ import annotations` in code paths that depend on runtime inspection.
|
|
- If static typing and runtime truth diverge, validate at runtime first and then use a narrowly scoped commented `cast(...)`.
|
|
|
|
```python
|
|
import numpy as np
|
|
from beartype import beartype
|
|
from jaxtyping import Float, jaxtyped
|
|
|
|
|
|
@jaxtyped(typechecker=beartype)
|
|
def center(x: Float[np.ndarray, "batch dims"]) -> Float[np.ndarray, "batch dims"]:
|
|
return x - x.mean(axis=0)
|
|
```
|
|
|
|
- For package-wide or test-only enforcement, `jaxtyping.install_import_hook` exists.
|
|
- Pytest can enable it with `--jaxtyping-packages=foo,bar.baz,beartype.beartype` or equivalent `addopts`.
|
|
- In notebooks, you can load the IPython extension and set `%jaxtyping.typechecker beartype.beartype`.
|
|
- With `jax.jit`, runtime shape checks occur during tracing, so the compiled code does not keep the checking overhead.
|
|
|
|
### Advanced Notes
|
|
|
|
- `print_bindings` can help inspect axis bindings when debugging shape mismatches.
|
|
- `AbstractDtype` can define custom dtypes for duck-typed arrays when a project needs that level of control.
|
|
- For unusual edge cases or less common features, see the upstream docs: `https://github.com/patrick-kidger/jaxtyping/tree/main/docs`.
|
|
|
|
### Tooling Gotchas
|
|
|
|
- `pyright` and `mypy` usually treat `dtype[array, shape]` as just `array`; do not expect them to prove shape safety.
|
|
- Ruff or `flake8` may complain about shape strings in annotations. Disabling `F722` is the usual fix for multidimensional forms.
|
|
- For one-dimensional annotations that trigger undefined-name lint errors, prefixing the shape with a space can route the linter to `F722` instead.
|
|
- Dataclass fields with stringified annotations can be skipped or mis-checked at runtime. Avoid them.
|
|
|
|
## Anti-Goals
|
|
|
|
- Do not introduce `jaxtyping` or `beartype` into non-numerical work just because this skill is loaded.
|
|
- Do not annotate every local scratch tensor when the extra ceremony does not improve clarity.
|
|
- Do not add runtime checking to hot loops unless the cost is acceptable.
|