-
Notifications
You must be signed in to change notification settings - Fork 23.2k
Insights: pytorch/pytorch
Overview
Could not load contribution data
Please try again later
2 Pull requests merged by 2 people
-
Fix staging for CPU tensors in OSS DCP async_save
#145408 merged
Jan 23, 2025 -
Prevent legacy_load when weights_only=True (correctly)
#145111 merged
Jan 17, 2025
193 Pull requests opened by 122 people
-
Revert "Fix for MSVC problem on Windows Arm64 (#136765)"
#145076 opened
Jan 17, 2025 -
Remove FFT from stride incorrect ops
#145080 opened
Jan 17, 2025 -
partitioner: avoid inserting duplicates into heap
#145082 opened
Jan 17, 2025 -
cpp_wrapper: Move #includes to per-device header files
#145083 opened
Jan 17, 2025 -
`torch.distributions`: replace `numbers.Number` with `torch.types.Number`.
#145086 opened
Jan 17, 2025 -
[POC] Extend torch function support to ALL arguments, not just scalar type (but not insides of list)
#145089 opened
Jan 17, 2025 -
Test
#145090 opened
Jan 17, 2025 -
cpp_wrapper/aot_inductor: handle conjugation and negation dispatch keys
#145095 opened
Jan 17, 2025 -
Use STL string_view header
#145098 opened
Jan 17, 2025 -
Maintain multiple configs
#145103 opened
Jan 17, 2025 -
futher scheduler changes for invoke_quant: prologue low prec, (slightly) more aggressive fusion
#145104 opened
Jan 17, 2025 -
WIP remove -E workaround for nvcc
#145116 opened
Jan 17, 2025 -
[EXPERIMENTAL][dynamo] optimize `DictGetItemGuardAccessor`
#145117 opened
Jan 17, 2025 -
WIP sccache simplified
#145119 opened
Jan 17, 2025 -
[triton] Update triton pin to include warp specialization support
#145120 opened
Jan 17, 2025 -
inductor: Don't throw an internal error when a nn.module is missing a attribute
#145122 opened
Jan 17, 2025 -
Repro collective timeout and FR dump
#145125 opened
Jan 18, 2025 -
test trigger dispatch
#145126 opened
Jan 18, 2025 -
[executorch hash update] update the pinned executorch hash
#145128 opened
Jan 18, 2025 -
[cuBLAS][cuBLASLt] Unify `cuBLASLt` workspaces with `cuBLAS` workspaces
#145130 opened
Jan 18, 2025 -
[dynamo] Log guard latency
#145132 opened
Jan 18, 2025 -
[inductor] [bug fix] Fix `conv` on processing uint
#145136 opened
Jan 18, 2025 -
improve perf for layer_norm
#145146 opened
Jan 18, 2025 -
[BE][Easy] increase pip timeout for nightly tool: 15s -> 60s
#145147 opened
Jan 18, 2025 -
[BE][PYFMT] bump `ruff format` target version to py39: add parentheses around long `with`-statements
#145148 opened
Jan 18, 2025 -
[inductor] Simplify _inductor/utils.py slightly
#145150 opened
Jan 18, 2025 -
[BE]: Apply ruff PERF401 to torch
#145153 opened
Jan 18, 2025 -
Make `inductor_utils.requires_gpu` accept MPS
#145156 opened
Jan 18, 2025 -
[BE]: Update NCCL submodule to 2.24.3
#145167 opened
Jan 19, 2025 -
Added weight to MSELoss Criterion
#145169 opened
Jan 19, 2025 -
[BE]: Update CUTLASS submodule to 3.7.0
#145172 opened
Jan 19, 2025 -
[scan] scan dim handling in user-facing scan()
#145179 opened
Jan 19, 2025 -
Added torch check to ensure indices are not empty
#145180 opened
Jan 19, 2025 -
Add transpose support for CppMicroGemmFP32Vec
#145194 opened
Jan 20, 2025 -
Guard size oblivious within empty_tensor_restride_symint
#145196 opened
Jan 20, 2025 -
Use std::string_view in get_fully_qualified_type_name
#145197 opened
Jan 20, 2025 -
CI test: TestAutograd.test_gradcheck_nondeterministic
#145205 opened
Jan 20, 2025 -
Update slow tests
#145206 opened
Jan 20, 2025 -
Fix incorrect citation of authors in documentation
#145209 opened
Jan 20, 2025 -
solve apl dependency issue
#145215 opened
Jan 20, 2025 -
Refactoring Distributed test cases to be device agnostic [1/n]
#145222 opened
Jan 20, 2025 -
Raise MutationError if there are side effects when returning generator
#145223 opened
Jan 20, 2025 -
update sympy version 1.13.3 in setup.py (previously update only in requirement.txt)
#145224 opened
Jan 20, 2025 -
fix test_cublas_workspace_explicit_allocation for gfx12
#145227 opened
Jan 20, 2025 -
Expose the rendezvous keepalive arguments
#145228 opened
Jan 20, 2025 -
Improve typing in torch/types.py
#145237 opened
Jan 21, 2025 -
Improve typing in torch/__init__.py
#145238 opened
Jan 21, 2025 -
Improve typing in torch/_C/__init__.pyi.in
#145239 opened
Jan 21, 2025 -
add grad_output shape check for adaptive_avg_pool2d_backward
#145241 opened
Jan 21, 2025 -
[inductor] Make serialized inductor patterns path configurable instead of using …
#145243 opened
Jan 21, 2025 -
Improve typing by using bool and int
#145244 opened
Jan 21, 2025 -
[Quant][CPU] add a wrapper op in quantized_decomposed for _weight_int4pack_mm_for_cpu
#145245 opened
Jan 21, 2025 -
[Inductor UT] Set input tensors to corresponding device for test case in test_aot_indutor.py
#145248 opened
Jan 21, 2025 -
[Inductor][CPU] Add a lowering pass for _weight_int4pack_mm_for_cpu
#145250 opened
Jan 21, 2025 -
change the test wheel to release wheel when release wheel available
#145252 opened
Jan 21, 2025 -
[TEST] tmp storage with CONSTANTHANDLE
#145254 opened
Jan 21, 2025 -
[ARM] Add test_ops and test_memory_profiler to aarch64 tests
#145260 opened
Jan 21, 2025 -
Fix SEGFAULT when None arg was passed in GraphContext.op(..)
#145265 opened
Jan 21, 2025 -
Improve the caching allocator test for raw alloc
#145269 opened
Jan 21, 2025 -
Remove unnecessary HPUHooksInterface method
#145272 opened
Jan 21, 2025 -
Updates NCCL user buffer registration test for NCCL 2.24.3
#145285 opened
Jan 21, 2025 -
update get start xpu
#145286 opened
Jan 21, 2025 -
[ROCm] miopen benchmark behavior now better aligns with cudnn
#145294 opened
Jan 21, 2025 -
Add unique identifer to bmm thread_mm functions
#145303 opened
Jan 21, 2025 -
[BE] Remove test_ops from FIXME_inductor_dont_reset_dynamo
#145307 opened
Jan 21, 2025 -
[Utilization] post-test-process workflow
#145310 opened
Jan 21, 2025 -
windows builds with VS2022
#145319 opened
Jan 21, 2025 -
inductor: Explicitly test that torch.compile(option=...) does something
#145321 opened
Jan 21, 2025 -
fix a small typo in comments
#145323 opened
Jan 21, 2025 -
Add stft option to align window for center = false
#145324 opened
Jan 21, 2025 -
[utilization] pipeline to create clean db records
#145327 opened
Jan 22, 2025 -
[WIP] [AOTInductor] Use AtenTensorHandle as the constant map's holder.
#145331 opened
Jan 22, 2025 -
PEP585: Missed conversions
#145342 opened
Jan 22, 2025 -
[dtensor][cp] experiment: call flex_attention on DTensor
#145353 opened
Jan 22, 2025 -
ehnace logging statically known by adding size_oblivious(..)
#145354 opened
Jan 22, 2025 -
[WIP] Fix avg_pool crash with negative numbers
#145358 opened
Jan 22, 2025 -
removed check for ConvTranspose3D on MPS
#145366 opened
Jan 22, 2025 -
[ARM] Fix broken tests in test_tensor_creation_ops on AArch64
#145367 opened
Jan 22, 2025 -
Enable C++ API parity tests on AArch64
#145370 opened
Jan 22, 2025 -
[inductor][BE] Enable test_cpu_cpp_wrapper in fbcode
#145373 opened
Jan 22, 2025 -
[torchbench] Increase tolerance for amp only poolformer_m36
#145375 opened
Jan 22, 2025 -
Use AOTI as inductor backend when fullgraph_package is enabled.
#145381 opened
Jan 22, 2025 -
[DO NOT MERGE] pre-merge runs only on MI200 and post-merge runs on both MI300
#145389 opened
Jan 22, 2025 -
Update NJT linear_backward to return non-aliased tensor bias grad
#145399 opened
Jan 22, 2025 -
[auto_functionalized] Support `Tensor(a!)[]?`
#145400 opened
Jan 22, 2025 -
Update OSS nested tensor docs to focus on NJT
#145402 opened
Jan 22, 2025 -
Use guard_size_oblivious in debug tensor writer
#145403 opened
Jan 22, 2025 -
[distributions] Catch inf gradient in beta distribution
#145404 opened
Jan 22, 2025 -
[export][be] Clean up local imports from export [2/n]
#145406 opened
Jan 22, 2025 -
[dynamo][fbcode] Turn on inline_inbuilt_nn_modules
#145407 opened
Jan 22, 2025 -
Add Torchao docs link to Pytorch libraries
#145412 opened
Jan 22, 2025 -
[dynamo][not ready - just for CI] Remove all builtin skiplist
#145415 opened
Jan 22, 2025 -
TopK ROCm Tuning
#145416 opened
Jan 22, 2025 -
[dynamo][hop] test torch.compiling all user-facing HOPs
#145422 opened
Jan 22, 2025 -
Tag storages with offset in file when with FakeTensorMode
#145424 opened
Jan 22, 2025 -
Fix aot inductor intermediate debug printing
#145426 opened
Jan 22, 2025 -
add pt2 callbacks for backward pass
#145427 opened
Jan 23, 2025 -
[ca][hop] test CA on all HOPs
#145429 opened
Jan 23, 2025 -
[dynamo] Re-enable `test_torch_name_rule_map_updated`
#145431 opened
Jan 23, 2025 -
[dynamo] save/restore system random state more carefully
#145435 opened
Jan 23, 2025 -
Advance past fc window for stft center
#145437 opened
Jan 23, 2025 -
[draft_export] add LOC for data-dep error logging
#145443 opened
Jan 23, 2025 -
[Docs] Add clarification for target types in CrossEntropyLoss doc
#145444 opened
Jan 23, 2025 -
Record inputs at time of tracing, constrain to them for triton fn
#145448 opened
Jan 23, 2025 -
Fix incorrect type comparison
#145449 opened
Jan 23, 2025 -
Replace is_same with is_same_v for concise syntax
#145450 opened
Jan 23, 2025 -
Add check that envvar configs are boolean
#145454 opened
Jan 23, 2025 -
Update TorchBench commit to main
#145455 opened
Jan 23, 2025 -
[c10d] Flush file in file recorder
#145458 opened
Jan 23, 2025 -
[AOTInductor] Align behavior between CPU and GPU
#145459 opened
Jan 23, 2025 -
[TEST ONLY] Conv with `oc = 0`
#145462 opened
Jan 23, 2025 -
OpenReg: Refactor impl_registry
#145465 opened
Jan 23, 2025 -
[Intel GPU] Add TORCH_API macro to export symbol NestedTensor_to_mask for libtorch_xpu
#145467 opened
Jan 23, 2025 -
fix test_convolution error when use cudnn.flags
#145474 opened
Jan 23, 2025 -
[dynamo] added support to trace torch.cuda.is_current_stream_capturing
#145475 opened
Jan 23, 2025 -
Adapt Dynamo Tests to HPUs
#145476 opened
Jan 23, 2025 -
Modify enable logic of COLLECTIVE_COMM profiler activity type
#145478 opened
Jan 23, 2025 -
[Dynamo] Fix names collisions with foreach decomps
#145479 opened
Jan 23, 2025 -
simplify torch.utils.cpp_extension.include_paths; use it in cpp_builder
#145480 opened
Jan 23, 2025 -
feat: add SVE dispatch for non-FBGEMM qembeddingbag
#145486 opened
Jan 23, 2025 -
Remove unnecessary "special linking" for `BLAS_LIBRARIES`
#145487 opened
Jan 23, 2025 -
[torchbench] Add meta function for _cudnn_rnn_flatten_weight
#145488 opened
Jan 23, 2025 -
cpp_wrapper: Move #includes to per-device header files
#145490 opened
Jan 23, 2025 -
[BE]: Fix OrderedSet equality oversight
#145492 opened
Jan 23, 2025 -
[inductor] Make triton kernel autotune config defaults backward-compatible
#145494 opened
Jan 23, 2025 -
[compiled_autograd] Rename interface to pyinterface
#145495 opened
Jan 23, 2025 -
Remove truncated normal initialization for 16-bit (and lower) tensors
#145499 opened
Jan 23, 2025 -
[ROCm] Update workflow to use root user instead of jenkins user
#145504 opened
Jan 23, 2025 -
Bump AOTriton to 0.8.2b
#145508 opened
Jan 23, 2025 -
[dynamo][guards] Log guard latency to tlparse
#145509 opened
Jan 23, 2025 -
Add istft option to align window for center = false
#145510 opened
Jan 23, 2025 -
Add missing autoreleasepool around runUniqueGraph to prevent leaks
#145512 opened
Jan 23, 2025 -
[WIP][inductor][5/N] triton support post-#5512, fix 1 and None handling
#145515 opened
Jan 23, 2025 -
[ROCM MI300 skips for flaky unit tests
#145518 opened
Jan 23, 2025 -
Fix allow_mutation_on_saved_tensors for inplace foreach
#145520 opened
Jan 23, 2025 -
General Changes for multi accelerators
#145521 opened
Jan 23, 2025 -
inductor.config.descriptive_names = False is not actually supported
#145523 opened
Jan 23, 2025 -
Fix IdentationError of code example
#145525 opened
Jan 23, 2025 -
[MPS] Add bilineard2d_aa implementation
#145526 opened
Jan 23, 2025 -
fix intermediate debug information with cpp_wrapper
#145527 opened
Jan 23, 2025 -
[utils] add try_import method for importing optional modules
#145528 opened
Jan 23, 2025 -
Work around buggy use_const_ref_for_mutable_tensors
#145530 opened
Jan 23, 2025 -
Disable slow gradcheck for nn.Transformer ModuleInfo
#145531 opened
Jan 23, 2025 -
Make sure that benchmark_harness is set before running
#145532 opened
Jan 23, 2025 -
Remove det_singular OpInfo
#145533 opened
Jan 23, 2025 -
Increase the number of perf benchmark shards
#145534 opened
Jan 23, 2025 -
Removes threadfence from topk kernel to improve AMD performance
#145536 opened
Jan 23, 2025 -
[dynamo] Properly model torch profiler context objects
#145537 opened
Jan 23, 2025 -
Make sure not using cpp wrapper when setting nvtx training annotation
#145538 opened
Jan 23, 2025 -
Add accuracy issue support in AOTI Minifier
#145539 opened
Jan 23, 2025 -
[BE][hop] make it easier to use speculate_subgraph
#145540 opened
Jan 23, 2025 -
[BE] Type annotate wrapper_benchmark.py and cuda_combined_scheduling.py
#145542 opened
Jan 23, 2025 -
Testing #144594
#145546 opened
Jan 23, 2025 -
[dynamo][refactor] Move collections.namedtuple out of SkipFunctionVariable
#145547 opened
Jan 23, 2025 -
fix unbacked + view incorrectness
#145548 opened
Jan 23, 2025 -
[ca] add test_reset for 2.6 release validation
#145549 opened
Jan 23, 2025 -
If mypy fails it should report the error back to lintrunner
#145550 opened
Jan 23, 2025 -
Remove incorrect BuiltinVariable.call_hasattr()
#145551 opened
Jan 23, 2025 -
Turn on mypy for _dynamo/variables/builtin.py
#145552 opened
Jan 23, 2025 -
Fix call to create_load_global
#145553 opened
Jan 23, 2025 -
Fix dynamo use of `list[int]` in graph break
#145554 opened
Jan 23, 2025 -
Add CUDA 12.8 installation and Linux CD Docker images
#145557 opened
Jan 23, 2025 -
[dynamo][builtin-skipfile-cleanup] Support tuple.__new__
#145558 opened
Jan 23, 2025 -
[dynamo][builtin-skipfile-cleanup] Remove collections
#145559 opened
Jan 23, 2025 -
[Not for land] hacking up mx
#145562 opened
Jan 24, 2025 -
Refactor fuzzer and add support for Dynamo
#145565 opened
Jan 24, 2025 -
Advance docker release latest verison to cuda 12.4
#145566 opened
Jan 24, 2025 -
Add CUDA 12.8 installation and manylinux-cuda12.8
#145567 opened
Jan 24, 2025 -
[mps] Hoist erfinv logic out of the kernel in preparation for moving.
#145568 opened
Jan 24, 2025 -
[aotinductor] update unbacked symint runtime assertion msg
#145569 opened
Jan 24, 2025 -
[BE] mv test/inductor_skips/* to test/inductor_expected_failures/
#145572 opened
Jan 24, 2025 -
[inductor/profiler] add kernel kwargs instrumentation
#145573 opened
Jan 24, 2025 -
[inductor][3/N] triton support post-#5512, tt.divisibility format
#145575 opened
Jan 24, 2025 -
[Inductor][CPP] fix torch logit decomposition
#145576 opened
Jan 24, 2025 -
[inductor] Fix duplicate detection in _dynamic_scale_rblock
#145577 opened
Jan 24, 2025 -
Spruce up docs for emulate_precision_casts
#145579 opened
Jan 24, 2025 -
[MPS][BE] Implement bilineard2d as shader
#145581 opened
Jan 24, 2025 -
[inductor][4/N] triton support post-#5512, fix constexpr signatures
#145583 opened
Jan 24, 2025 -
[ROCm] Eliminate the need for divisions in layernorm for default vector size
#145584 opened
Jan 24, 2025 -
[Custom Ops] Add a new API to allow users to register an autocast for the custom op
#145588 opened
Jan 24, 2025 -
Replace decorators in UTs to cover additional devices
#145589 opened
Jan 24, 2025 -
[dynamo][benchmarks] Stop benchmarking compile time of dead code
#145590 opened
Jan 24, 2025 -
[CCA] remove TODO for hardware_destructive_interference_size
#145591 opened
Jan 24, 2025 -
Fix constants with non-functional operators
#145593 opened
Jan 24, 2025 -
[micro_pipeline_tp] add logging for all-gather-matmul fusion
#145594 opened
Jan 24, 2025 -
[micro_pipeline_tp] support pattern matching row-wise scaled_mm with sharded scale
#145595 opened
Jan 24, 2025 -
Add `torch._foreach_copy_` doc
#145597 opened
Jan 24, 2025 -
[NFC] Fix some minor typos.
#145599 opened
Jan 24, 2025 -
Add device support for chunk_cat, all_gather_copy_in, and split_with_…
#145600 opened
Jan 24, 2025 -
[ATen][CUDA][Transformers] Add Blackwell support to SDPA
#145602 opened
Jan 24, 2025 -
[dynamo] refactor dynamo__custom_eval_frame to C++, refactor SKIP_CODE[_RECURSIVE]
#145603 opened
Jan 24, 2025 -
WIP error_prop sc
#145605 opened
Jan 24, 2025 -
[BE][CI] bump ruff to 0.9.3
#145606 opened
Jan 24, 2025
148 Issues closed by 54 people
-
DISABLED test_arange_dynamic_cuda (__main__.TestInductorDynamicCUDA)
#127067 closed
Jan 24, 2025 -
torch.sin/cos/tan+torch.floor/round may bring wrong results with torch.compile
#145466 closed
Jan 24, 2025 -
torch crashes on ubuntu:24.04 during SDPA-CuDNN test
#145580 closed
Jan 24, 2025 -
Error: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate more than 1EB memory
#145369 closed
Jan 24, 2025 -
Inference super slow with torchvision model fasterrcnn_mobilenet_v3_large_fpn
#145032 closed
Jan 24, 2025 -
DISABLED test_bw_decoding_fails_float16 (__main__.TestFlexDecoding)
#141761 closed
Jan 24, 2025 -
lerp_ doesn't correctly type promote
#140601 closed
Jan 24, 2025 -
HOP input mutation analysis is not comprehensive
#137639 closed
Jan 23, 2025 -
CUDA memory leak in model_container::run_const_fold
#126059 closed
Jan 23, 2025 -
Flexattention: ValueError: Shape element 1 must be a power of 2
#133321 closed
Jan 23, 2025 -
DISABLED test_fn_grad_linalg_det_singular_cuda_complex128 (__main__.TestBwdGradientsCUDA)
#93044 closed
Jan 23, 2025 -
DISABLED test_forward_mode_AD_linalg_det_singular_cuda_complex128 (__main__.TestFwdGradientsCUDA)
#93045 closed
Jan 23, 2025 -
qmul.cpp:34:10: error: redefinition of 'xnn_binary_params' 34 | struct xnn_binary_params { | ^
#145497 closed
Jan 23, 2025 -
Async distributed checkpointing works incorrectly with tensors on CPU
#144657 closed
Jan 23, 2025 -
cannot pickle 'torch._C._aoti.AOTIModelPackageLoader' object
#145411 closed
Jan 23, 2025 -
DISABLED test_trigger_bisect_on_error (__main__.ExcTests)
#131303 closed
Jan 23, 2025 -
PR #89436 looks like it causes or enables a memory leak
#90464 closed
Jan 23, 2025 -
"Unknown builtin op" error during jit.load() of TorchScript module with @custom_op
#143773 closed
Jan 23, 2025 -
Memory Leak in MPS Backend During LSTM Iterations (Out of Memory Error)
#145374 closed
Jan 23, 2025 -
cloning third_party/kleidiai fails
#145273 closed
Jan 23, 2025 -
DISABLED test_unbacked_bindings_for_divisible_u_symint (__main__.TestExport)
#138586 closed
Jan 23, 2025 -
DISABLED test_unbacked_bindings_for_divisible_u_symint_non_strict (__main__.NonStrictExportTestExport)
#138585 closed
Jan 23, 2025 -
DISABLED test_unbacked_bindings_for_divisible_u_symint_retraceability (__main__.RetraceExportTestExport)
#138584 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv_retraceability_non_strict (__main__.RetraceExportNonStrictTestExport)
#138675 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv_non_strict (__main__.NonStrictExportTestExport)
#131088 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv_retraceability (__main__.RetraceExportTestExport)
#131083 closed
Jan 23, 2025 -
[inductor] [cuda] [silence] `F.gumbel_softmax` return inconsistent resutls compared with eager
#145470 closed
Jan 23, 2025 -
Failed to export the model to ONNX
#144750 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv_training_ir_to_decomp (__main__.TrainingIRToRunDecompExportTestExport)
#131082 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv_serdes_non_strict (__main__.SerDesExportNonStrictTestExport)
#138884 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv_serdes (__main__.SerDesExportTestExport)
#131119 closed
Jan 23, 2025 -
DISABLED test_slice_with_floordiv (__main__.TestExport)
#131101 closed
Jan 23, 2025 -
Link to `third_party/eigen` git submodule is broken
#145496 closed
Jan 23, 2025 -
[Tracking Issue] Mixed precision does not work with ignored modules
#90318 closed
Jan 23, 2025 -
torch.jit.trace wrong function mapping: > maps to aten::lt
#145485 closed
Jan 23, 2025 -
libtorch_python.dylib not getting symlinked correctly in OSX 13 with pytorch-cpu
#145469 closed
Jan 23, 2025 -
torch.backends.cudnn.flags use error when test
#145472 closed
Jan 23, 2025 -
Non_blocking copy behavior on non-cuda/non-privateuse1 accelerator might be unexpected
#143641 closed
Jan 23, 2025 -
DISABLED test_device_mode_ops_sparse_sampled_addmm_cpu_float32 (__main__.TestDeviceUtilsCPU)
#132720 closed
Jan 23, 2025 -
torch._neg_view correctness
#145428 closed
Jan 23, 2025 -
[RFC] Improve performance for softmax op for cuda in some specific size
#144645 closed
Jan 23, 2025 -
DISABLED test_autograd_function_backed_op (__main__.TestCustomOpWithCompiledAutograd)
#121342 closed
Jan 22, 2025 -
DISABLED test_aot_sequence_nr (__main__.DynamicShapesAotAutogradFallbackTests)
#106440 closed
Jan 22, 2025 -
DISABLED test_no_grad_copy (__main__.TestAutograd)
#139734 closed
Jan 22, 2025 -
Footgun: tracer.root.register_module( in HOPs
#140760 closed
Jan 22, 2025 -
DISABLED test_second_order_accurate (__main__.TestGradient)
#116746 closed
Jan 22, 2025 -
No period in docstring of torch.compiler.disable
#145365 closed
Jan 22, 2025 -
DISABLED test_large_weight_non_abi_compatible_cuda (__main__.AOTInductorTestNonABICompatibleCuda)
#127068 closed
Jan 22, 2025 -
DISABLED test_large_mmaped_weights_non_abi_compatible_cuda (__main__.AOTInductorTestNonABICompatibleCuda)
#127202 closed
Jan 22, 2025 -
DISABLED test_cpp_frontend_module_has_same_output_as_python (__main__.TestCppExtensionJIT)
#116105 closed
Jan 22, 2025 -
[aarch64] multiple inductor test failures related to vec128_bfloat16
#144818 closed
Jan 22, 2025 -
[Inductor][GPU] Input is padded with incorrect value when executing `torch.nn.functional.pad` on gpu
#144462 closed
Jan 22, 2025 -
[inductor][gpu] torch.fft.fft outputs incorrect results when `n>1`
#143719 closed
Jan 22, 2025 -
[inductor][gpu] torch.nn.functional.avg_pool1d outputs incorrect result when input.numel() is 1
#143720 closed
Jan 22, 2025 -
Release Pyotrch version 2.6.0 in pypi
#145142 closed
Jan 22, 2025 -
[Device] `ConvTranspose` bahaves differently on CPU and CUDA when `out_channels=0`
#142466 closed
Jan 22, 2025 -
DISABLED test_fs_preserve_sharing (__main__.TestMultiprocessing)
#91467 closed
Jan 22, 2025 -
DISABLED test_min_cut_partitioner_recomputable_ops (__main__.TestPartitioning)
#104327 closed
Jan 22, 2025 -
make latexpdf
#145221 closed
Jan 22, 2025 -
DISABLED test_cpp_extension_recommends_custom_ops_dynamic_shapes (__main__.DynamicShapesMiscTests)
#127813 closed
Jan 22, 2025 -
DISABLED test_fake_crossref_backward_no_amp_index_fill_cuda_float32 (__main__.TestFakeTensorCUDA)
#99126 closed
Jan 22, 2025 -
Dtype available for `torch.optim.Adam` and `torch.optim.AdamW` when `fused=True` is different from described
#145282 closed
Jan 22, 2025 -
Optimizer state cannot get offloaded to CPU
#144397 closed
Jan 22, 2025 -
When calling a custom function of a LlamaForCausalLM using FSDP causes RuntimeError
#145281 closed
Jan 22, 2025 -
XPU builds validations
#145290 closed
Jan 22, 2025 -
DISABLED test_open_device_registration (__main__.TestCppExtensionOpenRgistration)
#100152 closed
Jan 22, 2025 -
DISABLED test_basic (__main__.TestPythonDispatch)
#145096 closed
Jan 22, 2025 -
DISABLED test_max_autotune_remote_caching_dynamic_False (__main__.TestMaxAutotuneRemoteCache)
#145360 closed
Jan 22, 2025 -
DISABLED test_comprehensive_fft_ifft_cuda_float64 (__main__.TestInductorOpInfoCUDA)
#127344 closed
Jan 22, 2025 -
nn.Embedding backwards pass for nested tensors
#145257 closed
Jan 22, 2025 -
DISABLED test_aot_module_simplified_fake_tensor_gm_raises (__main__.TestAOTModuleSimplified)
#124590 closed
Jan 22, 2025 -
Dynamo graph break on PEP585 generic types
#145226 closed
Jan 22, 2025 -
trace.save_real_tensors segfaults on resnet
#143524 closed
Jan 21, 2025 -
Investigate potential cost savings for inductor workflows
#138476 closed
Jan 21, 2025 -
DISABLED test_identity_float32 (__main__.TestTemplatedSDPA)
#124659 closed
Jan 21, 2025 -
Accessing secrets variables in CI
#144853 closed
Jan 21, 2025 -
[CD] Nightly Release Linux Manywheel builds add size check
#137362 closed
Jan 21, 2025 -
Flakybot fails to fetch test ownership information
#144964 closed
Jan 21, 2025 -
Binaries Python 3.13t failing linux-aarch64-binary-manywheel and linux-binary-manywheel
#145234 closed
Jan 21, 2025 -
DISABLED test_alibi_causal_float32 (__main__.TestTemplatedSDPA)
#124588 closed
Jan 21, 2025 -
DISABLED test_alibi_bias_float32 (__main__.TestTemplatedSDPA)
#124526 closed
Jan 21, 2025 -
torch/_prims/executor.py #TODO : caching
#145171 closed
Jan 21, 2025 -
isin prevents dynamic shapes in modules
#142507 closed
Jan 21, 2025 -
DISABLED test_autograd_in_attr (__main__.TestPythonDispatch)
#145068 closed
Jan 21, 2025 -
DISABLED test_returning_symint (__main__.TestPythonRegistration)
#144920 closed
Jan 21, 2025 -
DISABLED test_register_functional_op_multiple_returns (__main__.TestPythonRegistration)
#142807 closed
Jan 21, 2025 -
DISABLED test_override_aten_ops_with_multiple_libraries (__main__.TestPythonRegistration)
#142460 closed
Jan 21, 2025 -
DISABLED test_register_fallthrough (__main__.TestPythonRegistration)
#142494 closed
Jan 21, 2025 -
DISABLED test_override_cuda_with_jiterator (__main__.TestPythonRegistration)
#142495 closed
Jan 21, 2025 -
DISABLED test_register_functional_op_with_optional (__main__.TestPythonRegistration)
#117871 closed
Jan 21, 2025 -
DISABLED test_register_functional_op_no_returns (__main__.TestPythonRegistration)
#117834 closed
Jan 21, 2025 -
Performance regression when using @torch.compile compared to no compilation
#144822 closed
Jan 21, 2025 -
F.scaled_dot_product_attention get query @ key
#145276 closed
Jan 21, 2025 -
`torch.compile` may produce wrong result with `BicubicInterp+Neg+Linear+Tan`.
#145264 closed
Jan 21, 2025 -
`torch.compile` may produce wrong result with `torch.nn.functional.interpolate`.
#145268 closed
Jan 21, 2025 -
Calculation Results Become NaN After Using `torch.compile` with `Matmul+Concat4+Mul+Linear+Tan`.
#145266 closed
Jan 21, 2025 -
A confusion about Bidirectional GRU
#145073 closed
Jan 21, 2025 -
Exporting a model with dynamic axes and dynamo fails with `TypeError: unhashable type: 'list'`
#144860 closed
Jan 21, 2025 -
Cannot build static windows libraries
#111905 closed
Jan 21, 2025 -
Investigate CUDA enabled build-time difference between MSVC and GCC+WSL
#91623 closed
Jan 21, 2025 -
DISABLED test_out_of_order_index_ds (__main__.TestOutOfOrderDataLoader)
#142343 closed
Jan 21, 2025 -
BUG: torch.exp for complex types on Linux chokes in some cases
#136063 closed
Jan 21, 2025 -
torch.asin returns incorrect value with complex input on cpu
#138327 closed
Jan 21, 2025 -
Floating point exception (core dumped) in `thnn_conv2d`
#143489 closed
Jan 21, 2025 -
[XPU] Nightly binary builds for XPU Linux and Windows are failing since 01.11.2025
#144967 closed
Jan 20, 2025 -
[Perf] Flash-Attn Bwd slow down w/ cutlass 3.6.0 in General
#144729 closed
Jan 20, 2025 -
When using `torch.jit.trace` with `Linear+MaxPool2d+BatchNorm2d`, different results are observed.
#145207 closed
Jan 20, 2025 -
Inconsistent results between CPU and GPU for many operators with complex inputs containing `Inf`
#141487 closed
Jan 20, 2025 -
torch.nn.functional.normalize producing nan values with a large p value and tensor of complex numbers
#135428 closed
Jan 20, 2025 -
`1/torch.inf` produce inconsistent results
#106845 closed
Jan 20, 2025 -
torch.sigmoid producing nan for tensor of negative complex numbers on cpu
#135777 closed
Jan 20, 2025 -
[complex] torch.{exp}: does not match numpy
#48010 closed
Jan 20, 2025 -
DISABLED test_mm_concat_cuda (__main__.FreezingGpuTests)
#145185 closed
Jan 20, 2025 -
torch.onnx.export failed with Process finished with exit code 136 (interrupted by signal 8:SIGFPE)
#144144 closed
Jan 20, 2025 -
[XPU] Keep going jobs of `ciflow/xpu` when case fist failed.
#145048 closed
Jan 20, 2025 -
DISABLED test_comprehensive_fft_fft_cuda_float64 (__main__.TestInductorOpInfoCUDA)
#122715 closed
Jan 20, 2025 -
unexpected behaviour of `torch.chunk`
#145026 closed
Jan 19, 2025 -
[DCP] BUG: FsspecWriter calls os.fsync on .finish(), therefore program crashes on checkpoint save
#144752 closed
Jan 19, 2025 -
massive number of runtime asserts can hamper compile times
#144792 closed
Jan 18, 2025 -
DISABLED test_integers_t1_uint8_np_longlong (__main__.TestArrayFromScalar)
#145135 closed
Jan 18, 2025 -
DISABLED test_dtype_passthrough_dtype_complex128 (__main__.TestDLPack)
#145134 closed
Jan 18, 2025 -
DISABLED test_flex_attention (__main__.TestCompiledAutograd)
#144912 closed
Jan 18, 2025 -
DISABLED test_register_functional_op_one_return (__main__.TestPythonRegistration)
#117816 closed
Jan 18, 2025 -
[Inductor] Test failure in test_comprehensive_nn_functional_max_pool2d_cuda
#131072 closed
Jan 17, 2025 -
[AOTI] AOTI doesn't work well with torch.select
#132360 closed
Jan 17, 2025 -
"index_cuda" not implemented for 'Float8_e4m3fn'
#133605 closed
Jan 17, 2025 -
eps in layernorm.cpp causes a numerical transformation
#140092 closed
Jan 17, 2025 -
[Compiled_autograd] running deepspeed Zero3 failed for torch.compile with compiled_autograd
#141646 closed
Jan 17, 2025 -
DISABLED test_autograd_cpp_node_data_dependent (__main__.TestCompiledAutograd)
#125579 closed
Jan 17, 2025 -
DISABLED test_autograd_cpp_node_saved (__main__.TestCompiledAutograd)
#131103 closed
Jan 17, 2025 -
DISABLED test_autograd_cpp_node_saved_float (__main__.TestCompiledAutograd)
#133197 closed
Jan 17, 2025 -
DISABLED test_autograd_cpp_node_saved_int (__main__.TestCompiledAutograd)
#133283 closed
Jan 17, 2025 -
DISABLED test_non_traceable_autograd_cpp_node (__main__.TestCompiledAutograd)
#134738 closed
Jan 17, 2025 -
DISABLED test_autograd_cpp_node_saved_dynamic (__main__.TestCompiledAutograd)
#135685 closed
Jan 17, 2025 -
`unbind_copy` gives unexpected results on 1-dimensional inputs, or 0-dimensional outputs
#130829 closed
Jan 17, 2025 -
[inductor][cpu]pyhpc_isoneutral_mixing performance regression in 2024-07-30 nightly release
#132281 closed
Jan 17, 2025 -
[XPU] unrecognized device for new_qtensor: xpu:0
#144848 closed
Jan 17, 2025 -
torch.select could not guard on data-dependent expression error
#143249 closed
Jan 17, 2025 -
> if graph capture is thread local
#137844 closed
Jan 17, 2025 -
non-strict export doesn't work with nn.Sequential slicing
#137455 closed
Jan 17, 2025
145 Issues opened by 85 people
-
`torch.ops.aten.copy` causes SIGSEGV when handling sparse CSR tensors with invalid metadata
#145604 opened
Jan 24, 2025 -
add scalar inputs with out causes error in torch.compile
#145598 opened
Jan 24, 2025 -
[Dynamo] compile torch.logit with different data types
#145596 opened
Jan 24, 2025 -
_pickle.UnpicklingError: invalid load key, ''.
#145592 opened
Jan 24, 2025 -
[dynamo] mark_dynamic not working as intended with input shapes
#145587 opened
Jan 24, 2025 -
[BE] Automate update stable_cuda version so that we can set it when introducing new cuda version
#145571 opened
Jan 24, 2025 -
Enable CUDA 12.8.0
#145570 opened
Jan 24, 2025 -
[dynamo] Dynamo doesn't prune dead input cell object
#145564 opened
Jan 24, 2025 -
Unable to build pytorch after #143806
#145563 opened
Jan 24, 2025 -
Confusing as_storage_and_layout(x, want_contiguous=True) behavior
#145561 opened
Jan 24, 2025 -
Docs fonts are bold on Mac, in 2.7
#145556 opened
Jan 23, 2025 -
need to document `FlopCounterMode`
#145555 opened
Jan 23, 2025 -
[RFC] Cuda support matrix for Release 2.7
#145544 opened
Jan 23, 2025 -
Module.to() fail in dynamo when swap_module_params_on_conversion is true
#145529 opened
Jan 23, 2025 -
use_const_ref_for_mutable_tensors doesn't work with out= overloads
#145522 opened
Jan 23, 2025 -
Make debugging flaky tests easier by having relevant logs in one place
#145516 opened
Jan 23, 2025 -
Can't properly implement backward method for custom op in C++ when the op takes List of tensors as argument
#145514 opened
Jan 23, 2025 -
Activation Checkpointing composability with split backward computation
#145511 opened
Jan 23, 2025 -
`torch._inductor.aoti_compile_and_package` fails when using dynamic shapes (PyTorch 2.6.0 RC)
#145500 opened
Jan 23, 2025 -
Unexpected behavior of `torch.nn.init.trunc_normal` with bf16 tensors
#145498 opened
Jan 23, 2025 -
Add a lint rule to avoid the word `interface` in C++
#145493 opened
Jan 23, 2025 -
Cannot print symbolic tensors from C++
#145491 opened
Jan 23, 2025 -
OrderedSet is backed by normal Dict, does not check ordering in equality
#145489 opened
Jan 23, 2025 -
[custom ops] [2.7 nightly] custom ops with typing.List breaks when importing annotations from future
#145481 opened
Jan 23, 2025 -
[XPU] torch 2.7.0.dev20250121+xpu Import Error
#145477 opened
Jan 23, 2025 -
torch.backends.cudnn.flags use error when test
#145473 opened
Jan 23, 2025 -
Is there a PyTorch version that can work properly on the Thor platform based on the Blackwell architecture?
#145471 opened
Jan 23, 2025 -
Mark Dynamic does not work for nn module constructor inputs
#145463 opened
Jan 23, 2025 -
Incomplete check of LR as a tensor in Optimizer
#145461 opened
Jan 23, 2025 -
Flex Attention not support score_mod with gradients
#145460 opened
Jan 23, 2025 -
DISABLED test_tensor_subclass_basic (__main__.TestCompiledAutograd)
#145457 opened
Jan 23, 2025 -
[CUDA] Illegal Memory Access with `AdaptiveMaxPool2d`
#145453 opened
Jan 23, 2025 -
[Dynamo]while_loop raise an exception
#145451 opened
Jan 23, 2025 -
[dynamo] fix graph break on random.random
#145446 opened
Jan 23, 2025 -
[dynamo] `random.Random` gives wrong result on second call
#145445 opened
Jan 23, 2025 -
seg fault in aot_inductor_package on arm GPU with 2.6.0 RC
#145441 opened
Jan 23, 2025 -
Crash in wrapper_benchmark.py with --profile enabled
#145434 opened
Jan 23, 2025 -
XPU - UserWarning: Failed to initialize XPU devices. when run on Machine without without Intel GPU Driver
#145433 opened
Jan 23, 2025 -
aot inductor intermediate tensor debug printing (setting 2) not working
#145425 opened
Jan 22, 2025 -
torch.compile has different numerics for var_mean
#145401 opened
Jan 22, 2025 -
[EXPORT AOTI] `aoti_compile_and_package` custom_ops dependecies
#145394 opened
Jan 22, 2025 -
create DISABLED issues for specific runner labels
#145388 opened
Jan 22, 2025 -
DISABLED test_view_of_slice_cuda (__main__.TestUnbackedSymintsCUDA)
#145386 opened
Jan 22, 2025 -
Windows Pytorch compiler crash some version of cl.exe. Fix provided
#145383 opened
Jan 22, 2025 -
flaky test issues should close themselves if the test doesn't exist anymore
#145382 opened
Jan 22, 2025 -
torch.logit works incorrectly when input < eps after torch.compile
#145379 opened
Jan 22, 2025 -
Loading weights using `torch.distributed.checkpoint` leads to large loss values
#145378 opened
Jan 22, 2025 -
Inductor autograd raises an error in the second run may because of fx graph cache
#145377 opened
Jan 22, 2025 -
distributed.new_group with backend GLOO hangs when distributed.split_group was called before
#145376 opened
Jan 22, 2025 -
[XPU] torch.nn.functional.pad brings wrong results with torch.compile on Intel GPU
#145372 opened
Jan 22, 2025 -
Set `size` when `is_coalesced` is set in `torch.sparse_coo_tensor()`
#145371 opened
Jan 22, 2025 -
The possible error in the pytorch documentation of RNN.
#145368 opened
Jan 22, 2025 -
DISABLED test_cache_hot_load_device_cuda_bfloat16_dynamic_False (__main__.TestFxGraphCache)
#145364 opened
Jan 22, 2025 -
DISABLED test_max_autotune_remote_caching_dynamic_False (__main__.TestMaxAutotuneRemoteCache)
#145361 opened
Jan 22, 2025 -
DISABLED test_comprehensive_svd_lowrank_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#145362 opened
Jan 22, 2025 -
DISABLED test_linear_and_cel_max_autotune (__main__.InplacePaddingTest)
#145359 opened
Jan 22, 2025 -
[autograd] inconsistent jvp results
#145356 opened
Jan 22, 2025 -
Missing docs for `torch._foreach_copy_`
#145355 opened
Jan 22, 2025 -
DISABLED test_extern (__main__.NumBytesMetricTests)
#145352 opened
Jan 22, 2025 -
[CUDA] Illegal Memory Access with `ReplicationPad2D`
#145350 opened
Jan 22, 2025 -
[CUDA] Illegal Memory Access with `AdaptiveAvgPool2d`
#145349 opened
Jan 22, 2025 -
DISABLED test_graph_break_inside_ctx_with_side_effects (__main__.ContextlibContextManagerTests)
#145346 opened
Jan 22, 2025 -
DISABLED test_partitioning_with_view (__main__.MinCutPartitioningTests)
#145345 opened
Jan 22, 2025 -
DISABLED test_cat (__main__.NumBytesMetricTests)
#145344 opened
Jan 22, 2025 -
DISABLED test_partitioning_unremat_bw (__main__.MinCutPartitioningTests)
#145343 opened
Jan 22, 2025 -
internal compiler error: in extract_insn when compiling pytorch with xpu with gcc 12
#145340 opened
Jan 22, 2025 -
[libTorch] Model initialization on multi-device is slow. It seems to run sequentially in multi-thread
#145337 opened
Jan 22, 2025 -
DISABLED test_mm_plus_mm (__main__.TestPatternMatcher)
#145335 opened
Jan 22, 2025 -
DISABLED test_reorder_peak_memory (__main__.TestOperatorReorderForPeakMemory)
#145332 opened
Jan 22, 2025 -
DISABLED test_warn_on_invalid_torch_function_standalone_class (__main__.TestTorchFunctionWarning)
#145333 opened
Jan 22, 2025 -
DISABLED test_cache_hot_load_device_cuda_bfloat16_dynamic_False (__main__.AOTAutogradCacheTests)
#145334 opened
Jan 22, 2025 -
[dynamo] Save/restore system random state more carefully
#145329 opened
Jan 22, 2025 -
[inductor][triton] refactor ASTSource.make_ir integration
#145326 opened
Jan 21, 2025 -
Using torch.cond in a model intended for onnx.export(dynamo=True,...) has issues with the functions provided.
#145300 opened
Jan 21, 2025 -
[dynamo] `torch.compile` ICE on using a sourceless unspecialized NN module as branching condition
#145284 opened
Jan 21, 2025 -
Torch Compile edge case with != versus is not
#145277 opened
Jan 21, 2025 -
AttributeError: '_OpNamespace' 'aten' object has no attribute 'momentum'
#145274 opened
Jan 21, 2025 -
`torch.ops.aten.embedding_dense_backward` Crashes with Out-of-Bounds Indices On CPU
#145267 opened
Jan 21, 2025 -
[Pipelining] Problem using `torch.distributed.pipelining` on `Gemma2ForCausalLM`
#145263 opened
Jan 21, 2025 -
Missing create_graph arguments in torch.func apis
#145262 opened
Jan 21, 2025 -
Custom symbolic functions for ONNX export with None args causes SEGFAULT
#145261 opened
Jan 21, 2025 -
No Range Check for `storage_offset` in `as_strided` Function
#145259 opened
Jan 21, 2025 -
Missing Length Check for `reflection_pad3d` `padding` Argument
#145258 opened
Jan 21, 2025 -
PyObject preservation does not prevent weakrefs being cleared by Python garbage collector
#145253 opened
Jan 21, 2025 -
[Break XPU] device type in test_aot_inductor.py is not passed correctly to cpp_builder.
#145247 opened
Jan 21, 2025 -
Expose configurable path instead of using fixed path in the inductor module for serialized pattern generation
#145242 opened
Jan 21, 2025 -
Not using set_num_threads results in very slow .all()
#145233 opened
Jan 21, 2025 -
Flaky Dynamo test: TestAutograd.test_gradcheck_nondeterministic
#145231 opened
Jan 21, 2025 -
The `sympy` dependency spec for pytorch on PyPi wheel is still unchanged.
#145225 opened
Jan 20, 2025 -
Regression in the compilation of the torch.all operation in PyTorch version 2.6.0 compared to 2.5.1
#145220 opened
Jan 20, 2025 -
`torch.compile` may produce wrong result with `Linear+MaxPool2d+BatchNorm2d`.
#145219 opened
Jan 20, 2025 -
getting different results when adding `torch.Tensor` or python number to a DTensor - Is that expected?
#145218 opened
Jan 20, 2025 -
[ARM] - test_quantized_module.py test_lstm_api fails on Aarch64
#145216 opened
Jan 20, 2025 -
Nested tensor support for pointwise matrix multiplication of nested tensor and normal tensor
#145214 opened
Jan 20, 2025 -
Significant precision error from torch.compile
#145213 opened
Jan 20, 2025 -
DISABLED test_aoti (__main__.TestMemoryPlanning)
#145211 opened
Jan 20, 2025 -
DISABLED test_reorder_peak_memory_lpmf (__main__.TestOperatorReorderForPeakMemory)
#145210 opened
Jan 20, 2025 -
Some FlexAttention learned bias bugs/limitations
#145208 opened
Jan 20, 2025 -
Indexed ^= (XOR in-place) operation doesn't work as expected on MPS backend
#145203 opened
Jan 20, 2025 -
DISABLED test_reuse_kernel_cuda (__main__.AOTInductorTestABICompatibleGpu)
#145193 opened
Jan 20, 2025 -
DISABLED test_mixed_mm (__main__.TestPatternMatcher)
#145192 opened
Jan 20, 2025 -
DISABLED test_slice_scatter_reinplace_cuda (__main__.GPUTests)
#145189 opened
Jan 20, 2025 -
DISABLED test_sdpa_rewriter_12_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#145188 opened
Jan 20, 2025 -
DISABLED test_sdpa_rewriter_12_cuda (__main__.SDPAPatternRewriterCudaTests)
#145187 opened
Jan 20, 2025 -
DISABLED test_mm_concat_cuda (__main__.FreezingGpuTests)
#145186 opened
Jan 20, 2025 -
DISABLED test_aoti_eager_cache_hit_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#145184 opened
Jan 20, 2025 -
DISABLED test_reorder_peak_memory_dfs (__main__.TestOperatorReorderForPeakMemory)
#145183 opened
Jan 20, 2025 -
CUDA initialization error with vLLM 0.5.4 and PyTorch 2.4.0+cu121
#145170 opened
Jan 19, 2025 -
empty_cache does not work for CUDAPluggableAllocator + MemPool
#145168 opened
Jan 19, 2025 -
Pytorch matmul for nested 4D tensors in jagged layout doesn't work
#145158 opened
Jan 18, 2025 -
The latest PyTorch XPU wheel 2.7.0.dev20250117+xpu does not work on Windows
#145155 opened
Jan 18, 2025 -
Driver Allocated Memory grows unrestricted when using torch.unique on MPS device
#145151 opened
Jan 18, 2025 -
[RFC] Improve performance for layer_norm op for cuda with revectorized
#145145 opened
Jan 18, 2025 -
Please add fp16 to MPS devices.
#145144 opened
Jan 18, 2025 -
Bracket indexing not working
#145143 opened
Jan 18, 2025 -
Obey sm_carveout (limit on number of SMs) in inductor persistent kernel
#145115 opened
Jan 17, 2025 -
torch._C._IncludeDispatchKeyGuard is very broken?
#145108 opened
Jan 17, 2025 -
Tracking issue: Incorrect Meta Strides / Turn On PyDispatcher in FakeTensor Mode
#145094 opened
Jan 17, 2025 -
Inductor aten.clone lowering ignores Conjugate and Negative dispatch keys
#145093 opened
Jan 17, 2025 -
[torchbench] torch._dynamo.exc.Unsupported: Graph break due to unsupported builtin None.morphologyEx
#145088 opened
Jan 17, 2025 -
partitioner hangs for some long chains of ops with many users
#145081 opened
Jan 17, 2025 -
list comprehension in SkipFiles are always skipped with no way to override
#145079 opened
Jan 17, 2025 -
Negative values in stride causing error in `avg_pool2d` (on both CPU and CUDA)
#145077 opened
Jan 17, 2025 -
AssertionError: increase TRITON_MAX_BLOCK['X'] to 4096 Again!
#145074 opened
Jan 17, 2025 -
Segmentation fault when passing an empty tensor to `_local_scalar_dense`
#145072 opened
Jan 17, 2025 -
Illegal memory access and segmentation fault due to large `storage_offset` in `as_strided`
#145071 opened
Jan 17, 2025 -
Segment fault on CPU and IndexError on CUDA for `_adaptive_avg_pool2d_backward`
#145070 opened
Jan 17, 2025 -
DISABLED test_sparse_add_cuda_complex64 (__main__.TestSparseCSRCUDA)
#145069 opened
Jan 17, 2025 -
SIGSEGV error when passing a 0-sized tensor to `_local_scalar_dense`
#145066 opened
Jan 17, 2025 -
SIGFPE error when passing very large kernel_size to `avg_pool1d`
#145065 opened
Jan 17, 2025
430 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[inductor] Kernel memory analysis for use in heuristics
#142026 commented on
Jan 24, 2025 • 19 new comments -
[Inductor] Unifiy Low Precision FP Legalization for to_dtype_bitcast & constant
#144646 commented on
Jan 24, 2025 • 14 new comments -
[PGNCCL] Add an API to get the status/error code at the PG level
#144498 commented on
Jan 24, 2025 • 12 new comments -
Use `typing.IO[bytes]` instead of `io.BytesIO` in annotations
#144994 commented on
Jan 24, 2025 • 11 new comments -
Update test_c10d_object_collectives.py with DistributedTestBase class
#145056 commented on
Jan 24, 2025 • 10 new comments -
[inductor] Add type annotations to _inductor/utils.py
#144108 commented on
Jan 23, 2025 • 9 new comments -
[inductor] [cpp] Support vectorization for score and mask in FlexAttention CPU
#143638 commented on
Jan 24, 2025 • 8 new comments -
pickler for GraphModule
#141659 commented on
Jan 24, 2025 • 8 new comments -
Add noncontiguous OpInfo tests for MPS
#142202 commented on
Jan 24, 2025 • 6 new comments -
[dcp] Add ZStandard transformer
#143360 commented on
Jan 24, 2025 • 6 new comments -
serde unbacked bindings
#144894 commented on
Jan 23, 2025 • 6 new comments -
Parallelize epilogue/prologue benchmarking
#143408 commented on
Jan 21, 2025 • 5 new comments -
[torch.special] Adding betainc, betaincc, betaincinv, betainccinv, betaln and beta with backward operation
#132135 commented on
Jan 23, 2025 • 4 new comments -
Fix Throughputbenchmark issue
#144669 commented on
Jan 24, 2025 • 4 new comments -
Align CPU behavior with CUDA for `ConvTranspose` when `out_channels=0`
#142859 commented on
Jan 24, 2025 • 4 new comments -
Introduce the public API for all_gather_scaled_matmul
#141053 commented on
Jan 20, 2025 • 4 new comments -
[Intel GPU] qconv_pointwise.binary XPU support
#135189 commented on
Jan 23, 2025 • 4 new comments -
Add prepacking for linear weights
#139387 commented on
Jan 23, 2025 • 4 new comments -
[CI] enable operator benchmark on CPU
#143733 commented on
Jan 23, 2025 • 4 new comments -
[Intel GPU] qlinear at XPU backend
#133307 commented on
Jan 23, 2025 • 3 new comments -
Add test cases of fp8 datatypes in pt2e
#144388 commented on
Jan 24, 2025 • 3 new comments -
[Intel CPU] Fix issue #143483.
#144854 commented on
Jan 24, 2025 • 3 new comments -
[Dynamo][autograd.Function] Relax backward speculation strict mode
#142830 commented on
Jan 22, 2025 • 3 new comments -
[Intel CPU] Fix issue #143484.
#144950 commented on
Jan 24, 2025 • 2 new comments -
OpenReg: fix issue of pin_memory
#145046 commented on
Jan 24, 2025 • 2 new comments -
[Inductor] Fix starvation issue when threads attempt to acquire write…
#144460 commented on
Jan 23, 2025 • 2 new comments -
add fp8 scaled_mm for XPU
#140972 commented on
Jan 20, 2025 • 2 new comments -
Add overloads to diagonal docs
#144214 commented on
Jan 22, 2025 • 2 new comments -
[compiled autograd] Always proxy autograd.Function nodes; handle AOT backwards
#143405 commented on
Jan 24, 2025 • 2 new comments -
[Do NOT merge] Enable inductor-periodic testing for ROCm on MI300
#144594 commented on
Jan 24, 2025 • 2 new comments -
Add option to limit number of SMs used by matmul kernels
#144974 commented on
Jan 22, 2025 • 2 new comments -
[inductor] Fix an aten.squeeze stride computation issue
#143683 commented on
Jan 23, 2025 • 1 new comment -
Implement cuda graphs implementation of torch.cond and torch.while_loop
#140979 commented on
Jan 23, 2025 • 1 new comment -
inductor_config_logging: Don't drop keys
#144700 commented on
Jan 24, 2025 • 1 new comment -
Avoid data-dependent errors in NJT tests via capture_scalar_outputs=True
#144588 commented on
Jan 21, 2025 • 1 new comment -
[compiled autograd] support Tensor Subclasses in AOTBackward
#144115 commented on
Jan 24, 2025 • 1 new comment -
[BE] Add stride check in `torch.max_pool1d()`
#144023 commented on
Jan 23, 2025 • 1 new comment -
Exclude upsample_bilinear2d.vec from default core ATen decomposition table
#141791 commented on
Jan 18, 2025 • 1 new comment -
Fix a number of flexattention issues (cse, cudagraph, etc.)
#145059 commented on
Jan 22, 2025 • 1 new comment -
[Inductor changes] Invoke Quant
#139102 commented on
Jan 17, 2025 • 1 new comment -
[Intel CPU] Fix issue #143482.
#144760 commented on
Jan 24, 2025 • 1 new comment -
Fix fft jit ops cpu
#143894 commented on
Jan 21, 2025 • 1 new comment -
Enable CPP Extension Open Registration tests on Arm
#144774 commented on
Jan 21, 2025 • 1 new comment -
Update ck
#144799 commented on
Jan 24, 2025 • 1 new comment -
Support narrow() on batch dim for NJT
#142063 commented on
Jan 17, 2025 • 1 new comment -
Fix flash attention seed/offset overflow when seed/offset larger than int64
#144844 commented on
Jan 23, 2025 • 1 new comment -
Enable SVE ACLE implementation for tanH Aten op for FP32 dType.
#143741 commented on
Jan 21, 2025 • 1 new comment -
Support remaining *_like factory functions for NJT
#144889 commented on
Jan 22, 2025 • 1 new comment -
[Inductor UT] Refactor FlexAttention UT and add CPU tests
#144953 commented on
Jan 22, 2025 • 1 new comment -
[ATen][CUDA] Implement 128 bit vectorization
#141959 commented on
Jan 24, 2025 • 0 new comments -
Add AOT inductor support for _scaled_mm for CPU
#141961 commented on
Jan 21, 2025 • 0 new comments -
Permute test
#140261 commented on
Jan 19, 2025 • 0 new comments -
support condition branch in ao debug handler
#141302 commented on
Jan 21, 2025 • 0 new comments -
[sympy] Make solve of Mul for Eq int replacement friendly
#141347 commented on
Jan 23, 2025 • 0 new comments -
Save models in OCI registry
#141354 commented on
Jan 21, 2025 • 0 new comments -
Use std::string_view in torchgen
#141735 commented on
Jan 17, 2025 • 0 new comments -
[Don't merge] test only
#141468 commented on
Jan 24, 2025 • 0 new comments -
[Store log]Test log struct
#141439 commented on
Jan 23, 2025 • 0 new comments -
WIP delta graph logging
#141416 commented on
Jan 23, 2025 • 0 new comments -
[hop] fix unbacked_bindings meta for while_loop
#143559 commented on
Jan 24, 2025 • 0 new comments -
Fix FSDP hanging
#143540 commented on
Jan 23, 2025 • 0 new comments -
[hop][inductor] track the dependency on unbacked symbols correctly with constant_args for hops
#143456 commented on
Jan 24, 2025 • 0 new comments -
[compiled autograd] stop specializing on metadata during initial trace
#143417 commented on
Jan 24, 2025 • 0 new comments -
[compiled autograd] Proxy nodes for user-defined C++ torch::autograd::Function
#143387 commented on
Jan 24, 2025 • 0 new comments -
[EXPERIMENTAL][dynamo] Turn on `inline_inbuilt_nn_modules` for fbcode
#143313 commented on
Jan 17, 2025 • 0 new comments -
[compiled autograd] Proxy a node for CopyBackwards into the graph
#143304 commented on
Jan 24, 2025 • 0 new comments -
[compiled autograd] Proxy opaque nodes for built-in autograd nodes
#143296 commented on
Jan 24, 2025 • 0 new comments -
Set proper `LD_LIBRARY_PATH` on Linux in nightly venv in nightly pull tool
#143262 commented on
Jan 17, 2025 • 0 new comments -
[TorchGen] Simplify argumenttype_type
#143254 commented on
Jan 20, 2025 • 0 new comments -
[Testing only] Add python cycle detection
#143204 commented on
Jan 22, 2025 • 0 new comments -
Unify use of `enableCollectiveHashDebug_` and trivial updates
#142865 commented on
Jan 18, 2025 • 0 new comments -
Fix RMSNorm epsilon value type for BF16 or FP16
#142848 commented on
Jan 22, 2025 • 0 new comments -
Set `enable_faithful_generator_behavior` flag to True
#142513 commented on
Jan 23, 2025 • 0 new comments -
parallelize sort
#142391 commented on
Jan 23, 2025 • 0 new comments -
Fix type annotation of `Linear.bias`
#142326 commented on
Jan 23, 2025 • 0 new comments -
[inductor] Decide cooperative RSPLIT with same algorithm as split reductions
#142295 commented on
Jan 24, 2025 • 0 new comments -
[scan] Refactoring of input checking and dynamo invocation
#142125 commented on
Jan 23, 2025 • 0 new comments -
support condition branch in ao debug handler
#140256 commented on
Jan 19, 2025 • 0 new comments -
[Environment Variable][7/N] Use thread-safe getenv functions
#140211 commented on
Jan 21, 2025 • 0 new comments -
[Environment Variable][6/N] Use thread-safe getenv functions
#140200 commented on
Jan 19, 2025 • 0 new comments -
Cleanup stale Dynamo Feature Flags
#140147 commented on
Jan 18, 2025 • 0 new comments -
[export] Add custom op profiles and generate meta kernel
#140048 commented on
Jan 21, 2025 • 0 new comments -
[associative_scan] Lifted arguments
#140043 commented on
Jan 23, 2025 • 0 new comments -
Add torch._scaled_mm for CPU
#139975 commented on
Jan 22, 2025 • 0 new comments -
[Don't Review] Test CI
#139971 commented on
Jan 21, 2025 • 0 new comments -
[associative_scan] scan dim handling in user-facing associative_scan()
#139864 commented on
Jan 24, 2025 • 0 new comments -
Add Windows Arm64 Nightly Builds
#139760 commented on
Jan 23, 2025 • 0 new comments -
Add support for loading model `state_dict()`in C++ which are OrderedDicts
#139750 commented on
Jan 21, 2025 • 0 new comments -
[cuDNN] Add an option to force cuDNN usage (incl. SDPA)
#139699 commented on
Jan 23, 2025 • 0 new comments -
[export] Serialize draft export report
#139384 commented on
Jan 18, 2025 • 0 new comments -
Allow BUILD for classes with types built via functions/classes allowed for REDUCE (i.e. not GLOBALs in checkpoint)
#139302 commented on
Jan 17, 2025 • 0 new comments -
[do not review] saving things for NJT metadata cache
#139247 commented on
Jan 20, 2025 • 0 new comments -
Use the device interface for detecting Triton availability
#139171 commented on
Jan 22, 2025 • 0 new comments -
Unify shallow_copy_and_detach overloads by passing c10::VariableVersion
#138999 commented on
Jan 24, 2025 • 0 new comments -
[c10d] Remove ProcessGroupGloo + CUDA tests
#138998 commented on
Jan 23, 2025 • 0 new comments -
Tensor .cuda() very slow with specific array sizes
#138964 commented on
Jan 21, 2025 • 0 new comments -
Fix bug of torch.nn.functional.kl_div when broadcast happened
#138810 commented on
Jan 23, 2025 • 0 new comments -
[Docs] Optimize parameter description to declare allowed type (3/N)
#138798 commented on
Jan 19, 2025 • 0 new comments -
Switch back to the default checkout action
#138739 commented on
Jan 22, 2025 • 0 new comments -
inductor `full_like` decompositions give incorrect strides
#144699 commented on
Jan 21, 2025 • 0 new comments -
Expose torch.autograd.graph.is_backward_executing
#141276 commented on
Jan 21, 2025 • 0 new comments -
Fix InductorLower when attribute is shape ()
#141226 commented on
Jan 21, 2025 • 0 new comments -
[Inductor] be able to disable cache for test
#141195 commented on
Jan 24, 2025 • 0 new comments -
[CI] Reduce distributed test timeout to 60s
#141168 commented on
Jan 21, 2025 • 0 new comments -
Specific attribute for device DTensor RNG support indication.
#141141 commented on
Jan 20, 2025 • 0 new comments -
[WIP][Inductor XPU] Support mkldnn fusion in freezing for XPU.
#141096 commented on
Jan 19, 2025 • 0 new comments -
[pytree] Save namedtuple fields
#141084 commented on
Jan 19, 2025 • 0 new comments -
Suport generators
#141055 commented on
Jan 23, 2025 • 0 new comments -
[EZ] Remove TODO because it already works
#141002 commented on
Jan 18, 2025 • 0 new comments -
dynamo: Support custom attributes in tensor subclasses
#140978 commented on
Jan 21, 2025 • 0 new comments -
Add option to split Linear gates for Quantizable LSTM into separate ops
#140868 commented on
Jan 21, 2025 • 0 new comments -
[aoti] Avoid DCE unbacked symint node
#140858 commented on
Jan 18, 2025 • 0 new comments -
Fix TORCH_CUDA_ARCH_LIST for SBSA+CUDA build
#140844 commented on
Jan 18, 2025 • 0 new comments -
Enable CUDA 12.6 OSS CI
#140793 commented on
Jan 24, 2025 • 0 new comments -
Enable C++ dynamic shape guards by default
#140756 commented on
Jan 23, 2025 • 0 new comments -
fix torchrec on inductor
#140747 commented on
Jan 18, 2025 • 0 new comments -
[Intel GPU] Enable fp64 GEMM
#140677 commented on
Jan 21, 2025 • 0 new comments -
Add boolean conversion support for SymNodeVariable
#140621 commented on
Jan 17, 2025 • 0 new comments -
add NHWC support to GroupNorm backward pass and optimize NHWC GroupNorm kernels
#140440 commented on
Jan 18, 2025 • 0 new comments -
Auto SAC - Automated SAC (Selective Activation Checkpointing) Policy Construction and Wrapping
#140410 commented on
Jan 22, 2025 • 0 new comments -
remove redundant assign
#140399 commented on
Jan 24, 2025 • 0 new comments -
Fix inconsistent results from integral linspace on MPS
#140371 commented on
Jan 18, 2025 • 0 new comments -
[WIP] SimpleFSDP prototype frontend changes
#140360 commented on
Jan 21, 2025 • 0 new comments -
[scan] Refactored testcases
#140321 commented on
Jan 18, 2025 • 0 new comments -
[Inductor] optimize welford reduction
#145061 commented on
Jan 24, 2025 • 0 new comments -
optimize the decomposition of aten.native_group_norm
#144733 commented on
Jan 21, 2025 • 0 new comments -
[Intel GPU] Support SparseCsrXPU codegen
#144722 commented on
Jan 18, 2025 • 0 new comments -
functional compiled autograd
#144707 commented on
Jan 24, 2025 • 0 new comments -
Output of nonzero is transposed, fix fake tensor
#144695 commented on
Jan 23, 2025 • 0 new comments -
Generalize poison fork logic for each device backend
#144664 commented on
Jan 17, 2025 • 0 new comments -
Fix torch.logsumexp dim description
#144661 commented on
Jan 22, 2025 • 0 new comments -
[MPS] lu factor ex implementation
#144651 commented on
Jan 21, 2025 • 0 new comments -
remove Windows XPU build workaround.
#144644 commented on
Jan 23, 2025 • 0 new comments -
[Not4Land] test `optree` version compatibility
#144642 commented on
Jan 17, 2025 • 0 new comments -
[inductor] Add features to docstring_linter (see #142496)
#144620 commented on
Jan 23, 2025 • 0 new comments -
Collect packages with importlib in collect_env
#144616 commented on
Jan 24, 2025 • 0 new comments -
[device_mesh] improve device selection logic
#144600 commented on
Jan 18, 2025 • 0 new comments -
Fix DTensorTestBase to barrier with device ids
#144599 commented on
Jan 18, 2025 • 0 new comments -
Implemented dropout usage for RNN with MIOpen backend
#144572 commented on
Jan 23, 2025 • 0 new comments -
[BE][CI] bump `ruff` to 0.9.0: string quote styles
#144569 commented on
Jan 24, 2025 • 0 new comments -
[BE][PYFMT] migrate PYFMT for `torch.{distributed,distributions}` to `ruff format`
#144547 commented on
Jan 18, 2025 • 0 new comments -
[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements
#144546 commented on
Jan 24, 2025 • 0 new comments -
Fix clang-tidy warnings of performance from uncovered files
#144542 commented on
Jan 21, 2025 • 0 new comments -
Save integral tensor data for ET
#144508 commented on
Jan 24, 2025 • 0 new comments -
blocked benchmarking to avoid queue limit
#144507 commented on
Jan 18, 2025 • 0 new comments -
better overlapping of sleep and memory warmup
#144505 commented on
Jan 18, 2025 • 0 new comments -
add most basic event packing
#144501 commented on
Jan 18, 2025 • 0 new comments -
patch for block-wise quantization + pt2e
#144492 commented on
Jan 17, 2025 • 0 new comments -
prov logging
#145047 commented on
Jan 20, 2025 • 0 new comments -
[Easy] Replace paper description with link to make a concise description.
#145031 commented on
Jan 23, 2025 • 0 new comments -
Made partitioning more(?) deterministic
#145024 commented on
Jan 23, 2025 • 0 new comments -
[CD] Annotate linux/arm64 cuda wheels with consistent nvidia dependencies
#145021 commented on
Jan 21, 2025 • 0 new comments -
[dynamo/export] call local_scalar_dense when full() value is scalar tensor
#144999 commented on
Jan 24, 2025 • 0 new comments -
Introduce new template heuristic for triton autotune configs
#144985 commented on
Jan 24, 2025 • 0 new comments -
[inductor] Fix for pattern file contains 'getitem' fails during impor…
#144980 commented on
Jan 23, 2025 • 0 new comments -
[test] fix unit test
#144977 commented on
Jan 17, 2025 • 0 new comments -
Let `tensor_a.new_tensor()` be on `tensor_a.device` by default
#144958 commented on
Jan 22, 2025 • 0 new comments -
[Dynamo] Allow `format()` to handle int
#144956 commented on
Jan 24, 2025 • 0 new comments -
Replacing explicit backend search with api call
#144944 commented on
Jan 24, 2025 • 0 new comments -
[ROCm][TunableOp] Improve selection criteria for fastest solution
#144942 commented on
Jan 23, 2025 • 0 new comments -
Binary upload checksum
#144887 commented on
Jan 23, 2025 • 0 new comments -
update guard_size_oblivious comment
#144880 commented on
Jan 22, 2025 • 0 new comments -
[64-bit] Int64 casting for UpSampleNearest3D
#144865 commented on
Jan 23, 2025 • 0 new comments -
WIP pp_cp test
#144834 commented on
Jan 18, 2025 • 0 new comments -
Added swizzle searching, disabled fp16 accum, and enabled ping-pong for cutlass
#144829 commented on
Jan 24, 2025 • 0 new comments -
Test of RST to MD
#144804 commented on
Jan 22, 2025 • 0 new comments -
[c10d][NCCL] Implement ncclCommInitRankScalable (merging #136789)
#144794 commented on
Jan 24, 2025 • 0 new comments -
[caffe2] Use the manifold cache backend as the default
#144773 commented on
Jan 24, 2025 • 0 new comments -
Unconditional dependency on setuptools
#144763 commented on
Jan 18, 2025 • 0 new comments -
[Reopen] [Intel GPU] Set higher tolerance for some models only on XPU Device
#144756 commented on
Jan 21, 2025 • 0 new comments -
[cherry-pick][dtensor] expose the __create_chunk_list__ in the doc (#144100)
#144741 commented on
Jan 18, 2025 • 0 new comments -
c10::string_view -> std::string_view in torchgen
#144177 commented on
Jan 18, 2025 • 0 new comments -
[Submodule] Turning flash-attention integration into 3rd party submod
#144120 commented on
Jan 23, 2025 • 0 new comments -
Avoid overflow in vector_norm for scalar input
#144073 commented on
Jan 23, 2025 • 0 new comments -
Native channel shuffle floating point exception
#144010 commented on
Jan 18, 2025 • 0 new comments -
cpp_wrapper: Precompile device-specific header files
#144002 commented on
Jan 22, 2025 • 0 new comments -
[poc][not-ready-for-review] visualize dynamic shapes shape env mutations over time
#143961 commented on
Jan 20, 2025 • 0 new comments -
Add ability to skip compute capability checks for Triton
#143956 commented on
Jan 17, 2025 • 0 new comments -
Fix an unnecessary CPU to GPU copy within flex_attention
#143928 commented on
Jan 23, 2025 • 0 new comments -
Using acc_t for log_softmax
#143896 commented on
Jan 21, 2025 • 0 new comments -
Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True
#143880 commented on
Jan 24, 2025 • 0 new comments -
Remove lexicographical sorting of storage keys in torch.save
#143879 commented on
Jan 24, 2025 • 0 new comments -
[FlexAttention] make bm creation cuda-graphable
#143872 commented on
Jan 23, 2025 • 0 new comments -
[1/N]Add Intel GPU Support to Torch Test Cases
#143833 commented on
Jan 17, 2025 • 0 new comments -
[inductor] Used fixed configs for contiguous reductions
#143812 commented on
Jan 24, 2025 • 0 new comments -
Enable clang-tidy on torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp
#143806 commented on
Jan 24, 2025 • 0 new comments -
[don't merge] use vs2022 build windows cpu wheel.
#143791 commented on
Jan 23, 2025 • 0 new comments -
[Intel GPU] Avoid copy when the input of Matmul is broadcasted
#143784 commented on
Jan 17, 2025 • 0 new comments -
Modify the tolerance level in TIMM benchmark for XPU PreCI
#143739 commented on
Jan 23, 2025 • 0 new comments -
nn.MultiheadAttention string representation
#143724 commented on
Jan 23, 2025 • 0 new comments -
Getattr access for subclasses in pre-dispatch
#143671 commented on
Jan 23, 2025 • 0 new comments -
Fix the build errors in ONEDNN+BLIS Path
#143642 commented on
Jan 17, 2025 • 0 new comments -
[triton pin 3.2] Cherry pick additional device context fix
#143622 commented on
Jan 23, 2025 • 0 new comments -
Add the max_autotune tests in the periodic jobs.
#143560 commented on
Jan 21, 2025 • 0 new comments -
Introduce cache clearing APIs for the lazy graph executor
#144489 commented on
Jan 18, 2025 • 0 new comments -
Support Swiglu for Module and functional
#144465 commented on
Jan 24, 2025 • 0 new comments -
improve WOQ first token performance on CPU
#144463 commented on
Jan 23, 2025 • 0 new comments -
Support negative values for fill with uint tensors
#144458 commented on
Jan 21, 2025 • 0 new comments -
[CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt
#144441 commented on
Jan 24, 2025 • 0 new comments -
Implement `generator.throw(exception)`
#144424 commented on
Jan 23, 2025 • 0 new comments -
Implement `generator.close()`
#144423 commented on
Jan 23, 2025 • 0 new comments -
Implement `generator.send(..)`
#144422 commented on
Jan 23, 2025 • 0 new comments -
Implement `generator.__iter__()`
#144421 commented on
Jan 23, 2025 • 0 new comments -
Add `CLEANUP_THROW` bytecode
#144420 commented on
Jan 23, 2025 • 0 new comments -
fix a bug for constant_pad_nd
#144394 commented on
Jan 23, 2025 • 0 new comments -
`torch.linalg.solve`: doc update on dealing with rank-deficient systems which admit a solution
#144390 commented on
Jan 19, 2025 • 0 new comments -
Fix lowering to inductor IR for triton CPU
#144389 commented on
Jan 24, 2025 • 0 new comments -
[Intel GPU] fix memory leak in deconv backward
#144385 commented on
Jan 17, 2025 • 0 new comments -
Filter out iGPU if dGPU is found on XPU
#144378 commented on
Jan 24, 2025 • 0 new comments -
[Don't Merge] Fix poision child process issue when call getAccelerator()
#144368 commented on
Jan 22, 2025 • 0 new comments -
implement LazyInductorBenchmarker
#144365 commented on
Jan 18, 2025 • 0 new comments -
Improve torchrun documentation
#144354 commented on
Jan 24, 2025 • 0 new comments -
implement pruning for GroupedInductorBenchmarker
#144353 commented on
Jan 18, 2025 • 0 new comments -
codecache.py: Utilize precompiled headers for CPP python bindings
#144349 commented on
Jan 22, 2025 • 0 new comments -
codecache: Remove cpp_prefix.h duplication per build, then precompile it
#144293 commented on
Jan 22, 2025 • 0 new comments -
[inductor] Only call triton.compile in worker processes
#144288 commented on
Jan 22, 2025 • 0 new comments -
Cholesky mps implementation
#144193 commented on
Jan 23, 2025 • 0 new comments -
aot_inductor TIMM convit_base inference regression on dashboard
#144772 commented on
Jan 21, 2025 • 0 new comments -
TorchBench mobilenet_v2 cudagraphs_freezing inference regression
#144891 commented on
Jan 21, 2025 • 0 new comments -
TIMM cudagraphs_freezing inference regression
#144888 commented on
Jan 21, 2025 • 0 new comments -
torch.export fails for whisper tiny
#144906 commented on
Jan 21, 2025 • 0 new comments -
Dynamo is not thread safe
#118260 commented on
Jan 21, 2025 • 0 new comments -
torch.accelerator.is_available() raise RuntimeError if no available CUDA/XPU devices
#144567 commented on
Jan 21, 2025 • 0 new comments -
`torch.device(0)` makes CUDA init fail in subprocess since `2.5.0`
#144152 commented on
Jan 21, 2025 • 0 new comments -
Make flex_attention work if `score_mod`'s output doesn't require gradients at all
#145050 commented on
Jan 21, 2025 • 0 new comments -
[torchbench] stable_diffusion_unet compilation failure
#144991 commented on
Jan 21, 2025 • 0 new comments -
CUDAGraph outputs will be overwritten by a subsequent run?
#144961 commented on
Jan 21, 2025 • 0 new comments -
Inconsistency of `tensor.new_tensor(data)` between eager and dynamo
#144957 commented on
Jan 21, 2025 • 0 new comments -
UserWarning: cuDNN SDPA backward got grad_output.strides() != output.strides()
#144913 commented on
Jan 21, 2025 • 0 new comments -
Issue: Illegal Memory Access in Backward Pass of `scaled_dot_product_attention` with Custom Attention Mask
#145040 commented on
Jan 21, 2025 • 0 new comments -
[compiled autograd] It would be nice if the compiiled autograd graph was actually runnable
#144982 commented on
Jan 21, 2025 • 0 new comments -
ONNX: Wrong output shape for ceil_mode Pooling
#71549 commented on
Jan 21, 2025 • 0 new comments -
Numerical error when using torch.nn.functional.pad with a large array on MPS device
#121961 commented on
Jan 21, 2025 • 0 new comments -
[inductor][cpu]float32 dynamic shape maml_omniglot performance regression in 2025-01-13 nightly release
#144937 commented on
Jan 21, 2025 • 0 new comments -
[inductor][cpu]amp fp16 llama dynamic shape cpp wrapper performance regression in 2025-01-07 nightly release
#144932 commented on
Jan 21, 2025 • 0 new comments -
[inductor][cpu] fused attention Inductor tests fails with an error " name 'getitem' is not defined "
#144674 commented on
Jan 21, 2025 • 0 new comments -
[Inductor] Unify the data type propagation between Triton and CPP Backend
#144246 commented on
Jan 21, 2025 • 0 new comments -
Add API to detect if activation checkpointing is enabled in the current region or not
#144928 commented on
Jan 21, 2025 • 0 new comments -
Torch compile cache
#144859 commented on
Jan 21, 2025 • 0 new comments -
_pickle.UnpicklingError: pickle data was truncated - Windows multiprocessing during training
#69611 commented on
Jan 21, 2025 • 0 new comments -
Multiple CPU processes using same GPU model for inference
#16943 commented on
Jan 21, 2025 • 0 new comments -
[RFC] Disable CMake find_library(libm) on Windows, and solve libm conflict to MSVC runtime lib(ucrt.lib).
#141946 commented on
Jan 21, 2025 • 0 new comments -
DISABLED test_method_overloading (__main__.TestScript)
#131104 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_mixed_mm_epi_works (__main__.TestPatternMatcher)
#126489 commented on
Jan 22, 2025 • 0 new comments -
DISABLED test_batch_linear_post_grad_fusion (__main__.TestPostGradBatchLinearFusion)
#120280 commented on
Jan 22, 2025 • 0 new comments -
custom gradient for int8
#129889 commented on
Jan 22, 2025 • 0 new comments -
torch.compile() within TorchDispatchMode always causes an unknown guard failure.
#144787 commented on
Jan 22, 2025 • 0 new comments -
DISABLED test_sdpa_mask_fp16_L6_S17_NH23_HS121 (__main__.TestSDPA)
#138905 commented on
Jan 22, 2025 • 0 new comments -
RFC: Dynamically Quantized 4 bit matmul API and usage
#143289 commented on
Jan 22, 2025 • 0 new comments -
Code fails with "Expected curr_block->next == nullptr to be true, but got false"
#140419 commented on
Jan 22, 2025 • 0 new comments -
DISABLED test_mixed_mm_bad_cases (__main__.TestPatternMatcher)
#128487 commented on
Jan 22, 2025 • 0 new comments -
ModuleNotFoundError: No module named 'torch.privateuseone'
#144955 commented on
Jan 22, 2025 • 0 new comments -
DISABLED test_comprehensive_argsort_cuda_float16 (__main__.TestInductorOpInfoCUDA)
#131158 commented on
Jan 22, 2025 • 0 new comments -
Adding Infiniband to RDZV Backend for optimal torch run training
#144779 commented on
Jan 21, 2025 • 0 new comments -
user-defined triton kernels + inductor stride re-ordering can lead to silent incorrectness
#130243 commented on
Jan 21, 2025 • 0 new comments -
asynchronous copies from accelerator to cpu: what should be the expected behaviour?
#140296 commented on
Jan 21, 2025 • 0 new comments -
`torch._foreach_mul` does not support autograd
#144580 commented on
Jan 21, 2025 • 0 new comments -
Connection Limitation in PyTorch Distributed (Vanilla) with c10d Rendezvous Backend
#144856 commented on
Jan 21, 2025 • 0 new comments -
CheckpointError with `torch.distributed.algorithms._checkpoint.checkpoint_wrapper` and `torch.compile`
#144637 commented on
Jan 21, 2025 • 0 new comments -
Better mergebot messages when reverting a PR
#139680 commented on
Jan 21, 2025 • 0 new comments -
[CI] Manywheel image should use hash based on `.ci/docker` directory
#142218 commented on
Jan 21, 2025 • 0 new comments -
ONNX: wrong operator for ceil_mode Pooling in case of skip the last window
#131272 commented on
Jan 21, 2025 • 0 new comments -
DataLoader hangs when object fails during pickling
#142884 commented on
Jan 21, 2025 • 0 new comments -
python-3.13t binaries are only available for Linux x86
#144357 commented on
Jan 21, 2025 • 0 new comments -
Bug when using reparameterized model evaluating with DDP
#145043 commented on
Jan 21, 2025 • 0 new comments -
torch.distributed hangs between Linux (X86) and Mac (M2 Pro)
#144851 commented on
Jan 21, 2025 • 0 new comments -
Consider making torch.cond return zero rather than None for the gradients of tensors that are in the not-taken branch of the if-else.
#141301 commented on
Jan 21, 2025 • 0 new comments -
Partitioner stores fp8 copy of all weights between fwd and bwd, causing OOM
#141881 commented on
Jan 21, 2025 • 0 new comments -
[PassRate] TorchBench training PassRate is less than 100
#143414 commented on
Jan 21, 2025 • 0 new comments -
torch.compile does not work with Flash attention 3
#144540 commented on
Jan 21, 2025 • 0 new comments -
DISABLED test_grad_scaling_autocast_cuda (__main__.TestTorchDeviceTypeCUDA)
#119154 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_tensor_subclasses (__main__.TestScript)
#119949 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_comprehensive_cross_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#140355 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_dunder_round_edgecases_val_2147483647_ndigits_-1 (__main__.TestNonarrayArgs)
#116121 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_is_isnot (__main__.TestScript)
#120694 commented on
Jan 17, 2025 • 0 new comments -
Is it possible to remove NCCL submodule and use only nccl binaries from pypi instead ?
#144768 commented on
Jan 17, 2025 • 0 new comments -
Cannot create and distribute array in torch.func.grad
#134462 commented on
Jan 17, 2025 • 0 new comments -
[Dynamo] Do an audit on skipfiles and mark more files as inline
#142395 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_profiler_mark_wrapper_call_dynamic_shapes_cuda (__main__.DynamicShapesGPUTests)
#145002 commented on
Jan 17, 2025 • 0 new comments -
DTensor RNG state for non CUDA backends
#138329 commented on
Jan 17, 2025 • 0 new comments -
amp.custom_fwd has incomplete support for library.custom_op
#137033 commented on
Jan 17, 2025 • 0 new comments -
ExpandableMemorySegments not working on H100s/A100s
#122057 commented on
Jan 17, 2025 • 0 new comments -
AMP doesn't gracefully handle optimizers for disabled regions
#47128 commented on
Jan 17, 2025 • 0 new comments -
Support loading and executing a ExportedProgram from torch.export in C++ environment
#144663 commented on
Jan 17, 2025 • 0 new comments -
[Pipelining] PP+DDP does not work for Zero Bubble
#144530 commented on
Jan 17, 2025 • 0 new comments -
`_pdist_forward` causes segmentation fault for 3D tensor with last dimension of size 0
#145064 commented on
Jan 17, 2025 • 0 new comments -
auto-grad graph replicate split_with_sizes(lengths) X times where X = len(lengths) effecting compile time
#140835 commented on
Jan 17, 2025 • 0 new comments -
`torch.compiler.disable()` on module hooks will disable `module.compile()`
#142358 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_serialize_export_scan_simple_cuda_float32 (__main__.TestHOPCUDA)
#139073 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_retrace_export_scan_simple_cuda_float32 (__main__.TestHOPCUDA)
#139074 commented on
Jan 17, 2025 • 0 new comments -
Support torch.func.grad for Flex Attention
#144810 commented on
Jan 17, 2025 • 0 new comments -
torch.stack for sequences
#144671 commented on
Jan 17, 2025 • 0 new comments -
nn.LSTM documentation
#139582 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_device_mode_ops_sparse_sampled_addmm_cpu_complex64 (__main__.TestDeviceUtilsCPU)
#132686 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_dtype_sympy_expr_dynamic_shapes_cpu (__main__.DynamicShapesCodegenCpuTests)
#135213 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_tmp_not_defined_issue2_dynamic_shapes_cpu (__main__.DynamicShapesCodegenCpuTests)
#135212 commented on
Jan 17, 2025 • 0 new comments -
LibTorch -> TorchScript -> PyTorch (Python) fails with `AttributeError: 'RecursiveScriptModule' object has no attribute 'forward'`
#68559 commented on
Jan 20, 2025 • 0 new comments -
TorchInductor CPU Performance Dashboard
#93531 commented on
Jan 20, 2025 • 0 new comments -
Set up Mac builds with clang >= 17 even though Xcode only has at most clang 16
#143913 commented on
Jan 20, 2025 • 0 new comments -
Python 3.13 support for PyTorch
#130249 commented on
Jan 20, 2025 • 0 new comments -
tts_angular: fail_to_run, torch._dynamo.exc.Unsupported: call_method NNModuleVariable() flatten_parameters [] {}
#105532 commented on
Jan 20, 2025 • 0 new comments -
Illegal Memory Access With `torch.compile`
#139628 commented on
Jan 20, 2025 • 0 new comments -
TypeError: Type parameter +RV without a default follows type parameter with a default in _inductor/utils.py
#140914 commented on
Jan 20, 2025 • 0 new comments -
DISABLED test_matmul_triton_kernel_benchmark (__main__.TestKernelBenchmark)
#115002 commented on
Jan 20, 2025 • 0 new comments -
Device assert throws a runtime error in cuda backend and results in a crash in xpu backend
#142135 commented on
Jan 20, 2025 • 0 new comments -
Matmul with int32 parameters on Intel GPU leads to errors
#144766 commented on
Jan 20, 2025 • 0 new comments -
[Inductor] [CPU] `GroupNorm` triggers inconsistency when using Inductor
#141541 commented on
Jan 20, 2025 • 0 new comments -
[RFC] Intel GPU distributed Backend integration in `torch-xpu-ops`and registeration in PyTorch
#141741 commented on
Jan 20, 2025 • 0 new comments -
DISABLED test_profiler_mark_wrapper_call_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135294 commented on
Jan 20, 2025 • 0 new comments -
[torch.export] _insert_copy_for_mutations can't generate proper copy nodes for pure inplace ops
#144954 commented on
Jan 20, 2025 • 0 new comments -
ARM build failed with recent XNNPACK update: third_party/XNNPACK/src/reference/unary-elementwise.cc:125:14: error: invalid ‘static_cast’ from type ‘xnn_bfloat16’ to type ‘_Float16’
#141083 commented on
Jan 20, 2025 • 0 new comments -
DISABLED test_comprehensive_pca_lowrank_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#139828 commented on
Jan 20, 2025 • 0 new comments -
Torch profiler corrupted names with Python 3.11
#121219 commented on
Jan 19, 2025 • 0 new comments -
[pytree] Handling of `None` in torch.utils._pytree is inconsistent with JAX.
#119328 commented on
Jan 19, 2025 • 0 new comments -
Add BufferDict container
#37386 commented on
Jan 19, 2025 • 0 new comments -
Responses from `https://download.pytorch.org/whl/cpu` have `cache-control: no-cache` set in their headers
#130571 commented on
Jan 18, 2025 • 0 new comments -
MPS operator coverage tracking issue (2.6+ version)
#141287 commented on
Jan 18, 2025 • 0 new comments -
DISABLED test_input_mutation2_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135295 commented on
Jan 18, 2025 • 0 new comments -
Make streams used for NCCL operations configurable
#67158 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_device_mode_ops_sparse_sampled_addmm_cpu_float64 (__main__.TestDeviceUtilsCPU)
#132737 commented on
Jan 17, 2025 • 0 new comments -
Observing CUDA OOM errors in more recent versions of PyTorch nightly (post-`2.6.0.dev20241126`)
#141904 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_torch_to (__main__.TestTEFuserStatic)
#121876 commented on
Jan 17, 2025 • 0 new comments -
DISABLED test_torch_to (__main__.TestTEFuserDynamic)
#121875 commented on
Jan 17, 2025 • 0 new comments -
[dynamo] Fix constant propagation in builtins and UserClasses
#131354 commented on
Jan 21, 2025 • 0 new comments -
[inductor] enable bf32 for mkldnn linear pointwise/binary in inductor
#127294 commented on
Jan 20, 2025 • 0 new comments -
[inductor] enable bf32 test for mkldnn conv
#127293 commented on
Jan 21, 2025 • 0 new comments -
allow to use bf16 as fp32 internal precision for mkldnn conv backward
#126054 commented on
Jan 21, 2025 • 0 new comments -
allow to use bf16 as fp32 internal precision for mkldnn conv
#126050 commented on
Jan 21, 2025 • 0 new comments -
refine fp32 precision api
#125888 commented on
Jan 21, 2025 • 0 new comments -
[vision hash update] update the pinned vision hash
#125806 commented on
Jan 24, 2025 • 0 new comments -
Automated submodule update: FBGEMM
#115316 commented on
Jan 24, 2025 • 0 new comments -
Automated submodule update: kineto
#106149 commented on
Jan 22, 2025 • 0 new comments -
[Inductor test failure] torch:inductor/test_select_algorithm TestSelectAlgorithm.test_convolution1 with cuda 12.6.3
#143412 commented on
Jan 24, 2025 • 0 new comments -
[export] run_decompositions fails on `torch.ops.aten.index_put_`
#141336 commented on
Jan 24, 2025 • 0 new comments -
compile time regression 1/9
#144775 commented on
Jan 24, 2025 • 0 new comments -
[RFC] Add CPP INT8 SDPA Template for Inductor CPU
#144941 commented on
Jan 24, 2025 • 0 new comments -
DISABLED TCPStoreTest.testMultiTenantStores (__main__.TCPStoreTest)
#142030 commented on
Jan 24, 2025 • 0 new comments -
Performance regression in torch.compile
#136254 commented on
Jan 24, 2025 • 0 new comments -
assert size/strides for fallback kernel
#144717 commented on
Jan 24, 2025 • 0 new comments -
FFT half precision only let CUDA pass the check
#143112 commented on
Jan 24, 2025 • 0 new comments -
-fno-omit-frame-pointer by default in our builds
#51151 commented on
Jan 24, 2025 • 0 new comments -
TorchDispatchMode cann't capture the operator which name is aten::index_put_ impl_
#145041 commented on
Jan 24, 2025 • 0 new comments -
[inductor][cpu] With inductor_max_autotune, constants missing from frozen FxGraph.
#143144 commented on
Jan 24, 2025 • 0 new comments -
Support LayerNorm2d
#144223 commented on
Jan 24, 2025 • 0 new comments -
DISABLED test_aot_export_with_torch_cond (__main__.TestAOTExport)
#139998 commented on
Jan 24, 2025 • 0 new comments -
DISABLED test_flip_cpu (__main__.CpuTests)
#142863 commented on
Jan 24, 2025 • 0 new comments -
Allow generic python data structure input for torch.autograd.Function
#144159 commented on
Jan 23, 2025 • 0 new comments -
[ONNX][RFC] Migrate torchlib from onnxscript
#139301 commented on
Jan 23, 2025 • 0 new comments -
Release 2.6.0 validations checklist and cherry-picks
#144503 commented on
Jan 23, 2025 • 0 new comments -
DISABLED TCPStoreTest.testMultiTenantStoresUV (__main__.TCPStoreTest)
#139150 commented on
Jan 23, 2025 • 0 new comments -
Add overflow check for integer division
#138684 commented on
Jan 17, 2025 • 0 new comments -
[Docker] Create an independent dependecies layer
#138612 commented on
Jan 22, 2025 • 0 new comments -
Prototype Triton kernel for torch.bmm(NJT, T)
#138555 commented on
Jan 23, 2025 • 0 new comments -
Update test_function_base.py for Numpy 2.0 +
#138463 commented on
Jan 21, 2025 • 0 new comments -
Replace use of PyTorch 2.0 with torch.compile, and minor edits
#138436 commented on
Jan 23, 2025 • 0 new comments -
[pytree] add `treespec_{leaf,tuple,dict}` functions for args_spec modification
#138214 commented on
Jan 22, 2025 • 0 new comments -
[POC][FX][pytree] cleanup fx pytree implementation
#138202 commented on
Jan 23, 2025 • 0 new comments -
update _unsafe_set_version_counter to accept lists of tensors
#137921 commented on
Jan 24, 2025 • 0 new comments -
[Intel GPU] allow_tf32 context at XPU backend
#137570 commented on
Jan 20, 2025 • 0 new comments -
Add back DistributedDataParallel types that were lost when pyi was removed
#136835 commented on
Jan 20, 2025 • 0 new comments -
Avoid sqrt calculations with values less than zero
#136824 commented on
Jan 24, 2025 • 0 new comments -
[Intel GPU] qlinear.pointwise with mixed dtype support
#136753 commented on
Jan 23, 2025 • 0 new comments -
[Partitioner] Reduce time consuming of partitions merger
#136614 commented on
Jan 23, 2025 • 0 new comments -
[Partitioner] Remove unnecessary upstream nodes in dependency viewer
#136608 commented on
Jan 23, 2025 • 0 new comments -
add generalized pareto distribution (GPD)
#135968 commented on
Jan 24, 2025 • 0 new comments -
[Intel GPU] qconv.pointwise with mixed dtype XPU support
#135465 commented on
Jan 23, 2025 • 0 new comments -
add supports_coalescing property in c10d::Backend to determine whether backend supports coalescing
#135338 commented on
Jan 23, 2025 • 0 new comments -
[Intel GPU] qlinear_pointwise.binary[_tensor] XPU support
#135337 commented on
Jan 23, 2025 • 0 new comments -
Pass ideep:lowp_kind to matmul_forward::compute on cache misses
#135058 commented on
Jan 23, 2025 • 0 new comments -
Add decompositions for median and nonmedian
#134881 commented on
Jan 23, 2025 • 0 new comments -
[ROCm] Add support for SymmetricMemory and Intra Node Comm
#134817 commented on
Jan 17, 2025 • 0 new comments -
add ranking for grouped benchmarks
#133287 commented on
Jan 18, 2025 • 0 new comments -
Make IPC features extendable on third-party devices
#133222 commented on
Jan 23, 2025 • 0 new comments -
basic GroupedInductorBenchmarker
#133121 commented on
Jan 18, 2025 • 0 new comments -
xpu: support sycl with torch.utils.cpp_extension APIs
#132945 commented on
Jan 18, 2025 • 0 new comments -
Added dist utility API to get backend from a device object
#132735 commented on
Jan 24, 2025 • 0 new comments -
[xla hash update] update the pinned xla hash
#132021 commented on
Jan 20, 2025 • 0 new comments -
DISABLED test_cuda_event_created_outside_of_graph (__main__.CtxManagerTests)
#133828 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_cuda_event_created_outside_of_graph_dynamic_shapes (__main__.DynamicShapesCtxManagerTests)
#133837 commented on
Jan 23, 2025 • 0 new comments -
[Tensorboard] Problem with subfolders from SummaryWriter
#32651 commented on
Jan 23, 2025 • 0 new comments -
Runners, torchbench, & the future
#143215 commented on
Jan 23, 2025 • 0 new comments -
Need clarification on torch.nn.CrossEntropyLoss
#137188 commented on
Jan 23, 2025 • 0 new comments -
torch.library.opcheck generates gradients with strides of 0
#132857 commented on
Jan 23, 2025 • 0 new comments -
`torch.nn.function.one_hot` and `torch.Tensor.as_subclass` API not available under `torch.compile`
#129651 commented on
Jan 23, 2025 • 0 new comments -
ExportedModule default print of graph signature is unreadable
#141243 commented on
Jan 23, 2025 • 0 new comments -
TORCH_PYTHON_API contains breaking changes in same version 2.6.0a0
#144966 commented on
Jan 22, 2025 • 0 new comments -
int_mm seems broken due to Triton upgrade
#144705 commented on
Jan 22, 2025 • 0 new comments -
[feature request] Varlen indexing function for lookup and concat of varlen BPE tokens from a tensor vocab (i.e. `detokenize(...)` and arrays of strings)
#135704 commented on
Jan 22, 2025 • 0 new comments -
[triton 3.2] test_convolution_as_mm failure on A100
#141079 commented on
Jan 22, 2025 • 0 new comments -
`torch.ops.aten._local_scalar_dense` crashed on empty size tensor
#145063 commented on
Jan 22, 2025 • 0 new comments -
Method for loading a distributed checkpoint into a single state_dict is being deprecrated without alternative, request to make it possible to keep that feature
#125777 commented on
Jan 22, 2025 • 0 new comments -
[DCP]Distributed checkpoint `set_optimizer_state_dict` cause optimizer step error when optimizer contains empty param group
#143828 commented on
Jan 22, 2025 • 0 new comments -
Should coordinator_rank in class _DistWrapper be the global_rank instead of local rank in its process group?
#141825 commented on
Jan 22, 2025 • 0 new comments -
Compile error for custom op with optional mutable tensor list argument
#144072 commented on
Jan 22, 2025 • 0 new comments -
Really slow compilation times for torch.compile causing distributed training errors
#108971 commented on
Jan 22, 2025 • 0 new comments -
On Kaggle : libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12
#134929 commented on
Jan 22, 2025 • 0 new comments -
Support SDPA flash attention/ memory efficant attn on ROCm gfx908
#141958 commented on
Jan 22, 2025 • 0 new comments -
torch compile error with `torch.Tensor.unsqueeze_`
#129673 commented on
Jan 22, 2025 • 0 new comments -
Wrong meta function for constant_pad_nd
#144187 commented on
Jan 22, 2025 • 0 new comments -
Add ATen functions in native_functions.yaml to torch_in_graph_functions list automatically
#145014 commented on
Jan 22, 2025 • 0 new comments -
torch.nn.functional.scaled_dot_product_attention is_causal fails for kv-cache case (sequential and further parallel attention)
#144858 commented on
Jan 22, 2025 • 0 new comments -
TIMM Training cudagraphs poolformer_m36 regression
#144893 commented on
Jan 22, 2025 • 0 new comments -
PyTorch source code build failed on some Windows 11 environment caused by C++ protocol buffer compiler
#143795 commented on
Jan 22, 2025 • 0 new comments -
Fix `torch.stft` and `torch.istft` when using `center=False`, non-rectangular windows and `win_length==hop_length`
#134323 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_reentrant_parent_error_on_cpu_cuda (__main__.TestAutogradDeviceTypeCUDA)
#86735 commented on
Jan 23, 2025 • 0 new comments -
Meta implementations of FFT operators often have incorrect strides
#106623 commented on
Jan 23, 2025 • 0 new comments -
Adding Levenberg-marquardt optimizer in PyTorch
#83529 commented on
Jan 23, 2025 • 0 new comments -
Update TorchInductor to support removed AttrsDescriptor in upstream Triton
#144103 commented on
Jan 23, 2025 • 0 new comments -
[v.2.6.0] Release Tracker
#142814 commented on
Jan 23, 2025 • 0 new comments -
compiled autograd + dynamic shapes fails with constraint violation
#133575 commented on
Jan 23, 2025 • 0 new comments -
[feature request]: Update max onnx opset to 21 for onnxruntime==1.18 compatability
#127167 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_allocation_id_uniqueness (__main__.TestTorchTidyProfiler)
#125021 commented on
Jan 23, 2025 • 0 new comments -
AWS A100 runners reliability issue
#140332 commented on
Jan 23, 2025 • 0 new comments -
[MPS] Indexing Returns 0 if OOB
#144824 commented on
Jan 23, 2025 • 0 new comments -
capture_dynamic_output_shape_ops=True changing expected output between eager and compiled versions
#130290 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_angle_cpu (__main__.CpuTritonTests)
#136124 commented on
Jan 23, 2025 • 0 new comments -
pytorch with xpu support fails to eval pre trained models
#143996 commented on
Jan 23, 2025 • 0 new comments -
FSDP does not work on GLOO backend
#74041 commented on
Jan 23, 2025 • 0 new comments -
AOTAutogradCache implementation
#128234 commented on
Jan 23, 2025 • 0 new comments -
FlexAttention + ROCM Issue Tracker
#140855 commented on
Jan 23, 2025 • 0 new comments -
MaxPool2D memory leakage on device MPS
#125217 commented on
Jan 23, 2025 • 0 new comments -
General MPS op coverage tracking issue
#77764 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_crash (__main__.TestCompileWorker)
#131064 commented on
Jan 23, 2025 • 0 new comments -
[torchbench] Missing meta function for aten::_cudnn_rnn_flatten_weight
#144989 commented on
Jan 23, 2025 • 0 new comments -
[Tracker] Nested tensor op coverage requests
#118107 commented on
Jan 23, 2025 • 0 new comments -
Issues linking to libtorch on M2 mac
#143571 commented on
Jan 23, 2025 • 0 new comments -
[RFC] PyTorch - PyPi PEP-759 proposal (wheel-next)
#139761 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_log_traced_frames (__main__.LoggingTests)
#137461 commented on
Jan 23, 2025 • 0 new comments -
Inconsistent computation of gradient in MaxUnPooling
#80827 commented on
Jan 23, 2025 • 0 new comments -
DISABLED test_new_spectral_norm_forward_swap_True (__main__.TestNNParametrization)
#131089 commented on
Jan 23, 2025 • 0 new comments