

Step 2: It is your responsibility to use the functions in the forward’s ctx Were inputs, as long as they’re all None. Also, if you have optionalĪrguments to forward() you can return more gradients than there Whether each input needs gradient computation), or were non- Tensor ( needs_input_grad is a tuple of booleans indicating Were inputs, with each of them containing the gradient w.r.t. It should return as many tensors as there It will be givenĪs many Tensor arguments as there were outputs, with each of them Also, please refer to theĭocs of Function to find descriptions of useful methods that can beīackward() defines the gradient formula. Return either a single Tensor output, or a tuple of Logic won’t traverse lists/dicts/any other data structures and will onlyĬonsider tensors that are direct arguments to the call. Requires_grad=True) will be converted to ones that don’t track historyīefore the call, and their use will be registered in the graph. Tensor arguments that track history (i.e., with All kinds of Python objects are accepted here. It can takeĪs many arguments as you want, with some of them being optional, if you Step 1: After subclassing Function, you’ll need to define 2 methods:įorward() is the code that performs the operation.

Validate whether your gradients are correct using gradcheck. Declare whether your function supports double backward.Ĥ.

Call the proper methods on the ctx argument.ģ. Subclass Function and implement the forward() andĢ.
