![]() ![]() I’ve done some initial comparisons between torchvision.io.VideoReader + changing frame rate in Python + torch rescaling on batches of 16 frames versus a ffmpeg-python pipeline with scale and fps filters on a MP4 input video of ~261s. Is this something that you would want to add to torchvision.io.read_video? What about to torchvision.io.VideoReader? More generally, is there a plan to add support for all FFmpeg filters in the future? What would that interface look like? Additional context I can’t find anything similar for requesting a certain frame rate. Looking at the C++ code, there is already some support for requesting video frames of a certain resolution, but this functionality is only exposed in _reader.read_video_from_file, not the public API. ![]() I imagine changing the resolution/fps is a common requirement for making predictions on videos, so I can see it as a useful feature of video I/O. I would like to start a conversation on how best to bring such functionality to Torchvision. Such an approach is visibly slower when compared to an implementation based on ffmpeg-python – a wrapper around the command line ffmpeg. The current public API only supports decoding of video frames and trimming, but not any other pre-processing, so I need to do any such pre-processing in Python/PyTorch. To support making predictions on videos from various sources, I at least need to resample them at the correct resolution and frame rate. The model is trained at a fixed resolution, on videos with a frame rate of 15fps. I am working on a video loader to feed video frames to a model trained on the Kinetics 400 dataset and obtain predictions. In particular, rescaling and changing the frame rate would be useful when feeding in-the-wild videos through a trained model. Add support for (basic) FFmpeg filters for faster video pre-processing. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |