In this research article we study the video classification with 3D Convolutional Neural Networks (3D CNNs) as Video classification plays a significant role in action recognition, indexing, and surveillance systems. The UCF101 dataset consisting of six distinct action classes was utilized to evaluate the performance of the model, utilizing UCF101 dataset for training and validation. Afterward the model’s performance was evaluated based on a custom dataset collected from online, streaming sources. 3D CNNs are proficient in recognizing the spatial and temporal information that is present in video sequences, resulting in improved video classification across various categories. The 3D CNN model is able to understand and capture complex interactions over time and space for video data. This paper outlines the model framework, preprocessing methods, and detailed experiment results. This study illustrates the efficiency of 3D CNN when it comes to video classification tasks and therefore provides valuable insights on improving accuracy and real time use in the future.