Step1: Data Pre-processing
The frame segmentation part is done using python. I have attached a demo of the operation. The input video is the segmented video as downloaded from the weblink. The name of video has information for all actions and etc. An excel file is attached for it. An example is here
|Set||sequence #||Activity Class ID||Starting Frame #||Ending Frame #||Activity Class ID||Main actor (left=0, right=1, both=-1)|
In each video for segmentation, we need to select the actor by marking ROI around the main actor and selecting it by pressing enter. The action of the main actor will be tracked automatically and each frame will be saved in the destined directory. The main actor in each video is defined in label.xlsx (designed from data on the website). You can check the video for this.
you need to take care of actors in each video, the video may have one or two main actors and need to be segmented all actors action. Further manual analysis is required to further filter the frames of action only. We will leave the rest frames of the main actor in each video. For example, in the above table, the main action is in from frame 32-89 and the activity class is 4 which is punching. Once all action data is filtered and labeled, can be used for classification. The machine learning algorithm is used for the classification task. Optimal outcomes are achieved with the machine learning classifier. The activity of particular class is classified with the machine learning approach.
Requisites to run main.py
- download python latest version from this link : https://www.python.org/downloads/
- install opencv as
2.1 go to the command prompt
2.2 type pip install opencv-python and enter
2.3 once its installation is finished, type
pip install opencv-contrib-python Â and enter
How to run main.py
- open command prompt
- type the command as
python <directory of main.py> <input video location> <directory where images to be saved>
python F://freelancer//aditiJah//Data_preprocessing//main.py F://freelancer//aditiJah//ut-interaction_segmented_set1//segmented_set1 F://freelancer//aditiJah//ut-interaction_segmented_set1//images
Step 2: How to run MATLAB code
once the data pre-processing and labels.xlsx is generated, run main.m by giving proper path in the code.
Note: We have used this for the research work and the data was manually selected. The human actions are classified by 2-level machine learning approaches. The proposed approach has been able to fetch the accuracy upto 98.66%.
We take the UT interaction video dataset to validate the performance of the work. The UT interaction dataset consists of 20 sequence videos with the six different interaction category of the human-human interaction. The input video segmented into the frames using a python script. The global features are evaluated with the Histogram of Gradient (HOG) and Histogram of Optical Flow (HOF). A features vector obtained after the feature extraction from the frames of the video. The dimension of the HOF and HOG feature vector is reduced by the Principle Component Analysis (PCA) method. The extracted features are fed to the SVM classifier which provided two different classes of six human-human interaction action. Class 1 contains the fighting interaction like Kicking, Punching and Pushing and Class 2 contains the normal interaction like Pointing, Hugging and Hand Shaking.
Again features are extracted from the two classes of human interaction. The Speed up Robust Features (SURF) is used to detect the motion features from the classified categories of interaction. The detected features are represented into the visual words by using the Bag of Visual Word (BoVW) approach. The BoVW approach encoded the features or key points extracted from the segmented frame. The output of SURF-BoVW features is fed to the Neural Network classifier, which classifies the human-human interaction. The proposed method provided efficient results than the single level classier like SVM and KNN. The proposed method provided an approximate 99 % classification accuracy for the UT interaction dataset.