Activity Recognition is an emerging field of research, born from the larger fields of ubiquitous computing, context-aware computing and multimedia. Recently, recognizing everyday life activities becomes
one of the challenges for pervasive computing. It is required in various applications such as medical monitoring and rehabilitation where accurate and detailed measurement of an individual's physical activity is a key requirement. Accelerometers have become the popular choice for measuring physical activity owing to their small size, low cost, convenience and their ability to provide objective information about physical activity. In what follows, an accurate activity recognition system is developed to classify the activities performed into three classes i.e. Standing, Running and Cycling. The accelerometer is worn at the right waist of the user.
Data from accelerometer is subjected to feature extraction and then interpreting of the data from this process is done using machine learning algorithms. Accelerometer data obtained is divided into non-overlapping short windows of length 10 instances i.e. 5 continuous seconds in our case beacuse our sampling rate is 2 samples/second. Each window then yields a 15-dimentional feature vector which is used in a training process of machine learning model. 15 features are considered and extracted for the task at hand.
Motivation for our features are as follows :-
1. Mean: Mean refers to the central tendency of the data distribution so it is tells to which value is our activity data biased.
2. Standard deviation: This feature represents the amount of variation or the dispersion of the data present in the activity, so it can differentiate between the activities with same mean values.
3. Skewness: It represents the asymmetry of the data around it’s mean.
4. Kurtosis: It measure whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of outliers.
5. Coefficient of Variation: It is the standardised measure of deviation in an activity. It is also known as relative standard deviation (RSD).
6. Zero Crossing Rate: It represents the rate of variation of the data from positive to negative or back activities with both negative and positive values tend to have high zero crossing rate.
7. Data below 25 and 75 percentile: This feature represents the distribution of the data in different quartiles.
8. Peak Frequency component in spectrum of Y-axis data below 2 Hz: This feature gives the essence of the low frequency components present in the signal and is important as the real life activates generally have less frequencies.
9. Number of peaks in spectrum of Y-axis data below 2 Hz: It represents the slow varying components in the spectrum of Y axis and important in distinguishing between the activates with and without motion in Y direction.
Click here for all the required datasets and codes.