Activity Detection using Accelerometer Data

Harsh Bhardwaj, Roll No.: 150102023, Branch: ECE;     Shubham Lohiya, Roll No.: 150102064, Branch: ECE;

   

Aniruddha Ghosh, Roll No.: 150102078, Branch: ECE;     Bhargav Vanamala, Roll No.: 150102082, Branch: ECE;

   
Abstract
Activity Recognition is an emerging field of research, born from the larger fields of ubiquitous computing, context-aware computing and multimedia. Recently, recognizing everyday life activities becomes one of the challenges for pervasive computing. It is required in various applications such as medical monitoring and rehabilitation where accurate and detailed measurement of an individual's physical activity is a key requirement. Accelerometers have become the popular choice for measuring physical activity owing to their small size, low cost, convenience and their ability to provide objective information about physical activity. In what follows, an accurate activity recognition system is developed to classify the activities performed into three classes i.e. Standing, Running and Cycling. The accelerometer is worn at the right waist of the user.

Data from accelerometer is subjected to feature extraction and then interpreting of the data from this process is done using machine learning algorithms. Accelerometer data obtained is divided into non-overlapping short windows of length 10 instances i.e. 5 continuous seconds in our case beacuse our sampling rate is 2 samples/second. Each window then yields a 15-dimentional feature vector which is used in a training process of machine learning model. 15 features are considered and extracted for the task at hand.

Motivation for our features are as follows :-

1. Mean: Mean refers to the central tendency of the data distribution so it is tells to which value is our activity data biased.

2. Standard deviation: This feature represents the amount of variation or the dispersion of the data present in the activity, so it can differentiate between the activities with same mean values.

3. Skewness: It represents the asymmetry of the data around it’s mean.

4. Kurtosis: It measure whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of outliers.

5. Coefficient of Variation: It is the standardised measure of deviation in an activity. It is also known as relative standard deviation (RSD).

6. Zero Crossing Rate: It represents the rate of variation of the data from positive to negative or back activities with both negative and positive values tend to have high zero crossing rate.

7. Data below 25 and 75 percentile: This feature represents the distribution of the data in different quartiles.

8. Peak Frequency component in spectrum of Y-axis data below 2 Hz: This feature gives the essence of the low frequency components present in the signal and is important as the real life activates generally have less frequencies.

9. Number of peaks in spectrum of Y-axis data below 2 Hz: It represents the slow varying components in the spectrum of Y axis and important in distinguishing between the activates with and without motion in Y direction.

Click here for all the required datasets and codes.
1. Introduction
When we watch a person, it is easy for us to tell what activity they are performing even if we have never seen them in the past. This is because our brains are already trained to understand human activities. When viewing the activity, the brain compares it to thousands of activities it has memorized and pops out the one that matches. Similarly, a computer can identify the activity one is performing based on activities we have trained it to identify.
On a computer, a machine learning algorithm can be used to “learn” human activities and detect the activity being performed for the new data that is collected. A detection task such as this, which involves categorizing data into separate “classes” is called classification. Applying a classification algorithm to this task involves two steps: training and detection. The training step builds a model which maps training data to certain categories. The detection step maps new data to a category.
1.1 Introduction to Problem
Given accelerometer data from 3 activities namely: Standing, Running & Cycling, we need to classify/detect which activity is being performed from those corresponding data entries.
1.2 Figure

Representative image of our Project. This text displays when the image is unavailable
1.3 Literature Review
Some of the approaches, methodologies and ideas after surveying papers :-

1. Each time series Ai, with i = {x, y, z} can be filtered with a digital filter in order to separate low frequencies components and high frequencies component. The cut-off frequency has been set to 1 Hz, arbitrarily. In this way, we obtain for each time series, three more time series Aij with j = {b, dc, ac}, where b, dc, ac represent respectively the time series without filtering, the time series resulting from a low pass filtering and the time series resulting from a high pass filtering.

2. A successful technique for extracting features from sequential motion data has been demonstrated to be windowing with overlapping or continuous non-overlapping windows.

3. We can extract the following features: root mean squared value of integration of acceleration in a window, mean value of Minmax sums, mean value, standard deviation, coefficient of variation, zero crossing rate, skewness, kurtosis , squared sum of data below 25 & 75 percentile, mean of mean of CWC, variance of CWC and correlation between each pairwise of accelerometer axis.

#NOTE : Papers that we have gone through while literature review for collecting ideas, approaches and methodologies are:-
1. A Study on Human Activity Recognition Using Accelerometer Data
2. Feature Selection and Activity Recognition
3. Physical Activity Recognition from Accelerometer

Click here to download the required paper in PDF format.
1.4 Proposed Approach
Step 1 : Accelerometer Data Collection and Data-Set Creation.
Step 2 : Feature Extraction.
Step 3 : Training the Classifier & testing and finding accuracy.
1.5 Report Organization
Section 1 describes the Motivation for our Project, Problem Statement and the Literature we reviewed.
Section 2 describes the detailed approach followed to extract feature out of the given dataset.
Section 3 shows our classification results and the data set used for verifying.
In section 4, a summary of our whole project work is presented.
2. Proposed Approach
Step 1 : Accelerometer Data Collection and Data-Set Creation:-

Acceleration data is collected using an accelerometer (ADXL335). The accelerometer is a tri-axial accelerometer which gives acceleration in the X, Y and Z axes. The accelerometer data pertaining to the activities of standing, running and cycling, performed by a number of people is obtained via the accelerometer which is attached to the waist of the person performing the activity. The accelerometer is interfaced with a Raspberry Pi module and configured accordingly. Raspberry Pi module is used to store the data. The accelerometer data is recorded for the activities performed in continuation. Thus, the data set in regards to the activities is formed.


Step 2 : Feature Extraction:-

A 15-dimension feature vectors is extracted from non-overlapping short windows of length 10 instances i.e. 5 seconds for the activities as obtained earlier.

The following are some details about the same :-

1. Mean- The average of the acceleration data along all the three axes is calculated and RMS value is taken.

2. Standard Deviation- It is given by the square root of the sum of squared deviations about the mean. It is therefore, the second order moment about the mean. The standard deviation is calculated for the data along all the three axes and RMS value is taken.

3. Coefficient of Variation- It is the ratio of the standard deviation to the mean. It indicates the relative variability. The coefficient of variation of the acceleration data along all the three axes is calculated and RMS value is taken.

4. Skewness- It is the third order moment about the mean. It measures the degree of asymmetry of the data. The skewness of the acceleration data along all the three axes is calculated and RMS value is taken.

5. Kurtosis- It is the fourth order moment about the mean. It measures the tailed-ness of the data. The kurtosis of the acceleration data along all the three axes is calculated and RMS value is taken.

6. Zero Crossing Rate(ZCR)- This is a measure of the rate of sign change by the signal from positive to negative or vice-versa. The ZCR of the acceleration data along all the three axes is calculated and RMS value is taken.

7. Data below 25 and 75 percentile- This gives a measure of the data below 25 and 75 percentile respectively. The squared sum of data below 25 percentile and 75 percentile are calculated and used as features.

8. Peak Frequency Component below 2 Hz- The peak frequency in the spectrum of data along the y-axis below 2 Hz.

9. Number of peaks in y-axis data below 2 Hz- The number of peaks in the spectrum of data along the y-axis is calculated.

10. Lag One Auto-Correlation- This measures the auto-correlation with unity lag. The Lag One Autocorrelation for the data along all the axes is calculated and the RMS value of the same is taken.

11. Continuous Wavelet Transform(CWT)- It is used to divide a continuous-time function into wavelets. We found out the the mean of mean and variance of each wavelet. The CWT mean of mean and variance of the acceleration data along all the three axes is calculated and RMS value is taken.


Step 3 : Training the Classifier, tesing and finding accuracy.

Classifier is trained using the features as described above and then tested on a sequence of test data taken from the accelerometer.
3. Experiments & Results
3.1 Dataset Description
Data is collected from ADXL335 which is a tri-axial accelerometer worn on the waist of the user. The data set is obtained by recording 9 sets of data which are performed by a number of individuals. Total duration of each set being 7.5 minutes corresponding to 2.5 minutes for each activity. These activities are performed in succession i.e. running, then cycling followed by standing. The total duration of data for each of the activity being [(2.5 minutes) * 9] = 22.5 minutes. The sampling rate used is 2 Hz and hence a total of [22.5 * (60 seconds) * (2 samples/second)] = 2700 samples for each activity.
Acceleration data corresponding to the three axes are therefore collected. This data is segregated into different categories i.e. Running, Cycling and Standing and the data set is created.
3.2 Discussion
The following classifiers are used in our Project :-

1. Logistic Regression Model -

This is a classfication technique which fits a linear model to the feature space involving a probabilistic view of classification. This model involves a vector ß in d – dimensional feature space, the points in feature space are projected on to ß and converted into a real number in the range spanning the whole real line. This real number, thus obtained, is mapped to a value in the range 0 to 1 using the standard logistic function. The prediction using this model can be treated as a probability of class membership. By applying threshold to probability, class assignment can be done. Threshold represents the decision boundadry in the feature space. ß is optimized to give the best possible reproduction of training set labels. This is usually done by numerical approximation of maximum likelihood. On really large datasets, stochastic gradient descent may be used. There are a number of advantages to this :-

1. Makes no assumptions about distributions of classes in feature space.
2. Easily extended to multiple classes (multinomial regression)
3. Natural probabilistic view of class predictions.
4. Natural probabilistic view of class predictions
5. Quick to train
6. Very fast at classifying unknown records
7. Good accuracy for many simple data sets
8. Resistant to overfitting
9. Can interpret model coefficients as indicators of feature importance

When used on the data set at hand, it gives a training accuracy of 98.33% and a test accuracy of 97.41%.

2. Multinomial Regression -

Softmax regression (or multinomial logistic regression) is a generalization of logistic regression to the case where we want to handle multiple classes. In logistic regression the labels were binary, for this case the labels are more than two (three in our case i.e. Running, Cycling, Standing).In the softmax regression setting, we are interested in multi-class classification (as opposed to only binary classification), and so the label can take on K (the number of classes in question) different values, rather than only two. we want to estimate the probability of the class label taking on each of the K different possible values. Thus, our hypothesis will output a K-dimensional vector (whose elements sum to 1) giving us our K estimated probabilities.In the special case where K = 2, one can show that softmax regression reduces to logistic regression. This shows that softmax regression is a generalization of logistic regression.

When used on the data set at hand, it gives a training accuracy of 98.61% and a test accuracy of 98.05%.

3. Support Vector Machine -

Support Vector Machines fall under the category of supervised learning models that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic linear classifier (although methods such as Platt Scaling exist to use SVM in a probabilistic classification setting). An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall. Other SVM classfiers used are Polynomial SVM and Gaussian (RBF) SVM, the difference being that the kernel used are polynomial and Gaussian respectively.

# Plots with different features mapped on 3 axes for Standing, Running and Cycling.

Standing corresponds to Red points.
Running corresponds to Green points.
Cycling corresponds to Black points.

In each of the case:-
1st Feature is mapped in X direction.
2st Feature is mapped in Y direction.
3rd Feature is mapped in Z direction.

Plot of Mean, Standard Deviation and Coefficient of Variation:
This text displays when the image is unavailable
Plot of Crest Factor, Average and Squared Sum of Data below 25 Percentile:
This text displays when the image is unavailable
Plot of Squared Sum of Data below 25 Percentile, Peak Frequency and Number of peaks in Spectrum of Y-axis data below 2 Hz:
This text displays when the image is unavailable
Plot of Lag One Auto-Correlation, Skewness and Kurtosis:
This text displays when the image is unavailable
Plot of Zero Crossing Rate, Continuous Wavelet Transform Mean, Continuous Wavelet Transform Variance:
This text displays when the image is unavailable

Results from various classifiers:-

#Logistic Regression Train Accuracy :: 0.983333333333
#Logistic Regression Test Accuracy :: 0.974110032362

#Multinomial Logistic Regression Train Accuracy :: 0.986111111111
#Multinomial Logistic Regression Test Accuracy :: 0.980582524272

#Linear SVM Train Accuracy :: 0.988888888889
#Linear SVM Test Accuracy :: 0.980582524272

#RBF SVM Train Accuracy :: 0.991666666667
#RBF SVM Test Accuracy :: 0.961165048544

#Polynomial SVM Train Accuracy :: 0.994444444444
#Polynomial SVM Test Accuracy :: 0.987055016181

All the classifiers have performed reasonably well and are showing a high level of accuracy. Polynomial SVM performed the best on both the training and testing dataset.
4. Conclusions
4.1 Summary
The classifiers that we have used are Logistic Regression, Multinomial Regression and SVM trained on the features as previously mentioned i.e. Mean, Standard Deviation, Skewness, Kurtosis, Coefficient of Variation, Crest Factor, Zero Crossing Rate, 25 Percentile & 75 Percentile, Data below 25 and 75 percentile, Peak Frequency Component below 2 Hz, Number of peaks in y-axis data below 2Hz and Lag One Autocorrelation. The Multinomial Regression Classifier performed best while the other classifiers i.e. Logistic Regression and SVM Classifier performed reasonably well.

The methodology used classifies activities performed by differnet individuals at differnet speed and style with high accuracy. The system developed, thus, provides a foundation towards a more robust system that will require minimum training of the users and provide least errors due to orientation and positioning offsets of the accelerometer. With extension of the given system to include classification of more activities performed in real time it remains to test the system on more subjects. Also with the high accuracy achieved, it becomes worthwhile to deploy the system for real time use.
4.2 Future Extensions
1. More efficient algorithms for the classification purpose are to be investigated obtaining better accuracy. Keeping in view that the classification if can be done in real time would be a great asset. Thus, appropriate feature vectors and algorithms have to be explored for segmenting the data and classifying these segments into different activities.

2. The sampling rate used in the present approach is quite low, this in itself causes complications in the extraction of features. Hence, for a future extension it becomes worth investigating the ideal sampling rate so that data obtained does not become redundant, all the while making it easier to select and extract features.

3. Since an individual performs many more activities than just the three considered it is worthwhile to investigate methods for classification among more number of activities.
4.3 References
1. A Study on Human Activity Recognition Using Accelerometer Data by Akram Bayat, Marc Pomplun, Duc A. Tran
2. Feature Selection and Activity Recognition by Piyush Gupta and Tim Dallas
3. Physical Activity Recognition from Accelerometer by Yonglei Zheng, Weng-KeenWong, Xinze Guan