https://github.com/TwentyBN/sense

Skip to content
 
Sign up

  * Why GitHub?
    Features -
      + Mobile -
      + Actions -
      + Codespaces -
      + Packages -
      + Security -
      + Code review -
      + Project management -
      + Integrations -
      + GitHub Sponsors -
      + Customer stories -
      + Security -
  * Team
  * Enterprise
  * Explore
      + Explore GitHub -

    Learn & contribute

      + Topics -
      + Collections -
      + Trending -
      + Learning Lab -
      + Open source guides -

    Connect with others

      + The ReadME Project -
      + Events -
      + Community forum -
      + GitHub Education -
      + GitHub Stars program -
  * Marketplace
  * Pricing
    Plans -
      + Compare plans -
      + Contact Sales -
      + Nonprofit -
      + Education -

[                    ] [search-key]

  *  
    #
    In this repository All GitHub |
    Jump to |

  * No suggested jump to results

  *  
    #
    In this repository All GitHub |
    Jump to |
  *  
    #
    In this organization All GitHub |
    Jump to |
  *  
    #
    In this repository All GitHub |
    Jump to |

Sign in Sign up
{{ message }}

TwentyBN / sense

  * Watch 18
  * Star 302
  * Fork 37

Enhance your application with the ability to see and interact with
humans using any RGB camera.

20bn.com/products/datasets
MIT License
302 stars 37 forks
Star
Watch

  * Code
  * Issues 5
  * Pull requests 4
  * Actions
  * Projects 0
  * Wiki
  * Security
  * Insights

More

  * Code
  * Issues
  * Pull requests
  * Actions
  * Projects
  * Wiki
  * Security
  * Insights

master
20 branches 1 tag
Go to file Code
 
Clone
HTTPS GitHub CLI
[https://github.com/T]

Use Git or checkout with SVN using the web URL.

[gh repo clone Twenty]

Work fast with our official CLI. Learn more.

  * Open with GitHub Desktop
  * Download ZIP

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Go back

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Go back

Launching Xcode

If nothing happens, download Xcode and try again.

Go back

Launching Visual Studio

If nothing happens, download the GitHub extension for Visual Studio
and try again.

Go back

Latest commit

@guillaumebrg
guillaumebrg Merge pull request #107 from TwentyBN/fix/
undo-flip-frame-preprocessi...
...
81fb19a Feb 5, 2021
Merge pull request #107 from TwentyBN/fix/
undo-flip-frame-preprocessi...

...ng-before-prediction

[FIX] Frame flip in pre-processing causing left/right confusion

81fb19a

Git stats

  * 717 commits

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
docs
updated logo
Jan 30, 2021
examples
Address review comments on unit test
Feb 5, 2021
resources
change floor touches with squats
Nov 12, 2020
sense
remove unnecessary copy
Feb 5, 2021
tests
Merge branch 'display_class_name_on_gesture_demo' of
github.com:Twent...
Feb 5, 2021
tools
Merge branch 'master' into feature/simplify-new-project
Feb 4, 2021
.gitignore
travis.yml added again
Jan 15, 2021
.travis.yml
Add EOF lines
Jan 27, 2021
CODE_OF_CONDUCT.md
Update contact email
Nov 30, 2020
CONTRIBUTING.md
switch to lowercase
Dec 11, 2020
LICENSE
Update LICENSE
Nov 26, 2020
README.md
Removing coreml converter script until it's ported to Tensorflow 2
an...
Jan 29, 2021
requirements.txt
move coreml requirements to another file
Jan 25, 2021
setup.cfg
missing line
Jan 15, 2021
View code

README.md

                            [sense_logo]

            State-of-the-art Real-time Action Recognition

---------------------------------------------------------------------
                                  
 Website * Blogpost * Getting Started * Build Your Own Classifier *
               iOS Deployment * Datasets * SDK License

      Documentation GitHub GitHub release Contributor Covenant

---------------------------------------------------------------------

senseis an inference engine to serve powerful neural networks for
action recognition, with a low computational footprint. In this
repository, we provide:

  * Two models out-of-the-box pre-trained on millions of videos of
    humans performing actions in front of, and interacting with, a
    camera. Both neural networks are small, efficient, and run
    smoothly in real time on a CPU.
  * Demo applications showcasing the potential of our models: gesture
    recognition, fitness activity tracking, live calorie estimation.
  * A pipeline to record and annotate your own video dataset and
    train a custom classifier on top of our models with an
    easy-to-use script to fine-tune our weights.

                         Gesture Recognition

           [gesture_recognition_1] [gesture_recognition_2]

                   (full video can be found here)

           Fitness Activity Tracker and Calorie Estimation

              [fitness_tracking_1] [fitness_tracking_2]

                   (full video can be found here)

---------------------------------------------------------------------

 Requirements and Installation

The following steps are confirmed to work on Linux (Ubuntu 18.04 LTS
and 20.04 LTS) and macOS (Catalina 10.15.7).

 Step 1: Clone the repository

To begin, clone this repository to a local directory of your choice:

git clone https://github.com/TwentyBN/sense.git
cd sense

 Step 2: Install Dependencies

We recommended creating a new virtual environment to install our
dependencies using conda or virtualenv. The following instructions
will help create a conda environment.

conda create -y -n sense python=3.6
conda activate sense

Install Python dependencies:

pip install -r requirements.txt

Note: pip install -r requirements.txt only installs the CPU-only
version of PyTorch. To run inference on your GPU, another version of
PyTorch should be installed. For instance:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

See all available options here.

 Step 3: Download the Pre-trained Weights

Pre-trained weights can be downloaded from here. Follow the
instructions there to create an account and download the weights.
Once downloaded, unzip the folder and move the folder named backbone
into sense/resources. In the end, your resources folder structure
should look like this:

resources
+-- backbone
|   +-- strided_inflated_efficientnet.ckpt
|   +-- strided_inflated_mobilenet.ckpt
+-- fitness_activity_recognition
|   +-- ...
+-- gesture_detection
|   +-- ...
+-- ...

Note: The remaining folders in resources/ will already have the
necessary files -- only resources/backbone needs to be downloaded
separately.

---------------------------------------------------------------------

 Getting Started

To get started, try out the demos we've provided. Inside the sense/
examples directory, you will find 3 Python scripts,
run_gesture_recognition.py, run_fitness_tracker.py, and
run_calorie_estimation.py. Launching each demo is as simple as
running the script in terminal as described below.

 Demo 1: Gesture Recognition

examples/run_gesture_recognition.py applies our pre-trained models to
hand gesture recognition. 30 gestures are supported (see full list
here).

Usage:

PYTHONPATH=./ python examples/run_gesture_recognition.py

 Demo 2: Fitness Activity Tracking

examples/run_fitness_tracker.py applies our pre-trained models to
real-time fitness activity recognition and calorie estimation. In
total, 80 different fitness exercises are recognized (see full list
here).

Usage:

PYTHONPATH=./ python examples/run_fitness_tracker.py --weight=65 --age=30 --height=170 --gender=female

Weight, age, height should be respectively given in kilograms, years
and centimeters. If not provided, default values will be used.

Some additional arguments can be used to change the streaming source:

  --camera_id=CAMERA_ID           ID of the camera to stream from
  --path_in=FILENAME              Video file to stream from. This assumes that the video was encoded at 16 fps.

It is also possible to save the display window to a video file using:

  --path_out=FILENAME             Video file to stream to

For the best performance, the following is recommended:

  * Place your camera on the floor, angled upwards with a small
    portion of the floor visible
  * Ensure your body is fully visible (head-to-toe)
  * Try to be in a simple environment (with a clean background)

 Demo 3: Calorie Estimation

In order to estimate burned calories, we trained a neural net to
convert activity features to the corresponding MET value. We then
post-process these MET values (see correction and aggregation steps
performed here) and convert them to calories using the user's weight.

If you're only interested in the calorie estimation part, you might
want to use examples/run_calorie_estimation.py which has a slightly
more detailed display (see video here which compares two videos
produced by that script).

Usage:

PYTHONPATH=./ python examples/run_calorie_estimation.py --weight=65 --age=30 --height=170 --gender=female

The estimated calorie estimates are roughly in the range produced by
wearable devices, though they have not been verified in terms of
accuracy. From our experiments, our estimates correlate well with the
workout intensity (intense workouts burn more calories) so,
regardless of the absolute accuracy, it should be fair to use this
metric to compare one workout to another.

---------------------------------------------------------------------

 Build Your Own Classifier

This section will describe how you can build your own custom
classifier on top of our models. Our models will serve as a powerful
feature extractor that will reduce the amount of data you need to
build your project.

 Step 1: Data preparation

First, run the tools/sense_studio/sense_studio.py script and open
http://127.0.0.1:5000/ in your browser. There you can set up a new
project in a location of your choice and specify the classes that you
want to collect.

The tool will prepare the following file structure for your project,
so you can insert the recorded videos into the corresponding folders:

/path/to/your/dataset/
+-- videos_train
|   +-- class1
|   |   +-- video1.mp4
|   |   +-- video2.mp4
|   |   +-- ...
|   +-- class2
|   |   +-- video3.mp4
|   |   +-- video4.mp4
|   |   +-- ...
|   +-- ...
+-- videos_valid
|   +-- class1
|   |   +-- video5.mp4
|   |   +-- video6.mp4
|   |   +-- ...
|   +-- class2
|   |   +-- video7.mp4
|   |   +-- video8.mp4
|   |   +-- ...
|   +-- ...
+-- project_config.json

  * Two top-level folders: one for the training data, one for the
    validation data.
  * One sub-folder for each class with as many videos as you want
    (but at least one!)
  * Requirement: videos should have a framerate of 16 fps or higher.

In some cases, as few as 2-5 videos per class have been enough to
achieve excellent performance!

 Step 2: Training

Once your data is prepared, run this command to train a customized
classifier on top of one of our features extractor:

PYTHONPATH=./ python tools/train_classifier.py --path_in=/path/to/your/dataset/ [--use_gpu] [--num_layers_to_finetune=9]

 Step 3: Running your model

The training script should produce a checkpoint file called
classifier.checkpoint at the root of the dataset folder. You can now
run it live using the following script:

PYTHONPATH=./ python tools/run_custom_classifier.py --custom_classifier=/path/to/your/dataset/ [--use_gpu]

---------------------------------------------------------------------

 Advanced Options

You can further improve your model's performance by training on top
of temporally annotated data; individually tagged frames that
identify the event locally in the video versus treating every frame
with the same label. For instructions on how to prepare your data
with temporal annotations, refer to this page.

After preparing your dataset with our temporal annotations tool, pass
--temporal_training as an additional flag to the train_classifier.py
script.

---------------------------------------------------------------------

 iOS Deployment

If you're interested in mobile app development and want to run our
models on iOS devices, please check out sense-iOS for step by step
instructions on how to get our gesture demo to run on an iOS device.
One of the steps involves converting our Pytorch models to the
TensorFlow Lite format.

 Conversion to TensorFlow Lite

Our models can be converted to TensorFlow Lite using the following
script:

python tools/conversion/convert_to_tflite.py --backbone=efficientnet --classifier=efficient_net_gesture_control --output_name=model

If you want to convert a custom classifier, set the classifier name
to "custom_classifier", and provide the path to the dataset directory
used to train the classifier using the "--path_in" argument.

python tools/conversion/convert_to_tflite.py --backbone=efficientnet --classifier=custom_classifier --path_in=/path/to/your/dataset/ --output_name=model

---------------------------------------------------------------------

 Citation

We now have a blogpost you can cite:

@misc{sense2020blogpost,
    author = {Guillaume Berger and Antoine Mercier and Florian Letsch and Cornelius Boehm and 
              Sunny Panchal and Nahua Kang and Mark Todorovich and Ingo Bax and Roland Memisevic},
    title = {Towards situated visual AI via end-to-end learning on video clips},
    howpublished = {\url{https://medium.com/twentybn/towards-situated-visual-ai-via-end-to-end-learning-on-video-clips-2832bd9d519f}},
    note = {online; accessed 23 October 2020},
    year=2020,
}

---------------------------------------------------------------------

 License

The code is copyright (c) 2020 Twenty Billion Neurons GmbH under an
MIT Licence. See the file LICENSE for details. Note that this license
only covers the source code of this repo. Pretrained weights come
with a separate license available here.

The code makes use of these sounds from freesound:

  * "countdown_sound.wav" from "milton." licensed under CC0 1.0
  * "done_sound.wav" and "exit_sound.wav" from "paep3nguin" licensed
    under CC0 1.0

About

Enhance your application with the ability to see and interact with
humans using any RGB camera.

20bn.com/products/datasets

Topics

video computer-vision deep-learning activity-recognition pytorch 
neural-networks gesture-recognition fitness-app calorie-estimation

Resources

Readme

License

MIT License

Releases 1

 
v1.0: Initial release Latest
Nov 3, 2020

Packages 0

No packages published

Contributors 13

  * 
  * 
  * 
  * 
  * 
  * 
  * 
  * 
  * 
  * 
  * 

+ 2 contributors

Languages

  * Python 86.8%
  * HTML 9.6%
  * JavaScript 2.9%
  * Other 0.7%

  * (c) 2021 GitHub, Inc.
  * Terms
  * Privacy
  * Security
  * Status
  * Docs

 

  * Contact GitHub
  * Pricing
  * API
  * Training
  * Blog
  * About

You can't perform that action at this time.
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session.