AVA: A Finely Labeled Video Dataset for Human Action Understanding

research.googleblog.com

Are you interested in multimodal applications, but bored of image captioning? Then try video captioning with Atomic Visual Actions (AVA). AVA is a new dataset that provides multiple action labels for each person in extended video sequences. It consists of URLs for publicly available videos from YouTube, annotated with a set of 80 atomic actions (e.g. "walk", "kick (an object)", "shake hands") with a total of 210k action labels.

Read more...
Linkedin

Want to receive more content like this in your inbox?