Download PDFOpen PDF in browser

Smart Surveillance for Smart City

EasyChair Preprint no. 3519, version 2

Versions: 12history
6 pagesDate: September 25, 2020


In recent years, video surveillance technology has become pervasive in every sphere. The manual generation of the description of videos requires huge time and labor and sometimes important aspects of videos are overlooked in human summaries. The present work is an attempt towards the automated description generation of Surveillance Video.
The proposed method consists of the extraction of key-frames from a surveillance video, object detection in the key-frames, natural language (English) description generation of the key-frames and finally summarizing the descriptions.
The key-frames are identified based on a mean square error ratio. Object detection in a key-frame is performed using region convolutional Neural Network (R-CNN). We used Long Short Term Memory (LSTM) to generate captions from frames. Translation Error Rate (TER) is used to identify and remove duplicate event descriptions.
Tf-idf is used to rank the event descriptions generated from a video and the top-ranked description is returned as the system generated a summary of the video.
We evaluated the MSVD data set to validate our proposed approach and the system produces a Bilingual Evaluation Understudy (BLEU) score of 46.83.

Keyphrases: Content Based Video Retrieval, Image frame, key frame, key frame extraction, microsoft video description, object detection, pattern recognition, real-time object detection, Smart City, Smart Surveillance, video description corpus, video summarization

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Atanu Mandal and Amir Sinaeepourfard and Sudip Kumar Naskar},
  title = {Smart Surveillance for Smart City},
  howpublished = {EasyChair Preprint no. 3519},

  year = {EasyChair, 2020}}
Download PDFOpen PDF in browser