Wednesday, April 3, 2019
Techniques for Understanding Human Walking Motion
Techniques for Understanding Human Walking Motion displayMultimedia is a term that collectively describes a word form of media kernel available in assorted forms of text, speech, audio, still images, video, animation, graphics, 3D models and combinations of them use to mesmerize certain time moments. Over the recent years the technological advances dedicate enabled astray availability and easy access of multimedia content and much(prenominal) research was dedicated to finish automated computational tasks for a wide spectrum of applications such as surveillance, crime investigation, fashion and designing, traditional aerospace, publishing and advertising, aesculapian applications, virtual reality applications to name a few. The volume of multimedia culture is so huge now that the amendment in various tasks of representation, analyzing, look for and retrieving cognitive operation has become the need of the hour. Among all the available types of media, video is one of the prominent forms, widely used for analyzing multimedia content.Several types of videos can be captured by various recording devices but then even the nigh suitable types of devices used for acquiring videos accept to deal with two outstanding problems- arresting fissure and semantic gap. The sensory gap being- the difference amidst the real world and its representation. The sensory gap is the gap between the tendency in the world and the reading in a (computational) description derived from a recording of that pictorial matter Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000). Content-based image retrieval at the end of the early(a) years. IEEE legal proceeding on Pattern Analysis and Machine Intelligence, 22(12)13491380.. The semantic gap being- the difference between the expression description by mercifulse lot and the computational model used by the human activity/ demeanor analysis systems. The semantic gap is the lack of coincidence b etween the development that one can extract from the visual information and the interpretation that the similar data have for a user in a wedded situation Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12)13491380.. Many researchers have proposed to induce computational models of the human visual system to represent as shut up as possible to the reality. A major development was the frame change state proposed by David Marr at MIT, who used a bottom-up approach to represent scene instinct D. Marr, Vision A Computational investigating into the Human way and Processing of Visual schooling, Freeman, san Francisco (1982). Later, various state-of the-art methods evolved but, the technology that helps people to incorporate the content of multimedia, for meaningful expression is still lagging posterior.Within the demesne of multimedia content analysis, figurer mickle methods and algorithms have been used as foundation and the coupled relation between multimedia analysis and data processor vision is a well-known challenge. Currently, the most popular research acted by various researchers is the human straw man analysis. Several types of activities that atomic number 18 performed by piece can be captured by various recording devices and the human feat analysis systems were built with respect to context of applications. The aim of human movement analysis systems is to automatically take and transform the input video sequences into semantic interpretation of them. The recognition of human activities has been studied by computer vision for quite some time but is far behind the capabilities of human vision. In human visual system- when a person move is observed, humans brain recognizes that persons action by analyzing the transition of postures adopted or interprets behavior by tracking the pe rsons transition of postures and noting the intent of action. This analysis is complex for computer vision systems. Since the human body is non-rigid, deformable, articulated, a person can have a variety of postures over time. The works on human activity analysis have not provided satisfactory results yet.To solve problems relating human movement analysis use videos, the paradigm of data spinal fusion is recommended. Multimedia data fusion is a way to integrate septuple media, their associated features or integrate intermediate decisions to perform an analysis task. According to B.V Dasarathy, Combining Multimedia data fusion is a formal cloth in which are expressed means and tools for union of data originating from different sources for the exploitation of their synergy in order to adjudge information whose quality cannot be achieved otherwise. Dasarathy, B.V. (2001) information fusion- what, where, why, when, and how? Information fusion, 2, 75-76. In the existent literature several contributions are made to research on data fusion techniques used in multisensory environments and multimodal fusion with the aim of fusing and aggregating data obtained from multiple sources. Video data has a significant characteristic of multimodal content. Combining the information gathered from multiple modalities is valid approach to increase accuracy. P.K Atrey, M. a Hossain, A.E Saddik and M.S Kankahalli. Multimodal fusion for multimedia analysis A Survey. Multimedia systems 16(6) 345-379, 2010 Multimedia fusion is expedient for several tasks such as sleuthing, recognition, identification, tracking and a wide couch of applications.This research work presents multimedia analysis in combination with computer vision and data fusion perspectives to understand human walk doubtfulness in video sequences. This kind of research is challenging.MotivationFrom the view luff of data fusion this research work is motivated by the reflectivity that all living organisms have the capability to use multiple senses to rent about the environment and then the brain fuses all the information to perform a decision task. Human observer can easily and instantly recognize action. But, the main limitations with the visual sensory of universe are, limited crease of visual perception, limitations and compromises of human brain. Whereas, automatic systems can work 24 hours a day and 7 days a week allowing accurate type detection and their cost is lower to maintain.On the other hand, from the view detail of computer vision, algorithms and techniques are yet to improve operation for analyzing humans go found in videos. Computer vision systems are far behind the capabilities of human vision and have to deal with two important problems- sensory gap and semantic gap. The sensory gap being- the difference between the real world and its representation and the semantic gap being- the difference between the behavior description by human vision and the computational mo del used by the human activity/behavior analysis systems.A assure strategy consists in integrating different techniques of data fusion and computer vision in a merge framework to enhance the performance of the tasks associated with analyzing human walking motion and overcoming the drawbacks.1.3 The GoalThe aim of this research work is to conduct a detailed investigation of currently available tools and techniques for understanding human walking motion and develop a generic framework where data fusion and computer vision perspectives are used to analyze human walking actions in context to real life applications. During the process of fusing, correlation of activities and patterns of activities can be detected to predict intent. Finally, performance pull up stakes be evaluated for true positives, false positives and misclassifications.Summary of contributionsOur work in the thesis is focused on the following significant contributionsDesign of a unified framework, for combining dat a fusion and computer vision methodology to improve the performance of automatic analysis of human movements in videos.Tasks of detecting moving humans and related sub-problems in video frames using unsupervised techniques. businesslike technique to handle occlusion in the task of tracking walking humans.New strategy for accomplishing the task of correlation and predictions during detection and tracking of humans.Noticing and interpretation perspectives change in walking movements.1.5 OutlineThe thesis is organized as followsChapter 2 presents background and related literature review on various existing strategies and approaches of data fusion and computer vision while providing pauperism for the proposed approaches used for the work in this thesis.Chapter 3 Provides detailed explanation on the unified framework. Show how the frame work helps in accomplishing the tasks of analysis in multimedia content for correlation and prediction along with a similarity of proposed frame work to JDL, Dasarthy data fusion model.Chapter 4 Presents an overview of state-of-the art methods for detection of humans in videos, the proposed novel work, experiments and the evaluations.Chapter 5 Presents an overview of state-of-the art methods for tracking of humans in videos, the proposed novel work, experiments and the evaluations.Chapter 6 Automatic interpretation of changes in stance changes in human walking.Chapter 7 Conclusions, future directions and related open issues are discussed.ReferencesSmeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000).Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12)13491380D. Marr, Vision A Computational Investigation into the Human Representation and Processing of Visual Information, Freeman, san Francisco (1982)Dasarathy, B.V. (2001) information fusion- what, where, why, when, and how? Information fusion, 2, 75-76P.K Atrey, M. a Hossain, A .E Saddik and M.S Kankahalli. Multimodal fusion for multimedia analysis A Survey. Multimedia systems 16(6) 345-379, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.