SEDE 2020: Papers with Abstracts

Papers
Cloud Applications Adoption: User Study Survey From Industry and Academia Jalal Kiswani, Sergiu Dascalu and Frederick Harris Abstract. Cloud computing has changed the ways software systems are developed, deployed, and managed. Even though cloud computing could reduce cost, time to market, and increase the scalability of applications and systems, there are some debates about the actual adoption of cloud computing in general and cloud applications in particular by organizations from both, industry and academia, and whether stakeholders of organizations look for them as opportunities or risks. To answer these debates, we conducted a user study that included professionals from industry and academia, with various experience levels and job-roles, to be able to get a common understanding of the real value of cloud computing. In particular, the study included questions about the preferred approach of developing green-field applications and the availability of resources and expertise. In addition, it included questions about the cost of development, maintainability, and operations of cloud applications. The results of the user study are presented and discussed, and we believe that it is promising and could be used as the base of future directions of research and technology innovation.
Controls and Novelty on Digital Platforms: Two Case Studies Onkar Malgonde and Alan Hevner Abstract. Software system projects face challenges to rapidly meet user requirements while adding novel values to the application domain. Value appropriation focuses on exploiting existing knowledge to develop software that meets market requirements. Value creation focuses on exploring the solution space to innovate and attract new customers. In this pilot research, we study the tension between software project controls and the goal of novelty in the software product. Two case studies provide preliminary evidence that a well-balanced portfolio of controls can result in the design and implementation of novel product features. We position the case studies in the context of digital platforms to bound our definitions of control mechanisms and novelty. Based on analysis of data collected from two case studies, we find that formal and informal control modes can positively influence novelty in software applications on digital platforms. We conclude with a discussion on implications for software development and future research directions.
Cognitive Informatics in Emergency Disaster Management Systems Ajay Bandi and Aziz Fellah Abstract. In this research, we investigated the dynamic assignment of resources in emergency and disaster management systems that assists rescuers and responding agencies in effective real-time coordination. We also proposed a communication framework architecture, a com- mon operating picture that keeps all communication activities among various stakeholders and agencies to manage emergency and disaster responses. This spectrum of activities is achieved through a comprehensive analytical emergency disaster management system that fetches locations using Google APIs geospatial data and infrastructures such as Google Maps. Also, we conducted usability testing by applying cognitive informatics principles to our emergency model. Such a model provides several services across several types of disasters and various locations such as monitoring emergencies and disasters, describing incidents, performing triage, accessing databases, analyzing the specific need of rescuers, and providing assistance within the professional role and jurisdiction status of every stake- holder member. We also found a few design issues in selecting the type of the stakeholder, limiting the data accesses to a few stakeholders, and filtering the historical data by triage codes.
Detection of unknown galaxy types in large databases of galaxy images Venkata Siva Kumar Margapuri, Basant Thapa and Lior Shamir Abstract. Modern digital sky surveys utilize robotic telescopes that collect extremely large multi- PB astronomical databases. While these databases can contain billions of galaxies, most of the galaxies are “regular” galaxies of known galaxy types. However, a small portion of the galaxies is rare “peculiar” galaxies that are not yet known. These unknown galaxies are of paramount scientific interest, but due to the enormous size of astronomical databases they are practically impossible to find without automation. Since these novelty galaxies are, by definition, not known, machine learning models cannot be trained to detect them. In this paper, an unsupervised machine learning method for automatic detection of novelty galaxies in large databases is proposed. The method is based on a large and comprehensive set of numerical image content descriptors weighted by their entropy, and the farthest neighbors are ranked-ordered to handle self-similar peculiar galaxies that are expected in the very large datasets. Experimental results using data from the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) show that the ability of the method to detect novelty galaxies outperforms other shallow learning methods such as one-class SVM, Local Outlier Factor, and K-Means, and also newer deep learning-based methods such as auto-encoders. The dataset used to evaluate the method is publicly available and can be used as a benchmark to test future algorithms for automatic detection of peculiar galaxies.
Preprocessing Method Comparisons For VGG16 Fast-RCNN Pistol Detection Jiahao Li, Charles Ablan, Rui Wu, Shanyue Guan and Jason Yao Abstract. In recent years, gun detection and threat surveillance became a popular issue as gun violence continued to threaten public safety. Convolution Neural Networks (CNN) has achieved impressive gun detection precision with the advancements in graphic processing units. While many articles have proposed beneficial complex architectures within the neural network, there has been little study on effective image preprocessing techniques that supplement neural networks. With the objective of increasing neural net precision using image processing techniques, this research analyzes three different approaches to image preprocessing using a VGG16 trained Fast Regional Convolutional Neural Network (F- RCNN) pistol detector. The base VGG16 was trained with transfer learning in MATLAB on a dataset of 1500 pistol images and tested on 500 more. The results of the original VGG16 detector are compared with the results of the other VGG16 detectors trained with various image processing techniques to determine the viability of each technique. The three image processing techniques are as follows, color contrast enhancement, principle component analysis (PCA), and a combined preprocessing method. After testing the detector trained with the three methods above, it was found that the color enhancement technique had the best success in raising precision with proper levels of color contrast adjustments. The PCA analysis proved to be incompatible for the neural net to learn features of images that has not underwent PCA processing and thus the method failed to produce beneficial results on the unmodified testing dataset. The combined method processing took both PCA and color contrast enhancement techniques and combined the results into a single training dataset. The combined preprocessing method proved to be ineffective in raising precision potentially due to conflicting features.
An O(nlogn/logw) Time Algorithm for Ridesharing Yijie Han and Chen Sun Abstract. In the ridesharing problem different people share private vehicles because they have similar itineraries. The objective of solving the ridesharing problem is to minimize the number of drivers needed to carry all load to the destination. The general case of ridesharing problem is NP-complete. For the special case where the network is a chain and the destination is the leftmost vertex of the chain, we present an O(nlogn/logw) time algorithm for the ridesharing problem, where w is the word length used in the algorithm and is at least logn. Previous achieved algorithm for this case requires O(nlogn) time.
A Blueprint for a Trustworthy Health Data Platform Encompassing IoT and Blockchain Technologies Dennis Przytarski, Christoph Stach, Clémentine Gritti and Bernhard Mitschang Abstract. eHealth provides great relief for patients and physicians. This means, patients au- tonomously monitor their condition via IoT medical devices and make these data available to physicians for analyses. This requires a data platform that takes care of data acquisition, management, and provisioning. As health data are highly sensitive, there are major concerns regarding data security with respect to confidentiality, integrity, and authenticity. To this end, we present a blueprint for constructing a trustworthy health data platform called SEAL. It provides a lightweight attribute-based authentication mechanism for IoT devices to validate all involved data sources, there is a fine-grained data provisioning system to enable data provision according to actual requirements, and a verification procedure ensures that data cannot be manipulated.
Techniques for Using Virtual Reality Simulations for Self-Defense Skill Development John Apo and Alexander Redei Abstract. KickVR is a training simulation designed to teach self-defense. With realistic training and a user-friendly guide, its aim is to help users learn how to better defend themselves in the real-world. Although the chances of being in a scenario where one may need self- defense are relatively low, self-defense is an important skill to have. The goal is to enable users to learn self-defense even if they do not have the time to attend in-person classes or do not have self-defense classes in their region. Virtual reality provides convenient access to self-defense training for users in their own homes. It also provides the feeling of realism by implementing the use of hand-tracking. The user’s real-world hands are rendered for the self-defense simulation rather than using Oculus hand controllers. This document will cover the current implemented features of KickVR and offer suggestions on areas of improvement based on an internal and external evaluation of software limitations.
Chaotic Creations Natalie Arnold, Leah Kramer, Christopher Lewis, Terra Williams, Sergiu Dascalu and Frederick Harris Abstract. Chaotic Creations is a Windows application aimed toward Dungeon Masters of the fifth edition of Dungeons and Dragons. It includes a dice roller visual, a database of random-selection tables, as well as tools for generating non-player characters, encounters, and terrains. The application is built using Windows Presentation Foundation (WPF) with C# being used for the back end and XAML used for the front end. SQLite was used for database creation and management. The motivation for this project is to help Dungeon Masters be able to improvise during gameplay more effectively.
A Project Tracking Tool for Scrum Projects with Machine Learning Support for Cost Estimation Kasi Periyasamy and Joshua Chianelli Abstract. Cost estimation in software development is very important because it not only gives an idea to all stakeholders on how long it takes to complete the product under development, but it also mandates tracking development activities so that the project does not overrun on time or budget. Several cost estimation models have been reported in the literature for software development using traditional life cycle models but there are only a few ad hoc methods for software projects that used agile methods. This paper describes the design and implementation of a project tracking tool for software projects that are developed using the agile method Scrum. The users of the tool can closely monitor the progress of user stories, sprint tasks and test cases inducted into a scrum board. The tool also supports cost estimation of the project based on user stories and sprint tasks. For every user story, the tool provides a measure of hardship to implement in terms of story points, and for every sprint task, it gives the anticipated completion time. The tool uses machine learning support for continuous monitoring of efforts based on sprint tasks. The effectiveness of the tool has been tested using three different graduate course projects.
Ameliorating Accuracy of a Map Navigation When Dealing with Different Altitude Traffics that Share Exact Geolocation Thitivatr Patanasakpinyo Abstract. Many users use a location-based application on a portable device to be a navigator when driving. However, there exists an incident that two roads are located on the same geolocation, i.e., same values of latitude and longitude but different altitude, for very long distance where one road is located on the ground level and another one is elevated. This incident mostly confuses a location-based application to precisely retrieve the actual road that a vehicle is currently on and, consequently, causes the application to either navigate incorrectly or suggest a route that is a detour. Calling an altitude from a GPS sensor might be a possible solution but it came with problems of accuracy, especially for mid-grade GPS sensors that equipped with most smartphone in today’s market. We proposed a concept of implementing a classification model that can classify whether a vehicle is on a ground road or an elevated road regardless of geolocation data. We trained and validated two models using a dataset that we had collected from actual driving on two roads in Thailand that fell under this condition. A data instance that we collected contained measurements related to driving or driving environment such as a real-time speed at any certain interval of time. We reported validation results of both models as well as other important statistics.
Multi-Objective Regression Test Selection Yizhen Chen and Mei-Hwa Chen Abstract. Regression testing is challenging, yet essential, for maintaining evolving complex soft- ware. Efficient regression testing that minimizes the regression testing time and maximizes the detection of the regression faults is in great demand for fast-paced software develop- ment. Many research studies have been proposed for selecting regression tests under a time constraint. This paper presents a new approach that first evaluates the fault detectability of each regression test based on the extent to which the test is impacted by the changes. Then, two optimization algorithms are proposed to optimize a multi-objective function that takes fault detectability and execution time of the test as inputs to select an optimal subset of the regression tests that can detect maximal regression faults under a given time constraint. The validity and efficacy of the approach were evaluated using two empirical studies on industrial systems. The promising results suggest that the proposed approach has great potential to ensure the quality of the fast-paced evolving systems.