chitkara logo


Vol. 3, Issue 25, July 2017
Software Analytics - For Faster Software Development

The development of today's large and complex software systems is a challenging task particularly with constraints on resources such as work force and time. Applying machine learning techniques on software engineering artefacts to obtain more insightful and actionable information is called software analytics. Software analytics is gaining momentum as a result of involved empirical research in enhancing quality and productivity of software engineering activities like analysis, design, coding and testing. Artefacts like requirement specification, design specification, source code, bug reports, change history and testcases contain wealth of information. Classification, ranking and search-based techniques can be applied on the features mined from these artefacts to strengthen the understanding of software development and improve the decision making.

For example, the classification and regression based machine learning algorithms are widely used to build defect prediction models. Algorithms are trained in previous versions to predict defects in current version. Past bugs mined from bug reports, change history mined from version control systems and static code attributes mined from source code are used as features to predict defects. The algorithms used to build prediction models are Naive Bayes Classifier, Logistic Regression, Decision Tree and ensemble classifiers like Random Forest. In software engineering, problems like defect prediction can also be seen as multi-objective optimization problem. For example, in this case, optimization is done in such way that it maximizes the defect coverage and minimizes the effort. By applying evolutionary algorithms like genetic algorithm (NSGA-II), one can search through the space to find near-optimal or "good-enough" solutions.

In academia and industry there have been rigorous research efforts in the areas of defect prediction, bug localization and effort-estimation by making use of historical data. Following are the problem areas where advanced machine learning and search-based software engineering techniques would add a lot of value:-

  • Software effort-estimation: Effort-estimation has been a very difficult problem in fast changing development environments. In software effort-estimation past projects data can be used to predict the effort required for future projects. Features like application domain, project duration, project size, programming language and tools, function points, geographical locations and project team experience in building similar systems can be extracted and used for effort prediction. Similarity or dissimilarity can be computed between the new project and past projects.

  • Learning systems for requirement engineering: When we build software systems for the particular application domain, empirical data of similar systems will help enormously in understanding and implementations. Expected behaviours and exceptions of functional and non-functional requirements can be learned from the similar requirement specifications by associating the requirements of new systems with it. Various Machine learning techniques can be used to detect similar requirements listed in the historical data of previously developed systems.

  • An efficient classification of App reviews: App reviews are the wealth of information for app developers. App reviews may suggest a new feature, report a bug or offer a word of appreciation. When an incoming review is classified efficiently into any of the above mentioned categories, potential crashes or bugs can be recognized immediately. Effective key words can be identified and used as features for the classification of reviews.

Machine learning algorithms and growing computing power will result in fast learning and quick decision making while working on large datasets. Developers will benefit enormously by saving more manhours and it will allow for faster and quality release of new software and software updates.

By - Muthukumaran Kasinathan, Associate Professor,
CURIN, Chitkara University, Himachal Pradesh.

References:-

  1. Harman, Mark. "The relationship between search based software engineering and predictive modeling." In Proceedings of the 6th International Conference on Predictive Models in Software Engineering, p. 1. ACM, 2010.
  2. Shukla, Swapnil, T. Radhakrishnan, K. Muthukumaran, and Lalita Bhanu Murthy Neti. "Multi-objective cross-version defect prediction." Soft Computing(2016): 1-22.
  3. McIntosh, Shane, and Yasutaka Kamei. "Are Fix-Inducing Changes a Moving Target? A Longitudinal Case Study of Just-In-Time Defect Prediction." IEEE Transactions on Software Engineering (2017).
  4. Nam, Jaechang, Wei Fu, Sunghun Kim, Tim Menzies, and Lin Tan. "Heterogeneous defect prediction." IEEE Transactions on Software Engineering (2017).

About Technology Connect

Aim of this weekly newsletter is to share with students & faculty the latest developments, technologies, updates in the field Electronics & Computer Science and there by promoting knowledge sharing. All our readers are welcome to contribute content to Technology Connect. Just drop an email to the editor. The first Volume of Technology Connect featured 21 Issues published between June 2015 and December 2015. The second Volume of Technology Connect featured 46 Issues published between January 2016 and December 2016. This is Volume 3.

Previous Issue



LoRa - Connectivity Technology for IoT
Click here!

Archives - Random Issue from Vol. 1 & 2



Cost Effective Cluster & Cloud Computing
Click here!

Editorial Team

Chief Editor: Sagar Juneja
Members: Ms Sandhya Sharma, Gitesh Khurani
Arun Goyal, Ankush Gupta.

Disclaimer:The content of this newsletter is contributed by Chitkara University faculty & taken from resources that are believed to be reliable.The content is verified by editorial team to best of its accuracy but editorial team denies any ownership pertaining to validation of the source & accuracy of the content. The objective of the newsletter is only limited to spread awareness among faculty & students about technology and not to impose or influence decision of individuals.