Publications

Export 54 results:

]

2014

MDSheet – Model-Driven Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João , Proceedings of the 1st Workshop on Software Engineering methods in Spreadsheets, Volume 1209, p.31–33, (2014) Abstractsems14-td.pdf

This paper showcases MDSheet, a framework aimed at improving the engineering of spreadsheets. This framework is model-driven, and has been fully integrated under a spreadsheet system. Also, its practical interest has been demonstrated by several empirical studies.

Model-Based Programming Environments for Spreadsheets, Cunha, Jácome, Mendes Jorge, Saraiva João, and Visser Joost , Journal of Science of Computer Programming (SCP), Volume 96, p.254–275, (2014) Abstractscp14.pdfWebsite

n/a

Refactoring meets Model-Driven Spreadsheet Evolution, Cunha, Jácome, Fernandes João Paulo, Martins Pedro, Pereira Rui, and Saraiva João , Proceedings of the 9th International Conference on the Quality of Information and Communications Technology, Quality in Model Driven Engineering Track, p.196–201, (2014) Abstractquatic14.pdf

Software refactoring is a well-known technique that provides transformations on software artifacts with the aim of improving their overall quality. In this paper we present a set of refactorings for ClassSheets, a modeling language that allows to specify the business logic of a spreadsheet in an object-oriented fashion. The set of refactorings that we propose allows us to improve the quality of these spreadsheet models. Moreover, it is implemented in a setting that guarantees that all model refactorings are automatically carried to all the corresponding (spreadsheet) instances, thus providing an automatic evolution of the data so it is always synchronized with the model.

Smelling Faults in Spreadsheets, Abreu, Rui, Cunha Jácome, Fernandes João Paulo, Martins Pedro, Perez Alexandre, and Saraiva João , Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution, Washington, DC, USA, p.111–120, (2014) Abstracticsme14.pdf

Despite being staggeringly error prone, spreadsheets are a highly flexible programming environment that is widely used in industry. In fact, spreadsheets are widely adopted for decision making, and decisions taken upon wrong (spreadsheet-based) assumptions may have serious economical impacts on businesses, among other consequences. This paper proposes a technique to automatically pinpoint potential faults in spreadsheets. It combines a catalog of spreadsheet smells that provide a first indication of a potential fault, with a generic spectrum-based fault localization strategy in order to improve (in terms of accuracy and false positive rate) on these initial results. Our technique has been implemented in a tool which helps users detecting faults. To validate the proposed technique, we consider a well-known and well-documented catalog of faulty spreadsheets. Our experiments yield two main results: we were able to distinguish between smells that can point to faulty cells from smells and those that are not capable of doing so; and we provide a technique capable of detecting a significant number of errors: two thirds of the cells labeled as faulty are in fact (documented) errors.

SSaaPP: SpreadSheets as a Programming Paradigm – Project's Final Report, Abreu, Rui, Alves Tiago, Belo Orlando, Campos José C., Cunha Jácome, Fernandes João Paulo, Martins Pedro, Mendes Jorge, Pacheco Hugo, Peixoto Christophe, Pereira Rui, Perez Alexandre, Ribeiro Hugo, Riboira André, Saraiva João, Silva André, Silva João Carlos, and Visser Joost , Number TR-HASLab:02:2014, (2014) Abstracttr_ssaapp.pdf

This technical report describes the research goals and results of the SpreadSheet as a Programming Paradigm research project. This was a project funded by Funda{\c c}ão para a Ciencia e Tecnologia – FCT: the Portuguese research foundation, under reference FCOMP-01-0124-FEDER-010048, that ran from May 2010 till July 2013. This report includes the complete document reporting the results achieved during the project execution, which was submitted to FCT for evaluation on October 2013. It describes the goals of the project, and the different research tasks presenting the deliver- ables of each of them. It also presents the management and result dissemination work performed during the project's execution. The document includes also a self assess- ment of the achieved results, and a complete list of scientific publications describing the contributions of the project. Finally, this document includes the FCT evaluation report.

2013

Complexity Metrics for Spreadsheet Models, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , The 13th International Conference on Computational Science and Its Applications, Volume 7972, p.459–474, (2013) Abstracticcsa-sq13.pdf

This paper proposes a set of metrics for the assessment of the complexity of models defining the business logic of spreadsheets. This set can be considered the first step in the direction of building a quality standard for spreadsheet models, that is still to be defined. The computation of concrete metric values has further been integrated under a well-established model-driven spreadsheet development environment, providing a framework for the analysis of spreadsheet models under spreadsheets themselves.

Querying Model-Driven Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João , Proceedings of the 2013 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.83–86, (2013) Abstractvlhcc2013-query.pdf

Spreadsheets are being used with many different purposes that range from toy applications to complete information systems. In any of these cases, they are often used as data repositories that can grow significantly. As the amount of data grows, it also becomes more difficult to extract concrete information out of them. This paper focuses on the problem of spreadsheet querying. In particular, we propose an expressive and composable technique where intuitive queries can be defined. Our approach builds on a model-driven spreadsheet development environment, and queries are expressed referencing entities in the model of a spreadsheet instead of in its actual data. Finally, the system that we have implemented relies on Google's query function for spreadsheets.

QuerySheet: A Bidirectional Query Environment for Model-Driven Spreadsheets, Belo, Orlando, Cunha Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João , Proceedings of the 2013 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.199–200, (2013) Abstractvlhcc2013-td.pdf

This paper presents a tool, named QUERYSHEET, to query spreadsheets. We defined a language to write the queries, which resembles SQL, the language to query databases. This allows to write queries which are more related to the spreadsheet content than with current approaches.

2012

A Bidirectional Model-driven Spreadsheet Environment (Poster/Abstract), Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , Proceedings of the 34rd International Conference on Software Engineering, p.1443–1444, (2012) Abstractabstract.pdfposter.pdf

n this extended abstract we present a bidirectional model-driven framework to develop spreadsheets. By being model driven, our approach allows to evolve a spreadsheet model and automatically have the data co-evolved. The bidirectional component achieves precisely the inverse, that is, to evolve the data and automatically obtain a new model to which the data conforms.

Bidirectional Transformation of Model-Driven Spreadsheets, Cunha, Jácome, Fernandes João P., Mendes Jorge, Pacheco Hugo, and Saraiva João , Theory and Practice of Model Transformations, Volume 7307, p.105–120, (2012) Abstracticmt12.pdf

Spreadsheets play an important role in software organizations. Indeed, in large software organizations, spreadsheets are not only used to define sheets containing data and formulas, but also to collect information from different systems, to adapt data coming from one system to the format required by another, to perform operations to enrich or simplify data, etc. In fact, over time many spreadsheets turn out to be used for storing and processing increasing amounts of data and supporting increasing numbers of users. Unfortunately, spreadsheet systems provide poor support for modularity, abstraction, and transformation, thus, making the maintenance, update and evolution of spreadsheets a very complex and error-prone task. We present techniques for model-driven spreadsheet engineering where we employ bidirectional transformations to maintain spreadsheet models and instances synchronized. In our setting, the business logic of spreadsheets is defined by ClassSheet models to which the spreadsheet data conforms, and spreadsheet users may evolve both the model and the data instances. Our techniques are implemented as part of the MDSheet framework: an extension for a traditional spreadsheet system.

Extension and Implementation of ClassSheet Models, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.19–22, (2012) Abstractvlhcc12.pdf

n this paper we explore the use of models in the context of spreadsheet engineering. We review a successful spreadsheet modeling language, whose semantics we further extend. With this extension we bring spreadsheet models closer to the business models of spreadsheets themselves. An addon for a widely used spreadsheet system, providing bidirectional model-driven spreadsheet development, was also improved to include the proposed model extension.

From Relational ClassSheets to UML+OCL, Cunha, Jácome, Fernandes João Paulo, and Saraiva João , Proceedings of the Software Engineering Track at the 27th Annual ACM Symposium On Applied Computing (SAC 2012), p.1151–1158, (2012) Abstractsac-se12.pdf

Spreadsheets are among the most popular programming languages in the world. Unfortunately, spreadsheet systems were not tailored from scratch with modern programming language features that guarantee, as much as possible, program correctness. As a consequence, spreadsheets are populated with unacceptable amounts of errors. In other programming language settings, model-based approaches have been proposed to increase productivity and program effectiveness. Within spreadsheets, this approach has also been followed, namely by ClassSheets. In this paper, we propose an extension to ClassSheets to allow the specification of spreadsheets that can be viewed as relational databases. Moreover, we present a transformation from ClassSheet models to UML class diagrams enriched with OCL constraints. This brings to the spreadsheet realm the entire paraphernalia of model validation techniques that are available for UML.

MDSheet: A Framework for Model-driven Spreadsheet Engineering, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , Proceedings of the 34rd International Conference on Software Engineering, p.1395–1398, (2012) Abstracticse12_tooldemo.pdf

n this paper, we present MDSHEET, a framework for the embedding, evolution and inference of spreadsheet models. This framework offers a model-driven software development mechanism for spreadsheet users.

Model-Based Programming Environments for Spreadsheets, Cunha, Jácome, Saraiva João, and Visser Joost , Programming Languages, Volume 7554, p.117–133, (2012) Abstractsblp12.pdf

Although spreadsheets can be seen as a flexible programming environment, they lack some of the concepts of regular programming languages, such as structured data types. This can lead the user to edit the spreadsheet in a wrong way and perhaps cause corrupt or redundant data. We devised a method for extraction of a relational model from a spreadsheet and the subsequent embedding of the model back into the spreadsheet to create a model-based spreadsheet programming environment. The extraction algorithm is specific for spreadsheets since it considers particularities such as layout and column arrangement. The extracted model is used to generate formulas and visual elements that are then embedded in the spreadsheet helping the user to edit data in a correct way. We present preliminary experimental results from applying our approach to a sample of spreadsheets from the EUSES Spreadsheet Corpus.

Model-based Spreadsheet Engineering: Using Relational Models to Improve Spreadsheets, Cunha, Jácome , (2012) Abstractbook.pdf

Spreadsheets can be viewed as programming languages for non-professional programmers. These so-called ``end-user'' programmers vastly outnumber professional programmers creating millions of new spreadsheets every year. As a programming language, spreadsheets lack support for abstraction, testing, encapsulation, or structured programming. As a result, and as numerous studies have shown, the high rate of production is accompanied by an alarming high rate of errors. Some studies report that up to 90% of real-world spreadsheets contain errors. After their initial creation, many spreadsheets turn out to be used for storing and processing increasing amounts of data and supporting increasing numbers of users over long periods of time, making them complicated systems. An emerging solution to handle the complex and evolving software systems is Model-driven Engineering (MDE). To consider models as first class entities and any software artifact as a model or a model element is one of the basic principles of MDE. We adopted some techniques from MDE to solve spreadsheet problems. Most spreadsheets (if not all) lack a proper specification or a model. Using reverse engineering techniques we are able to derive various models from legacy spreadsheets. We use functional dependencies (a formalism that allow us to define how some column values depend on other column values) as building blocks for these models. Models can be used for several spreadsheet improvements, namely refactoring, safe evolution, migration or even generation of edit assistance. The techniques presented in this work are available under the framework HAEXCEL that we developed. It is composed of online and batch tools, reusable HASKELL libraries and OpenOffice.org extensions. A study with several end-users was organized to survey the impact of the techniques we designed. The results of this study indicate that the models can bring great benefits to spreadsheet engineering helping users to commit fewer errors and to work faster.

A Quality Model for Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Peixoto Christophe, and Saraiva João , Proceedings of the 8th International Conference on the Quality of Information and Communications Technology, Quality in ICT Evolution Track, p.231–236, (2012) Abstractquatic2012.pdf

In this paper we present a quality model for spreadsheets, based on the ISO/IEC 9126 standard that defines a generic quality model for software. To each of the software characteristics defined in the ISO/IEC 9126, we associate an equivalent spreadsheet characteristic. Then, we propose a set of spreadsheet specific metrics to assess the quality of a spreadsheet in each of the defined characteristics. In order to obtain the normal distribution of expected values for a spreadsheet in each of the metrics that we propose, we have executed them against all spreadsheets in the large and widely used EUSES spreadsheet corpus. Then, we quantify each characteristic of our quality model after computing the values of our metrics, and we define quality scores for the different ranges of values. Finally, to automate the atribution of a quality score to a given spreadsheet, according to our quality model, we have integrated the computation of the metrics it includes in both a batch and a web-based tool.

SmellSheet Detective: A Tool for Detecting Bad Smells in Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Martins Pedro, and Saraiva João , Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.243–244, (2012) Abstractvlhcc12-td.pdf

This tool demo paper presents SmellSheet Detective: a tool for automatically detecting bad smells in spreadsheets. We have defined a catalog of bad smells in spreadsheet data which was fully implemented in a reusable library for the manipulation of spreadsheets. This library is the building block of the SmellSheet Detective tool, that has been used to detect smells in large, real-world spreadsheet within the EUSES corpus, in order to validate and evolve our bad smells catalog.

Towards a Catalog of Spreadsheet Smells, Cunha, Jácome, Fernandes João P., Ribeiro Hugo, and Saraiva João , Proceedings of the 12th International Conference on Computational Science and Its Applications - Volume Part IV, Berlin, Heidelberg, p.202–216, (2012) Abstracticcsa-sq12.pdf

Spreadsheets are considered to be the most widely used programming language in the world, and reports have shown that 90% of real-world spreadsheets contain errors. In this work, we try to identify spreadsheet smells, a concept adapted from software, which consists of a surface indication that usually corresponds to a deeper problem. Our smells have been integrated in a tool, and were computed for a large spreadsheet repository. Finally, the analysis of the results we obtained led to the refinement of our initial catalog.

Towards an Evaluation of Bidirectional Model-driven Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , User evaluation for Software Engineering Researchers, p.25–28, (2012) Abstractuser12.pdf

Spreadsheets are widely recognized as popular programming systems with a huge number of spreadsheets being created every day. Also, spreadsheets are often used in the decision processes of profit-oriented companies. While this illustrates their practical importance, studies have shown that up to 90% of real-world spreadsheets contain errors. In order to improve the productivity of spreadsheet end-users, the software engineering community has proposed to employ model-driven approaches to spreadsheet development. In this paper we describe the evaluation of a bidirectional model-driven spreadsheet environment. In this environment, models and data instances are kept in conformity, even after an update on any of these artifacts. We describe the issues of an empirical study we plan to conduct, based on our previous experience with end-user studies. Our goal is to assess if this model-driven spreadsheet development framework does in fact contribute to improve the productivity of spreadsheet users.

2011

HaExcel: A Model-Based Spreadsheet Evolution System (Poster), Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, September, (2011) Abstractposter.vlhcc11.png

n/a

End-users Productivity in Model-based Spreadsheets: An Empirical Study, Beckwith, Laura, Cunha Jácome, Fernandes João Paulo, and Saraiva João , Proceedings of the Third International Symposium on End-User Development, June, Heidelberg, p.282–288, (2011) Abstractiseud11.pdf

Spreadsheetsarewidelyusedandstudiesshowthatmostoftheexisting ones contain non-trivial errors. To improve end-users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets. In this paper we conduct the first empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets. We present and analyze here the results achieved.

An Empirical Study on End-users Productivity Using Model-based Spreadsheets, Beckwith, Laura, Cunha Jácome, Fernandes João Paulo, and Saraiva João , Proceedings of the European Spreadsheet Risks Interest Group, July, p.87–100, (2011) Abstracteusprig11.pdf

Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain non-trivial errors. To improve end-users productivity, recent research proposes the use of a model-driven engineering approach to spreadsheets. In this paper we conduct the first systematic empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets, and we present and analyze here the results achieved.

Embedding and Evolution of Spreadsheet Models in Spreadsheet Systems, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.186–201, (2011) Abstractvlhcc11.pdf

This paper describes the embedding of ClassSheet models in spreadsheet systems. ClassSheet models are well-known and describe the business logic of spreadsheet data. We embed this domain specific model representation on the (general purpose) spreadsheet system it models. By defining such an embedding, we provide end users a model-driven engineering spreadsheet developing environment. End users can interact with both the model and the spreadsheet data in the same environment. Moreover, we use advanced techniques to evolve spreadsheets and models and to have them synchronized. In this paper we present our work on extending a widely used spreadsheet system with such a model-driven spreadsheet engineering environment.

Type-Safe Evolution of Spreadsheets, Cunha, Jácome, Visser Joost, Alves Tiago, and Saraiva João , Proceedings of the 14th International Conference Fundamental Approaches to Software Engineering (FASE '11): Part of the Joint European Conferences on Theory and Practice of Software (ETAPS '11), Volume 6603, p.186–201, (2011) Abstractfase11.pdf

Spreadsheets are notoriously error-prone. To help avoid the introduction of errors when changing spreadsheets, models that capture the structure and interdependencies of spreadsheets at a conceptual level have been proposed. Thus, spreadsheet evolution can be made safe within the confines of a model. As in any other model/instance setting, evolution may not only require changes at the instance level but also at the model level. When model changes are required, the safety of instance evolution can not be guarded by the model alone. We have designed an appropriate representation of spreadsheet models, including the fundamental notions of formulæand references. For these models and their instances, we have designed coupled transformation rules that cover specific spreadsheet evolution steps, such as the insertion of columns in all occurrences of a repeated block of cells. Each model-level transformation rule is coupled with instance level migration rules from the source to the target model and vice versa. These coupled rules can be composed to create compound transformations at the model level inducing compound transformations at the instance level. This approach guarantees safe evolution of spreadsheets even when models change.

2010

Automatically Inferring ClassSheet Models from Spreadsheets, Cunha, Jácome, Erwig Martin, and Saraiva João , Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.93–100, (2010) Abstractvlhcc10.pdf

Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints. However, designing such models is time consuming and requires expertise beyond the knowledge to work with spreadsheets. Legacy spreadsheets pose a particular challenge to the approach of controlling spreadsheet evolution through higher-level models, because the need for a model might be overshadowed by two problems: (A) The benefit of creating a spreadsheet is lacking since the legacy spreadsheet already exists, and (B) existing data must be transferred into the new model-generated spreadsheet. To address these problems and to support the model-driven spreadsheet engineering approach, we have developed a tool that can automatically infer ClassSheet models from spreadsheets. To this end, we have adapted a method to infer entity/relationship models from relational database to the spreadsheets/ClassSheets realm. We have implemented our techniques in the HAEXCEL framework and integrated it with the ViTSL/Gencel spreadsheet generator, which allows the automatic generation of refactored spreadsheets from the inferred ClassSheet model. The resulting spreadsheet guides further changes and provably safeguards the spreadsheet against a large class of formula errors. The developed tool is a significant contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.

Jácome Cunha

jacome@fct.unl.pt

Publications