Publications

Export 36 results:
Sort by: [ Author  (Desc)] Title Type Year
A B [C] D E F G H I J K L M N O P Q R S T U V W X Y Z   [Show ALL]
C
Model-Based Programming Environments for Spreadsheets, Cunha, Jácome, Mendes Jorge, Saraiva João, and Visser Joost , Journal of Science of Computer Programming (SCP), Volume 96, p.254–275, (2014) Abstractscp14.pdfWebsite

n/a

Towards a Catalog of Spreadsheet Smells, Cunha, Jácome, Fernandes João P., Ribeiro Hugo, and Saraiva João , Proceedings of the 12th International Conference on Computational Science and Its Applications - Volume Part IV, Berlin, Heidelberg, p.202–216, (2012) Abstracticcsa-sq12.pdf

Spreadsheets are considered to be the most widely used programming language in the world, and reports have shown that 90% of real-world spreadsheets contain errors. In this work, we try to identify spreadsheet smells, a concept adapted from software, which consists of a surface indication that usually corresponds to a deeper problem. Our smells have been integrated in a tool, and were computed for a large spreadsheet repository. Finally, the analysis of the results we obtained led to the refinement of our initial catalog.

Automatically Inferring ClassSheet Models from Spreadsheets, Cunha, Jácome, Erwig Martin, and Saraiva João , Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.93–100, (2010) Abstractvlhcc10.pdf

Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints. However, designing such models is time consuming and requires expertise beyond the knowledge to work with spreadsheets. Legacy spreadsheets pose a particular challenge to the approach of controlling spreadsheet evolution through higher-level models, because the need for a model might be overshadowed by two problems: (A) The benefit of creating a spreadsheet is lacking since the legacy spreadsheet already exists, and (B) existing data must be transferred into the new model-generated spreadsheet. To address these problems and to support the model-driven spreadsheet engineering approach, we have developed a tool that can automatically infer ClassSheet models from spreadsheets. To this end, we have adapted a method to infer entity/relationship models from relational database to the spreadsheets/ClassSheets realm. We have implemented our techniques in the HAEXCEL framework and integrated it with the ViTSL/Gencel spreadsheet generator, which allows the automatic generation of refactored spreadsheets from the inferred ClassSheet model. The resulting spreadsheet guides further changes and provably safeguards the spreadsheet against a large class of formula errors. The developed tool is a significant contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.

ES-SQL: Visually Querying Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João , Proceedings of the 2014 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.203–204, (2014) Abstractvlhcc14-td.pdf

This paper presents ES-SQL, an embedded tool for visually constructing queries over spreadsheets. This tool provides an expressive query environment which has knowledge on the business logic of spreadsheets, and by this knowledge it assists the user in defining the intended queries.

Model-Based Spreadsheet Engineering, Cunha, Jácome , March, (2011) Abstractthesis.pdf

Spreadsheets can be viewed as programming languages for non-professional programmers. These so-called ``end-user'' programmers vastly outnumber professional programmers creating millions of new spreadsheets every year. As a programming language, spreadsheets lack support for abstraction, testing, encapsulation, or structured programming. As a result, and as numerous studies have shown, the high rate of production is accompanied by an alarming high rate of errors. Some studies report that up to 90% of real-world spreadsheets contain errors. After their initial creation, many spreadsheets turn out to be used for storing and processing increasing amounts of data and supporting increasing numbers of users over long periods of time, making them complicated systems. An emerging solution to handle the complex and evolving software systems is Model-driven Engineering (MDE). To consider models as first class entities and any software artifact as a model or a model element is one of the basic principles of MDE. We adopted some techniques from MDE to solve spreadsheet problems. Most spreadsheets (if not all) lack a proper specification or a model. Using reverse engineering techniques we are able to derive various models from legacy spreadsheets. We use functional dependencies (a formalism that allow us to define how some column values depend on other column values) as building blocks for these models. Models can be used for several spreadsheet improvements, namely refactoring, safe evolution, migration or even generation of edit assistance. The techniques presented in this work are available under the framework HAEXCEL that we developed. It is composed of online and batch tools, reusable HASKELL libraries and OpenOffice.org extensions. A study with several end-users was organized to survey the impact of the techniques we designed. The results of this study indicate that the models can bring great benefits to spreadsheet engineering helping users to commit less errors and to work faster.

Graphical Querying of Model-Driven Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Pereira Rui, and Saraiva João , Human Interface and the Management of Information. Information and Knowledge Design and Evaluation, Volume 8521, p.419–430, (2014) Abstracthci14.pdf

This paper presents a graphical interface to query model-driven spreadsheets, based on experience with previous work and empirical studies in querying systems, to simplify query construction for typical end-users with little to no knowledge of SQL. We briefly show our previous text based model-driven querying system. Afterwards, we detail our graphical model-driven querying interface, explaining each part of the interface and showing an example. To validate our work, we executed an empirical study, comparing our graphical querying approach to an alternative querying tool, which produced positive results.

HaExcel: A Model-Based Spreadsheet Evolution System (Poster), Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, September, (2011) Abstractposter.vlhcc11.png

n/a

A Quality Model for Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Peixoto Christophe, and Saraiva João , Proceedings of the 8th International Conference on the Quality of Information and Communications Technology, Quality in ICT Evolution Track, p.231–236, (2012) Abstractquatic2012.pdf

In this paper we present a quality model for spreadsheets, based on the ISO/IEC 9126 standard that defines a generic quality model for software. To each of the software characteristics defined in the ISO/IEC 9126, we associate an equivalent spreadsheet characteristic. Then, we propose a set of spreadsheet specific metrics to assess the quality of a spreadsheet in each of the defined characteristics. In order to obtain the normal distribution of expected values for a spreadsheet in each of the metrics that we propose, we have executed them against all spreadsheets in the large and widely used EUSES spreadsheet corpus. Then, we quantify each characteristic of our quality model after computing the values of our metrics, and we define quality scores for the different ranges of values. Finally, to automate the atribution of a quality score to a given spreadsheet, according to our quality model, we have integrated the computation of the metrics it includes in both a batch and a web-based tool.

A Structured Approach to Document Spreadsheets (in preparation), Cunha, Jácome, and Canteiro Diogo , (Submitted) jvlc.pdf
Model-Based Programming Environments for Spreadsheets, Cunha, Jácome, Saraiva João, and Visser Joost , Programming Languages, Volume 7554, p.117–133, (2012) Abstractsblp12.pdf

Although spreadsheets can be seen as a flexible programming environment, they lack some of the concepts of regular programming languages, such as structured data types. This can lead the user to edit the spreadsheet in a wrong way and perhaps cause corrupt or redundant data. We devised a method for extraction of a relational model from a spreadsheet and the subsequent embedding of the model back into the spreadsheet to create a model-based spreadsheet programming environment. The extraction algorithm is specific for spreadsheets since it considers particularities such as layout and column arrangement. The extracted model is used to generate formulas and visual elements that are then embedded in the spreadsheet helping the user to edit data in a correct way. We present preliminary experimental results from applying our approach to a sample of spreadsheets from the EUSES Spreadsheet Corpus.

Embedding, Evolution, and Validation of Spreadsheet Models in Spreadsheet Systems, Cunha, Jácome, Fernandes João P., Mendes Jorge, and Saraiva João , Number TR-HASLab:01:2014, (2014) Abstracttr_embedding.pdf

This paper proposes and validates a model-driven software engineering technique for spreadsheets. The technique that we envision builds on the embedding of spreadsheet models under a widely used spreadsheet system, so that models and their conforming instances are developed under the same environment. In practice, this convenient environment enhances evolution steps at the model level while the corresponding instance is automatically co-evolved. Finally, we have designed and conducted an empirical study with human users in order to assess our technique in production environments. The results of this study are promising and suggest that productivity gains are realizable under our model-driven spreadsheet development setting.

Discovery-Based Edit Assistance for Spreadsheets, Cunha, Jácome, Saraiva João, and Visser Joost , Proceedings of the 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Washington, DC, USA, p.233–237, (2009) Abstractvl-hcc09.pdf

Spreadsheets can be viewed as a highly flexible end-users programming environment which enjoys wide-spread adoption. But spreadsheets lack many of the structured programming concepts of regular programming paradigms. In particular, the lack of data structures in spreadsheets may lead spreadsheet users to cause redundancy, loss, or corruption of data during edit actions. In this paper, we demonstrate how implicit structural properties of spreadsheet data can be exploited to offer edit assistance to spreadsheet users. Our approach is based on the discovery of functional dependencies among data items which allow automatic reconstruction of a relational database schema. From this schema, new formulas and visual objects are embedded into the spreadsheet to offer features for auto-completion, guarded deletion, and controlled insertion. Schema discovery and spreadsheet enhancement are carried out automatically in the background and do not disturb normal user experience.

Embedding Model-Driven Spreadsheet Queries in Spreadsheet Systems, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João , Proceedings of the 2014 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.151-154, (2014) Abstractvlhcc14.pdf

Spreadsheets are widely used not only to define mathematical expressions, but also to store large and complex data. To query such data is usually a difficult task to perform, usually for end user. In this work we embed the textual query language in the model-driven spreadsheet environment as a spreadsheet itself. The result is an expressive and powerful query environment that has knowledge of the business logic defined by the spreadsheet data (the spreadsheet model) to guide end users constructing correct queries.

SmellSheet Detective: A Tool for Detecting Bad Smells in Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Martins Pedro, and Saraiva João , Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.243–244, (2012) Abstractvlhcc12-td.pdf

This tool demo paper presents SmellSheet Detective: a tool for automatically detecting bad smells in spreadsheets. We have defined a catalog of bad smells in spreadsheet data which was fully implemented in a reusable library for the manipulation of spreadsheets. This library is the building block of the SmellSheet Detective tool, that has been used to detect smells in large, real-world spreadsheet within the EUSES corpus, in order to validate and evolve our bad smells catalog.

MDSheet: A Framework for Model-driven Spreadsheet Engineering, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , Proceedings of the 34rd International Conference on Software Engineering, p.1395–1398, (2012) Abstracticse12_tooldemo.pdf

n this paper, we present MDSHEET, a framework for the embedding, evolution and inference of spreadsheet models. This framework offers a model-driven software development mechanism for spreadsheet users.

Towards an Evaluation of Bidirectional Model-driven Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , User evaluation for Software Engineering Researchers, p.25–28, (2012) Abstractuser12.pdf

Spreadsheets are widely recognized as popular programming systems with a huge number of spreadsheets being created every day. Also, spreadsheets are often used in the decision processes of profit-oriented companies. While this illustrates their practical importance, studies have shown that up to 90% of real-world spreadsheets contain errors. In order to improve the productivity of spreadsheet end-users, the software engineering community has proposed to employ model-driven approaches to spreadsheet development. In this paper we describe the evaluation of a bidirectional model-driven spreadsheet environment. In this environment, models and data instances are kept in conformity, even after an update on any of these artifacts. We describe the issues of an empirical study we plan to conduct, based on our previous experience with end-user studies. Our goal is to assess if this model-driven spreadsheet development framework does in fact contribute to improve the productivity of spreadsheet users.

Extension and Implementation of ClassSheet Models, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João , Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.19–22, (2012) Abstractvlhcc12.pdf

n this paper we explore the use of models in the context of spreadsheet engineering. We review a successful spreadsheet modeling language, whose semantics we further extend. With this extension we bring spreadsheet models closer to the business models of spreadsheets themselves. An addon for a widely used spreadsheet system, providing bidirectional model-driven spreadsheet development, was also improved to include the proposed model extension.

Evaluating Refactorings for Spreadsheet Models, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, Saraiva João Alexandre, and Martins Pedro , Journal of Systems and Software, Volume 118, p.234-250, (2016) Abstractmain.pdf

Software refactoring is a well-known technique that provides transformations on software artifacts with the aim of improving their overall quality.

In the past, we have proposed a catalog of refactoring for spreadsheet models expressed in the ClassSheets modeling language, which allows us to specify the business logic of a spreadsheet in an object-oriented fashion.

Reasoning about spreadsheets at the model level enhances a model-driven spreadsheet environment where a ClassSheet model and its conforming instance (the spreadsheet data) automatically co-evolves after a refactoring is applied at the model level. Our motivation for such research was to improve the model and its conforming instance: the spreadsheet data.

In this paper we define such refactorings using previously proposed evolution steps for models and instances.

We also present an empirical study we designed and conducted in order to confirm our original intuition that these refactorings have a positive impact on end-user productivity, both in terms of effectiveness and efficiency.

The results are presented not only in terms of productivity changes between refactored and non-refactored scenarios, but also in terms of overall user satisfaction, relevance, and experience.

In almost all cases the refactorings indeed improved end-users productivity. Moreover, in most cases users were more engaged with the refactored version of the spreadsheets they worked with.

Automatically Inferring Models from Spreadsheets, Cunha, Jácome, Erwig Martin, Mendes Jorge, and Saraiva João , Automated Software Engineering (ASE), Volume 23, Issue 3, p.361-392, (2016) Abstractase14.pdfWebsite

Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is difficult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: First, we have demonstrated its applicability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not fit well spreadsheets as they do not allow to express layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.

From Spreadsheets to Relational Databases and Back, Cunha, Jácome, Saraiva João, and Visser Joost , Proceedings of the 2009 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, New York, NY, USA, p.179–188, (2009) Abstractpepm09.pdf

This paper presents techniques and tools to transform spreadsheets into relational databases and back. A set of data refinement rules is introduced to map a tabular datatype into a relational database schema. Having expressed the transformation of the two data models as data refinements, we obtain for free the functions that migrate the data. We use well-known relational database techniques to optimize and query the data. Because data refinements define bidirectional transformations we can map such database back to an optimized spreadsheet. We have implemented the data refinement rules and we have constructed tools to manipulate, optimize and refactor Excel-like spreadsheets.

Querying Model-Driven Spreadsheets, Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João , Proceedings of the 2013 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.83–86, (2013) Abstractvlhcc2013-query.pdf

Spreadsheets are being used with many different purposes that range from toy applications to complete information systems. In any of these cases, they are often used as data repositories that can grow significantly. As the amount of data grows, it also becomes more difficult to extract concrete information out of them. This paper focuses on the problem of spreadsheet querying. In particular, we propose an expressive and composable technique where intuitive queries can be defined. Our approach builds on a model-driven spreadsheet development environment, and queries are expressed referencing entities in the model of a spreadsheet instead of in its actual data. Finally, the system that we have implemented relies on Google's query function for spreadsheets.

From Relational ClassSheets to UML+OCL, Cunha, Jácome, Fernandes João Paulo, and Saraiva João , Proceedings of the Software Engineering Track at the 27th Annual ACM Symposium On Applied Computing (SAC 2012), p.1151–1158, (2012) Abstractsac-se12.pdf

Spreadsheets are among the most popular programming languages in the world. Unfortunately, spreadsheet systems were not tailored from scratch with modern programming language features that guarantee, as much as possible, program correctness. As a consequence, spreadsheets are populated with unacceptable amounts of errors. In other programming language settings, model-based approaches have been proposed to increase productivity and program effectiveness. Within spreadsheets, this approach has also been followed, namely by ClassSheets. In this paper, we propose an extension to ClassSheets to allow the specification of spreadsheets that can be viewed as relational databases. Moreover, we present a transformation from ClassSheet models to UML class diagrams enriched with OCL constraints. This brings to the spreadsheet realm the entire paraphernalia of model validation techniques that are available for UML.

Refactoring meets Model-Driven Spreadsheet Evolution, Cunha, Jácome, Fernandes João Paulo, Martins Pedro, Pereira Rui, and Saraiva João , Proceedings of the 9th International Conference on the Quality of Information and Communications Technology, Quality in Model Driven Engineering Track, p.196–201, (2014) Abstractquatic14.pdf

Software refactoring is a well-known technique that provides transformations on software artifacts with the aim of improving their overall quality. In this paper we present a set of refactorings for ClassSheets, a modeling language that allows to specify the business logic of a spreadsheet in an object-oriented fashion. The set of refactorings that we propose allows us to improve the quality of these spreadsheet models. Moreover, it is implemented in a setting that guarantees that all model refactorings are automatically carried to all the corresponding (spreadsheet) instances, thus providing an automatic evolution of the data so it is always synchronized with the model.

Type-Safe Evolution of Spreadsheets, Cunha, Jácome, Visser Joost, Alves Tiago, and Saraiva João , Number DI-CCTC-10-09, (2010) Abstracttr_evolution.pdf

Spreadsheets are notoriously error-prone. To help avoid the introduction of errors when changing spreadsheets, models that capture the structure and inter-dependencies of spreadsheets at a conceptual level have been proposed. Thus, spreadsheet evolution can be made safe within the confines of a model. As in any other model/instance setting, evolution may not only require changes at the instance level but also at the model level. When model changes are required, the safety of instance evolution can not be guarded by the model alone. Coupled transformation of models and instances are supported by the 2LT platform and have been applied for transformation of algebraic datatypes, XML schemas, and relational database models. We have extended 2LT to spreadsheet evolution. We have designed an appropriate representation of spreadsheet models, including the fundamental notions of formulæ, references, and blocks of cells. For these models and their instances, we have designed coupled transformation rules that cover specific spreadsheet evolution steps, such as extraction of a block of cells into a separate sheet or insertion of columns in all occurrences of a repeated block of cells. Each model-level transformation rule is coupled with instance level migration rules from the source to the target model and vice versa. These coupled rules can be composed to create compound transformations at the model level that induce compound transformations at the instance level. With this approach, spreadsheet evolution can be made safe, even when model changes are involved.

Embedding, Evolution, and Validation of Spreadsheet Models in Spreadsheet Systems, Cunha, Jácome, Fernandes João P., Mendes Jorge, and Saraiva João , IEEE Transactions on Software Engineering, Volume 41, Issue 3, p.241-263, (2014) Abstracttse14.pdfWebsite

This paper proposes and validates a model-driven software engineering technique for spreadsheets. The technique that we envision builds on the embedding of spreadsheet models under a widely used spreadsheet system. This means that we enable the creation and evolution of spreadsheet models under a spreadsheet system. More precisely, we embed ClassSheets, a visual language with a syntax similar to the one offered by common spreadsheets, that was created with the aim of specifying spreadsheets. Our embedding allows models and their conforming instances to be developed under the same environment. In practice, this convenient environment enhances evolution steps at the model level while the corresponding instance is automatically co-evolved. Finally, we have designed and conducted an empirical study with human users in order to assess our technique in production environments. The results of this study are promising and suggest that productivity gains are realizable under our model-driven spreadsheet development setting.