Embedding, Evolution, and Validation of Spreadsheet Models in Spreadsheet Systems,
Cunha, Jácome, Fernandes João P., Mendes Jorge, and Saraiva João
, IEEE Transactions on Software Engineering, Volume 41, Issue 3, p.241-263, (2014)
AbstractThis paper proposes and validates a model-driven software engineering technique for spreadsheets. The technique that we envision builds on the embedding of spreadsheet models under a widely used spreadsheet system. This means that we enable the creation and evolution of spreadsheet models under a spreadsheet system. More precisely, we embed ClassSheets, a visual language with a syntax similar to the one offered by common spreadsheets, that was created with the aim of specifying spreadsheets. Our embedding allows models and their conforming instances to be developed under the same environment. In practice, this convenient environment enhances evolution steps at the model level while the corresponding instance is automatically co-evolved. Finally, we have designed and conducted an empirical study with human users in order to assess our technique in production environments. The results of this study are promising and suggest that productivity gains are realizable under our model-driven spreadsheet development setting.
Embedding and Evolution of Spreadsheet Models in Spreadsheet Systems,
Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João
, Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.186–201, (2011)
AbstractThis paper describes the embedding of ClassSheet models in spreadsheet systems. ClassSheet models are well-known and describe the business logic of spreadsheet data. We embed this domain specific model representation on the (general purpose) spreadsheet system it models. By defining such an embedding, we provide end users a model-driven engineering spreadsheet developing environment. End users can interact with both the model and the spreadsheet data in the same environment. Moreover, we use advanced techniques to evolve spreadsheets and models and to have them synchronized. In this paper we present our work on extending a widely used spreadsheet system with such a model-driven spreadsheet engineering environment.
From Spreadsheets to Relational Databases and Back,
Cunha, Jácome, Saraiva João, and Visser Joost
, Proceedings of the 2009 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, New York, NY, USA, p.179–188, (2009)
AbstractThis paper presents techniques and tools to transform spreadsheets into relational databases and back. A set of data refinement rules is introduced to map a tabular datatype into a relational database schema. Having expressed the transformation of the two data models as data refinements, we obtain for free the functions that migrate the data. We use well-known relational database techniques to optimize and query the data. Because data refinements define bidirectional transformations we can map such database back to an optimized spreadsheet. We have implemented the data refinement rules and we have constructed tools to manipulate, optimize and refactor Excel-like spreadsheets.
Towards a Catalog of Spreadsheet Smells,
Cunha, Jácome, Fernandes João P., Ribeiro Hugo, and Saraiva João
, Proceedings of the 12th International Conference on Computational Science and Its Applications - Volume Part IV, Berlin, Heidelberg, p.202–216, (2012)
AbstractSpreadsheets are considered to be the most widely used programming language in the world, and reports have shown that 90% of real-world spreadsheets contain errors. In this work, we try to identify spreadsheet smells, a concept adapted from software, which consists of a surface indication that usually corresponds to a deeper problem. Our smells have been integrated in a tool, and were computed for a large spreadsheet repository. Finally, the analysis of the results we obtained led to the refinement of our initial catalog.
From Relational ClassSheets to UML+OCL,
Cunha, Jácome, Fernandes João Paulo, and Saraiva João
, Proceedings of the Software Engineering Track at the 27th Annual ACM Symposium On Applied Computing (SAC 2012), p.1151–1158, (2012)
AbstractSpreadsheets are among the most popular programming languages in the world. Unfortunately, spreadsheet systems were not tailored from scratch with modern programming language features that guarantee, as much as possible, program correctness. As a consequence, spreadsheets are populated with unacceptable amounts of errors. In other programming language settings, model-based approaches have been proposed to increase productivity and program effectiveness. Within spreadsheets, this approach has also been followed, namely by ClassSheets. In this paper, we propose an extension to ClassSheets to allow the specification of spreadsheets that can be viewed as relational databases. Moreover, we present a transformation from ClassSheet models to UML class diagrams enriched with OCL constraints. This brings to the spreadsheet realm the entire paraphernalia of model validation techniques that are available for UML.
Embedding Model-Driven Spreadsheet Queries in Spreadsheet Systems,
Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, Pereira Rui, and Saraiva João
, Proceedings of the 2014 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.151-154, (2014)
AbstractSpreadsheets are widely used not only to define mathematical expressions, but also to store large and complex data. To query such data is usually a difficult task to perform, usually for end user. In this work we embed the textual query language in the model-driven spreadsheet environment as a spreadsheet itself. The result is an expressive and powerful query environment that has knowledge of the business logic defined by the spreadsheet data (the spreadsheet model) to guide end users constructing correct queries.
Type-Safe Evolution of Spreadsheets,
Cunha, Jácome, Visser Joost, Alves Tiago, and Saraiva João
, Number DI-CCTC-10-09, (2010)
AbstractSpreadsheets are notoriously error-prone. To help avoid the introduction of errors when changing spreadsheets, models that capture the structure and inter-dependencies of spreadsheets at a conceptual level have been proposed. Thus, spreadsheet evolution can be made safe within the confines of a model. As in any other model/instance setting, evolution may not only require changes at the instance level but also at the model level. When model changes are required, the safety of instance evolution can not be guarded by the model alone. Coupled transformation of models and instances are supported by the 2LT platform and have been applied for transformation of algebraic datatypes, XML schemas, and relational database models. We have extended 2LT to spreadsheet evolution. We have designed an appropriate representation of spreadsheet models, including the fundamental notions of formulæ, references, and blocks of cells. For these models and their instances, we have designed coupled transformation rules that cover specific spreadsheet evolution steps, such as extraction of a block of cells into a separate sheet or insertion of columns in all occurrences of a repeated block of cells. Each model-level transformation rule is coupled with instance level migration rules from the source to the target model and vice versa. These coupled rules can be composed to create compound transformations at the model level that induce compound transformations at the instance level. With this approach, spreadsheet evolution can be made safe, even when model changes are involved.
A Quality Model for Spreadsheets,
Cunha, Jácome, Fernandes João Paulo, Peixoto Christophe, and Saraiva João
, Proceedings of the 8th International Conference on the Quality of Information and Communications Technology, Quality in ICT Evolution Track, p.231–236, (2012)
AbstractIn this paper we present a quality model for spreadsheets, based on the ISO/IEC 9126 standard that defines a generic quality model for software. To each of the software characteristics defined in the ISO/IEC 9126, we associate an equivalent spreadsheet characteristic. Then, we propose a set of spreadsheet specific metrics to assess the quality of a spreadsheet in each of the defined characteristics. In order to obtain the normal distribution of expected values for a spreadsheet in each of the metrics that we propose, we have executed them against all spreadsheets in the large and widely used EUSES spreadsheet corpus. Then, we quantify each characteristic of our quality model after computing the values of our metrics, and we define quality scores for the different ranges of values. Finally, to automate the atribution of a quality score to a given spreadsheet, according to our quality model, we have integrated the computation of the metrics it includes in both a batch and a web-based tool.
Model-Based Programming Environments for Spreadsheets,
Cunha, Jácome, Saraiva João, and Visser Joost
, Programming Languages, Volume 7554, p.117–133, (2012)
AbstractAlthough spreadsheets can be seen as a flexible programming environment, they lack some of the concepts of regular programming languages, such as structured data types. This can lead the user to edit the spreadsheet in a wrong way and perhaps cause corrupt or redundant data. We devised a method for extraction of a relational model from a spreadsheet and the subsequent embedding of the model back into the spreadsheet to create a model-based spreadsheet programming environment. The extraction algorithm is specific for spreadsheets since it considers particularities such as layout and column arrangement. The extracted model is used to generate formulas and visual elements that are then embedded in the spreadsheet helping the user to edit data in a correct way. We present preliminary experimental results from applying our approach to a sample of spreadsheets from the EUSES Spreadsheet Corpus.
Embedding, Evolution, and Validation of Spreadsheet Models in Spreadsheet Systems,
Cunha, Jácome, Fernandes João P., Mendes Jorge, and Saraiva João
, Number TR-HASLab:01:2014, (2014)
AbstractThis paper proposes and validates a model-driven software engineering technique for spreadsheets. The technique that we envision builds on the embedding of spreadsheet models under a widely used spreadsheet system, so that models and their conforming instances are developed under the same environment. In practice, this convenient environment enhances evolution steps at the model level while the corresponding instance is automatically co-evolved. Finally, we have designed and conducted an empirical study with human users in order to assess our technique in production environments. The results of this study are promising and suggest that productivity gains are realizable under our model-driven spreadsheet development setting.
Automatically Inferring ClassSheet Models from Spreadsheets,
Cunha, Jácome, Erwig Martin, and Saraiva João
, Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, Washington, DC, USA, p.93–100, (2010)
AbstractMany errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints. However, designing such models is time consuming and requires expertise beyond the knowledge to work with spreadsheets. Legacy spreadsheets pose a particular challenge to the approach of controlling spreadsheet evolution through higher-level models, because the need for a model might be overshadowed by two problems: (A) The benefit of creating a spreadsheet is lacking since the legacy spreadsheet already exists, and (B) existing data must be transferred into the new model-generated spreadsheet. To address these problems and to support the model-driven spreadsheet engineering approach, we have developed a tool that can automatically infer ClassSheet models from spreadsheets. To this end, we have adapted a method to infer entity/relationship models from relational database to the spreadsheets/ClassSheets realm. We have implemented our techniques in the HAEXCEL framework and integrated it with the ViTSL/Gencel spreadsheet generator, which allows the automatic generation of refactored spreadsheets from the inferred ClassSheet model. The resulting spreadsheet guides further changes and provably safeguards the spreadsheet against a large class of formula errors. The developed tool is a significant contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.
Refactoring meets Model-Driven Spreadsheet Evolution,
Cunha, Jácome, Fernandes João Paulo, Martins Pedro, Pereira Rui, and Saraiva João
, Proceedings of the 9th International Conference on the Quality of Information and Communications Technology, Quality in Model Driven Engineering Track, p.196–201, (2014)
AbstractSoftware refactoring is a well-known technique that provides transformations on software artifacts with the aim of improving their overall quality. In this paper we present a set of refactorings for ClassSheets, a modeling language that allows to specify the business logic of a spreadsheet in an object-oriented fashion. The set of refactorings that we propose allows us to improve the quality of these spreadsheet models. Moreover, it is implemented in a setting that guarantees that all model refactorings are automatically carried to all the corresponding (spreadsheet) instances, thus providing an automatic evolution of the data so it is always synchronized with the model.
MDSheet: A Framework for Model-driven Spreadsheet Engineering,
Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João
, Proceedings of the 34rd International Conference on Software Engineering, p.1395–1398, (2012)
Abstractn this paper, we present MDSHEET, a framework for the embedding, evolution and inference of spreadsheet models. This framework offers a model-driven software development mechanism for spreadsheet users.
Graphical Querying of Model-Driven Spreadsheets,
Cunha, Jácome, Fernandes João Paulo, Pereira Rui, and Saraiva João
, Human Interface and the Management of Information. Information and Knowledge Design and Evaluation, Volume 8521, p.419–430, (2014)
AbstractThis paper presents a graphical interface to query model-driven spreadsheets, based on experience with previous work and empirical studies in querying systems, to simplify query construction for typical end-users with little to no knowledge of SQL. We briefly show our previous text based model-driven querying system. Afterwards, we detail our graphical model-driven querying interface, explaining each part of the interface and showing an example. To validate our work, we executed an empirical study, comparing our graphical querying approach to an alternative querying tool, which produced positive results.
Towards an Evaluation of Bidirectional Model-driven Spreadsheets,
Cunha, Jácome, Fernandes João Paulo, Mendes Jorge, and Saraiva João
, User evaluation for Software Engineering Researchers, p.25–28, (2012)
AbstractSpreadsheets are widely recognized as popular programming systems with a huge number of spreadsheets being created every day. Also, spreadsheets are often used in the decision processes of profit-oriented companies. While this illustrates their practical importance, studies have shown that up to 90% of real-world spreadsheets contain errors. In order to improve the productivity of spreadsheet end-users, the software engineering community has proposed to employ model-driven approaches to spreadsheet development. In this paper we describe the evaluation of a bidirectional model-driven spreadsheet environment. In this environment, models and data instances are kept in conformity, even after an update on any of these artifacts. We describe the issues of an empirical study we plan to conduct, based on our previous experience with end-user studies. Our goal is to assess if this model-driven spreadsheet development framework does in fact contribute to improve the productivity of spreadsheet users.