文章基本信息

标题：Possible particular abstract approach to validation.
作者：Masar, Alojz ; Tanuska, Pavol ; Masarova, Renata 等
期刊名称：Annals of DAAAM & Proceedings
印刷版ISSN：1726-9679
出版年度：2009
期号：January
语种：English
出版社：DAAAM International Vienna
摘要：There are a lot of points of view used to look at a validation. Almost every computer program, computer system or software framework deals with validation in a particular manner. A reason why validation is investigated determines meaning of the words used to define it. The words such as sensibility, reasonability, correctness, accuracy are usually used to define validation. There is no commonly accepted definition of the validation. Generally, the validation is a process of ensuring that the system operates on clean, correct and useful data and its operations are correct and useful too. Validation rules are realized in the system for check of correctness and meaningfulness of data that are input to the system or the viability of operations. The rules may be implemented through automated facilities of a data dictionary, or by the inclusion of explicit (application program) validation logic.
关键词：Software licensing

Possible particular abstract approach to validation.

Masar, Alojz ; Tanuska, Pavol ; Masarova, Renata 等

1. INTRODUCTION

There are a lot of points of view used to look at a validation. Almost every computer program, computer system or software framework deals with validation in a particular manner. A reason why validation is investigated determines meaning of the words used to define it. The words such as sensibility, reasonability, correctness, accuracy are usually used to define validation. There is no commonly accepted definition of the validation. Generally, the validation is a process of ensuring that the system operates on clean, correct and useful data and its operations are correct and useful too. Validation rules are realized in the system for check of correctness and meaningfulness of data that are input to the system or the viability of operations. The rules may be implemented through automated facilities of a data dictionary, or by the inclusion of explicit (application program) validation logic.

Suitable formalism for UML models was introduced by OMG. Object Constraint Language (OCL) is a formal language used to describe expressions (OMG, 2006). It could be used to describe invariants, pre- and post-condition, guards and constraints on UML models. Other attempt has been made by The Apache Software Foundation. Commons validator is a framework which provide a configurable (typically XML) validation engine and reusable simple validation methods (ASF, 2006). The Java Community Process standardises data validation for Java by JSR-303 (JCP, 2009). The proposed final draft is published. A reference implementation of this standard is Hibernate validator 4.0.0 (Hibernate, 2009).

These efforts solve the validation problem partially and usually are focused to the data validation. There is no common background, which permit their orchestration. This paper deals with the validation in the systems based on computer programs. Following sections set possible approach to explore validations in the particular but sufficiently abstract manner.

2. EXAMINATION

Validation is an automatic computer check to ensure that the data entered or the executed operations are sensible and reasonable, according to the rules defined in the system and the data realized in the system or reachable by the system.

Differently, the validation does not check the accuracy of data or operations according to the rules defined outside the system or to the data unreachable by the system. The validation can be designed into system with several differing approaches e.g. user interface code, application code, or database constraints.

2.1 Data validation

At first it is necessary to say some words about data types. It is assumed in this paper, that data has type. In a broad sense, the data type defines a set of values and the allowable operations on those values. Sufficient overview of common types is published on Wikipedia. More detailed description is in (Wirth, 1985).

Basic data validation is based on the rules concerned with own feature of the particular data. There are two basic types of data validation.

* Range checking (in a broad sense)--e.g. less 3, in <2.5), isUpper. Special case of range checking is data type validation--e.g. isNumber, isChar, isTypeOf, isTypeOf.

* Presence checking--e.g. obligatory (isRequired), existence of value (notNull)

Simply, the evaluation of these rules is based only on the check of the value of particular data and no more pieces of other data are needed.

Complex data validation is based on the more complex rules. Other piece of data is required to evaluate these rules. Complex data validation could be based on the result of another validation. This piece of data has to be obtainable at the moment of evaluation. There are two ways, how can be the additional piece of data reached.

* It is derived from already presented data in the system--the use of unmodified value (or values) of the other data is a special case of this.

* Obtained (and possibly derived) from outside the system--this obtained data have to be temporally presented (during evaluation) in the system. This sounds like preceding point, but in fact, interaction outside the system is fundamentally different activity from processing internal data.

2.2 Operation validation

There are two possible approaches. The first approach disallows adding or removing operations. The operation in contrast to data is fixed part of the system. Systems neither change the set of his functions (represented by his operations) nor theirs nature. Therefore the validation of the particular operation is reduced to detection of the operation's viability. The viability of operation means the system could execute particular operation or not (at the right moment). It not express the fact the operation is runnable or not. The viability of operation depends on the current set of data values at the moment of validation. This type of systems is called static.

The second approach allows the modification of the set of operations. The operation could be added to the system or could be removed from it. In spite of this, the deterministic nature of operation guarantee, that the function of the particular operation cannot be changed (operation are added or removed in its entirety). The viability of the operation depends on the current set of data values at the moment of validation and on the current set of operations of the system. This approach allows examining the presence of the particular operation. This type of systems is called dynamic.

3. DEFINITION

3.1 Validation rule

Validation rule determines the validity condition of data (and theirs values) or operations in the system. Condition evaluation is called check. Check is set off by some event in the system or by interaction of the system with something outside (e.g. another system or user) of the system and determines if the data or the operations (or both) of the system are valid or not.

Definition: Let S is set of all possible data, its values in the system or reachable by the system and all operations of the system. Let D is subset of the S. Then function v: D [right arrow] {0,1} is primitive validation function.

The primitive validation function represents simple condition at particular moment. The evaluation of this function represents the check. The state of the system is determined by data values and by set of operations at the moment. It is evident that the state could be represented by D at some moment.

Let S' is a subset of S. The set S' consists of those elements of S which are presented in the system or reachable by the system at concrete moment. A reaction of the system to the event could cause a change of the system state and so cause the change of the set S'. It could be changed after the event has been set off. This is the reason that the domain (in mathematical sense) of some primitive validation function would not be the subset of the current state and the function to become partially or whole undefined for this state (for each element of the set S'). To avoid this awkward situation we extend the codomain of primitive validation function to the set {0, 1, [epsilon]} and define validation function for each element of the complement of S' [intersection] D in S'. These elements are imaged to the value [epsilon].

Definition: Let S is set of all possible data and its values in the system or reachable by the system and all operations of the system. Let D and S' are subsets of S. Let function v is primitive validation function v: D [right arrow] {0,1}. Let [sup.c]S = (S' [intersection] D)\S'. Function [bar.v]: S' [right arrow] {0,1, [epsilon]} defined by [for all] x [member of] S' [intersection] D, y [member of] [sup.c]S; V(x) = [bar.v](x), [bar.v](y) = [epsilon] is validation function in S'.

There is the second motivation to put [epsilon] into game. The value s represents exception in real systems. The well designed system ought to be deal with exceptions and so the value [epsilon] simply respects this fact.

3.2 State

As we mentioned earlier, the state consists of data, its values and operations. We have silently assumed the system is "alive" and could actively deal with the events coming to it or arising in it as a consequence its own activity. This assumption permits us to deal with dynamic behaviour of the system. The activity of the system could lead to the interaction outside of the system or to the change of the system itself (or both). The only way, how could be the system changed is through change of its state. But the assumption of the live system is not necessary. The state could be changed outside the system. System could use data of another system or could activate functions of another system or use operations prepared to it elsewhere (e.g. plugins). The system operates only in his own internal space. Therefore it must have some knowledge about external data or operations. There are two approaches to this fact. Stricter one considers this knowledge the integral part of the system and therefore does not allow "temporary" states. Simply, every state is "regular" state. Less strict one does not pay attention to the temporary states between event and reaction. None of them is better. Which approach ought to be used depend on specific conditions of examination of the system. One way or the other, the state of the system consists of the sets of the system data and its values, temporary data and its values, system operations and temporary operations.

3.3 Data

Data represents a piece of information in the system. Data must be identifiable and characterised by the type in order to be usable by the system. The type is a tuple T = [M, O]; M is the set of allowed values and O is set of operations on this values. Each type must be recognizable for the system. The data is a tuple [identificator, T] where the identificator is unique in the system and recognizable by the system. Finally, the data value is a tuple [data, x] where x is the element of the set M. This notation is a little bit tricky. The used types are usually well known. Therefore we sometime use notation identificator:T for data and idetificator:T=x for its value. If some mistake could be avoided we use only identificator for data and identificator = x for its value.

It is not important how the system indentifies the types, data or values. This assumption respect the fact, that running software program (in binary format) does not address data by the same way as a programmer in the source form of the same program. Really, the programmer could create program without this knowledge. This definition deals with primitive types, data and values from the system's point of view. This means, that the system knows them and they are elementary.

4. CONCLUSION

This adumbrate approach allows map the validation problem to the well known results of set theory and algebra. On one hand, the degree of abstraction enables us to direct our attention to principal characteristics of the system, but on the other hand, it provides mechanisms to deal with details. Verbal mapping of this formalism to the (more or less) vague terms of software systems is very important part of this approach. It enables interpretation of results in a more convenient form and simplifies practical use.

We have exposed some fundamentals, but have left many questions unanswered. Operation and events of the system have not been exposed at all. Definition of these terms, constructing more complicated validation functions and values, research a dynamics of the system, interaction of checks and operations, mapping results to the real problems could be a task of further work.

5. REFERENCES

ASF (2006), http://commons.apache.org/validator/--Common validators version 1.3.0, Accessed: 2006-06-07

Hibernate (2009), http://www.hibernate.org/ hib_docs/validator/-Hibernate validator 4.0.0 Beta1, Accessed: 2009-05-05

JCP (2009), http://jcp.org/en/jsr/detail?id=303--JSR 303: Bean Validation, Accessed: 2006-04-12

OMG (2006), http://www.omg.org/docs/formal/06-05-01.pdf-Object Constraint Language Version 2.0, Accessed: 2006-06-07

Wirth, N. (1985), Algorithms and Data Structures, Prentice Hall, ISBN: 978-0130220059