It Might Be Valid, But It's Still Wrong Paul Maskens and Andy Kramek

Similar documents
MITOCW ocw f99-lec07_300k

Post Experiment Interview Questions

Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5

In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology.

Instructor: Craig Duckett. Lecture 04: Thursday, April 5, Relationships

Formal Methods of Software Design, Eric Hehner, segment 1 page 1 out of 5

The following content is provided under a Creative Commons license. Your support

Skill 1: Multiplying Polynomials

So the UI needs to change when the button changes. When the button becomes submit it needs to change color and get bigger.

Linked Lists. What is a Linked List?

I'm Andy Glover and this is the Java Technical Series of. the developerworks podcasts. My guest is Brian Jakovich. He is the

Binary Search Trees. Carlos Moreno uwaterloo.ca EIT

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between

Smart formatting for better compatibility between OpenOffice.org and Microsoft Office

PROFESSOR: Well, yesterday we learned a bit about symbolic manipulation, and we wrote a rather stylized

MITOCW watch?v=yarwp7tntl4

CS103 Spring 2018 Mathematical Vocabulary

PROFESSOR: Well, now that we've given you some power to make independent local state and to model objects,

Who am I? I m a python developer who has been working on OpenStack since I currently work for Aptira, who do OpenStack, SDN, and orchestration

Binary Search Trees. Carlos Moreno uwaterloo.ca EIT

The following content is provided under a Creative Commons license. Your support

The following content is provided under a Creative Commons license. Your support

MITOCW watch?v=w_-sx4vr53m

MITOCW watch?v=se4p7ivcune

MITOCW watch?v=4dj1oguwtem

CPSC 320 Sample Solution, Playing with Graphs!

The following content is provided under a Creative Commons license. Your support

6 Stephanie Well. It s six, because there s six towers.

Advisor Answers. January, Visual FoxPro 3.0 and 5.0

Designing a Database -- Understanding Relational Design

Q &A on Entity Relationship Diagrams. What is the Point? 1 Q&A

Naming Things in Adafruit IO

MITOCW watch?v=9h6muyzjms0

mismatch between what is maybe possible today and what is going on in many of today's IDEs.

(Refer Slide Time: 01:25)

Contents. What's New. Version released. Newsletter #31 (May 24, 2008) What's New New version released, version 4.3.3

MITOCW watch?v=hverxup4cfg

MITOCW watch?v=v3omvlzi0we

The Stack, Free Store, and Global Namespace

How to Improve Your Campaign Conversion Rates

BBC LEARNING ENGLISH 6 Minute English Wireless furniture for phones

Lab 4. Recall, from last lab the commands Table[], ListPlot[], DiscretePlot[], and Limit[]. Go ahead and review them now you'll be using them soon.

Definition: A data structure is a way of organizing data in a computer so that it can be used efficiently.

Slick The Split:

In today s video I'm going show you how you can set up your own online business using marketing and affiliate marketing.

The following content is provided under a Creative Commons license. Your support

9 R1 Get another piece of paper. We re going to have fun keeping track of (inaudible). Um How much time do you have? Are you getting tired?

I always recommend diversifying and testing more than one source, but make sure it is as targeted as possible.

CPSC W2 Midterm #2 Sample Solutions

MITOCW watch?v=kz7jjltq9r4

6.001 Notes: Section 8.1

Show notes for today's conversation are available at the podcast website.

Autodesk University Step Up Your Game AutoCAD P&ID and SQL: Making Data Work for You Skill Level: All Levels

Understanding Business Objects, Part 1

Intro. Scheme Basics. scm> 5 5. scm>

MITOCW watch?v=0jljzrnhwoi

2: Functions, Equations, and Graphs

How To Make 3-50 Times The Profits From Your Traffic

PROFESSOR: So far in this course we've been talking a lot about data abstraction. And remember the idea is that

MITOCW watch?v=zm5mw5nkzjg

A lot of people make repeated mistakes of not calling their functions and getting errors. Make sure you're calling your functions.

Introduction to Programming

MITOCW watch?v=ninwepprkdq

P1_L3 Operating Systems Security Page 1

CS144 Final Review. Dec 4th, 2009 Tom Wiltzius

Understandable manual? Posted by Max Besser - 02 Feb :10

BBC Learning English 6 Minute English Work s

Instructor (Mehran Sahami):

Well, Hal just told us how you build robust systems. The key idea was-- I'm sure that many of

This lesson is part 5 of 5 in a series. You can go to Invoice, Part 1: Free Shipping if you'd like to start from the beginning.

MITOCW watch?v=sdw8_0rdzuw

MITOCW watch?v=flgjisf3l78

MITOCW watch?v=zlohv4xq_ti

3.7. Vertex and tangent


MITOCW watch?v=rvrkt-jxvko

MITOCW watch?v=qota76ga_fy

The following content is provided under a Creative Commons license. Your support

============================================================================

BBC Learning English Face up to Phrasals Mark's Mistake

============================================================================

MITOCW ocw f99-lec12_300k

Blog post on updates yesterday and today:

Foundations, Reasoning About Algorithms, and Design By Contract CMPSC 122

Sample Online Survey Report: Complex Software Application

Problem One: A Quick Algebra Review

CFMG Training Modules Classified Ad Strategy Module

WINDOWS NT FILE SYSTEM INTERNALS : OSR CLASSIC REPRINTS BY RAJEEV NAGAR

Problem Solving through Programming In C Prof. Anupam Basu Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Setting up ODBC, Part 2 Robert Abram

Ruby on Rails Welcome. Using the exercise files

Using icloud's Mail rules to delete a message before you see it.

6.001 Notes: Section 6.1

The following content is provided under a Creative Commons license. Your support

Database management system Prof. D. Janakiram Department of Computer Science and Engineering Indian Institute of Technology, Madras

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Detecting and correcting mistakes

6.001 Notes: Section 1.1

Outline for Today. Euler Tour Trees Revisited. The Key Idea. Dynamic Graphs. Implementation Details. Dynamic connectivity in forests.

Transcription:

Seite 1 von 5 Issue Date: FoxTalk July 2000 It Might Be Valid, But It's Still Wrong Paul Maskens and Andy Kramek This month, Paul Maskens and Andy Kramek discuss the problems of validating data entry. Paul: Okay, Andy, let's see what you make of this one. I have a simple data entry form for inserting a name, address, and phone number into a table. What I want to do is to make sure that the data is valid, so the question is: What sort of code should I be using, and where should I put it? Andy: Well, I'd say that it all depends on what you mean by a "Form," what you mean by a "Table," and, most importantly of all, what you mean by "Valid." Paul: That sort of cop-out won't do! This is a very real problem just how should we set about handling validation of user input. In the good old FoxPro 2.6 days, I've have simply used the VALID snippet of a control and put the code right in there. Andy: But my answer wasn't a cop-out, Paul. The issue of what sort of code, and where to place it, does depend on those three things! For example, if the "Form" is a simple data entry form running inside a VFP application directly against a local VFP table, then there's no problem just put the code into the Valid method of the controls. However, if your "Form" is actually being displayed in a Web browser, or is accessing remote data through ODBC, then you have a whole different set of problems. Paul: And that's is precisely my point where should the validation go then? In the Database Container? Andy: Well, that depends on what you mean by a "Table." Again, if we're talking a pure VFP solution, then it might be possible. But it's not going to work if you're using SQL Pass-Through to access a back-end data server you don't actually have tables in a DBC in that situation. Paul: Hmph! We don't seem to be getting very far. Maybe we should review the possibilities and see where that takes us? Andy: Sounds good to me. Let's start with the design (now there's a novel concept <g>). What are you actually trying to build, and on what sort of platform is it going to run? Paul: I don't know! Actually, on reflection, perhaps it would be more accurate to say that I can't predict how this application is going to evolve. What I can be sure of is that I can't afford to tie myself to a "pure VFP" solution. I must at least plan for the possibility that this application will have to run against a remote data source and might even require multiple user interfaces. Andy: Good, we have a starting point, then. What you're saying is that we'll need to use at least a three-tier model, and we should probably plan for a true n-tier architecture. Paul: I've never really been happy about this distinction. When does a three-tier model become n-tier? Andy: I don't think it works like that. The basis of the designs is completely different. But I agree that the terminology is certainly unclear. I think that the difference is best illustrated diagrammatically. Figure 1 is a layered three-tier model. Figure 1: Layered tree-tier architecture. Paul: Yes, that looks right. Each tier comprises a number of layers, and there are always at least three: an "upward" and a "downward" interface, and a core, which may itself be made up of a number of layers. The basic rule here is that any layer may have knowledge of only the layer immediately below itself. Andy: That's a very important rule too! In this diagram, the middle tier's UI Interface Layer doesn't need to know anything about what exists above itself, but must know how to address the Rules Layer below. Conversely all that the User Interface needs to know about is the Public Interface of the Middle Tier. Paul: Which, yet again, emphasizes the importance of programming to interface, not implementation. So for the n-tier model,

Seite 2 von 5 what's different? Andy: In practice, nothing. What we're really doing is dividing the tiers into separate functional components, as opposed to incorporating the functionality into layers within a single tier. An n-tier diagram might look something like Figure 2. Figure 2: N-tier architecture. Paul: Ah, I see. As we have the three-tier model drawn, the middle tier must include both the actual business rules and the necessary functionality to communicate with the data tier. So if more than one application wants to share the same data, we'd need to duplicate that code in a different middle-tier object. Andy: Exactly in the n-tier model, the task of communicating with the database is no longer part of the middle tier, but is separate in its own right. Paul: That's very clear, Andy. A picture really is worth a thousand words in this case. I agree that the n-tier model is the way we should go, though the price to pay is obviously going to be a more complex set of interfaces and an increase in messaging. But at the risk of keeping to the point, where should our validation go then? Andy: The diagram makes it pretty clear that it really has to go in either the Application Rules Tier, the Data Tier, or both. Paul: I can see why you're saying the Application Rules anything above that tier really has to be UI-specific and, since we're looking for data validation, that should be independent of the UI. I don't see why you'd have validation in both places, though. Andy: That would depend on the implementation. But in general I'd say that business rules must be implemented only in the Rules Tiers and integrity rules only in the Data Tier. Paul: I'm not sure that I follow your distinction here we're not back into "what is data" again, are we? Andy: Not quite <bg>. There's a distinction between rules that are required for enforcing the integrity of data and those that are purely business related. Paul: So what you're saying is that a rule like this: "An entry to the Orders table must reference a valid entry in the Customer table." is a data integrity rule, while "All entries in the Customer table must have a telephone number." is merely a business rule? Andy: Exactly. Violating the first would break the referential integrity of your database, and so it must be implemented in, and by, the database through whatever mechanism it provides for enforcing referential integrity. The second, however, doesn't really matter to the database one way or the other. If a customer record is missing the telephone number field, it won't change my ability to access the database in any way (other than by searching via telephone number, of course). Paul: I have no problem with the database enforcing referential integrity that seems entirely proper. But you also seem to be saying that we should not be enforcing field- and, by extension, record-level validation in the database. If that were really the case, then why would databases have field- and record-level validation rules built into them? Andy: Well, actually they don't. Built-in field/record-level validation is a feature of the Visual FoxPro DBC, but it's not a normal part of a SQL database. Such validation is enforced either through index constraints or in triggers and stored procedures. Some back-end databases will allow for default values to be specified in the table definition, but I don't know of one that permits field-level validation like Visual FoxPro does. Paul: Ah! So it's Visual FoxPro that's out of line here, not other databases. I suppose this is a consequence of the way in which the DBC has to be implemented in a file-based database like Visual FoxPro. Andy: That seems a reasonable assumption, but I wouldn't really know. So where are we now in terms of your original question? Paul: Well, I think we're agreed that I need to implement an n-tier design and that the job of maintaining the referential integrity

Seite 3 von 5 will be left to the database. Business rules will be enforced in their own tier, which will provide a standard interface so that different presentation tiers can access them. That leaves only the issue of "assistive validation" to be addressed, then. Andy: Assistive validation? What does that mean? Paul: I mean validation placed in the UI with the objective of assisting the user during data input. One thing that really annoys me is the kind of interface where I enter a whole lot of data, hit the Save key, and the damn thing then sneers at me and says something like "You may not leave the telephone number field blank!" Assistive validation means trapping this sort of thing immediately when it occurs, rather than waiting for the results of the submission process. Andy: Oh boy, Paul, you really must hate the Web interfaces then <g>. Very few of them have this sort of validation, but I can see where you're coming from, and you do have a good point. Paul: Of course, to implement it we should be making calls into the Application Tier rather than simply adding yet more code to the UI and duplicating functionality. Andy: Good, that sounds very reasonable. Most importantly, it also provides for situations in which different business rules may apply to the same set of data in different circumstances. Paul: Huh? How can two sets of rules apply to the same data; surely that's not very sensible. Andy: On the contrary, it can easily happen. Consider the case where you have a common table in which the addresses and phone numbers of all people with whom you deal are stored. It might be entirely reasonable to say that a "supplier" must have a telephone number, but surely an employee need not have one? If you were to enforce the "must have a telephone number" rule in the database, either you'd be unable to employ someone unless they had a telephone, or you'd need a special table for storing the addresses of employees. Paul: Or, more likely, the users will just put some dummy or meaningless information into the telephone number field. Which brings us back to the specific problem I originally wanted to address how should we validate user input fields? Andy: I suppose that there are essentially four situations with which we have to deal and each requires a different solution. Paul: I think I see where you're going; let me guess. First is "Choose a Value from a List." Andy: Absolutely! This one is easy to deal with because we can use a list or combo box and simply populate it with all the valid options, and only the valid options! If necessary, we can force a default by setting the ListIndex property to the appropriate value. Paul: I'd go further and say that a default is always necessary when using a predefined list from which an option must be chosen. Andy: I wouldn't disagree there. The second case is where there's a list of possible values, but new ones can be added by the user at runtime. You could just use a standard combo box to cover this. Paul: No, I don't like that idea at all. The combo box will allow you to enter data for only one field directly. I'm much more likely to use a lookup table that includes a primary key, code, and description. I could add only one field in the combo, so I'd much prefer to have a proper data entry form called when a new entry to the list is required. Andy: Good, we're agreed on that one too. So for both of these situations, the issue of validation has to be addressed when an item is added to the list rather than in the application at runtime. Paul: Ah, but in the second case, the addition of the item is actually going to happen at runtime. Andy: That could be true for either case don't you include pick-list maintenance screens in your application? Paul: Of course, so what we're saying is that using list-based values merely shifts the problem of validation to the point at which the item is added to the list. Andy: Yes, in reality we come down to only two situations. Either we're dealing with "formatted" or "unformatted" input. Paul: What do you mean by "formatted"? You don't just mean using the Format or InputMask properties, do you? Andy: No. Although they're useful in many circumstances, they're applicable only when we're actually using Visual FoxPro directly (either as DBC properties or as properties of native controls). I mean input where we can define a value's type, range, or both.

Seite 4 von 5 Paul: Specifying that a value must be a "date" would qualify it as formatted input, then? Andy: Yes, because there are standard rules for checking dates. Similarly, specifying that a value must be a number between 0 and 10 would qualify as formatted. Paul: But now you're implying that this should go into the User Interface by setting up a Visual FoxPro text box with a date value, for example. Andy: Not at all! The fact that, in the Visual FoxPro-based UI we can set up a text box to accept only date values is a bonus, but it doesn't relieve us of the necessity to ensure, in the Application Rules tier, that the value that's been supplied is actually appropriate. Paul: Ah! You're talking about ensuring that the value supplied is actually valid for the purpose for which it's been entered! For example, if we want to specify a start and an end date for a reporting period, we can use a date text box control in the UI to ensure that the user can enter only dates. However, the check that the end date is later than the start date still belongs in the Application Rules. Andy: I'd say that you need to check both that the value is really a date and that it conforms to the business rules. The same applies to any other formatted entry. Paul: This implies that there are two stages to validating formatted inputs. The first is to ensure that the input supplied is of the correct type. This might be enforced in the UI assuming it supports the necessary functionality but should still be checked in the Application Rules. The second is to ensure that the input is valid in the context of the application, which must be done in the rules tier. That seems entirely reasonable, but what about unformatted input? Andy: That's a more difficult problem. Since we're saying that the input isn't formatted, by definition we can't know what it's supposed to contain, and therefore the only approach we can take is to ensure that it is not invalid. Paul: That suggests that the first stage of validation is actually the same whether we're dealing with formatted or unformatted data, then? Andy: And so it is. The second-stage validation is what differs. Paul: So maybe first-stage validation could be implemented as a separate layer. That would help with the assistive validation too! Andy: Nice one! I hadn't thought of that. Paul: But, as always, as soon as we start saying things are the same, we should be thinking in terms of abstracting the implicit functionality. At least, that's what you keep telling me! Andy: Okay, I just hadn't thought of it in those terms. So, when we're dealing with formatted data, we can apply positive rules because we know what the data must look like. Conversely, for unformatted data, all we can apply are negative rules because all we know is what it can't look like. Paul: Coming back to my name, address, and telephone number example (which is where we started), I can see that I'm in trouble. There's no universal format for a name, for an address, or even for a telephone number! Andy: I'm afraid not. Of course you might be able to define some local rules. Addresses and phone numbers within the UK do conform to some basic standards, and so do those in the USA, although the standard is different, of course. Paul: Yes, I can see how to do that all right. A simple root class, specialized for different locations and implemented at runtime using a strategy based on locale would handle it. But the actual validation still bothers me. Andy: It will, because what you're really dealing with in this situation is unformatted data and, as we just said, all you can do is apply negative rules. Paul: So the best I can do is to say that the telephone number must be a character string containing at least nine and not more than 13 digits, can't be empty, and might (or might not) contain a "+" or a "-" or parentheses or periods. Andy: If those are the rules that you want to apply, then yes. Of course, you could include a look-up into a list of valid area codes and even apply formatting rules to the numbers if appropriate. Paul: This is my real problem. I was hoping that I could stop my users from doing things (like entering telephone numbers that don't exist) to bypass the validation and so have a greater degree of confidence that this data would be valid. But I can see

Seite 5 von 5 that I can't. Andy: Sorry to disagree with you there, Paul. You can always ensure that the data is valid, what you can't do is to ensure that it is right! This is the fundamental problem with all data entry. No matter how carefully you validate your data, there's no way to detect when an input is wrong if it meets the rules. After all, your software can't possibly know that the name "Paul Maskers" should really have been entered as "Paul Maskens." Paul: Of course. So the conclusion is that, while we can check that data is entered according to rules, and we can even specify those rules at various levels, there's just no software solution to the problem of data that is "valid, but wrong."