**3. Document Description Scripts**

In previous section, we discussed the scope of DDLs, in this section we'll introduce a new concept: the Document Description Script or DDS for short. Let's state this: a DDL is an instruction set, these instructions are unable to perform anything unless they are properly structured and proper parameters are given.

Most of the time, for any computer language, instructions are written in a file known as a sourcecode and then compiled in order to generate a computer program (sometimes, the sourcecode is not compiled but interpreted instead), sometimes these source code is also called a script; a DDS shares this concept, the DDS contain a set of instructions properly structured, they are written in a script what we call a document and this document is interpreted by a document viewer, so this viewer interprets how to draw a document in a computer screen or how to print it.

For example, in Fig. 1; a part of the DDS as used for the ODF, PostScript and PDF is shown. Of course, it lacks many essential elements, but the aim is to show the nature of those approaches.

In Fig. 1(a), we can see that the text "This is a text document showing a DDL with a xml approach" is to be drawn in the page, we can identify the special tags body to indicate that the body of the document is to begin, and then the special tag text indicates that the enclosed stream is the text of the document and furthermore, the special tag text:p text:style-name="Standard" indicates that the enclosed paragraph and this text has the style Standard (12 pt Times Roman font, normal weigth), usually a document has several paragraphs and several styles including user defined styles, for example bold letters with font size 14 pt and Arial font, and the way to define which parts of the whole text has to be in this style is by means of these command sequence.

In Fig. 1(b) the command sequence to draw the text "this is a text document showing a DDL with a PostScript approach" is illustrated, it is clear how different DDL's approach the same task in different ways, not necessarily better yet different. In this slice of code, one can identify a command used to position the text in a given point in the page ("100 50 moveto" positions the beginning of the text at the point (100,50) ), and then, the character stream is given, note the special delimiters "(" and ")" which enclose the characters to be drawn and finally the instruction "show" that draws the given stream in the page. And in Fig. 1(c) it is shown the corresponding script slice to approach the same task, one can see that it is almost the same as done using the postscript approach, not surprisingly since it is know that PDF is an evolution from Postscript.

We would like to emphasize that not all DDL's use the same instruction set for document descriptions, furthermore, in most cases DLL's differ greatly, thus in the remaining of this chapter, we well focus in DDL in which character metrics are available so an automated system can locate an process them, and illustrative examples will be carried out using the postscript DDL because is better documented and easier to understand; since postscript is

Finally, we wish to point out that a DDL is like any other computer language, it provides an instruction set but those instructions must be properly structured, in next section, a

In previous section, we discussed the scope of DDLs, in this section we'll introduce a new concept: the Document Description Script or DDS for short. Let's state this: a DDL is an instruction set, these instructions are unable to perform anything unless they are properly

Most of the time, for any computer language, instructions are written in a file known as a sourcecode and then compiled in order to generate a computer program (sometimes, the sourcecode is not compiled but interpreted instead), sometimes these source code is also called a script; a DDS shares this concept, the DDS contain a set of instructions properly structured, they are written in a script what we call a document and this document is interpreted by a document viewer, so this viewer interprets how to draw a document in a

For example, in Fig. 1; a part of the DDS as used for the ODF, PostScript and PDF is shown. Of course, it lacks many essential elements, but the aim is to show the nature of those

In Fig. 1(a), we can see that the text "This is a text document showing a DDL with a xml approach" is to be drawn in the page, we can identify the special tags body to indicate that the body of the document is to begin, and then the special tag text indicates that the enclosed stream is the text of the document and furthermore, the special tag text:p text:style-name="Standard" indicates that the enclosed paragraph and this text has the style Standard (12 pt Times Roman font, normal weigth), usually a document has several paragraphs and several styles including user defined styles, for example bold letters with font size 14 pt and Arial font, and the way to define which parts of the whole text has to be

In Fig. 1(b) the command sequence to draw the text "this is a text document showing a DDL with a PostScript approach" is illustrated, it is clear how different DDL's approach the same task in different ways, not necessarily better yet different. In this slice of code, one can identify a command used to position the text in a given point in the page ("100 50 moveto" positions the beginning of the text at the point (100,50) ), and then, the character stream is given, note the special delimiters "(" and ")" which enclose the characters to be drawn and finally the instruction "show" that draws the given stream in the page. And in Fig. 1(c) it is shown the corresponding script slice to approach the same task, one can see that it is almost the same as done using the postscript approach, not surprisingly since it is

We would like to emphasize that not all DDL's use the same instruction set for document descriptions, furthermore, in most cases DLL's differ greatly, thus in the remaining of this chapter, we well focus in DDL in which character metrics are available so an automated system can locate an process them, and illustrative examples will be carried out using the postscript DDL because is better documented and easier to understand; since postscript is

discussion on this subject is carried out.

**3. Document Description Scripts** 

computer screen or how to print it.

approaches.

structured and proper parameters are given.

in this style is by means of these command sequence.

know that PDF is an evolution from Postscript.

```
<office:body> 
 <office:text> 
 <text:p text:style-name="Standard"> 
 This is a text document showing a DDL with a xml 
 approach 
 </text:p> 
 </office:text> 
 </office:body> 
 </office:document-content> 
                           ( a ) 
100 50 moveto 
(this is a text document showing a DDL with a PostScript 
approach) 
show 
                           ( b ) 
100 50 Td 
(This a text document showing a DDL with a PDF approach) Tj 
                            ( c )
```
Fig. 1. Example of a DDS, one can notice how a Language is used to describe the structure of an electronic document. The same text was written with a) the ODF; b) the Postscript Language and c) the PDF.

considered the basis of PDF, it is feasible that if you understand the postscript it will be in fact easier to understand the PDF internals, conversely, it will be more difficult to proceed the other way.

A typical approach is depicted in Fig. 2. In this figure we can see that the most important parts of the script file are the header and the body. The former is called Encapsulated PostScript or EPS, it contains information about the version of the standard used in the document; in addition, it contains other useful data such as the number of pages, the bounding box, etc. The latter, that is to say, the body contains the whole contents of the document organized in pages (each one can be recognized easily by the special command

Authentication of Script Format Documents Using Watermarking Techniques 243

In last section, the basic concepts of DDS's and their role was described, in this section we

A character metric is the distance between consecutive characters, another way to understand the character metrics is as the distance that "the cursor" must be advanced to place next character. A character has two metrics, called mx and my, that are the distance in the x-axis and the y-axis where the next character must be placed (see Fig. 3). Since some languages have different writing styles, the metrics should agree with this, and thus we can have vertical documents, like Japanese in which mx=0 and my ≠0, and horizontal documents like in English in which mx ≠0 and my=0, and the seldom used, diagonal documents, which are mostly used in graphic design field, even when seems that this class apply only for line shapes, here consider that any text in which mx ≠0 and my ≠0 holds is a diagonal document.

will go deeper in the internals of the document description scripts.

 T T e e

 Text x x t t

( a ) ( b ) ( c )

More information on character metrics can be read in (Turner, 2000).

Fig. 4. Types of documents. a) Horizontal document, b) Vertical document and c) diagonal

As mentioned above, the actual contents of a page is enclosed in special tags; for text documents, the text is organized in rows. In Fig. 5 it is shown an example of a simple row definition. Firstly, the position for the row within the page is set at (52,742) by the command

Let's first introduce the character metrics concept.

Fig. 4 shows examples of each type of documents.

Fig. 3. The character metrics.

document.

**3.1 Character metrics** 

```
 %!PS-Adobe-2.0 
 %%Pages: 2 
 %%Creator: Txt2Ps 
 %%Title: A Simple Document. 
 %%PageOrder: Ascend 
 %%BoundingBox: 0 0 615 792 
 %%CreationDate: Fri Jul 9 17:31:33 2010 
 %%BeginSetup 
 %%PaperSize: Letter 
 %%EndSetup 
 /Times-Roman findfont 
 12 scalefont setfont 
 %%Page: 1 1 
 %% %% Page Contents 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 showpage 
 . 
 . 
 . 
 %%Page: N N 
 %% %% Page Contents 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 showpage
```
Fig. 2. Example of a basic DDS of PostScript.

showpage which is used to mark the end of a page and tell the document interpreter that the page must be drawn). In this example, the actual contents of the page is not shown, a comment is shown instead. The first lines illustrate a header, then, the marker %%Page: x x is used to begin the page x, and the command showpage marks the end of the page.

In the examples ahead, all this structure will be omitted and just the contents will be illustrated in order to keep the examples small and to focus in the parts of the script that are processed.
