Collaborating in geospatial context since 2000!

Friday, February 13, 2009

What's up with the parens in the GeoPDF CTM?

In the georegistration worked example post, we calculated a CTM that takes the page coordinates in points into projected coordinates in meters:

[35.28267 0 0 35.28267 205188.64 3207094.8]

In the PostScript and PDF files, there are CTM entries:

[(35.28267) (0) (0) (35.28267) (205188.64) (3207094.8)]

Same numerical values, just the values in CTMs in the files have parentheses around them. What's up with that?

PostScript and PDF share a similar object system that provides a rich set of types. We'll go into the details of the PDF object system in a later post. However, we'll talk about the types in the two version of the CTM. There are three types present: numbers, strings, and an array. Arrays are delimited with the square backets []. Strings are delimited with parens (). There are actually two types of numbers: integer and real. These types were implemented in earlier versions of Acrobat as 32 bit objects. Real numbers used a fixed point scheme that traded precision for range. That range was too small to hold values often used in geodesy. More recent versions of Acrobat used IEEE 754 single-precision floats, the precision of which is not sufficient store values for geodesic calculations. Rather than try to shoehorn a syntatic extension that used native PDF numbers directly, we stashed the values into strings and extracted the values based on whether a string appeared in a numeric context.

Bonus link: David Goldberg's What Every Computer Scientist Should Know about Floating-Point Arithmetic [PostScript].

0 Comments:

Post a Comment

<< Home