Skip navigation.
Home

PTF Draft Spec 2007-10-09 discussion

jwatte's picture

Here is the draft specification for the Paged Terrain Format file format.

I look forward to discussion to this thread, where it's easy to fold in suggestions.

jwatte's picture

Btw, there's an easier URL

Btw, there's an easier URL to get at the document (assuming you're registered):

http://www.interopworld.org/members/ptf

This (symbolic) URL will be updated to refer to the latest version, as it gets revised.

dragonmage's picture

Hi, can you summarise what

Hi,

can you summarise what is good/better about PTF versus other terrain formats?
Do you have an analysis of the strength/weaknesses of other formats you are trying to address here?

jwatte's picture

Sure, let's get started! In

Sure, let's get started!

In general, there are a few axes that terrain formats can be considered across. Some of them are:

Text-based vs. Binary-based
Interchange vs. Runtime
Local vs. Global
Descriptive vs. Visual
Fixed vs. Extensible

PTF is designed to be a binary-based, global, visual runtime format with support for descriptive features.

The problem with text-based formats such as X3D, COLLADA or STF, is that you can't page them -- you have to examine the entire file to parse its grammatic structure. If you start out with a five gigabyte database of France, you can't just load Paris, say, without examining all of it. Work-arounds used before is to manually break the files into multiple pages, and load an entire page at a time. Then you need additional metadata, and you can't change your mind at runtime about what paging strategy to use. PTF solves this by putting all data into a spatial index, and letting the runtime software make paging decisions over larger or smaller scales.

The problem with interchange formats like COLLADA or OpenFlight is that they are not efficient for runtime loading -- they require processing, sometimes significantly so, before actually being used by the runtime software. Many smaller simulations solve this by loading all necessary meshes before the simulation starts (I've seen load times of 10 minues for as single "level" in some systems). That won't work for an interoperable virtual world where users will move around at will. The design of PTF is such that any display software is able to load and display the geometry without any pre-processing.

The problem with local formats, such as Quake MAP files, and most other game formats, is that you limit the size of the area covered. A typical limitation you'll see is "supports up to 10km x 10km areas" which means that everything is localized to a 32-bit floating point space, and most physical simulation software starts seeing instability about 5 km away from the origin (assuming meters for units) because of precision. PTF uses a 64-bit coordinate system for its overall structure, but stores geometry in chunks of 32-bit data (for storage space reasons), with enough metadata to allow you to "bind" together different parts of data to avoid rasterization "sparkles."

Descriptive formats, such as C2DB, are developed for use by various "AI" agents (such as the various SAF systems out there). If you try to create display geometry from these formats, it doesn't look very good. The same problem is often had from file like SHP files, which don't contain enough information to do a good 3D rendering job.

Last, fixed formats, such as TerraPage, do not allow the user to add necessary data, such as collision query acceleration data, or mark-up needed by agents that act in the world. The PTF standard is designed as an open-ended database, that lets you store, index and page any kind of additional metadata, either as a published standard (that others can parse and understand), or as a vendor-specific extension, which other implementations can easily just ignore. In all honesty, TerraPage was the closest to what we wanted in a standard format, but the problems with that format were not just that it wasn't of an extensible design, but also that there's only a single tool that can write it well, and there is no file format specification -- just a set of C source code that can read and write a specific version of the format.

So, PTF can (and does, in implementation) store data that is traditionally auxiliary, such as textures or meshes for culturals. Thus, you can build a PTF file that is entirely self-contained. Download it, and that's all you need. To allow asset sharing between different files, PTF also allows cross-file references, and allows for data to be stored in separate files if you'd rather prefer that. However, that means you have more separate files to manage and download. Also, PTF supports splitting a database into many files, with an arbitrary number of index levels before you get to the real data. This allows you to store very large datasets, retaining the benefits of PTF for paging, extensibility, etc, while still keeping each file under some target file size. Some systems have problems with files over 2 GB, for example.

There are other design choices made to make it easy to process PTF files without having to know all the structure of the file. For example, all references are made using the concept of a "record reference," which can be seen as a database key. You can physically shuffle around these records in the file, as long as you update the key/index that points at these records, and the file will be logically consistent, even if you don't know what's in the records.

We've worked on this specification for over a year, gathering feedback from various people without necessarily making it obvious what it was for. We wanted to be pretty sure it would work, and it would solve the problems we were setting out to solve, before we came forward and proposed the standard. Now that we're seeing some promise out of the implementation, it's time to open it up to anyone interested for comment.

The documentation is extracted, using Doxygen, from the reference implementation header files, so we don't have as much control over the layout as we would like. The interesting overview parts of the documentation start at page 303 in the documentation. Someone just diving into it might drown in the minutiae of the reference implementation -- I suggest starting at page 303, reading to the end, and then going back and looking at the implementation for more details if needed.

dragonmage's picture

Thanks for the explanation.

Thanks for the explanation. I have been late to reply because I was busy with a 3D campus effort. I think I support the general thrust of your intent with the PTF but would like to correct a few statements before discussing what I think is the main issue.

The premise that the text formats can't be paged is not correct. As you noted yourself they can be broken up into pages (in most cases called tiles). Breaking the terrain of a text format into tiles does not have to be manually done as you state: software tools can be used to generate the tiles. Both planet-earth and X3D Earth take that approach with X3D.

Further, I have to correct you on another point: X3D is not a text format as you state. It is a functional specification which has text and binary encodings.

I agree that X3D (I don't know enough about Collada atm) is not very good for intelligent paging of data / optimisations, even if broken into some tiling scheme. This seems to be the main issue and if PTF helps in that regard it sounds like a good approach.

I like the coordinate system approach: it is storage efficient while catering for a large continuous coordinate space.

Can indexes, with references to location of remote PTF files, be downloaded separately?

I think a good format is only half the answer: suitable protocols to query/retrieve/transmit the data are pretty necessary for modern networked applications. What protocol(s) are intended for accesing this data and how do they affect the format?

Len Bullard's picture

"Further, I have to correct

"Further, I have to correct you on another point: X3D is not a text format as you state. It is a functional specification which has text and binary encodings."

That is in fact, correct. It is an ISO standard with a functional object model and multiple encodings (Classic VRML, XML, and binary).

Uptake is another issue that should be researched. According to the W3DC and evidence from products, the most widely deployed export format is VRML. How to measure actual usage is debatable.

Collada may have uptake but it may not meet the lifecycle requirements. As Tony Parisi points out, these have the rendering and behavioral poles that have to be accounted for. Consider that where runtimes reference external behavior engines in different languages (eg, Java vs C#) and different scripting languages (say Javascript vs PHP), there are multiple failure points that have to be accounted for in procurement.

This is where the 'market picks the standard' philosophy tends to be flawed. The market is so ill-defined at this time that very few are really knowledgeable to the point of making good predictions. Where the W3DC with X3D has outpaced the market has been in creating both a community of use and a vendor community that demonstrates remarkable continuity of operations. In other words, and as simply as I can put it, when it comes to picking and fielding standards, they are better at this than most.

Today we really have a lot of different real-time 3D applications being lumped together under the 'virtual worlds' topic. This is where the nascent interop efforts are likely to fail. So before we get into deciding what the 'anti-standards are' (a buzzy but meaningless term), we should know what the products are that are being standardized. So far, PTF seems aimed toward the "World Map With Just In Time Bots and Services" market, eg; military/industrial scenario sims.

That is likely not the biggest or most lucrative market although it is one for whome some claim specific needs for specific standards. That need is not yet demonstrated.