Thursday, March 26, 2009

Data Acquisition and Manual Preprocessing

The following describes the process used to convert raw CAD files to the format used for graph construction, visualization, and other analysis tasks.


Data acquisition begins by collecting all CAD building files for the building that will be incorporated into our database. For most recent buildings, these are in the form of “.dwg” CAD drawing files and are easily read by ArcGIS software. In order for the files to be inserted into our PostGIS database, they must first undergo some manual preprocessing to ensure homogeneity as well as eliminate any errors or data that is not needed.

Since these CAD files contain no spatial reference information, they must first be geo-reference to a base map so they line up with our other datasets. For this process, I use the Spatial Adjustment tools in ArcGIS to define control points on the base map and point them to the same areas on the CAD building file. These are usually corners of the building or other defining features that will allow the algorithm to line up the data correctly with minimal distortions. Using a building footprint file or a rectified orthophoto allows a georeference to a coordinate system (in our case NAD1983) and “spatially enables” our data.

Once georeferenced, we use ArcGIS to read the files in their native format and then we convert them to a shapefile format, which is much easier to handle in ArcGIS as well as our database. This conversion is executed by a simple tool used to process CAD files of this type in ArcGIS. Once the conversion is completed four shapefiles are produced for each CAD file and are separated by geometry type (point, line, multipatch, and polygon). The only one we are interested in is the polygon file that contains the rooms, stairways, and elevators, which are the main components in our system.

Once we have the polygon file we can then begin to remove all the extra data we do not need and begin to “clean” the files to contain only what we need. There is usually a tag in the attribute table of the shapefile that identifies the room, stairways, and elevator polygons. In our examples of Woodward Hall and Cameron Research Institute, these were labeled as “RM$”, and a simple SQL query allowed us to select them and insert them into their own separate file. This file is the one we use in the graph construction, visualization, and other analysis tasks of the building.

Once we have the “cleaned” files, they are uploaded into a central Postgresql with PostGIS database. This database serves as the central data server for the mobile and the desktop application, and allows updates to be propagated down the line to any device reading from it.

No comments: