Large Dataset Visualization
Ubiquitous, persistent, global surveillance assets are producing multi-modal geospatial intelligence datasets that overwhelm the capacity of the human analyst workforce. Homeland security and defense applications are now routinely producing prodigious volumes of image and video data that become aggregated into petabyte-sized archives. Optimistically, automatic feature extraction and automatic target recognition algorithms combined with content-based information mining database systems can assist in automatically processing between 90 to 95% of the raw data volume. The remaining 5-10% is still a huge volume and better visualization tools are needed to assist the analyst workforce in rapid visual scene interrogation, information extraction, and exploitation for multiple intelligence tradecrafts.
The multi-window visual spreadsheet paradigm developed in our research is a powerful approach for organizing, interrogating, and exploiting multi-modal image datasets. Fig.1 shows an example of a multi-window, multi-modal visualization tool called the Distributed Information SpreadSheet (DISS).
The DISS is an interactive geospatial visualization and analysis tool that provides a novel gridded multiwindow interface for constructing, organizing, and intercomparing gigabyte-sized datasets. The DISS provides high performance networked storage access using HTTP and FTP methods directly with progressive transmission, combined with novel data compression schemes using error resilience, and network-based data caching for low latency and efficient bandwidth utilization. Highly interactive visual browsing tools, such as multi-window synchronized animation, roam and zoom, stereoscopic display, navigated data probing, surface rendering, flightpaths and volume visualization for interacting with arbitrary-sized multi-dimensional data in each frame of the DISS have been demonstrated to be highly successful for quickly inspecting thousands of separate datasets. A unique DISS capability allows an analyst to rapidly and smoothly zoom, roam, animate and execute functions in synchrony for a large set of 2-D or 3-D data cells. Other software systems are beginning to adopt the multi-window organization of displays with synchronized control that was pioneered in the DISS.
A recent survey points out that visualization spreadsheets provide a tabular layout, operators, and dependencies between image cells that provides two key benefits: (i) direct manipulation interfaces for convenient viewing, navigation and interaction with data and (ii) a flexible and easy-to-learn programming environment. Remote access to data within the DISS framework benefits the user by enabling convenient sharing of results and visualizations for collaboration, and by allowing transparent access to the latest near-realtime data. Users can generate a DISS header file in which data references are a combination of local files, NFS files, remote files specified by URLs (FTP or HTTP), Grid files (GridFTP) or database files (SQL queries). Sharing these results with other users, particularly users not local to the data, is easily done by sharing the text-only DISS header file. When all data references in the header are remote, then data is exchanged only between the user and the remote sensing archive since the retrieval uses the embedded URLs.
We are presently developing a large data viewing tool that adds new capabilities to the multi-window approach including a quad-bundle data structure for handling extremely large imagery on small footprint desktop computers, on the fly vector migration for aligning map data with high resolution imagery, software agent-based image classification, and visual system based image interpolation. An example video demonstrating the visualization software for massive 2D imagery on standard PC hardware is shown in Figure 2. Note the video resolution has been degraded to facilitate the file download. Since large datasets cannot fit entirely in main memory, these massive images must be duplicated at multiple resolutions and broken into smaller segments to be viewed efficiently. In addition to standard roam and zoom operations, the software supports an arbitrary number of simultaneously visible embedded layers, on-the-fly colormap lookup and histogram enhancement, and projection of images onto a spherical surface with elevation maps. We are also developing a zoomable video surveillance tool that can be potentially integrated with DISS and Kolam for working with high-resolution video surveillance imagery from UAVs.