The NLCD 2001 is created by partitioning the U.S. into mapping zones. A total of 66 mapping zones were delineated within the conterminous U.S. based on ecoregion and geographical characteristics, edge matching features and the size requirement of Landsat mosaics. Mapping zone 64 encompasses whole or portions of several states, including the states of New York, New Jersey, and Pennsylvania. Questions about the NLCD mapping zone 64 can be directed to the NLCD 2001 land cover mapping team at the USGS/EROS, Sioux Falls, SD (605) 594-6151 or mrlc@usgs.gov.
(Errors of Omission/Commission)
0 Background (N/A)
2 Developed, High Intensity (65%/72.1%)
3 Developed, Medium Intensity (47%/53.8%)
4 Developed, Low Intensity (62%/55.7%)
5 Developed, Open Space (82%/86.5%)
6 Cultivated Crops (82%/76.5%)
7 Pasture/Hay (76%/86.5%)
8 Grassland/Herbaceous (76%/68.6%)
9 Deciduous Forest (86%/78.7%)
10 Evergreen Forest (80%/83.3%)
11 Mixed Forest (71%/73.8%)
12 Scrub/Shrub (71%/60.8%)
13 Palustrine Forested Wetland (62%/58.9%)
14 Palustrine Scrub/Shrub Wetland (40%/40.0%)
15 Palustrine Emergent Wetland (62%/77.9%)
16 Estuarine Forested Wetland (NA/NA)
17 Estuarine Scrub/Shrub Wetland (NA/NA)
18 Estuarine Emergent Wetland (NA/NA)
19 Unconsolidated Shore (71%/100.0%)
20 Bare Land (94%/89.6%)
21 Open Water (99%/88.1%)
22 Palustrine Aquatic Bed (17%/50.0%)
23 Estuarine Aquatic Bed (NA/NA)
24 Tundra (N/A)
25 Snow/Ice (N/A)
The validation points were both collected in the field and photo interpreted. The accuracy assessment selection methods were developed to minimize spatial autocorrelation between the training and accuracy assessment. The first pool of accuracy assessment sites came from field data and photo interpretation of black and white digital orthophotos and digital color infrared imagery (primarily Ikonos data). These sites were collected prior to initial mapping and were collected at the same time as the training data. The sites were selected to capture the physical and spectral diversity of the land cover. After the first criteria was met, the accuracy assessment sites were buffered to see if they fell within 1000 meters of another accuracy assessment site of the same class or within 1000 meters of a training site of the same class. Those that fell within the 1000 meter buffer were eliminated. All sites were to be from a homogeneous 3x3 area. After an analysis of the point distribution, it became clear that there were not enough samples for every class. The remaining points were selected from the initial draft final classification and had to be a homogeneous 3x3 area. Sampling was limited to areas where there was high resolution color infrared imagery. Classes which had a proportionally larger representative land area had larger numbers of samples. When possible, we tried to identify 50 samples of each of the classes. Exceptions were in the estuarine classes. While a small proportion of estuarine environments do exist within zone 64, the rarity of each did not allow for the collection of points. The rarity of bare land with respect to a random sample limited the number of accuracy points to 21.
A total of 1521 accuracy assessment points were including excluding urban classes. All classes have a minimum of 50 accuracy assessment points except for the classes mentioned above. These classes are limited in the study area and to some extent in the imagery that was available to sample from.
Post-Processing Steps: None
Known Problems: None
Spatial Filters: None
Conceptually, the descriptive tree is a classification tree generated by using the final minimum-map- unit land cover product (1 acre) as training data, and Landsat and other ancillary data as predictors. The goal of the descriptive tree is to summarize the effects of boosted trees (10 sequential classification trees) into a single condensed decision tree that can be used as a diagnostic tool for the classification process. This descriptive tree can be used to assess the relative importance of each of the input data sets on each land cover class. Such information may also be useful to customize the minimum-mapping-unit classification to meet a user's specific needs through raster modeling. Descriptive trees usually capture 60 to 80% of the information from the original land cover data.
The leaf or terminal nodes of the descriptive tree are assigned to sequential numbers (called node numbers) and mapped across the entire mapping zone on a pixel-by-pixel basis. These node numbers can then be matched with the various conditional statements associated with each respective terminal node. This spatial layer appears similar to a cluster map, but is the result of a supervised classification - not an unsupervised clustering. This node map can potentially be used as input by users to customize NLCD land cover, by linking the spatial extent of an individual node with the rules of the conditional statement.
The Land Cover spatial classification confidence data layer is provided to users to help determine the per-pixel spatial confidence of the NLCD 2001 land cover prediction from the descriptive tree. The C5 algorithm produces an estimate (a value between 0% and 100%) that indicates the confidence of rule predictions at each node based on the training data. This spatial confidence map should be considered as only one indicator of relative reliability of the land cover classification, rather than a precise estimate. Users should be aware that this estimate is made based on only training data, and is derived from a generalized descriptive decision tree that reproduces the final land cover data. However, this layer provides valuable insight for a user to determine the risk or confidence they choose to place in each pixel of land cover.
A logic statement from a descriptive tree classification describes each classification rule for each classified pixel. An example of the logic statement follows:
IF tasseled-cap wetness > 140 and imperviousness = 0 and canopy density < 4, then classify as Water
This logic file can be used in combination with the spatial node map to identify classification logic and allow modifications of the classification based on user's knowledge and/or additional data sets.
Additional information may be found at <http://www.mrlc.gov/mrlc2k_nlcd.asp>.
Summary: This section outlines the classification procedure for the zone 64 C-CAP Late Date map. The map is an update based on an existing 2001 C-CAP map based on a previous classification methodology and classification. The previous map covered approximately 45% of the mapping area. The three dates of imagery were first reviewed for image quality and shifts between image dates. Training points were used as the dependent variable in a CART (Classification Analysis by Regression Tree) approach. Derivative data layers (NDVI, Tasseled Cap) were calculated from the TM data and were used as additional independent variables in the analysis. There were 3 major versions of the map prior to a draft map: 1) the provisional map, 2) automated (incl. draft and final) and 3) final edited. The provisional map was developed prior to field work to give an idea of potential areas to sample. This rough map represented the first output from the CART classification routine. Ancillary data (DEMs, etc) and spectral data were used in this and all other CART models. The final automated map had additional points from the field and photo interpretation added. Points were continually added until the best possible model was developed. This represented a fully automated product. This product was then altered by hand edits to refine the classification. The existing C-CAP product was incorporated into the automated map to improve the overall quality and maintain consistency. In addition, a percent impervious data layer developed from TM data using high resolution imagery was imbedded into the classification to define the developed classes. This produced the final-with-edits version which is the final version of the classification and is the one described here.
Pre-processing steps: Each Landsat TM scene was geo-referenced by USGS (United States Geological Survey)/EROS. The Sanborn staff reviewed the spectral and spatial quality of the imagery. Areas that were greater than 1-2 pixels off were sent back to USGS for reprocessing. The data was geo-referenced to Albers Conical Equal Area, with a spheroid of GRS 1980, and Datum of NAD83. The data units is in meters. The zone 64 TM data was delivered in the form of USGS zone mosaics. The data included three dates of TM- leaf-on, leaf-off, and spring. For each date of TM, spectral and tasseled cap data were received.
Acquisition dates of Landsat ETM+ (TM) scenes used for land cover classification in zone 64 are as follows:
SPRING-
Index 1 for Path 14/Row 29 on 05/07/01 = Scene_ID 7014029000112750
Index 1 for Path 14/Row 30 on 05/07/01 = Scene_ID 7014030000112750
Index 1 for Path 14/Row 31 on 05/07/01 = Scene_ID 7014031000112750
Index 2 for Path 14/Row 32 on 04/24/02 = Scene_ID 7014032000211450
Index 3 for Path 15/Row 29 on 04/28/01 = Scene_ID 7015029000111850
Index 3 for Path 15/Row 30 on 04/28/01 = Scene_ID 7015030000111850
Index 3 for Path 15/Row 31 on 04/28/01 = Scene_ID 7015031000111850
Index 3 for Path 15/Row 32 on 04/28/01 = Scene_ID 7015032000111850
LEAF ON (Summer)-
Index 1 for Path 14/Row 30 on 06/08/01 = Scene_ID 7014030000115950
Index 1 for Path 14/Row 31 on 06/08/01 = Scene_ID 7014031000115950
Index 2 for Path 14/Row 32 on 07/05/02 = Scene_ID 5014032000218610
Index 3 for Path 15/Row 29 on 05/25/02 = Scene_ID 5015029000214510
Index 3 for Path 15/Row 30 on 05/25/02 = Scene_ID 5015030000214510
Index 3 for Path 15/Row 31 on 05/25/02 = Scene_ID 5015031000214510
LEAF-OFF (Fall)-
Index 1 for Path 14/Row 29 on 10/30/01 = Scene_ID 7014029000130350
Index 1 for Path 14/Row 30 on 10/30/01 = Scene_ID 7014030000130350
Index 1 for Path 14/Row 31 on 10/30/01 = Scene_ID 7014031000130350
Index 6 for Path 14/Row 31 on 09/23/99 = Scene_ID 7014031009926650
Index 2 for Path 14/Row 32 on 09/23/99 = Scene_ID 7014032009926650
Index 3 for Path 15/Row 29 on 10/02/00 = Scene_ID 7015029000027650
Index 5 for Path 15/Row 30 on 10/16/99 = Scene_ID 7015030009928950
Index 4 for Path 15/Row 31 on 10/05/01 = Scene_ID 7015031000127850
Field-Collected Data- The goals of the field data collection were to sample the diversity of the landscape, within the classes, and among image dates. Classes that would be more difficult to collect from air photos were targeted for field data collection. To meet these goals, Sanborn stratified the image into spectral clusters and located the field sites throughout the study area based on these. In addition to these pre-arranged sites, Sanborn collected points while driving between locations. Due to limited time and accessibility, not all polygons were assessed in the field. Those that we did not visit on the ground were labeled with digital orthophotographs or Ikonos data if it was available. Both training and validation points were collected together. See the accuracy assessment section to see how the points were split into training and validation points. Sanborn used laptop computers and GPS (Global Positioning System) to correctly locate field points on the TM imagery. Software downloaded from the Minnesota's Department of Natural Resources (DNR) was used to connect the Garmin GPS to the laptop computer and ESRI's ArcView software. Sanborn's programmer developed an ArcMap application that allowed entry of location and field notes with a click of the mouse. These data were stored in a shapefile. Items that were collected were: Land Cover characterization Special conditions and remarks Photograph Number Date/time location
The data and equipment used for the fieldwork are as follows:
Ancillary datasets:
TIGER 2000
NLCD - mosaicked into zones
State road map and Delorme state atlas www.delorme.com
Existing 2001 C-CAP dataset
Hardware: with ArcView/ArcGIS and data
GARMIN GPS modules and external antennae, redundant data
cables
Cameras devices (Floppy Drives, CD Burners, external HDD)
Extra batteries (lap-top and GPS)
Mobile phones
System backup CD's with data and software
Compass notebooks with instructions and road maps with pre-determined routes
Wetland and Vegetation Field Guides
Imagery:
Image data for each zone
Initial classifications
Classification:
After the field points for training were collected, they were combined with photo-interpreted points and used as the dependent variable in a CART classification approach. Many layers tested as independent layers. They included three dates of spectral and tasseled cap imagery, DEM, slope, aspect, texture, band indices (NDVI, Moisture, NDVI-Green, shape indices fractal dimension, compactness, convexity, and form). Statistical analyses and visual inspection of the output was used to eliminate data that was redundant or not useful in the classification. Additional training points were added to help reduce some of the confusion between classes. The rough classification was created at the end of this process using only the CART discrete decision-tree software. A provisional classification was produced by applying spatial models using ancillary data to the rough classification. The final automated map was then edited using hand editing techniques while using high resolution imagery as reference data. Independently of this process, impervious data layers for zone 64 were created. This layer was developed from Regression Tree and used impervious classifications from IKONOS imagery to predict pixel level percent impervious at the TM pixel level. The continuous percent impervious data was thresholded to produce the developed categories and imbedded into the final map.
Attributes for this product are as follows:
0 Background
1 Unclassified (Cloud, Shadow, etc)
2 High Intensity Developed
3 Medium Intensity Developed
4 Low Intensity Developed
5 Open Spaces Developed
6 Cultivated Land
7 Pasture/Hay
8 Grassland
9 Deciduous Forest
10 Evergreen Forest
11 Mixed Forest
12 Scrub/Shrub
13 Palustrine Forested Wetland
14 Palustrine Scrub/Shrub Wetland
15 Palustrine Emergent Wetland
16 Estuarine Forested Wetland
17 Estuarine Scrub/Shrub Wetland
18 Estuarine Emergent Wetland
19 Unconsolidated Shore
20 Bare Land
21 Water
22 Palustrine Aquatic Bed
23 Estuarine Aquatic Bed
24 Tundra (N/A)
25 Snow/Ice (N/A)
Ancillary Datasets: Non-TM image datasets used are DEM (Digital Elevation Model), slope, aspect, positional index, NWI, NLCD, TIGER2000, field-collected points, photo-interpreted points, zone 64 GAP reclassified by NOAA (Gap Analysis Program),Census data (housing and population density), Ecoregions. 2001 C-CAP Map
QA/QC Process: There were several QA/QC steps involved in the creation of this product. First, there was an internal QA/QC. This was done by viewing the classification frame-by-frame along with the TM imagery, the classification, and high resolution reference imagery. NOAA staff completed a similar review and provided both general and point comments.
The completed single pixel product was then generalized to a 1 acre (approximately 5 ETM+ 30 m pixel patch) minimum mapping unit product using a "smart eliminate" algorithm. This aggregation program subsumes pixels from the single pixel level to a 5-pixel patch using a queens algorithm at doubling intervals. The algorithm consults a weighting matrix to guide merging of cover types by similarity, resulting in a product that preserves land cover logic as much as possible.