Earth Observation Data Integration Pilot Project 5 - Developing community and crowd-sourced validation of 'Living Maps'

Author(s): Stuart E. Newson, David J. Turvey, Samuel Neal, Simon Gillings

Published: March 2016 Issue No.: 682

Publisher: British Trust for Ornithology Pages: 382pp

Download article 3.68 MB application/pdf

Earth Observation data offer great potential for a range of terrestrial surveillance and management issues. Living Maps – land cover maps with a focus on priority semi-natural habitats – are being developed using state of the art data and remote sensing analyses. This report scopes out how volunteers could be engaged in the validation of the Norfolk Living Map, and how transferable proposed the techniques are to other regions of the UK.

Abstract

Earth Observation data offers great potential for a range of terrestrial surveillance and management issues. Living Maps – land cover maps with a focus on priority semi-natural habitats – are being developed using state of the art data and remote sensing analyses.

The purpose of this report is to scope out how volunteers could be engaged in the validation of the Norfolk Living Map, and how transferable proposed techniques will be to other regions of the UK.

Many Norfolk habitats are very rare and sampling is likely to require a stratified approach.

Many Norfolk Living Map habitat classes are difficult to identify; some can be validated at the desk only if additional data layers are available; others can be validated in the field only at key times when botanical features are evident. A field trial is recommended to confirm ease of identification by volunteers, and to develop and test identification and training material.

Significant artefacts and ambiguous habitat classes require clarification before the data are fit for validation by volunteers.

The validation task could be divided into a desk-based component and a field-based component. For the former, volunteers would validate selected individual parcels; for the latter, volunteers should be directed to validate parcels within grid squares to capitalise on travel costs and allow squares to be joined to provide continuous validation where needed.

The desk task should include validation of a sample of parcels for superabundant habitat classes (Gardens; Urban; possibly also Hedgerows and field margins); removing these habitats from the field task will significantly reduce the number of parcels needing to be checked and make 500-m squares a viable resolution for field-based validation.

Interviews and questionnaires were completed across a broad spectrum of groups spanning charities, councils, leisure groups, recorder networks and conservation agencies.

Interviews suggest that up to 3,500 volunteers may exist in Norfolk, many preferring to self-select their local area for validation (i.e. unstructured surveying).

Analysis of existing schemes suggest structured scheme capacity of 0.5 volunteers per 10,000 residents, rising to 1–2 volunteers for unstructured schemes. Based on the current Norfolk population (878,000) we could expect 44 volunteers for a structured scheme or 88–176 for an unstructured scheme. These figures are significantly lower than those estimated from interviews, possibly owing to differences between national and local promotion and appealing to potentially different communities.

Both field- and desk-based validation show potential but will require different optimisation.

We recommend a desk-based assessment of 200–400 parcels of each habitat type, with a focus on superabundant easily identified habitats and any rare habitats that can be identified remotely with the use of additional data layers.

Structured field-based sampling will be required to ensure coverage of rare habitats, which will also achieve coverage of many other common and widespread habitats. In terms of grid resolution, using 500-m grid squares provides the best balance of sufficient parcels to warrant the travel without too many to make a survey impractical (provided Gardens and Urban have been dealt with at the desk).

As a rule of thumb, 50 squares per habitat are needed to derive a robust error estimate; more if few parcels co-occur in a square, and more if spatial autocorrelation of error is judged to be a serious issue.

Sufficiently precise countywide estimates of error could be produced with a sample of c630 squares selected for presence of rare habitats; this would achieve coverage of common habitats but their error estimates may be biased. Stratification by habitat is achievable at the county scale with a sample of c1700 500-m squares which is at the upper end of volunteer capacity.

The power analysis provides a useful analytical framework for optimising the sampling strategy once clarity has been gained on the ease of habitat identification, ideally based on a field pilot.

Local communities should be encouraged to undertake unstructured validation of a network of 500-m (i.e. self-selected) squares to produce local maps, with the aim of providing qualitative information on commission errors.

It will be important to build in procedures for collecting information that will facilitate quality control of volunteer data. This should include using multiple volunteers to validate a sample of the same habitat parcels for both the field and desk-based components, and using control sites in desk-based validation where the habitat has been validated by experts.

Transferability of habitat-based stratification is dependent on the number of habitat classes in future maps and how often they co-occur; high and low values respectively will inflate required sample sizes and challenge volunteer capacity.

A questionnaire was formulated to quantify the operational, functional and distributional aspects of technological solutions offered by six vendors.

The gap between the project requirements and open-source products is significant and would require considerable systems development to achieve a solution. The gap between the product requirements and proprietary solutions is less, but would still require considerable development.

There are two potential routes to providing a solution: a) approach existing vendors that provide systems capability and work with them to extend their solution to meet the needs of the project. This may include tailoring of both software and infrastructure, or b) Take existing open source software and a vendor with the capability to extend this, as well as the infrastructure to implement the solution for the project, and commission a development and maintenance contract for the project. The resultant implementation may then be moved back into the public domain.

The costs for producing a solution are estimated to be upwards of £150,000 with an annual cost of at least £2,000 per year to run the system.

Only two existing vendors identify their capacity to scale the project from the Norfolk Living Map to a larger solution.

A clear message on why the project is needed, why volunteers are required and how the resulting data will be used is critical.

Local promotion is important and should use a variety of media whilst not neglecting national volunteer groups with local presence. Promotion should focus on making people aware of the project and how to sign-up and request a square.

Volunteers should be provided with training and other material to maximise uptake and the quality of data collected. Volunteers should be supported with information and materials to make the process of taking part as simple and as engaging as possible, including information on interesting features to look for when visiting a location.

Support is critical for a project involving new methods and new volunteers. It will be important to provide engaging feedback during the project to motivate and encourage further volunteer uptake.

Upon completion, all data would be stored in the Collaboration Node and all users should be notified of any issues highlighted by the validation task. The latter would be easier if users had to register before getting their (free) copy of the map data.

There is willingness in the local community to assist with the validation task but the level and type of activity may differ from what is required for a statistically robust validation exercise.

The volunteer capacity for small scale but intensive desk-based validation component exists and the broad methods are well defined. Finer detail concerning which habitats to focus on requires more information which will be best provided by pilot work.

The volunteer capacity for the larger field component, ideally utilising a habitat-based stratified sample of 500-m squares is more problematic and will require significant effort to recruit and train. Diverting observers from covering large self-selected areas will be important.

Technology is at the heart of the desk-based solution and is easily defined but costly. With an expectation of only c10 desk-based volunteers per county, development costs will look very high unless a long view can be taken on the re-use of the technology in subsequent iterations of the map and in other regions. GIS-based solutions will be cheaper to develop but more costly to implement.

It is our view that a smartphone application is the most effective way to complete the field-based validation but a desktop version with printable maps may aid engagement.

Several actions are required before validation with volunteers can begin, including resolution of artefacts, clarity of purpose and future plans, pilot fieldwork and full specifications for technology tendering.