Invaders Main | Noxious Weeds List | Links Database | Queries by: Name, List, Area, and Map


Project Map

• Summary
• Introduction
• Invasiveness
• Data
    • Attributes
    • Aquisition
• Data Sets
    • Training Set
    • Prediction Set
• Methods
    • Initial Experiments
    • Neural Network
• Results
• Bibliography
Data Sets
         Time series plots were created of the cumulative number of counties with records versus year of record for about 30 exotic forbs known to have become successful invaders in the northwestern U.S. Examination of these plots revealed that most species took a period of about 50 years to show rapid and extensive geographic expansion. Therefore, the species that were selected to train and validate the neural network were those that had their earliest record in the northwest U.S. before 1951. It was assumed that species would exhibit invasiveness within this period of time if they had that potential. 89 species in the INVADERS database fit this criteria, but complete attribute data was available for only 61 species. Those 61 species and a set of 62 exotic plants introduced during the same period, but never declared noxious, were therefore selected to be used as examples to the machine learning models.


Exotics Introduced Before 1951
        

View the attribute data for all exotics introduced to Idaho and Montana before 1951:



Training Set (Introduced 1875-1950)
         A subset of 123 species was selected for developing the models to predict potential noxious species. 61 species introduced before 1951 have already been declared noxious in one or more of the five northwest states. The other 62 species, introduced in the 1920’s or earlier, have not been declared noxious. The model development data has three (2:1:1) subsets:
    • 62 training species
    • 31 cross verification species
    • 30 testing species.
         Each subset consists of approximately half noxious and half exotics not declared noxious. The training subset is used to search for the optimal neural networking model. The cross verification data is used to check that the network is able to generalize (i.e. the model doesn’t operate on case-by-case basis) during the iterative training. The test subset checks the predictive accuracy of the model.

View training set data by species:



Prediction Set (Introduced 1951-2000)
        The rest of the species (i.e the post 1950 introductions) formed the set on which we ran predictions. This resulting data set also included 15 species already declared noxious that were introduced after 1950. These 15 species served as an indicator of performance of the neural network model.

View prediction set data by species:



<< Data Methods >>