"ML Mondays"

"ML Mondays"

  • Docs
  • Data
  • Models
  • API
  • Help
  • Blog
Project Logo

"ML Mondays"A weekly USGS-CDI course on image analysis using machine learning

Docs
Data
Models
API
Github Repository

ML Mondays is an intensive USGS course in image analysis using deep learning. It is supported by the USGS Community for Data Integration, in collaboration with the USGS Coastal Hazards Program.

Deep learning is a set of methods in machine learning that use very large neural networks to automatically extract features from imagery then classify them. This course will assume you already know a little about python, that you have heard of deep learning and machine learning, and you have identified these tools as ones you would like to gain practical experience using together.

Each Monday in October 2020, Dr Daniel Buscombe will introduce an applied image analysis topic, and demonstrate a technique using data and python code that have been specially curated for the course. Participants will be expected to participate fully by carrying out the same analysis, either on the provided data or their own data.

The course will be conducted entirely online, using Microsoft Teams and USGS Cloud Hosting Solutions, a cloud computing environment built upon Amazon Web Services (AWS).

Course Leader: Dr Dan Buscombe

Dan has 16 years of professional experience with scientific programming, including 8 years using machine learning methods with imagery and geophysical data, for a variety of measurement purposes in coastal and river science. He has worked extensively with USGS researchers, using imagery to make measurements of sediment transport, benthic habitats, and geomorphic change.

Dan is currently a contractor for the USGS Pacific Coastal and Marine Science Center, operating through his company, Marda Science.

Other team members

Dr Phil Wernette has experience with machine learning methods and remote sensing and will be assisting with course planning and implementation. Phil is a Mendenhall Postdoctoral Fellow at the Pacific Coastal and Marine Science Center in Santa Cruz, CA.

Dr Leslie Hsu is the CDI coordinator and will serve as course facilitator and main contact person.

Dr Jonathan Warrick is a Research Geologist at the Pacific Coastal and Marine Science Center and will also be assisting with course planning and implementation.

Machine Learning training for USGS researchers

"ML Mondays" is a course designed to be taught live online to USGS scientists and researchers during Mondays in October 2020. It is designed to teach cutting-edge deep learning techniques to scientists whose work involves image analysis, in three main areas ...

  1. Image Segmentation, 2. Classification, and 3. Object Recognition

Classify images at the pixel level ("image segmentation"), whole image level ("image recognition"), and object-in-image detection/classification ("object detection")

Who is this course for? And when is it?

ML-Mondays consists of 4 in-person (i.e. live, online/virtual) classes, on Oct 5, Oct 13 (a day delayed, due to the Federal Holiday Columbus Day), Oct 19, and Oct 26. Each class starts at 10 am Pacific time (12pm Central time, 1pm Eastern time, 7am Hawaii) and lasts for up to 3 hours.

Each class follows on from the last. Classes 1 and 4 are pairs, as are classes 2 and 3. Participants are therefore expected to last the course. Optional homework assignments will be set for participants to carry out in their own time.

If you cannot guarantee blocking out 3 hrs on those days in Oct, then you do not have time for the course.

However, all course materials, including code, data, notebooks, this website, and videos, will be made available to the entire USGS in November, after the event. Full agenda to be announced in September.

This course is designed for USGS employees and contractors across all mission areas actively engaged in one or more of the following topics:

* satellite and aerial remote sensing

* image analysis

* geospatial analysis

* machine learning and software development



and some experience with:

* the python programming language (or extensive experience in any programming language, such as R, matlab, C++)

* a command line interface such as a bash shell, windows powershell, git bash terminal, AWS-CLI, or other terminals.

Week 1: Image recognition

The image on the left shows an example of image "recognition", which is classification of the entire image, rather than individual pixels. It answers the question, "is this thing in this image?"

We get a measure of the likelihood that the image contains each class in a set of classes. The models we use to do this need to be trained using lots of examples of images and their associated labels. We require a powerful machine learning model — called a deep convolutional neural network — configured to extract features that predict the desired classes. Likelihoods are based on normalized multinomial logits from a softmax classifier.

This is useful for things like:

* classification for monitoring and cataloging - finding and enumerating specific things in the landscape

* presence/absence detection - for example, the model depicted by the image to the right is set up to detect the presence or otherwise of a coastal barrier breach.

Week 2: Object recognition

The image on the right shows an example of object recognition, which is the detection and localization of objects (in this case, people on a beach). Localization means the model can draw a rectange or "bounding box" around each object of each class in each image. It answers the question, "where is this thing in this image?"

This is useful for things like:

* counting people on beaches, counting animals and birds in static cameras, etc, etc

* quantifying the proximity of detected objects to other important objects in the same scene

Week 3: Image segmentation

The images on the left show some examples of image segmentation. The vegetation in the images in the far left column are segmented to form a binary mask, where white is vegetation and black is everything else (center column). The segmented images (the right column) shows the original image segmented with the mask.

We can also estimate multiple classes at once. The models we use to do this need to be trained using lots of examples of images and their associated label images. To deal with the large intra-class variability, which is often implied in natural landcovers/uses, we require a powerful machine learning model to carry out the segmentation. We will use another type of deep convolutional neural network, this time configured to be an autoencoder-decoder network based on the U-Net.

This is useful for things like:

* quantifying the spatial extent of objects and features of interest

* quantifying everything in the scene as a unique class, with no features or objects unlabelled

Week 4: Semi-supervised image classification

In the last week we will cover some more advanced and emerging techniques for image classification. Many approaches in machine learning require a measure of distance between data points (Euclidean, City-Block, Cosine, etc.).Here we use a "weakly supervised" deep learning framework based on distance metrics, using the concept of maximizing the distance between classes in embedding space.

The image to the right shows the position of sample images (black dots) within an embedding space from a deep learning model training to identify landuse/landcover (LULC). This approach is designed to improve upon supervised image recognition (such as in week 1) in two ways:

1. they potentially require less data to train, and

2. they provide a sample metric that can be used a goodness-of-fit measure

Summary of topics, models and datasets

A concise summary of the various datasets and models (and some of their parameters) are listed here

This software has been approved for release by the U.S. Geological Survey (USGS). Although the software has been subjected to rigorous review, the USGS reserves the right to update the software as needed pursuant to further analysis and review. No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty. Furthermore, the software is released on condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from its authorized or unauthorized use.

"ML Mondays"
Internal links
DocsDataHelp
Community
Stack OverflowUSGS Community for Data Integration (CDI)USGS Remote Sensing Coastal Change Projectwww.danielbuscombe.com
More
BlogGitHubStar
Follow @magic_walnut
Marda Science
Copyright © 2020 Marda Science, LLC