Working with the National Oceanic and Atmospheric Administration (NOAA), Jaxon created a generative AI model to map the seafloor habitats of the U.S. Caribbean. NOAA has no shortage of data, but manual annotation proved so time-consuming that the project was bottlenecked. Jaxon accelerated the annotation process from minutes to milliseconds per image, magnifying expert contributions and minimizing time investment. Jaxon incorporated domain knowledge at every step, producing a highly-accurate automation environment and saving the NOAA team months of manual effort.
The Problem: Characterizing & Mapping Seafloor Habitats
NOAA has over 10TB of video datasets from remotely operated vehicles (ROVs). NOAA’s National Centers for Coastal Ocean Science (NCCOS) Habitat Mapping Team analyzes this data to characterize benthic habitats. They also look at the distribution of biological (coral, seagrass, sponge) and geological substrate (sand, rock). Documentation of these communities and their health informs conservation, management, and restoration efforts.
At the time, the biggest barriers to annotation of the ROV datasets were:
-
- NOAA data doesn’t align with off-the-shelf AI models.
- Manual annotation is especially time- consuming for this use case.
- Non-experts unable to annotate data at required accuracy.
- Limited number of hours experts could contribute to the project.
Want to learn more about solving your real-world problems with AI?
The Solution: Jaxon Designed a Custom AI Training Pipeline
1. Enhance Images to Clarify Object and Scene Detection: First, we enhanced the images, which had been extracted from murky video footage. This led to blurry or obscured images that decreased model accuracy. The enhancements were proven to increase model capabilities and accuracies in object characterization (below). Experts then annotated less than 100 examples.
Native Underwater Image
Image Enhanced with Jaxon’s Modified Algorithm
2. Augment 100 Examples into Tens of Thousands: Next, the annotated examples were augmented with several techniques—this created thousands of new, labeled examples from the original group. We gained input from the experts about the details that helped them discern similar-looking items like rock and coral. This helped us augment the images in a way that expands image characteristics that will be crucial in categorizing the remaining data.
Less than 100 expert-annotated examples were amplified into a training dataset containing tens of thousands of usable examples. Given the tiny number of expert annotations compared to much larger quantities of unlabeled data, accurately encoding expert input to the ML training pipeline was critical to ensuring that the training set was representative of all the data.
Expert-provided ground truth (coral)
Jaxon prediction (coral)
3. Train Models to Annotate All Data – Lastly, AI did the heavy lifting—the model was able to annotate terabytes of NOAA’s unlabeled data, saving months of work.
Conclusion: Mapping Seafloor Habitats
The Jaxon team created a custom AI training pipeline to help NOAA map the seafloor of the Caribbean. Jaxon provided an efficient, effective, and scalable solution that restarted the project of seafloor characterization and saved the NOAA scientists immense amounts of time, effort, and money. Models need to be retrained as new data becomes available, and designing repeatable training pipelines to annotate new data is Jaxon’s specialty. The system Jaxon created for NOAA easily accommodates new knowledge and will continue to save valuable resources in the future.