Synthetic Data Generator
Data isn’t always cheap—sometimes it’s time-consuming or costly to obtain. The Flask helps sidestep this problem by generating synthetic data from what you already have.
The Flask takes your real data and uses an array of heuristic and generative deep learning methods to create synthetic data. Generative AI can be used to train custom machine learning models by generating large amounts of diverse and realistic training data.
The Flask’s training data helps improve the accuracy of your ML models. When you save time with data generation, more effort can be devoted to improving your model.
Free-Form Text Augmentation
LM (Language Model) Text Generation
Jaxon uses a modular generative AI system based on large language models to generate new examples in the style of an existing dataset.
Frequent Terms
Replaces a percentage of words based on how frequently the words appear in the dataset. TF-IDF (term frequency–inverse document frequency) is used.
Synonyms
Replaces words in the original example with synonyms.
Random Words
Replaces a word from the example with another word selected at random from the same corpus of text.
Tabular Data Augmentation
VAE (Variational Autoencoder)
Compresses existing data and expands it again using deep learning. The reconstruction is purposefully noisy, creating variation.
Gaussian STDEV (Standard Deviation)
Sets the maximum distance (standard deviation) from the original value. The higher the standard deviation, the greater the allowable difference between the original and generated values.
Gaussian Noise
Changes numerical values, assuming that the distribution of existing values fits a bell curve.
Categorical
Changes a categorical value to another used in the dataset (e.g. in Countries, changing “United States” to “Canada”).
More about Synthetic Data
Greater outputs with fewer inputs.
Jaxon rivaled state-of-the-art with over 1000x fewer manually labeled examples.
Greg Harman, CTO of Jaxon, shares his thoughts on synthetic data, tradeoffs between noise and quantity, and how to improve your models.
Learn how to augment data in Jaxon with the Flask, and read a breakdown of the techniques that the Flask uses to augment data.