Data Science actually IS ‘Science’

data science icon

One thing that strikes me is that the science in data science is often overlooked. It truly is science wrought with hypotheses, experimentation, variables, and subject matter expertise. Jaxon provides users a bunch of knobs to turn (algorithmic choices) – with “just enough” supervision (as little labeled data as possible), so they can iterate and get the model production-ready as quickly as possible.

Data Efficiency

Mature data science teams are focusing in on the emerging concept of data efficiency. How long does it take to create reliably accurate models trained with as little data (labeled or otherwise) as possible? How much human time is needed in the form of subject matter expertise (codified knowledge as heuristics), testing and validation of the output, and model tinkering? Data efficiency is also about knowing when to label, when to test, when to use AutoML, and when to get your PhD engaged (reading academic papers and doing hardcore data science).

Data Origins

Where the data comes from and how it’s labeled also matters. This is very much a trial-and-error exercise data scientists nearly always conduct at the onset of a project. If there’s a way of easily re-labeling the data (e.g. Jaxon!), data scientists can easily experiment with problem specification changes. Iteration coupled with transfer learning (using available assets, i.e. pre-trained models) is the name of the game, and Jaxon has an eloquent way of operationalizing the process. Jaxon also has training schedules that allow users to get pretty sophisticated in so far as how the neural networks are trained. This is a part of the Jaxon platform that the pure labeling efforts simply don’t really touch. It oftentimes is an effort that data scientists custom code in their notebooks (one-time, wasted prototype code).

Rapid Prototyping

A quote from one of our customers:

“Before Jaxon, we had to create a mini-Jaxon for each use case, cobbling together bespoke training pipelines. Those models often got wasted and it was typically just throwaway code. There was no tooling around automating this process until Jaxon.”

Jaxon’s rapid prototyping has proven its value: iterate quickly to winning models, but also iterate between both parties (data science and the business units), enabled by the fact that Jaxon has the ability to rapidly adjust a problem specification and re-label. This eliminates the need for data scientists to model by hand (coding) until the project is on the final stretch and already has the right problem specification.

Jaxon is one of the first to offer a platform that allows analysts, via UI, and data scientists, via API, to collaborate on ML model training. And since Jaxon drastically reduces the amount of labeled data needed, it’s compelling for companies to stop outsourcing/offshoring their data labeling (manual labeling is still responsible for the majority of data labeling) and let their analysts do it. The models will be eons better (a recent benchmark showed Jaxon-produced models have 33% less error), the cost of data labeling will go way down (75% or more), and the company will reduce the time it takes to build custom models by over 90%. That means they can get the model into production that much faster and start making money with it or shaving costs (realize ROI sooner).

– Scott Cohen, CEO