OctoML raises $15M to make optimizing ML models easier

OctoML, a startup founded by the team behind the Apache TVM machine learning compiler stack project, today announced that it has raised a $15 million Series A round led by Amplify, with participation from Madrone Ventures, which led its $3.9 million seed round. The core idea behind OctoML and TVM is to use machine learning to optimize machine learning models so they can more efficiently run on different types of hardware.

“There’s been quite a bit of progress in creating machine learning models,” OctoML CEO and University of Washington professor Luis Ceze told me.” But a lot of the pain has moved to once you have a model, how do you actually make good use of it in the edge and in the clouds?”

That’s where the TVM project comes in, which was launched by Ceze and his collaborators at the University of Washington’s Paul G. Allen School of Computer Science & Engineering. It’s now an Apache incubating project and because it’s seen quite a bit of usage and support from major companies like AWS, ARM, Facebook, Google, Intel, Microsoft, Nvidia, Xilinx and others, the team decided to form a commercial venture around it, which became OctoML. Today, even Amazon Alexa’s wake word detection is powered by TVM.

Ceze described TVM as a modern operating system for machine learning models. “A machine learning model is not code, it doesn’t have instructions, it has numbers that describe its statistical modeling,” he said. “There’s quite a few challenges in making it run efficiently on a given hardware platform because there’s literally billions and billions of ways in which you can map a model to specific hardware targets. Picking the right one that performs well is a significant task that typically requires human intuition.”

And that’s where OctoML and its “Octomizer” SaaS product, which it also announced, today come in. Users can upload their model to the service and it will automatically optimize, benchmark and package it for the hardware you specify and in the format you want. For more advanced users, there’s also the option to add the service’s API to their CI/CD pipelines. These optimized models run significantly faster because they can now fully leverage the hardware they run on, but what many businesses will maybe care about even more is that these more efficient models also cost them less to run in the cloud, or that they are able to use cheaper hardware with less performance to get the same results. For some use cases, TVM already results in 80x performance gains.

Currently, the OctoML team consists of about 20 engineers. With this new funding, the company plans to expand its team. Those hires will mostly be engineers, but Ceze also stressed that he wants to hire an evangelist, which makes sense, given the company’s open-source heritage. He also noted that while the Octomizer is a good start, the real goal here is to build a more fully featured MLOps platform. “OctoML’s mission is to build the world’s best platform that automates MLOps,” he said.

ImmunityBio and Microsoft team up to precisely model how key COVID-19 protein leads to infection

An undertaking that involved combining massive amounts of graphics processing power could provide key leverage for researchers looking to develop potential cures and treatments for the novel coronavirus behind the current global pandemic. Immunotherapy startup ImmunityBio is working with Microsoft’s Azure to deliver a combined 24 petaflops of GPU computing capability for the purposes of modelling, in a very high degree of detail, the structure o the so-called “spike protein” that allows the SARS-CoV-2 virus that causes COVID-19 to enter human cells.

This new partnership means that they were able to produce a model of the spike protein within just days, instead of the months it would’ve taken previously. That time savings means that the model can get in the virtual hands of researchers and scientists working on potential vaccines and treatments even faster, and that they’ll be able to gear their work towards a detailed replication of the very protein they’re trying to prevent from attaching to the human ACE-2 proteins’ receptor, which is what sets up the viral infection process to begin with.

The main way that scientists working on treatments look to prevent or minimize the spread of the virus within the body is to block the attachment of the virus to these proteins, and the simplest way to do that is to ensure that the spike protein can’t connect with the receptor it targets. Naturally-occurring antibodies in patients who have recovered from the novel coronavirus do exactly that, and the vaccines under development are focused on doing the same thing pre-emptively, while many treatments are looking at lessening the ability of the virus to latch on to new cells as it replicates within the body.

In practical terms, the partnership between the two companies included a complement of 1,250 NVIDIA V100 Tensor Core GPUs designed for use in machine learning applications from a Microsoft Azure cluster, working with ImmunityBio’s existing 320 GPU cluster that is tuned specifically to molecular modeling work. The results of the collaboration will now be made available to researchers working on COVID-19 mitigation and prevention therapies, in the hopes that they will enable them to work more quickly and effectively towards a solution. launches a free version of its data science platform

Data science platform today announced the launch of the free community version of its data science platform. Dubbed ‘CORE,’ this version includes most — but not all — of the standard feature in cnvrg’s main commercial offering. It’s an end-to-end solution for building, managing and automating basic ML models with limitations in the free version that mostly center around the production capabilities of the paid premium version and working with larger teams of data scientists.

As the company’s CEO Yochay Ettun told me, CORE users will be able to use the platform either on-premise or in the cloud, using Nvidia-optimized containers that run on a Kubernetes cluster. Because of this, it natively handles hybrid- and multi-cloud deployments that can automatically scale up and down as needed — and adding new AI frameworks is simply a matter of spinning up new containers, all of which are managed from the platform’s web-based dashboard.

Ettun describes CORE as a ‘lightweight version’ of the original platform but still hews closely to the platform’s original mission. “As was our vision from the very start, wants to help data scientists do what they do best – build high impact AI,” he said. “With the growing technical complexity of the AI field, the data science community has strayed from the core of what makes data science such a captivating profession — the algorithms. Today’s reality is that data scientists are spending 80 percent of their time on non-data science tasks, and 65 percent of models don’t make it to production. CORE is an opportunity to open its end-to-end solution to the community to help data scientists and engineers focus less on technical complexity and DevOps, and more on the core of data science — solving complex problems.”

This has very much been the company’s direction from the outset and as Ettun noted in a blog post from a few days ago, many data scientists today try to build their own stack by using open-source tools. They want to remain agile and able to customize their tools to their needs, after all. But he also argues that data scientists are usually hired to build machine learning models, not to build and manage data science platforms.

While other platforms like, for example, are betting on open source and the flexibility that comes with that,’s focus is squarely on ease of use. Unlike those tools, Jerusalem-based, which has raised about $8 million so far, doesn’t have the advantage of the free marketing that comes with open source, so it makes sense for the company to now launch this free self-service version

It’s worth noting that while features plenty of graphical tools for managing date ingestion flows, models and clusters, it’s very much a code-first platform. With that, Ettun tells me that the ideal user is a data scientist, data engineer or a student passionate about machine learning. “As a code-first platform, users with experience and savvy in the data science field will be able to leverage cnvrg CORE features to produce high impact models,” he said. “As our product is built around getting more models to production, users that are deploying their models to real-world applications will see the most value.”