How Voxel cut model retraining time by 80%
Voxel uses Sematic to increase productivity across their Machine Learning (ML) team and retrain Computer Vision models faster. Sematic enabled Voxel to reduce their model turnaround time from 3 weeks to 3 days at a 20x cost saving.
Highlights
80% reduction in model retraining time
Sematic enabled Voxel to reduce the time it takes to retrain and deploy models from 2-4 weeks to 2-4 days.
20x cost reduction
Sematic freed up at least one full-time engineering resource, made the team 80% more productive, and enabled fine-grained control of compute resources, leading to hundreds of thousands of dollars in savings.
Strong production guarantees: traceability, reproducibility, observability
With increased visibility and observability of ML pipelines, Sematic made developing and debugging pipelines a lot easier and faster at Voxel.
About Voxel
Voxel leverages cutting-edge Computer Vision technology to elevate safety standards within industrial warehouse operations for customers such as Office Depot, Americold, and Dollar Tree. Comprised of ten highly skilled ML Engineers, the Voxel ML team specializes in the development and implementation of advanced object detection, classification, and segmentation models. Voxel has been recognized by Fast Company as one of the top ten most innovative companies of 2023.
Before Sematic
Before adopting Sematic, Voxel had made substantial investments in the development of an internal ML Orchestration platform known as Symphony. However, as the company expanded, the upkeep of this internal orchestration tool became an increasingly demanding endeavor.
Why is an ML Orchestration strategy critical to the Voxel team?
Voxel leverages long-running ML pipelines that harness distinct computational resources for diverse tasks. These pipelines are developed and operated by various team members, promoting transparency and facilitating reusability. By utilizing ML Orchestration, Voxel's ML team can effectively execute these extended pipelines across various resources, such as data processing clusters and GPU-training clusters. This approach ensures traceability, reproducibility, and expedites both development and debugging processes. Given the necessity to retrain specific models on a bi-weekly basis, Voxel required an ML Orchestration platform capable of swiftly automating intricate pipelines, and enhancing their overall efficiency.
Results Post-Sematic
By integrating Sematic into their workflows, the Voxel team achieved a remarkable 80% reduction in model deployment time to their customers.
“Prior to Sematic it took our team 2-4 weeks to train models. Thanks to Sematic, our team has the ability to automate pipelines and parallelize the work. Today, this same model takes 2-4 days to train. This is an 80% improvement!” – Diksha Gohlyan, MLOps Lead at Voxel.
Sematic Integration with Ray
Voxel leverages the Sematic - Ray Integration (detailed in this blog post) to run distributed compute and orchestration.
Voxel’s Decision to adopt Sematic over other tools
Voxel made the strategic decision to partner with Sematic, opting for its comprehensive ML Orchestration solution over the complexities of developing and maintaining an in-house platform. By choosing Sematic, Voxel forewent other available options such as Airflow, Kubeflow, or Buildkite, recognizing the unique value and competitive advantage that Sematic brings to their operations.
In addition to the extensive feature set, Voxel's decision to select Sematic was comforted by the expertise and experience of the Sematic team, renowned for their work in developing ML Infrastructure at Cruise, the industry leader in robotaxi technology. Coupled with the complexities associated with maintaining their own orchestration tool, this compelling factor prompted Voxel to confidently choose Sematic as their preferred solution. The team at Voxel expressed their perspective, stating:
“The Sematic team has been very helpful. In addition to deploying Sematic within our cloud infrastructure, they helped implement the initial set of ML pipelines and enabled our team to seamlessly migrate other pipelines to Sematic.”
The internal orchestration platform necessitated at least one full-time infrastructure engineer, solely responsible for its development and maintenance. However, through the implementation of Sematic, the Voxel team freed up valuable time and resources, allowing them to redirect their focus towards high-priority mission-critical initiatives. This strategic adoption of Sematic empowered the team to optimize their productivity and maximize their impact within the organization.
Alternative orchestration tools were not selected due to their lack of required features:
- Visibility of inputs/outputs
- Artifact management
- Ability to re-run pipeline steps/check-pointing capabilities.
Sematic solved for features that were not available in Buildkite, Airflow or Kubelow.
“Sematic stands out due to its exceptional features, including the ability to rerun pipelines, as well as its comprehensive visibility into input and output data, which greatly aids in troubleshooting and understanding the workflow.”
Implementing Sematic increased productivity for the Voxel ML Team
Voxel's ML workflows encompass many critical processes, including model training, data collection pipelines, labeling, and regression testing. With Sematic, Voxel unlocks a multitude of possibilities for ML and data workflows, as the platform seamlessly orchestrates arbitrary Python logic. Prior to the integration of Sematic, failures within these workflows would burden the Voxel team with extensive troubleshooting efforts, often requiring them to restart long-running pipelines from scratch, leading to substantial wastage of valuable human and compute resources. Thanks to Sematic, the Voxel team now enjoys the convenience of effortless debugging and seamless resumption from the exact point of interruption, resulting in substantial time, effort, and cost savings.
“Sematic provides quick debugging capabilities for pipeline failures, allowing us to identify issues promptly. It also offers visibility into the entire pipeline, enabling us to understand what went into each step.”
Sematic integrated with the rest of Voxel’s ML stack seamlessly thanks to its python-centric declarative SDK and API client. Diksha states that
“We use Python across the stack and the ease of working with decorators to convert functions into Sematic pipelines has made it extremely easy for us.”
The Voxel team states that Sematic has been critical to the team’s productivity.
“Some of our favorite Sematic features include the ability to rerun pipelines from a specific point, as well as the visibility provided for both input and output at each stage of the workflow. Sematic has played a crucial role in our business-critical projects by enabling us to retrain models faster, resulting in increased customer satisfaction.”
As the Voxel team continues to grow, new employees will be able to leverage Sematic to run long jobs across their local machine and cloud infrastructure without requiring assistance from Infrastructure teams, thus boosting their productivity and reducing infrastructure limitations.
The Sematic team is looking forward to continuing to work with Voxel’s machine learning team, as they look to continue to execute on their vision for ‘Working toward a world with zero injuries’ through their computer vision models.