Release Notes – 0.31.0

June 29, 2023

Josh Bauer

Founding Engineer

Another Sematic release is here, with lots of exciting new features, improvements, and bug fixes. Let’s take a look at some highlights!

GitHub integration

At Sematic, we believe testing is crucial to enhancing your development loop, whether it be "traditional" code, unit testing of ML pipelines, or running full regression pipelines involving your data and models. Now, Sematic Enterprise users can connect Sematic to their repos to surface links to relevant Sematic pipeline runs directly on their GitHub PRs. You can even block GitHub PRs on the successful completion of Sematic pipelines.

If you want to learn more about this feature or how to set it up, you can check out our docs.

Real-time Metrics

Sematic has always allowed rich visualizations for inputs and outputs of the Sematic functions that serve as the steps in your pipelines. But what if you want to visualize some data while your functions are executing? Some example use cases might be looking at a loss curve during a training job to see whether the model is converging, or looking at some evaluation metrics during model eval to get an early indicator of whether the model is likely to be viable. Before this release, your only option was to integrate with a third-party tool like Weights & Biases or Tensorboard. You can still do this, for more powerful visualization capabilities, but you can now get timeseries values directly in Sematic without any extra work.

To log a metric value, simply import and call log_metric:

Read the full documentation here.

New Type and Example for Language Modeling

Sematic aims to be a general tool for ML development, and thus can be used for language model development out of the box. One illustration of this is one of our new examples, which uses an LLM to ingest & summarize articles from Hacker News and produce html showing the result. Many thanks to kaushil24 for the contribution.

Output of the Hacker News summary example pipeline. — Output of the Hacker News summary example pipeline

You can view the example here.

We’ve also added our first custom visualization tailored specifically for usage with language models: PromptResponse. It allows easy visualization of prompts for language models combined with the response from those models. It shows a compact view for convenience, but can be used to handle long sections of text as well.

Visualization for the PromptResponse type. — Visualization for the *PromptResponse* type

Are you a language modeler interested in Sematic? Stay tuned because we have some exciting things on their way in upcoming releases!

Native dependency packaging

Until now, Cloud Execution (i.e. submitting Sematic jobs to Kubernetes) was only officially supported with Bazel to build and push container images at runtime.

As of 0.31.0, users can leverage Cloud Execution purely with Docker. In true Sematic fashion, we focused on protecting the iteration loop for people writing and running pipelines. As with our Bazel-based workflow, there is no need to make sure you remember to re-build and push your images every time you make a change and want to see it in the cloud. Just use the Sematic CLI, and it will take care of everything:

$ sematic --build path/to/my/script.py

Upgrade

To upgrade, simply issue:

$ pip install sematic --upgrade

and to upgrade your Kubernetes deployment:

$ helm repo update $ helm upgrade sematic-server sematic-ai/sematic-server -f /path/to/values.yaml

As always, if you hit any issues or have any questions, ask us on our Discord server.

July 18, 2023

Release Notes – 0.31.0

GitHub integration

Real-time Metrics

New Type and Example for Language Modeling

Native dependency packaging

Upgrade

Tuning and Testing Llama 2, FLAN-T5, and GPT-J with LoRA, Sematic, and Gradio

How Voxel cut model retraining time by 80%

ML Orchestration: Why It's Time to Move Past Airflow

5 Tips to Reduce your ML Cloud Costs

Release Notes – 0.29.0

Sematic + Ray: The Best of Orchestration and Distributed Compute at your Fingertips

Release Notes – 0.27.0

Release Notes – 0.22.1

What is Lineage Tracking in Machine Learning and why you need It

What is “production” Machine Learning?

Sematic raises $3M to build an open-source Continuous Machine Learning platform

Observability for Machine Learning: what is it and what are the benefits

Getting started with Sematic in 5 minutes

Implementing Deep Links in React with Atoms

Continuous Learning for safer and better ML models

Hello World

Release Notes – 0.31.0

GitHub integration

Real-time Metrics

New Type and Example for Language Modeling

Native dependency packaging

Upgrade

Tuning and Testing Llama 2, FLAN-T5, and GPT-J with LoRA, Sematic, and Gradio

How Voxel cut model retraining time by 80%

ML Orchestration: Why It's Time to Move Past Airflow

5 Tips to Reduce your ML Cloud Costs

Release Notes – 0.29.0

Sematic + Ray: The Best of Orchestration and Distributed Compute at your Fingertips

Release Notes – 0.27.0

Release Notes – 0.22.1

What is Lineage Tracking in Machine Learning and why you need It

What is “production” Machine Learning?

Sematic raises $3M to build an open-source Continuous Machine Learning platform

Observability for Machine Learning: what is it and what are the benefits

Getting started with Sematic in 5 minutes

Implementing Deep Links in React with Atoms

Continuous Learning for safer and better ML models

Hello World

Subscribe to our mailing list