LLM Voice Assistant
Framework for running Ollama or GPT4All models with voice recognition and text-to-speech output. Designed to work completely offline and even on a Raspberry Pi.
STTTS (speech-to-text-to-speech) is a voice assistant framework for running large language models with voice recognition input and speech synthesis output. Several approaches come already implemented and only need to be plugged together using a single configuration file. The internal multi-processing interface is simple, bases merely on bare text or raw PCM data, and scales down even to a Raspberry Pi.
As the use-case is to run completely offline, STTTS only involves projects that can work with (automatically) downloaded local models:
- STT speech-to-text recognizers: Vosk or OpenAI Whisper
- LLM large-language-model processors: GPT4All or Ollama
- TTS text-to-speech synthesizers: eSpeak or Coqui
- Audio I/O: ALSA, PulseAudio, or PyAudio
Mode of Operation
In full pipeline mode, STTTS manages three independent Python processes, one for STT, LLM, and TTS each. This decouples dependent library imports and leverages individual process scheduling for performance. As communication merely consists of prompt and token text – i.e. no audio, there is little IPC overhead through the pipe-backed queues. In single-process “CLI” mode, each part can also be directly invoked with an interactive prompt session.
Audio data is internally mostly handled as 16bit little-endian integer mono PCM buffers, which all involved i/o libraries seem to agree on.
Certain processing is eased by or needs a conversion to single-precision floating point numpy
arrays, though.
The sampling rate depends on the chosen recognizer or synthesizer, does not require resampling, and typically is 16000 or 22050 Hz.
--+-------------------|-------+---------------------------+---+-------------------------+--
Q | v | | Q | | Q
+-+---[Start]--->AudioSource | [Token/Feedback]--->+-+-+----->State---[MSG]----->+-+
| | ^ | | | | v |
| [PCM] | | | | | OutputFilter |
| v | | | | | | | |
| SpeechSegmenter | | Processor | | [Utterance] | |
| | | | ^ | | v | |
| [PCM] [Stop] | | | | SentenceSegmenter | |
| v | | | | | | [PCM] |
| Recognizer | | [Prompt] | | [Sentence] | |
| | | | | | | v | |
| State-------+ | | | | Synthesizer | |
| | | | | | | v |
| [Keyword/Utterance]-->+-+-+--------->State | [PCM]---->AudioSink |
|STT | Q | LLM|TTS | |
+---------------------------+---+---------------------------+----------------------v------+
Usage
The full voice-activated pipeline or individual CLI prompts can be run by providing a configuration file in YAML format, see below for examples and further details.
usage: sttts [-h] --config YAML [--log-level LEVEL] [--cli MODE]
Framework for running Ollama or GPT4All models with voice recognition and text-to-speech output.
options:
--config YAML configuration file
--log-level LEVEL logging level (DEBUG, INFO, WARNING, or ERROR) (default: INFO)
--cli MODE run certain cli instead of full pipeline (stt, llm, or tts) (default: None)
For initial startup, tweaking, debugging, or simply as quick LLM prompt interface, the single-process modes are recommended:
stt
activates the configured audio source with speech recognition and continuously prints transcribed utterances.llm
runs the configured model from within an interactive prompt/reply session.tts
can play input text from a prompt by using the configured synthesizer and audio sink.
Note that at least audio settings and the local models to use must be provided. Convenient automatic model download is supported but opt-in. Part of the audio configuration is also voice activity detection, with thresholds that might need to be adjusted according to the ambient recording situation.
Interaction with the voice assistant bases on certain keywords, which can be adjusted at will for the language in use:
- start: Initial hotword, start listening and further voice recognition.
- reset: Cancel the current operation, discard the pending utterance, and wait for start again.
- commit: Finish the current prompt utterance, stop listening, and run the model with synthesized speech playback. Start listening hotwords again afterwards.
- stop: Hotword to initiate overall exit.
Example 1: Desktop
A decent Linux desktop with a dated graphics card (i5-12600K, 32GB DDR5, NVMe SSD, GTX 1050 Ti 4GB) should be able to run all the even advanced models.
make EXTRA=pulse,rosa,whisper,gpt4all,coqui clean install
./venv/bin/sttts --config config.yaml
Note that this might need corresponding native system libraries, too. The configuration file for this local “virtual environment” installation could look like:
---
source: pulse
sink: pulse
speech_segmenter: band
recognizer: whisper
processor: gpt4all
synthesizer: coqui
band_speech_segmenter:
threshold: 1.0
whisper_recognizer:
model_name: "small.en"
download: true
gpt4all_processor:
model_name: "Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf"
download: true
coqui_synthesizer:
model_name: "tts_models/en/ljspeech/tacotron2-DDC"
download: true
For the first run that requires model downloads and possibly verifying audio settings, the single-process CLI modes are recommended. When running, after the signal tone, utterances are accepted by using the default hotwords:
“start” <pause> “Hello, there!” <pause> “commit” <wait> “stop” (or Ctrl^C)
In total, running these models can be done while occupying ~8GB RAM plus GPU memory. The logged LLM’s tokens-per-second rate should not be critical as long as it outpaces the speech synthesis, which only consumes very few tokens/second. Ideally, the TTS process should however be able to generate more than one second audio per second.
Example 2: Raspberry Pi
In contrast to the first example, everything can also be scaled down to run even on a Raspberry Pi – with certain compromises. Also, this experiment will use German as non-English language use-case. Ingredients:
- Raspberry Pi 4B 4GB
- Reasonable active or passive cooling solution
- Cheap USB microphone dongle
- 3.5mm jack speakers or headphones
- Ubuntu Noble 24.04 LTS (runs with some cleanup at ~100MB)
For supporting alsa
and espeak
on a fresh system, installing the corresponding libraries using apt-get
might be required.
A local Ollama binary and virtual environment for dependencies can be created by:
curl -L https://ollama.com/download/ollama-linux-arm64.tgz -o ollama-linux-arm64.tgz && tar xzf ollama-linux-arm64.tgz && ./bin/ollama serve --help
make EXTRA=alsa,vosk,ollama,espeak clean install
Again, using the individual CLI modes for model downloads and audio settings might be useful, otherwise start with:
./venv/bin/sttts --config config.yaml
A configuration file with StableLM2 for German, but “only” eSpeak could look like:
---
source: alsa
sink: alsa
alsa_source:
device: "default:CARD=Device"
alsa_sink:
device: "default:CARD=Headphones"
speech_segmenter: simple
recognizer: vosk
processor: ollama
synthesizer: espeak
simple_speech_segmenter:
threshold: 0.02
vosk_recognizer:
model_name: "vosk-model-small-de-0.15"
download: true
ollama_processor:
model_name: "stablelm2:1.6b"
download: true
device: cpu
serve: true
serve_exe: "./ollama"
system_prompt: "Du bist ein hilfreicher Assistent."
num_ctx: 512
espeak_synthesizer:
model_name: "german"
keywords:
start: "start"
reset: "korrektur"
commit: "los"
stop: "ende"
This setup should run a bit too slow but relatively robust with ~2GB memory usage overall.
Additionally enabling Coqui TTS with tts_models/de/css10/vits-neon
seems pretty risky in this regard – while barely reaching 1sec/sec synthesizer speed.
Example 3: Raspberry Pi
As second example on a Raspberry Pi, when downgrading the model to tinyllama, there’s enough headroom to experiment with nicer text-to-speech synthesis in English.
Then, during installation, the coqui
extra should be added or used instead of espeak
. (But see the Python version caveat below.)
---
source: alsa
sink: alsa
alsa_source:
device: "default:CARD=Device"
alsa_sink:
device: "default:CARD=Headphones"
speech_segmenter: simple
recognizer: vosk
processor: ollama
synthesizer: coqui
simple_speech_segmenter:
threshold: 0.02
vosk_recognizer:
model_name: "vosk-model-small-en-us-0.15"
download: true
ollama_processor:
model_name: "tinyllama"
download: true
device: cpu
serve: true
serve_exe: "./ollama"
coqui_synthesizer:
model_name: "tts_models/en/ljspeech/glow-tts"
download: true
device: cpu
This setup should also run a bit too slow but relatively robust cpu-bound with ~2GB memory usage overall.
Excourse: Backporting Python in Ubuntu 24.04
At the time of writing, Coqui TSS requires Python below the most recent 3.12 version, which is shipped with Ubuntu 24.04. The official repositories do not provide other major versions, but the deadsnakes PPA allows to install additional Python environments, which can co-exist.
echo 'deb https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu noble main' | \
sudo tee /etc/apt/sources.list.d/deadsnakes.list
curl -o - 'https://keyserver.ubuntu.com/pks/lookup?op=get&search=0xf23c5a6cf475977595c89f51ba6932366a755776' | \
sudo tee /etc/apt/trusted.gpg.d/deadsnakes.asc
sudo apt-get update
sudo apt-get install python3.11-minimal python3.11-venv python3.11-dev
For such cases, the Makefile
accepts a Python interpreter argument to be used in the virtual environment lateron.
make PYTHON=python3.11 EXTRA=alsa,vosk,ollama,coqui clean install
When running into SIGILL
on a Raspberry Pi for torch 2.4.0, sticking its version to torch==2.3.1
in requirements.in
might be needed.
Installation
In order to make the large amount of dependencies manageable, STTTS makes use of extras to opt-in certain implementations. Available tags are:
alsa
,pulse
,pyaudio
,whisper
,vosk
,gpt4all
,ollama
,espeak
,coqui
,sphinx
,sbd
,all
of the abovedev
for linting and docs
Either full or requirements-only installation inside a virtual environment is wrapped by the Makefile
, which accepts PYTHON
and EXTRA
arguments.
For example:
make EXTRA=pulse,vosk,ollama,coqui install # create and install in venv -> ./venv/bin/sttts
pip install .[pulse,vosk,ollama,coqui] # install system- or user-wide -> sttts or python3 -m sttts
make PYTHON=python3.10 EXTRA=dev,all deps # create and install dependencies in venv -> ./venv/bin/python3 -m sttts
Note that several packages also require native system libraries (as documented below), which need to be installed separately beforehand.
Further make
targets apart from deps
and install
are clean
, check
, and docs
.
Configuration Reference
A single configuration file in YAML format contains all relevant settings. Technically, the following fields are required:
source
:alsa
,pulse
,pyaudio
, orwave
sink
:alsa
,pulse
,pyaudio
, orwave
recognizer
:sphinx
,vosk
, orwhisper
processor
:noop
,ollama
, orgpt4all
synthesizer
:espeak
orcoqui
Corresponding to these choices, there are respective own config sections, e.g., for choosing and downloading the model to use. Similarly, but optional, as defaults exist:
speech_segmenter
:simple
,median
,band
, orsphinx
, defaultsimple
(which might need adjusting itsthreshold
)sentence_segmenter
:split
orsbd
, defaultsplit
feedback
:noop
,speech
, orbeep
, defaultbeep
Other global configuration objects possibly of interest are:
keywords
: hotwords forstart
,reset
,commit
, orstop
logging
A more structured, code-based read-the-docs style sphinx HTML documentation can be generated with:
make EXTRA=dev clean docs
Audio I/O
Audio sources provide the input for speech recognition as interface for interaction, typically a microphone.
Must be configured by source
(alsa
, pulse
, pyaudio
, wave
) and the corresponding per-class objects alsa_source
, pulse_source
, pyaudio_source
, or wave_source
.
Sinks receive the text-to-speech audio data as generated by the synthesizer, typically speaker or headphone audio devices.
Must be configured by sink
(alsa
, pulse
, pyaudio
, wave
), and the per-class alsa_sink
, pulse_sink
, pyaudio_sink
, or wave_sink
objects, respectively.
Alsa
ALSA listening source and playback sink, using PyAlsaAudio. This should be the most low-level sound system implementation available per default for example even on a Raspberry Pi.
For installation, the corresponding native library and headers are needed, such as from the libasound2-dev
package.
For configuration, the amixer
, aplay
, and arecord
CLI commands from the alsa-utils
package might be useful, too. Also note that the invoking user must typically be in the audio
group.
Source: Alsa
str
device
- Capture PCM to use, as obtained by
arecord -L
, for exampledefault:CARD=Device
. Defaultdefault
. float
buffer_length
- Adjust read size in seconds, default 250ms.
int
warmup
- Skip the first reads, in case of microphone auto-gaining, default 4, thus 1 second.
int
periods
- ALSA periods.
kwargs
- Extra options passed to
alsaaudio.PCM
.
Sink: Alsa
str
device
- Playback PCM to use, as obtained by
aplay -L
, for exampledefault:CARD=Headphones
. Defaultdefault
. float
buffer_length
- Output buffer length in seconds, default 5. Generous to unblock the synthesizer running in parallel.
int
period_size
- ALSA period size in frames.
kwargs
- Extra options passed to
alsaaudio.PCM
.
Pulse
Listening source and playback sink for PulseAudio servers, using PaSimple, that in turn requires the native libpulse-simple.so.0
library.
This sound server implementation should be in use per default on various Linux desktop distributions.
Source: Pulse
str
device
- Recording device to use, none for default.
float
buffer_length
- Adjust read size in seconds, the default 250ms.
int
warmup
- Skip the first reads, in case of microphone auto-gaining, default 4, thus 1 second.
kwargs
- Extra options passed to
PaSimple
.
Sink: Pulse
str
device
- Playback device to use, none for default.
int
buffer_length
- Output buffer length in seconds, default 5. Generous to unblock the synthesizer running in parallel.
kwargs
- Extra options passed to
PaSimple
.
PyAudio
Listening source and playback sink using PyAudio, which relies on the cross-platform PortAudio library.
Building requires the portaudio19-dev
package or similar.
At the time of writing, on Ubuntu 22.04, this conflicts with jackd
and the pre-built python3-pyaudio
binary package 0.2.11 is broken for Python 3.10.
Also, problems might arise for non-default microphone sampling rates using ALSA.
This option provides the most high-level abstraction and compatibility if neither ALSA nor PulseAudio is supported.
Source: PyAudio
str
device
- Recording device to use such as
USB PnP Sound Device: Audio (hw:1,0)
, none for default. If invalid, error out with a list of available devices. int
buffer_length
- Read size in seconds, default 250ms.
kwargs
- Extra options passed to
pyaudio.Stream
.
Sink: PyAudio
str
device
- Playback device to use, none for default, for example
bcm2835 Headphones: – (hw:0,0)
. If invalid, error out with a list of available devices. int
buffer_length
- Requested output buffer length in seconds, default 5. Note that the actually applied buffer size might be lower.
kwargs
- Extra options passed to
pyaudio.Stream
.
Wave
Mostly for debugging purposes, audio can be directly read from or written to S16LE mono *.wav
files, as configured by filename
.
When reading, the sample rate must match the internally chosen one.
Speech Segmenters
Not all voice recognition implementations provide support for a fully streamed operation, i.e., are able to continuously receive audio frames, detect silence or activity, and transcribe speech on-the-fly. Thus, this explicit and exchangeable pre-processing step monitors input audio and yields buffers that contain a whole utterance, as separated by short breaks of silence.
As this particular aspect of the pipeline largely depends on environmental conditions, choosing an implementation and its config might need some trial-and-error approach.
Configured by speech_segmenter
(simple
as default, median
, band
, sphinx
) and the corresponding per-class objects simple_speech_segmenter
, median_speech_segmenter
, band_speech_segmenter
, or sphinx_speech_segmenter
.
The speech_buffer_limit
(30.0) and speech_pause_limit
(30.0) configuration values limit the allowed utterance and silence durations. This should prevent excessive buffering in case of mis-detected spurious speech activity detection or “start” keyword.
Segmenter: Simple
Determine silence/speech audio by a simple absolute RMS/volume threshold, which can require tweaking and good recording environments.
float
frames
- Length of the sliding look behind window in seconds (2.0).
float
threshold
- RMS threshold, smaller values will be considered silent. Default 0.2.
Segmenter: Median
Determine silence/speech audio by comparing the RMS with the median (percentile) energy. Idea: If the median is smaller than the average, there’s peaks, i.e., a flat noise/silence distribution.
This simple method should be self-adaptive wrt. background noise to automatically detect volume outliers. The calculation is applied to a sliding window of past audio frames, with a change from speech to silence leading to returning the buffered utterance as a whole.
float
frames
- Length of the sliding look behind window in seconds (2.0).
int
percentile
- Percentile that is compared with the RMS energy, for example 50 for median (default).
float
threshold
- Percentile by RMS factor, greater will be considered as silence. Default 0.5.
Segmenter: Band
Use the librosa STFT FFT implementation as simple band-pass filter. The average contribution of typical voice frequencies is compared against other frequencies in the spectrum. This gives a voice-vs-noise estimate, with a configurable threshold.
float
frames
- Length of the sliding look behind window in seconds (1.0). This also directly influences the possible FFT resolution.
float
threshold
- Average voice frequency compared to other frequencies, default 1.0.
int
freq_start
- Lower band-pass, where the human voice typically starts (256).
int
freq_end
- Upper band-pass, where the human voice typically ends (4096).
Segmenter: Sphinx
Use PocketSphinx Endpointer
for VAD voice activity detection, similar to the basic Segmenter
.
int
mode
- Aggressiveness of voice activity detection (0-3, loose-strict, default 0).
float
window
- Length in seconds of window for decision (0.3).
float
ratio
- Fraction of window that must be speech or non-speech to make a transition (0.9).
kwargs
- Extra options passed to
pocketsphinx.Endpointer
.
STT Recognizers
Voice recognizers transcribe speech from audio buffers.
Apart from generic prompts, utterances that only consist of a single keyword are detected.
By the keywords
object, alternate hotwords for start
, reset
, commit
, or stop
can be configured.
Must be configured by recognizer
(sphinx
, vosk
, whisper
) and the corresponding per-class sphinx_recognizer
, vosk_recognizer
, or whisper_recognizer
objects.
Recognizer: Sphinx
Use the Python bindings of the PocketSphinx speech recognizer package.
Direct pocketsphinx.Decoder
access is possible without the oversimplified wrappers for audio or live speech, as the SphinxSegmenter
implements a pocketsphinx.Endpointer
beforehand.
Generic transcription capabilities seem to be rather poor for nowadays standards, making it more suitable for specific speech detection on a limited dictionary.
More recent models than the en-us
one that the package ships with might be available on the PocketSphinx project page or from SpeechRecognition.
str
model_path
- Common base path for the
hmm
,lm
, anddct
arguments. Default to use theen-us
model that ships withpocketsphinx
. str
hmm
- Sub-path to the directory containing acoustic model files, such as
acoustic-model
. str
lm
- Sub-path to the N-Gram language model, such as
language-model.lm.bin
. str
dct
- Sub-path to the pronunciation dictionary, such as
pronounciation-dictionary.dict
. kwargs
- Extra
Config
parameters passed toDecoder
.
Recognizer: Vosk
Use the Vosk speech recognition toolkit, with a wide range of models available. This implementation seems to provide good detection capabilities that also runs on low-end hardware.
str
model_name
- Model to use, for example
vosk-model-small-en-us-0.15
. Omitted to list available models if download is enabled. bool
download
- Opt-in model search and automatic download. Otherwise, ensure the model exists beforehand.
int
sample_rate
- Accepted input sampling rate, default 16000. Might be changed if 16K is not supported by the input recording device.
Recognizer: Whisper
Audio transcription using the OpenAI Whisper speech recognition models, which can be problematic on low-end hardware.
str
model_name
- Name of the model to use, for example
base.en
. Omitted to list available ones. str
language
- Indicate input language, default
en
. Especially important for multilingual models. bool
download
- Opt-in automatic downloading of models to
~/.cache/whisper/
. Otherwise, ensure the model exists beforehand. str
device
torch
device to use, defaultcuda
if available, otherwisecpu
.kwargs
- Extra arguments passed to
whisper.Whisper.transcribe()
.
LLM Processors
Processors are the core functionality, formed by LLMs, which receive transcribed prompts and yield tokens to be synthesized to speech output.
Must be configured by processor
(noop
, ollama
, gpt4all
) and the corresponding ollama_processor
or gpt4all_processor
objects, respectively.
Processor: GPT4All
Run language models through the GPT4All Python client around the Nomic and/or llama.cpp backends.
str
model_name
- Model to use, for example
Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf
. Omitted to list remote models if download is enabled. str
model_path
- Path for model files, default
~/.cache/gpt4all/
. bool
download
- Opt-in model search and automatic download.
str
device
- Explicit processing unit, such as
cpu
, automatic per default. int
max_tokens
- The maximum number of tokens to generate (200).
int
n_ctx
- Maximum size of context window (2048).
str
system_prompt
- Override initial instruction for the model.
kwargs
- Extra options passed to
GPT4All.generate()
.
Processor: Ollama
Run language models on a remote or locally started Ollama server, using the client provided by the ollama package.
If no installation as system daemon is needed, the self-contained binary can simply be downloaded, for example from https://ollama.com/download/ollama-linux-amd64.tgz.
str
model_name
- Model to use, for example
llama3
. Omitted to list locally available ones. Remotely available models can be browsed in the official model library. str
host
- API host, default
127.0.0.1:11434
. bool
download
- Opt-in automatic model pull, usually to
~/.ollama/models/
. bool
serve
- Run
ollama serve
in an own subprocess. str
serve_exe
- Path to local binary when using internal serving instead of
ollama
. dict
serve_env
- Extra environment variables when using internal serving, see
ollama serve --help
. str
device
- Disable CUDA when using internal serving and set to
cpu
. str
system_prompt
- Override system message from what is defined in the Modelfile.
kwargs
- Extra
ollama.Options
passed toollama.Client.generate()
.
Sentence Segmenters
Not all synthesizers support a streaming operation, i.e., are able to continuously receive text/token input while yielding internally buffered chunks of audio. In its simplest form, sentence segmenters thus combine and flush tokens until certain boundaries are found, for example full stop periods. By this means, playback can start as soon as the first sentence is available, while further tokens and synthesized output is still generated in the background.
Configured by sentence_segmenter
(split
as default, sbd
) and the per-class config objects split_sentence_segmenter
, or sbd_sentence_segmenter
, respectively.
Segmenter: Split
Split streamed text into sentences by applying a simple expression that recognizes newlines or certain punctuation characters followed by space.
str
delimiter_chars
- Characters that end sentences if followed by a space, default
.!!??::;
.
Segmenter: Boundary
Use the pySBD module for sentence boundary disambiguation.
str
language
- Implementation to use, default
en
.
Output Filters
In a post-processing step, output filters can opt to add either further text or PCM to be played.
For example, beep sounds can indicate readiness or end of output.
Configured by feedback
(noop
, speech
, beep
as default).
TTS Synthesizers
As actual text-to-speech implementation, synthesizers receive tokens/sentences and yield audio buffer streams.
Must be configured by synthesizer
(espeak
, coqui
) and the per-class espeak_synthesizer
or coqui_synthesizer
objects, respectively.
Synthesizer: Espeak
Speech synthesizer using the eSpeak bindings from pyttsx3.
Only the actual C library wrappers are directly used, bypassing the provided loop and ffmpeg
-based PCM output.
As dependency, usually the espeak
(or at least libespeak1
) package need to be installed beforehand, this usually also makes a wide range of languages (voices) is available.
Results are understandable but typically sound rather mechanical than natural by today’s standards. However, it is a viable alternative that runs even on low-end hardware.
str
model_name
- Voice name, for example
default
orenglish-us
. Omitted to list available ones. str
model_path
- Directory which contains the espeak-data directory, omitted for default location.
float
buffer_length
- Length in seconds of sound buffers that are passed to the callback (0.25).
Synthesizer: Coqui
Use the TTS text-to-speech library from Coqui. Originally forked from Mozilla, both seem to be discontinued, though. A wide range of natural sounding models is available, some examples can be found at Coqui-TTS Voice Samples.
Comes with lots of additional dependencies, such as espeak
, ffmpeg
, libav
, or rustc
.
str
model_name
- Model to use, in the format type/language/dataset/model with
tts_models
type. For exampletts_models/en/ljspeech/tacotron2-DDC
. Omitted to list available models. bool
download
- Opt-in automatic downloading of models to
~/.local/share/tts/
. Otherwise, ensure the model exists beforehand. str
device
- Device to use, default
cuda
if available, otherwisecpu
. kwargs
- Extra options passed to
TTS.tts()
.
Logging
Some libraries show the bad habit of directly using print
statements.
STTTS tries to unify this to a certain extent by intercepting warnings and standard streams, forwarding to the logging subsystem instead.
The overall log level is set by the command line.
For more complex scenarios, or simply for excluding too verbose loggers when using DEBUG
, a logging config can be provided, which then gets applied by each process instead, such as:
logging:
version: 1
formatters:
colored:
(): colorlog.ColoredFormatter
format: "%(asctime)s %(log_color)s%(levelname)-8s%(reset)s %(name)s: %(message)s"
datefmt: '%Y-%m-%d %H:%M:%S'
handlers:
console:
class: logging.StreamHandler
formatter: colored
loggers:
root:
handlers:
- console
asyncio:
level: INFO
numba:
level: INFO
torio:
level: INFO