Kaldi Community Roadmap

Logo

A series of virtual meetings to set the path forward

Dates of the webinars

Thursday, Sept 17
11:30-13:00 EDT
Agenda
Video
Meeting Summary
Thursday, Sept 24
11:30-13:00 EDT
Agenda
Video
Meeting Summary
Thursday, Oct 1
11:30-13:00 EDT
Agenda
Video
Meeting Summary


Do you want to get involved?

Kaldi ASR: Research and Academic Users – Summary of the meeting

Introduction (Sanjeev Khudanpur)

The townhall was planned to discuss the future of Kaldi. Sanjeev Khudanpur gave an introduction to the evolution of the toolkit. Kaldi still remains one of the leading tools for researchers and small companies. Part of its success relies on the community contribution to maintain Kaldi up to date. Today, Kaldi is preparing major revisions such as pytorch-ification as the core neural components, automatic differentiation through the FST, and streamlining for data ingestion. The main questions to continue the project are what, how, and who. We will need the help of the community.

K2 (Dan Povey)

Dan Povey explained his vision of the next Kaldi. The overall plan is to have two main projects: K2 for the sequence modeling, Lhotse for the data preparation, but also to make sure it works with other platforms. The recipes’ structure will be decided later.

The current Kaldi is based on C++ and bash scripts. The neural network framework is not powerful enough for the new PyTorch and TensorFlow, and it is becoming complex. The Kaldi contributors started a prototype for building a python wrapper, but it was difficult to maintain, so it was abandoned. The idea, now, is to start from scratch.

Design Considerations

Kaldi new design will have separate packages for data preparation, training, etc, plus small and more maintainable projects. It will also work well with frameworks like PyTorch or Tensorflow. It will avoid the mismatch between training and inference codebases.

Goals

K2 implementation

K2 will be mostly C++/Cuda The workhorse in K2 will be ``RaggedTensor’’: arbitrarily nesting vector<vector> It will have dependencies on cub, DLPack, pybind11

Why not Pytorch or Tensorflow

The new Kaldi will not be PyTorch or TensorFlow because it is not possible to efficiently implement some algorithms like composition and determinization. TensorFlow RaggedTensor codebase is too complex and not integrable with PyTorch. Autograd can be used since it allows the algorithms to be pure FSAs, and it is implemented at Python level.

Code

The Kaldi contributors need to rethink algorithms for GPU implementation (because of loops). There will not be a lot of GPU specific code; most C++ lambdas are CPU-based, so that reusable code is accessible for more developers. Operations such as pruned composition, computing expectations are now on GPU; the determinization will come later.

Timeline

We plan to release the code Mid-to-late October. The initial release will be CTC and LF-MMI. By the end of the year, non-trivial uses such as multi-pass jointly optimized systems will be implemented. Later on, the contributors will focus on recipes (still undecided structure), modeling and Lhotse.

Lhotse (Piotr Zelasko)

Kaldi was designed with experts in mind, not easy-to-use. Lhotse proposes to make things accessible to a broader community.

Key aspects

Panel Discussion: Kaldi in education and research (Moderated by Jan “Yenda” Trmal

Kaldi remarks/opinions

Research goals

Education goals

Commitments/plans

Some other ideas