Spearheading Open-Source Speech Enhancement

As more and more of our daily lives revolve around consuming content online, more emphasis is placed on making it as high-quality as possible. The demand for better quality audio thus grows by the day.

That being said, speech enhancement is a rather complex field of study. For instance, with the task of denoising audio, separating the speech we want from background noise requires training models capable of distinguishing the two from each other under many different circumstances.

However, what we find is that much of this technology is hidden behind a paywall, when in reality all the components necessary to open the floodgates for open-source innovation are already in place.

SoundsRight is dedicated to the research and development of non-proprietary speech enhancement models through daily fine-tuning competitions, powered by the decentralized Bittensor ecosystem.

The subnet’s miners upload models to HuggingFace, and validators benchmark them on a fresh dataset generated every two days to find the miner with the best-performing model.

Competitions for Multiple Use-Cases

The subnet hosts competitions for both denoising and dereverberation tasks at a 16 kHz sample rate, and are looking to expand to 48 kHz competitions in upcoming updates. This provides a wide variety of applications for the models provided by the subnet.

Continuous Dataset Generation

Subnet validators generate a new dataset for every new competition, ensuring that miner models are not susceptible to overfitting during the benchmarking process. Miners can also generate datasets for themselves during the fine-tuning process, providing them with the ability to perpetually train their models on new data.

soundsright.

Spearheading Open-Source Speech Enhancement

Competitions for Multiple Use-Cases

Continuous Dataset Generation