Spearheading Open-Source Speech Enhancement
As more and more of our daily lives revolve around consuming content online, more emphasis is placed on making it as high-quality as possible. The demand for better quality audio thus grows by the day.
That being said, speech enhancement is a rather complex field of study. For instance, with the task of denoising audio, separating the speech we want from background noise requires training models capable of distinguishing the two from each other under many different circumstances.
However, what we find is that much of this technology is hidden behind a paywall, when in reality all the components necessary to open the floodgates for open-source innovation are already in place.
SoundsRight is dedicated to the research and development of non-proprietary speech enhancement models through daily fine-tuning competitions, powered by the decentralized Bittensor ecosystem.
The subnet’s miners upload models to HuggingFace, and validators benchmark them on a fresh dataset generated every day to find the miner with the best-performing model.
Competitions for Multiple Use-Cases
The subnet hosts competitions for both denoising and dereverberation tasks at a 16 kHz sample rate, and are looking to expand to 48 kHz competitions in upcoming updates. This provides a wide variety of applications for the models provided by the subnet.
Winner-Takes-All Competitions
The winner-takes-all format has a few implications for the subnet. The format, when combined with the validation mechanism, deters miner factions by rendering model duplication unviable. And most importantly, miners must put their best foot forward to come out on top.
Continuous Dataset Generation
Subnet validators generate a new dataset for every new competition, ensuring that miner models are not susceptible to overfitting during the benchmarking process. Miners can also generate datasets for themselves during the fine-tuning process, providing them with the ability to perpetually train their models on new data.