Whisper

AI Speech Recognition

https://github.com/openai/whisper?utm_source=toolify

Social Media Links

Introduction

Whisper is a general-purpose speech recognition model developed by OpenAI. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. Whisper uses a Transformer sequence-to-sequence model trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.

How To Use

Whisper can be used via command-line or within Python. For command-line usage, you can transcribe speech in audio files by specifying the audio file and model size. For Python usage, you can load the model and use the transcribe() method to process audio files.

Pricing

Packages	Pricing	Features
Free Edition	Free	Unlimited public repositories, limited private repositories
Team Edition	$4/user/month	Unlimited private repositories, basic features
Enterprise Edition	$21/user/month	Advanced security and auditing features

Whisper

Tags

Social Media Links

Introduction

How To Use

Pricing

Similar Tools

Whisper

Tags

Social Media Links

赞助广告

Introduction

How To Use

Pricing

Similar Tools

学习资源推荐