Pages in topic:   [1 2 3 4 5] >
Is OpenAI’s Whisper better than Dragon?
Thread poster: Hans Lenting
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Mar 12, 2023

Who will be the first to tell us?

https://youtu.be/8SQV-B83tPU

Hey Michael: did you read this/do you have plans for this Sunday?


Dan Lucas
Gerard de Noord
Michael Beijer
 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 12:21
Member (2014)
Japanese to English
Interesting, but... Mar 12, 2023

Hans Lenting wrote:
Hey Michael: did you read this/do you have plans for this Sunday?

I agree, let's get Mr Beijer to do the hard work.
But isn't this online only? To put it another way, isn't all content recorded...? I think I read something about privacy concerns.

Dan


 
Tom in London
Tom in London
United Kingdom
Local time: 12:21
Member (2008)
Italian to English
Intolerable Mar 12, 2023

Hans Lenting wrote:

Who will be the first to tell us?

https://youtu.be/8SQV-B83tPU

Hey Michael: did you read this/do you have plans for this Sunday?


That guy in the video is I N T O L E R A B L E


 
Baran Keki
Baran Keki  Identity Verified
Türkiye
Local time: 14:21
Member
English to Turkish
Tolerable or not Mar 12, 2023

Tom in London wrote:
That guy in the video is I N T O L E R A B L E

Apparently his 2 million plus subscribers find him tolerable enough... He must be making shit loads of money from Youtube than you'll ever make from translation by annoying people like you.


Tom in London
 
Milan Condak
Milan Condak  Identity Verified
Local time: 13:21
English to Czech
I post on January 16 in Czech forum short info Mar 12, 2023

Hans Lenting wrote:

Who will be the first to tell us?


Jan 16

4 Free Apps to Automatically Transcribe Audio and Video Files into Text or Subtitles Powered by Open AI's Whisper.

https://www.proz.com/forum/czech/360628-4_bezplatné_aplikace_pro_automatický_přepis_řeči_do_textu.html

---
I am using Whisper three months, mostly in application Buzz (two months, my home PC Windows without power GPU) for CPU only.
My son tested GPU/CPU, the GPU process is six times faster.
The quality is better than Google transcribe on Youtube. Dragon and Youtube do not support transcibing Czech speech.
It works only for transcibing. For dictate exists other applications.

Another post you can find in Ukrainian forum:

https://www.proz.com/forum/ukrainian/361316-краще_по_для_переведення_аудіо_в_текст_українською_мовою.html

Milan


Gerard de Noord
 
Tom in London
Tom in London
United Kingdom
Local time: 12:21
Member (2008)
Italian to English
OK Mar 12, 2023

Baran Keki wrote:

Tom in London wrote:
That guy in the video is I N T O L E R A B L E

Apparently his 2 million plus subscribers find him tolerable enough... He must be making shit loads of money from Youtube than you'll ever make from translation by annoying people like you.


I'm doing fine thanks.


Baran Keki
 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:21
Member (2009)
Dutch to English
+ ...
Hi Hans Mar 12, 2023

Hans Lenting wrote:

Who will be the first to tell us?

https://youtu.be/8SQV-B83tPU

Hey Michael: did you read this/do you have plans for this Sunday?


I have indeed already been investigating how to use Whisper to dictate text in Windows. Haven't gotten anywhere yet, but I might ask over at the knowbrainer forums, where someone might know more.

See e.g.:
https://www.knowbrainer.com/forums/forum/index.cfm
https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36875&highlight_key=y&keyword1=whisper (‘Whisper vs Dragon’)

I am currently on the fence as to whether to upgrade to the new Dragon Professional v16, also because Voice Access in Win11 is just so good already. In my testing it seems to be pretty much as good as Dragon already at dictating flowing text.

I also recently discovered you can send Voice Access to sleep and wake it up with a keyboard shortcut (Alt+Shift+B, which I have changed to something easier with AutoHotkey). You can of course also do this by voice, by saying: ‘Voice access wake up’ & ‘Voice access sleep’.

Voice Access also works really well in memoQ. You can't select words, but you can dictate your target segment perfectly, and you can even do basic things like confirm segments (by saying ‘Press control enter’), go to the top of the file (by saying ‘Press control home’).

Another thing about Windows 11's Voice Access that's good is that it runs entirely in the cloud. Temporarily ignoring the privacy concerns that this might raise with some people, this means that dictating text will not require any CPU/GPU cycles from your computer, like with Dragon and Whisper.

[Edited at 2023-03-12 14:16 GMT]


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:21
Member (2009)
Dutch to English
+ ...
interesting post @ knowbrainer.com/forums/… Mar 12, 2023

very interesting post by a person called rjwilmsi @ https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36875&highlight_key=y&keyword1=whisper

01/30/2023 07:06

rjwilmsi
Power Member

Posts: 77
Joined: 08/24/2008

I've been playing with
... See more
very interesting post by a person called rjwilmsi @ https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=36875&highlight_key=y&keyword1=whisper

01/30/2023 07:06

rjwilmsi
Power Member

Posts: 77
Joined: 08/24/2008

I've been playing with whisper more or less since it was first released on github https://github.com/openai/whisper/. It's freely available on github and after installation and download of the models it runs entirely offline. The installation is easy if you're a Linux user used to fiddling with python/utilities from github, probably a bit challenging if you are new to that sort of set up. Though I haven't tested the Windows installation of whisper, maybe that is packaged up so easier. I use it in conjunction with a "whisper_mic" utility to get live dictation https://github.com/mallorbc/whisper_mic's

Compared to the last versions of DNS that I was using regularly (DNS 12 and DNS 13), for general speech and dictation the accuracy of whisper is much better. I don't know if DNS 15 is much more accurate than 12 or 13, if it's just incrementally better then I'd say whisper would be clearly better. Whisper has 5 different models trading time versus accuracy, on the fastest two models its error rate is much lower than I got with DNS (for live dictation of general speech in English). On the slower models (which are too slow on CPU for real time dictation) the accuracy on everything I've played with (youtube video audio etc.) has been so good that differences versus a transcript I'd do myself are nearly all differences about punctuation (how do you spit sentences etc. from a speaker speaking off the cuff).

Because whisper is done by machine learning on big datasets of various audio sources it supports multiple languages, accents etc. so there is no concept of training it for your voice or specifying your accent. It means that it is very good at all sorts of accents and you don't have to cultivate your own custom trained profile etc. Also there is no big focus on your mic quality like with DNS.

The downsides of whisper is that it's much more resource demanding than DNS 12 - on CPU need a modern 6+ core CPU and to use the tiny or base model (fastest two). Otherwise for the larger 3 models need a workstation CPU (16 threads etc.) or better a reasonable NVIDIA graphics card to run in CUDA mode (e.g. a GTX 1060 or better), and unless you have a high end GPU (RTX etc.) then transcription would still be worse than real time on those larger models. Also there is no "training" you can do or custom vocabulary, so if you have specific terms that it hasn't got in its model it may not work so well there - though there is a "prompt" option where you can pre-feed it words likely to be in the audio (so sort-of custom vocab option). And of course its core is just speech recognition so it doesn't of itself provide any computer automation / macro functionality.

So I think if you use DNS for general dictation / transcription and find DNS's accuracy lacking then whisper is very much worth looking at. If you use DNS tied into the environment of automation, macros and specific software integration then whisper doesn't cover that.
Collapse


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 12:21
Member (2014)
Japanese to English
Yes, this Mar 12, 2023

Michael Beijer wrote:
very interesting post by a person called rjwilmsi

Thank you Michael, this is the thread I read. It's interesting and the technology certainly sounds promising. Hopefully the "prompt" function allows specialist terminology could be used.

Those comments have also persuaded me that my next PC should have a proper GPU. If the PC has Thunderbolt 4 ports one could use an eGPU, but that would not be a portable solution.

My current system is fairly puny and has only integrated graphics, which is why I haven't bothered trying Whisper.

Regards,
Dan


 
Gerard de Noord
Gerard de Noord  Identity Verified
France
Local time: 13:21
Member (2003)
English to Dutch
+ ...
Graphics processing unit Mar 12, 2023

Dan Lucas wrote:

Those comments have also persuaded me that my next PC should have a proper GPU.

Dan


That's the first thing I thought too. I hadn't thought twice about GPU before. Geeks like Milan, Hans and Michael can really guide us to stay ahead of the wave.

Cheers,
Gerard


Dan Lucas
 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:21
Member (2006)
English to Afrikaans
+ ...
@Hans Mar 13, 2023

Hans Lenting wrote:
https://youtu.be/8SQV-B83tPU

I can't tell if its better or not, but I used the online version (i.e. via Google Colab) to transcribe some Zoom meetings. I hit my free limit after transcribing about 7 hours of audio. Then I bought 100 credits for $12. Having paid gives me access to "Premium" GPU, and I can confirm that it is a bit faster than the free tier. One annoying thing is that there is a usage timeout (which can occur even while a transcription process is running), at which point all data is lost, so it's best not to transcribe more than 3 hours of audio in a single go. It eats about 13 credits per hour.

resources

When transcribing Zoom meetings and webinars, obviously its faster to upload only the audio, so to extract MP3 from MP4 I use Pazera's converter.

I'm not sure how much better Whisper is at transcribing such files than e.g. YouTube. It's faster (well, with Whisper the results are available within an hour or so, compared to 3-4 days with YouTube).

[Edited at 2023-03-13 16:21 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 13:21
English to Czech
Try Buzz - Free Transcription App Mar 13, 2023

Samuel Murray wrote:

Hans Lenting wrote:
Then I bought 100 credits for $12. Having paid gives me access to "Premium" GPU, and I can confirm that it is a bit faster than the free tier. One annoying thing is that there is a usage timeout (which can occur even while a transcription process is running), at which point all data is lost, so it's best not to transcribe more than 3 hours of audio in a single go.

It eats about 13 credits per hour.


I'm not sure how much better Whisper is at transcribing such files than e.g. YouTube. It's faster (well, with Whisper the results are available within an hour or so, compared to 3-4 days with YouTube).


Try the Buzz - it is free and works on your PC.

https://www.youtube.com/playlist?list=PLG8jlFKr-RtchdAg069DGCFTpVBae6-3R

Buzz - Free Transcription App Tutorials, 3:11 min
David Mbugua
6. 1. 2023

Batch Transcription Using Buzz - How to Automatically Transcribe Multiple Files for Free Using Buzz
--
I downloaded it, unzipped and run on my new PC. For example, I open ten MP4 files at once.
I select a language manually and I always use Large size of Whisper model. Buzz convert MP4 to MP3 automatically.
The results are TXT, SRT (or VTT) files.

Milan


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:21
Member (2006)
English to Afrikaans
+ ...
Buzz Mar 13, 2023

Milan Condak wrote:
Try the Buzz - it is free and works on your PC.

Malwarebytes is not happy with Buzz:
https://www.malwarebytes.com/blog/detections/malware-ai


 
Milan Condak
Milan Condak  Identity Verified
Local time: 13:21
English to Czech
Thank for info Mar 13, 2023

Samuel Murray wrote:

Malwarebytes is not happy with Buzz:



Samuel, thank you for the information. I'll inform the author if you haven't already.
"My" Buzz is not running on my main working PC. I have been using it for two months. Any system that creates new files tends to be suspicious.

https://github.com/chidiwilliams/buzz/releases/tag/v0.7.2

You tested source code?
--
I replaced version 0.7.0 with version 0.7.2, again "windows.tar.gz". After unzipping the file, Windows 11 reported "Unknown publisher" . I still use Buzz.

Milan


[Edited at 2023-03-13 21:49 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:21
Member (2006)
English to Afrikaans
+ ...
@Milan Mar 14, 2023

Milan Condak wrote:
After unzipping the file, Windows 11 reported "Unknown publisher".

Yes, that happened to me too, but that by itself is not concerning. I have such faith in my anti-virus tools that I would install such a program anyway. But when I ran the program the first time, Malwarebytes complained and quarantined the program. Malwarebytes isn't always right, though -- I have a folder on my computer that is excluded from scanning from where I run programs that Malwarebytes doesn't like. The explanation given by Malwarebytes on their website about this particular threat is so generic that I'm tempted not to take it seriously.


 
Pages in topic:   [1 2 3 4 5] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Is OpenAI’s Whisper better than Dragon?






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »