I would like to add subtitles to some video conferences automatically, and then to translate them to different languages. Which speech recognition software and translation software do you recommend in this regards, please?
I have successfully used Google Cloud Speech-to-Text[1] [Speech API] for transcribing phone calls in an Enterprise Intelligence application.
Their documentation is very helpful. We have used NodeJS to call their REST API.
You have to first upload the audio files to a Google Cloud Storage bucket, and then invoke a long-running-process which will transcribe the audio into a JSON file, and report back once finished.
I highly recommend checking out the following article on subtitle translation and subtitles in Chinese and English. https://www.aclang.com/blog/challenges_film_translation/