In this modern technological world, the conversion of audio and video content into text is a game-changer. More specifically, transcription not only improves the accessibility for the hearing impaired but also gives content creators, analysts, Python programmers, and researchers a fantastic tool.
Today’s guide is all about how you can effortlessly transcribe audio and video files into text using the AssemblyAI Python SDK.
Prerequisites
Before moving toward the transcription process, make sure that you have all of the essential tools and libraries configured and ready to use.
Install Python Virtual Environment
First of all, open Command Prompt or the terminal of your IDE or Code Editor and create a new Python virtual environment named “venv
“.
python -m venv venv
Activate Python Virtual Environment
The next step is to activate the created Python virtual environment on your respective operation system.
For Windows:
venv\Scripts\activate
For Linux or macOS:
source ./venv/bin/activate
Install AssemblyAI Python SDK
After activating the virtual environment, install the AssemblyAI Python SDK.
pip install assemblyai
For the verification of the AssemblyAI library installation, check out its details.
pip show assemblyai
It can be observed that we have successfully installed AssemblyAI Python SDK version “0.15.1“.
Get AssemblyAI API Key
Navigate to the AssemnblyAI official website and click on the highlighted button to sign up.
Specify your Email Address and hit the “Get your API key
” button.
Then, copy your API key from the below screen.
Copy the API key and save it somewhere for later use.
Audio/Video Transcription Using the AssemblyAI Python SDK
After fulfilling all of the mentioned prerequisites, you are now ready to check out the procedure of transcribing audio and video using the AssemblyAI Python SDK.
1. Import Required Libraries
First, create a Python file with the “.py” extension and import the “assemblyai
” library that offers the functionalities for interacting with the AssemblyAI API.
Moreover, this library permits you to transcribe audio and video content into text effortlessly.
import assemblyai as aai
2. Specify the Audio/Video File URL
Next, specify the URL of the desired audio or video file that you want to transcribe. Here, “URL
” simply refers to the location of the content you want to convert to text.
URL = "audio_or_video_file_link"
For instance, in our case, we have added a URL of a podcast episode.
URL = "https://talkpython.fm/episodes/download/356/tips-for-ml-ai-startups.mp3"
3. Set the Output File (Optional)
Now, choose whether you want to save the transcript to a file to display it to the console. We recommend you save it to a file if you want to use it for future reference or analysis.
OUTPUT_FILENAME = "filename.txt"
Specify the filename in the double quotes as we did here.
OUTPUT_FILENAME = "example.txt"
4. Configure Transcription Settings
The AssemblyAI Python SDK permits configuring several settings of the transcription process. However, in our case, we are customizing the formatting and punctuation options of the transcript.
Therefore, we have set the value of “punctuate
” and “format_text
” as “True
“.
More specifically, the “TranscriptionConfig
” is used for defining these settings.
config = aai.TranscriptionConfig( punctuate=True, format_text=True )
5. Initialize the AssemblyAI SDK
Before making API calls, it is required to initialize the AssemblyAI SDK. This operation is based on setting your unique API key, which can be utilized for authentication when interacting with the AssemblyAI service.
aai.settings.api_key = "YOUR_API_KEY" transcriber = aai.Transcriber()
This step ensures authorized and secure access to the API.
6. Call the AssemblyAI API
Now, start the actual transcription process by utilizing the instantiate “Transcriber
” object, the defined URL, and the configuration settings as arguments in the “transcribe()
” method.
This method will initiate the transcription of the given video or audio content.
transcript = transcriber.transcribe(URL, config)
7. Write Transcription to File or Print to Console
You can either write the transcribed text to a file or print it to your console, it depends on your project requirement. In case, if you selected to write it to a file, the given code will open the given file and write the transcripted text to it.
Otherwise, if no output file has been specified, the transcribed text will be shown on the console.
if OUTPUT_FILENAME: with open(OUTPUT_FILENAME, "w") as file: file.write(transcript.text) else: print(transcript.text)
8. Complete Code for Transcription
Here is the complete code for transcribing audio or video files. Make sure to mention your Assembly API key in replacement of “YOUR_API_KEY
“.
import assemblyai as aai URL = "https://talkpython.fm/episodes/download/356/tips-for-ml-ai-startups.mp3" OUTPUT_FILENAME = "example.txt" config = aai.TranscriptionConfig( punctuate = True, format_text = True ) aai.settings.api_key = f"YOUR_API_KEY" transcriber = aai.Transcriber() transcript = transcriber.transcribe(URL, config) OUTPUT_FILENAME = 'example.txt' if OUTPUT_FILENAME: with open(OUTPUT_FILENAME, "w") as file: file.write(transcript.text) else: print(transcript.text)
Why Transcribe Audio and Video Files into Text?
Transcribing audio and video file into text offer several benefits, such as:
- Accessibility – Ensuring content inclusivity for the hearing impaired.
- Searchability – Improving SEO and content discoverability.
- Analysis – Enabling detailed content analysis and data mining.
- Repurposing – Facilitating content adaptation into various formats.
- Translation – Allowing accurate content translation.
- Legal Compliance – Meeting legal requirements for records.
- Education – Enhancing learning materials and comprehension.
- Collaboration – Providing textual records for remote teamwork.
- Data Enrichment – Contributing to training speech recognition models.
That brought us to the end of our today’s guide related to transcription.
Conclusion
The ability to transcribe audio and video files into text is considered a transformative capability in today’s digital life. With the help of AssemblyAI Python SDK, you can not only achieve this functionality but also perform it much more efficiently.
From SEO optimization and accessibility to content analysis and beyond, transcription plays an essential for unlocking different possibilities in your Python project. So, use AssemblyAI Python SDK and utilize your multimedia content to its fullest potential.
Want to explore and learn more related to Python, do check out our dedicated Python Tutorial Series!