How to Generate Captions and Transcripts for Video and Audio-only Files

If you are planning to create video or audio media to include with learning or presentation materials, you will want to ensure the inclusion of captions and transcripts. Ohio State University adheres to Minimum Digital Accessibility Standards (MDAS) that dictate the need to have videos captioned and provide transcripts for video and audio media to guarantee functional accessibility for individuals with disabilities as well as any other individuals that require this accommodation. 

Video Media

Captions can be helpful for anyone who views a video, especially those viewing videos in a second language, people watching videos in noisy environments, or anyone who needs additional help making sense of the material presented in the video. There are three options for you to use to generate and edit captions for video media at Ohio State. Using one of these university approved tools, you will also be able to produce text transcripts to provide alongside your captioned videos:

  1. CarmenZoom
  2. Microsoft Stream
  3. Mediasite

You can record your media directly using any of the tools listed above, and in the case of Microsoft Stream and Mediasite, you also have the option to upload MP4 media files that have been created elsewhere.


Using CarmenZoom, you can record and caption video media. If you are unfamiliar with this process, you can consult our student-facing tutorial “CarmenZoom Video Capture” for detailed instructions. The instructions below will primarily highlight the key components of generating and accessing captions and transcripts. NOTE: CarmenZoom recordings are best suited to content that is created or recreated on a regular basis since CarmenZoom automatically deletes recordings after 120 days. If you’d like to retain long-term access to your CarmenZoom recordings we recommend downloading and re-uploading them to one of the other approved platforms listed here.

You can record captioned videos in CarmenZoom from your laptop or tablet device. From a laptop, you can login via browser or desktop application. If you are using a tablet, you will want to be sure to have the Zoom app downloaded to your device. To record and auto-generate captions for your video media in CarmenZoom:

  1. Go to and login using your Ohio State credentials. NOTE: If you are using the app, you’ll need to select the SSO sign-in option and then enter your Ohio State credentials. (Fig. 1)
  2. Start a meeting in your personal meeting room. (Fig. 2)
  3. To turn on live transcription, click the Show Captions option from your Zoom controls. This will bring up a pop-up on screen. As you start speaking, you will now see captioning appear beneath your video or shared screen. (Fig.3)
  4. You are now ready to start a recording. If at any time you need to pause the recording to switch screens, select Pause and/or Resume Recording as needed. When you have finished recording click the Stop Recording button followed by the End button to generate your recording. When it is finished processing you will receive an email notification in your Ohio State email account. (Fig. 4)
  5. Once your recording is ready, click on the “For host only, click here to view your recording detail” link in your email to access the “Shared screen with speaker view” to make any necessary edits to the transcript. You’ll also have the option to trim the start and end time of your recording. (Fig. 5)
  6. From here, you can share a link to your captioned video by clicking the “Copy sharable link” button. You can also download and share a copy of the video transcript. Under the heading that reads “The recording includes the files listed below,” you’ll see an Audio Transcript option. Click on this to download as a .vtt file. This file type can then be opened in a text editor application and exported to PDF or Word document. (Fig. 6)

If you download all the files from your Zoom recording, you can also upload them to Microsoft Stream or Mediasite to ensure that they are available for more than 120 days. Click on the step-by-step instructions for one of these options below to learn more. 

NOTE: There are a total of six images in the gallery below. Hover over each image for a second to view the figure number and brief description.

Microsoft Stream

As a part of the Microsoft suite of applications and products, Microsoft Stream is an option for recording or uploading video media to be captioned. Microsoft Stream can generate captions and transcripts for videos up to 4GB in size. Recordings in Stream are limited to 15 minutes per video to help you stay within these parameters. Longer videos can be captured using Mediasite, which is described in the section below. 

Click the step-by-step instructions below for more information about how to record and generate captions and transcripts in Microsoft Stream.

To get started go to and login using your Ohio State credentials. From the left-hand navigation menu or by clicking the app grid icon in the top left corner, select Stream. (Fig. 1) If you are recording your video directly in Stream:

  1. Click the Screen recording, Camera recording, or Playlist options to begin recording your video. (Fig. 2)
    1. Screen recording: this will create a “cameo” view of you over a browser tab, presentation slides, or your full computer screen. You can disable the cameo view and record your screen only. NOTE: to share sound in your video recording, you must select the option to share a browser tab.
    2. Camera recording: this is a video recording of only you on camera. You can add backgrounds and effects using this option.
    3. Playlist: this gives you the option to record multiple videos in a series and then share the full playlist. This is very helpful in the context of classroom recordings that you might build on over the course of the semester. Grant access to the playlist link and avoid sharing multiple video links. 
  2. Click the large circle in the bottom, center of your browser window to begin recording when you are ready and to stop recording when you are finished. You’ll have the option to apply minor edits. (Fig. 3)
  3. Click Finish in the bottom, right-hand corner of the editor screen when your edits are complete to begin processing your video. (Fig. 4)
  4. Once your video is finished processing, jump to the steps under “Once this is complete” below.

NOTE: while you can record audio-only files directly in Microsoft Stream, the ability to generate transcripts for these types of files is not yet available. You can, however, download the audio file and use one of the options listed under the Audio-only Files heading below to produce your transcripts. 

If you are uploading a pre-recorded video file:

  1. Click the Upload option under the Create new heading and select the file from your device. (Fig. 5)
  2. You can upload your file to OneDrive or select Change location to add the file to a specific folder or other location in your OneDrive or Sharepoint sites. Click the Upload or Select button to begin uploading your file.

Once this is complete:

  1. Click on the video within Stream to open it in player view.
  2. Locate and click the Edit button in the top right-hand corner (Fig. 6), then 
  3. Locate the Video Settings option and click on “Transcript and Captions” to open a small menu where you will see a blue button to Generate your transcript (Note: you can select from different languages if the video you’re uploading is not in English (US)).
  4. Click the blue Generate button in the pop-up window that appears. This will give you the approximate time that it takes to generate your transcript. (Fig. 7)
  5. You can navigate away from this window to do something else if you wish. When you return, refresh your browser to view the video with captions and with the transcript appearing on the right-hand side.
  6. You can edit the transcript text blocks as you need by hovering over them and selecting Edit. Remember to select Done when finished editing. You can also use the search bar to search for keywords and jump ahead to other transcript text blocks. (Fig. 8)
  7. You are now ready to share you’re link to your video in Stream. Be sure to adjust the file sharing permissions to enable appropriate viewing. Additionally, you can download the transcript to share as a Word document or .vtt file that can be converted to PDF or used to upload to another platform. 

NOTE: There are a total of eight images in the gallery below. Hover over each image for a second to view the figure number and brief description.

View a short walk-through video of how to upload, record, and edit your video, then generate and edit your transcript in Microsoft Stream. This video has no dialogue, but there is background music playing. It is meant as a demonstration only.


Mediasite is Ohio State’s approved lecture capture platform. The Whisper captioning tool within Mediasite can reliably autogenerate captions and transcripts for small to very large video files in several languages, though it can take some time to become familiar with this platform and the location of important settings and functionalities if you aren’t used to working with the tool. 

Start by going to and login using your Ohio State credentials. If you are recording your video directly in Mediasite:

  1. Click Add Presentation. NOTE: you will first need to download the Mediasite Mosaic app to manage the recording. Click the link or button to download. (Fig. 1)
    1. If you have not downloaded the app, you’ll be prompted to follow Mediasite’s two-step process to Download then Install and Register with Ohio State’s Mediasite server. You’ll be given the option to download the app for MacOS or Windows. (Fig. 2)
    2. Follow the instructions from Mediasite and enable the proper setting when prompted to use the app as designed, then return to your browser to complete the second step to Install and Register the Mediasite Mosaic app. You will know this process is complete when you see your Ohio State username appear in the top right-hand corner of the app. 
  2. Give the file a Title/Description and click the Create Presentation button. It may take a while for the file to finish processing, depending on the size of the file. 
  3. Make sure that the app is toggled to the Capture screen, then enable the camera and microphone that you’d like to use with the buttons at the top of the app window. When you are ready, click the round record button at the bottom of the app window. Click the square stop button when you are finished to begin processing your video in Mediasite. (Fig. 3)
  4. Once your video is finished processing, jump to the steps under “Once this is complete” below.

If you are uploading a pre-recorded video file:

  1. Click Add Presentation to upload a new video file, then 
  2. Select Choose File and locate your file on your device. 
  3. Give the file a Title/Description and click the Create Presentation button. It may take a while for the file to finish processing, depending on the size of the file. 

Once this is complete:

  1. Click Edit Details in the far-right options menu, near the top right corner of the screen. (Fig. 4)
  2. Then, below the video window, choose the Delivery tab and check the box next to Audio Transcriptions. (Fig. 5)
  3. Next, select “Choose a Provider for Captioning”. Click within the text area that says, “Select a Captioning Profile”. (Fig. 6)
  4. Select Automated Transcription (English) from the pop-up window (use Language Detect if the audio is in a language other than English). (Fig. 7)
  5. Click the blue Save button near the top-right corner of the screen to start the transcription process. You should receive an email when the autogenerated captions have completed. (Fig. 8)
  6. Once the captions have finished generating, select Edit Captions from the right navigation menu. From here, be sure to review and edit captions for accuracy and click the Save button at the top of the screen to save your edits. (Fig. 9)
  7. Finally, you can download the transcript file by selecting Downloads and then selecting Transcript. This will give you the transcript in the form of a .txt file, which you can then share or save as a PDF or Word document. (Figs. 10 & 11)

NOTE: There are a total of eleven images in the gallery below. Hover over each image for a second to view the figure number and brief description.

Audio-only Media

In developing or delivering your course, you may come across a situation in which you would like to incorporate audio-only materials that have not been recorded in CarmenZoom or that may come from other external sources (e.g. podcasts). Just as with video recordings, providing text transcripts to your students together with the media files greatly enhances access to the materials for a variety of learners. 

There are two university-supported tools that we recommend to instructors for generating, editing, and sharing text transcripts for audio-only recordings: 

  1. Microsoft Word (web browser application)
  2. Mediasite

To use either of these tools to generate transcripts, the first step is to ensure that you have access to the actual audio file (not simply a URL link), as you will need to upload the file first before you can utilize the transcription services. Once you have the file on hand, determine which of the two supported tools will best fit your needs, and then follow the step-by-step instructions included in the dropdowns below to generate, edit, and download your transcripts.

Microsoft Word (Web browser application)

With Microsoft Word’s web browser application (not the desktop app), you can use the Dictate and Transcribe tools to either upload an existing audio file or record directly into the document. Once the recording has been added, you can then transcribe the audio directly from those recordings into text that can be automatically populated onto the page.

To transcribe audio using Microsoft Word’s web browser application:

  1. Go to and login using your Ohio State credentials.
  2. From the left-hand navigation menu or by clicking the app grid icon in the top left corner, select Word. (Fig. 1)
  3. Select Blank document to open a new file. (Fig. 2)
  4. With the Home tab selected from the toolbar, click the dropdown next to the Dictate tool and choose Transcribe. (Fig. 3)
  5. Click Upload audio and locate the audio file on your device and select Open. This may take a few moments to process. (Fig. 4)
  6. Once the file has finished processing, you can edit the transcript for accuracy directly in the right-hand Transcribe pane. It is a good idea to update the Speaker lines to identify the individual speakers’ names, if possible.  (Fig. 5)
  7. Once you have finished editing the transcript, click the “Add to document” button near the bottom right corner of the page and choose “with speaker and timestamps” to add the full transcript to the Word doc. (Fig. 6)

You can now download a version of this Word document to your device and save as a PDF or share it directly with others via OneDrive.

NOTE: There are a total of six images in the gallery below. Hover over each image for a second to view the figure number and brief description.



If you are already using Mediasite to store or record other media files for your class, this may also be a good option to generate transcripts for any audio-only files you might wish to share with students. 

To get started:

  1. Go to and login using your Ohio State credentials.
  2. Click Add Presentation to upload a new audio file. (Fig. 1)
  3. Select Choose File and locate your audio file on your device. (Fig. 2)
  4. Give the file a Title/Description and click the Create Presentation button. It may take a while for the file to finish processing, depending on the size of the file.
  5. Once the file has been added, click Edit Details in the far-right options menu, near the top right corner of the screen. (Fig. 3)
  6. Then, below the video window, choose the Delivery tab and check the box next to Audio Transcriptions. (Fig. 4)
  7. Then select, “Choose a Provider for Captioning”. (Fig. 4)
  8. Click within the text area that says, “Select a Captioning Profile”. (Fig. 5)
  9. Select Automated Transcription (English) from the pop-up window (use Language Detect if the audio is in a language other than English). (Fig. 6)
  10. Click the blue Save button near the top-right corner of the screen to start the transcription process. You should receive an email when the autogenerated captions have completed. (Fig. 7)
  11. Once the captions have finished generating, select Edit Captions from the right navigation menu. From here, be sure to review and edit captions for accuracy and click the Save button at the top of the screen to save your edits. (Fig. 8)
  12. Finally, you can download the transcript file by selecting Downloads and then selecting Transcript. This will give you the transcript in the form of a .txt file, which you can then share or save as a PDF or Word document. (Figs. 9 & 10)

NOTE: There are a total of ten images in the gallery below. Hover over each image for a second to view the figure number and brief description.