Learning Timeline
Key Insights
Lyric Quality Limitations
Currently, the quality of lyrics generated by UniAudio 2.0 may sound like gibberish and lack clarity, even though the music itself sounds good.
Advantages of a Unified Model
UniAudio 2.0 is a 'unified audio language model,' meaning you can perform various tasks like TTS, SFX, and voice editing using a single model without needing to switch tools.
Prompts
Prompt for Generating Sound Effects (SFX)
Target:
UniAudio 2.0
large crowd cheers and applauds
Prompt for Generating Music by Genre
Target:
UniAudio 2.0
energetic Punjabi folk song
Prompt for Generating Arabic Songs
Target:
UniAudio 2.0
Arabic dance song
Step by Step
How to Install and Run UniAudio 2.0 Locally
- Visit the UniAudio 2.0 GitHub page using the link provided in the description.
- Scroll down until you find the 'Instructions' section.
- Follow the steps to download the required model files to your computer.
- Install the dependencies listed in the GitHub documentation file.
- Run the main script following the terminal instructions provided to start the audio model.
Generating Sound Effects (SFX) and Music
- Select the 'Text-to-Audio' or 'Text-to-Music' mode on the program interface.
- Enter your desired sound description into the prompt input box (e.g., 'large crowd cheers').
- To generate a song with lyrics, enter the specific text lyrics into the provided input field.
- Press the 'Generate' button to start the audio synthesis process.
- Play the audio output and click 'Download' or 'Save' to save the file to your computer.
Speech Style Editing
- Prepare a reference audio file (reference dialogue) whose style you want to change.
- Upload the reference audio file into the UniAudio system.
- Type your desired target voice style (e.g., 'whisper style') in the prompt field.
- Click the 'Edit' or 'Convert' button to process the reference voice into the new style.
- Wait for the process to complete and listen to the edited voice result.