Voice overs: technical considerations for vetting the network – Bunny Studio Help Center

Our Quality Control team uses the following premises to review any recordings submitted to the system:

Project Requirements:

Does the voice match the age, gender, accent, and language selected by our client?

Project Instructions:

Did the Pro follow the project instructions provided in the remarks section?
Did the Pro use the reference of the sample or the external material attached by our client?
Did the Pro read the script exactly as provided?
If the Pro submitted multiple takes, are they complete and organized?

Technical Requirements:

There isn't a room echo.
The voice sounds at a proper distance from the mic.
There isn't a distortion/clipping.
The volume is normalized around -3dB Peak.
There are no editing issues (i.e. clicks, pops, audible cuts).
There isn't more silence than needed at the start or end (0.5 secs is suggested).
The audio doesn't sound heavily processed (i.e. noise reduction plugin, EQ).
There isn't heavy compression and/or limiting.
The voice sounds as if it was recorded with professional equipment.
There aren't breath noises (i.e. loud, undesired, and distracting).
There aren't mouth clicks/mouth noises (i.e. loud, undesired, and distracting).
There aren't plosives.
There's no hiss/white noise.
There's no electrical noise/hum.
There aren't any background noises (i.e. cars, mouse clicks, fans, pages, people, etc.).
There aren't sibilance issues.

Performance Requirements:

Does the voice sound monotone/flat?
Does the voice sound robotic or computer-generated?
Are there pronunciation issues, mumbled, or unclear words?
Does the voice sound nasal/raspy?

Other requirements:

Did the Pro include any contact details, or content not relevant to the project? If the project is marked with the syncing option:
Is the deliverable synced as instructed by our client?
If the audio has background music, is the voice easy to understand?
Does the sample convey a complete idea? I.e. the sentences are complete, and the demo is clear.
If the sample has multiple voices, is it clear which is the pro's voice?
If the sample has background music, is the voice easy to understand? Is the mix balanced?
Is the sample well-labeled by category and other relevant attributes? (Gender, age, purpose, language, accent).

Formats that audio files should be submitted in

You can use an audio recording program of your choice, but you must record your voice over in:

.wav
Mono
44.1 kHz sample frequency
16-bit depth
Normalized to -3dB peak

Formats other than these will not be accepted by our website.

If the audio format is wrong, you will see a red error message telling you that the file you're uploading is in an incorrect format. Some audio software and DAWs are set by default to 32-bit .wav files. If you are having trouble uploading, be sure to double-check the bit rate.

Allocate yourself plenty of time for submission. Keep in mind that long .wav files can take a while to upload, depending on the speed of your Internet connection. Also, do make sure that the file has finished uploading before you click on the submit button. Should you try to click submit before the upload is complete, you will see an error message.

Terms & Definitions

Noise floor: The measurement of signals created by all noise sources and unwanted signals. Noise is defined here, as any signal other than the one being monitored.

De-esser: A special process that is sensitively tuned to sounds with high frequencies such as the sound produced by the letter “s”, hence the name de-esser.

Normalization: The application of a constant amount of gain to an audio recording to bring the average or peak amplitude to a target level (the norm). We ask that your audio is normalized to -3db.

Overmodulated sound: This occurs when a signal - be it from an acoustic source, such as sound recorded into a microphone, or an electronic signal passing through a console - is too strong for its intended target to handle. This results in audio that sounds distorted. If your read is rejected because of this, consider the possibility that your mic gain might be too loud, or that the file recording may be at a level too high. Record your audio between -3db and 0db for the best results.

Clip (audio): This is a type of distortion that occurs at the loudest points of an audio file. Sometimes it results in a pop, click, or cut at that point in the audio. Ensure you are recording and normalizing around -3db.

Plosive: This is a popping sound created by air hitting the microphone when certain sounds are produced, such as the "p," "b," "t," and other "hard consonant" sounds. Adding a pop filter to your microphone, as well as improving your technique on using it, can reduce this effect. Bunny Studio Voice will not accept reads or samples containing these loud sounds.

Pop filter: This is a piece of foam or a "windscreen" that attaches to your microphone. It prevents plosive sound by reducing the amount of air that hits the microphone when you speak. These are very inexpensive and can be purchased at any music or online store.

Room Echo: This is the sound of a voice bouncing off walls, ceilings, or other hard surfaces. The sounds are reverted back into the microphone, creating an echo effect that sounds like the person is speaking in a large room or a bathroom. (The actual size of the room doesn't affect this).

.wav: A .wav is a type of audio format that is required by Bunny Studio Voice. Audio data is saved in an uncompressed PCM format. We require that you record your .wav in mono, 44.1 kHz, 16 bit. Find out more about file formats here.

Deliverable: The final product that Pros submit to the system (article, voice over, logo, etc.)

Turnaround time: The average time that a Pro takes to review, accept, and submit an entry or a full deliverable. The faster, the better!

Pro: All the accepted writers, designers, creatives and artists that craft top-notch deliverables within the Bunny Studio platform.

Brief: A section of the project form that clients fill up to describe details of what they need. It includes information on how the client's deliverable should be crafted and what the content should communicate.

Rates: The range of prices that Pros charge to craft a complete project or submit an entry for a Contest.

Sibilance: High-frequency hissing sound. Most commonly presented when speaking in the consonants 's', 'sh', 'ch', 'z', 'zh', and others similar.

Robotic: A read that sounds flat, or one that doesn't convey any emotion. Robotic reads are the main reason why our clients reject projects.

Terms & Definitions

Related articles