Tõlk.fm documentation
Help Center
In-depth guides for event organisers — from setup and audio to billing, privacy, and troubleshooting. For quick answers, see the FAQ.
First steps
Getting started
Everything you need to run your first event — account creation, event setup, and distributing your QR code.
Create an account
- Go to tolk.fm and click Sign up.
- Enter your email address. You will receive a magic link — no password required.
- Click the link in the email to confirm your address and land in the dashboard.
- Optional: add your organisation name in Settings → Profile so it appears in event pages and billing documents.
You must be at least 18 years old to create an account or purchase services.
Create your first event
- In the dashboard click New event.
- Enter an event title and choose an event mode: Live translation, Scheduled playback, or Hybrid. Live translation is the right choice for most first events.
- Select your source language — the language the speaker will use.
- Choose up to 8 target languages. Each active language channel is billed independently.
- Click Create event. You are taken to the event control room.
- From the control room, copy the join link or download the QR code to share with your audience.
You can edit most settings — title, languages, engine, speaker mode — after creation and even while the event is running.
Share the join link and QR code
Every event has a join link and a QR code. Audience members scan or tap to open the join page, pick their language, and start listening — no app or account required.
- Project it on screen — display the QR code on a slide at the start of your event. Include it in the title slide and on any language-change break slides.
- Print handouts — use the Download PDF button in the event control room to generate a print-ready A5 card with the QR code and join URL.
- Custom URL — set a custom join slug (e.g. tolk.fm/my-conference) to make the link easier to type and remember. See Create a custom join URL below.
- Messaging apps — paste the join link into the event programme, WhatsApp group, or email invitation.
Run a test event
Always rehearse before your real event. Test runs consume credits at the same rate as production events, but are essential for catching audio and connectivity issues.
- Create a test event with the same language channels you plan to use.
- Set up the same microphone, mixer, or audio interface you will use at the venue.
- Start the event and speak naturally for a few minutes while a colleague listens on the join page.
- Check latency (a few seconds is normal), translation accuracy, and audio quality across all target languages.
- Switch between OpenAI and Gemini to compare which engine suits your content.
- End the event when done. The test cost will be deducted from your wallet.
Microphone placement and room acoustics strongly affect translation quality. Test in the actual venue whenever possible.
Create a custom join URL
A custom join slug replaces the default code-based URL with a human-readable path:
tolk.fm/my-conference-2026
- Open the event from the dashboard and go to Event settings.
- Find the Custom join URL field and enter your preferred slug. Slugs must be unique — you will see an error if the slug is already taken.
- Save. The new URL becomes active immediately and the old code-based URL continues to work.
Slugs can contain letters, numbers, and hyphens. Spaces and special characters are not allowed.
Event setup
Event modes
Tõlk.fm supports three event modes and three audience capture modes. Learn which combination fits your event.
Live translation mode
In live translation mode, source audio is captured from a microphone, mixer, or audio interface and streamed to the AI translation engine in real time. Translated audio is broadcast to listeners within a few seconds.
- Best for: conferences, keynotes, panels, lectures, and any event where content is not known in advance.
- Latency: a few seconds is normal. Let your audience know in advance.
- Languages: up to 8 target languages simultaneously. Each runs as a separate AI translation session.
- Engine: choose OpenAI (13 languages, synthetic voice) or Gemini (70+ languages, preserves speaker voice). You can switch live.
Scheduled playback mode
In scheduled playback mode, you upload a pre-recorded audio or video file before the event. Tõlk.fm prepares translated audio tracks for all target languages. At the scheduled start time, each listener hears the translated version of the recording in perfect sync with everyone else on the same channel.
- Best for: pre-recorded keynotes, video screenings, or presentations where content is finalised in advance.
- No latency: prepared translations are synced exactly to the recording timeline.
- Upload: upload your source file from the event control room. Translation preparation begins automatically.
- Start time: set the broadcast start time. Listeners who join early will wait on a countdown screen.
Hybrid mode
Hybrid mode combines prepared translated tracks for pre-recorded segments with live AI translation for live segments such as Q&A, panel discussions, or speaker remarks.
Example: a pre-recorded keynote video plays with prepared translations, then a live Q&A session switches to real-time AI translation with open-mic audience participation.
Hybrid mode requires planning the handoff points between prepared and live segments. Test the transition with your team before the event.
Open floor and audience mic
Open floor is an audience capture mode that lets listeners request the microphone from their phone. One person speaks at a time; the organiser or speaker approves the request.
- Best for: guided tours, workshops, Q&A sessions, and any event where the audience needs to speak.
- How it works: a listener taps Request mic on the join page. The organiser sees the request in the control room and taps Approve. The listener's voice is then translated and broadcast like any other speaker.
- One at a time: only one person holds the floor. The previous speaker is automatically released when a new request is approved.
- Translation cost: open-mic speech adds to the per-channel-minute cost in the same way as main-speaker audio.
Speaker monitor mode
Speaker monitor mode gives the event speaker or moderator a dedicated interface showing live translations as they are generated. It also provides controls for the open-mic queue.
- In event settings, enable Speaker monitor.
- A separate join code is generated for the speaker monitor page. Share this with your speaker or moderator.
- The speaker opens the monitor page on their device. They see captions and translations in real time.
- Optionally, exclude one language from the speaker monitor to avoid hearing a translated version of their own voice (echo avoidance).
Add or remove languages during an event
You can change the active language channels without stopping the event.
- Add a language: in the runtime control room, click Add channel and select the target language. The new channel starts within a few seconds.
- Remove a language: click Stop next to an active channel. Billing for that channel stops immediately.
- Switch engine: you can switch between OpenAI and Gemini for the entire event from the runtime controls. Channels restart automatically with the new engine.
Audio
Audio & sound setup
Good audio in means good translation out. How to set up your microphone, mixer, and venue connection.
Microphone and audio input setup
Tõlk.fm captures audio from whichever input device your browser has access to. For best results:
- Use a directional microphone pointed at the speaker's mouth rather than an omnidirectional room microphone. This reduces background noise and reverberation.
- Set gain correctly — aim for a signal that peaks around –12 dBFS. Clipping or very low signal both hurt translation accuracy.
- USB audio interfaces (e.g. Focusrite Scarlett, RODE AI-1) give you more control over gain and connect reliably to laptops and tablets.
- Bluetooth microphones introduce additional latency and are not recommended. Use wired USB or 3.5 mm TRS connections where possible.
Using a mixer or PA direct feed
Connecting a direct feed from the venue mixing desk produces the cleanest possible audio and is strongly recommended for large venues.
- Ask the venue sound engineer for an auxiliary send or direct output from the mixing desk.
- Connect this output to your laptop or tablet via a USB audio interface or a 3.5 mm TRS adapter.
- In the Tõlk.fm control room, select the correct audio input device before starting the event.
- Ask the sound engineer to set the send level at a consistent volume — avoid riding gain during the event, which causes translation quality to vary.
A PA feed from a live desk typically sounds better than a lavalier microphone placed near the event speaker, especially in large or reverberant spaces.
Understanding latency
Live AI translation introduces a delay of a few seconds between the speaker and the listener. This is inherent to the technology: the model must receive a phrase before it can translate it.
| Latency | What it means |
|---|---|
| 1–4 seconds | Normal. Expected for real-time AI translation. |
| 5–8 seconds | Acceptable. Check network conditions if this is consistent. |
| 10–15 seconds | High. Investigate network, provider load, or audio issues. |
| 15+ seconds | Problematic. Restart the channel and check connectivity. |
Tell your audience at the start of the event that translation trails the speaker slightly. This sets expectations and prevents confusion.
Listener headphone guidance
Headphones are strongly recommended for every listener. Without them:
- The translated audio from a listener's device can be picked up by the event microphone, creating a feedback loop that degrades translation quality for everyone in the room.
- Ambient translated audio from multiple devices creates distracting noise for other attendees.
Practical options: ask attendees to bring earphones; provide disposable earbuds at the registration desk; or use an induction loop system for accessibility-compliant deployments.
Network and bandwidth requirements
Tõlk.fm requires a stable, low-latency internet connection. Bandwidth requirements are modest, but connection stability matters more than raw speed.
| Connection | Requirement |
|---|---|
| Organiser device (audio capture) | Stable uplink ≥1 Mbps, latency <100 ms preferred. Wired Ethernet strongly recommended for the organiser side. |
| Listener devices (audio playback) | Any Wi-Fi or 4G/5G mobile connection. Minimum ~200 kbps per channel. |
| Venue Wi-Fi | Separate SSID for the event from the public guest network. Ensure sufficient access points for the expected audience density. |
Avoid relying on shared public Wi-Fi for the organiser's audio capture device. If the venue connection drops, translation stops for all listeners.
Languages
Languages & translation engines
OpenAI and Gemini power Tõlk.fm. Choose the right engine for your language coverage and voice preferences.
Supported languages
The available target languages depend on the translation engine chosen for the event. You can run up to 8 target languages simultaneously per event.
OpenAI realtime translate — 13 target languages
- EnglishEnglish
- SpanishEspañol
- PortuguesePortuguês
- FrenchFrançais
- Japanese日本語
- RussianРусский
- Chinese中文
- GermanDeutsch
- Korean한국어
- Hindiहिन्दी
- IndonesianBahasa Indonesia
- VietnameseTiếng Việt
- ItalianItaliano
Google Gemini live translate — 81 target languages (includes all OpenAI languages)
- AfrikaansAfrikaans
- AkanAkan
- AlbanianShqip
- Amharicአማርኛ
- Arabicالعربية
- ArmenianՀայերեն
- AzerbaijaniAzərbaycan dili
- BasqueEuskara
- BelarusianБеларуская
- Bengaliবাংলা
- BulgarianБългарски
- Burmese (Myanmar)မြန်မာစာ
- CatalanCatalà
- Chinese中文
- Chinese (Simplified)简体中文
- Chinese (Traditional)繁體中文
- CroatianHrvatski
- CzechČeština
- DanishDansk
- DutchNederlands
- EnglishEnglish
- EstonianEesti
- FilipinoFilipino
- FinnishSuomi
- FrenchFrançais
- GalicianGalego
- Georgianქართული
- GermanDeutsch
- GreekΕλληνικά
- Gujaratiગુજરાતી
- HausaHausa
- Hebrewעברית
- Hindiहिन्दी
- HungarianMagyar
- IcelandicÍslenska
- IndonesianBahasa Indonesia
- ItalianItaliano
- Japanese日本語
- JavaneseBasa Jawa
- Kannadaಕನ್ನಡ
- KazakhҚазақ тілі
- Khmerខ្មែរ
- KinyarwandaIkinyarwanda
- Korean한국어
- Laoລາວ
- LatvianLatviešu
- LithuanianLietuvių
- MacedonianМакедонски
- MalayBahasa Melayu
- Malayalamമലയാളം
- Marathiमराठी
- MongolianМонгол
- Nepaliनेपाली
- NorwegianNorsk
- Norwegian (Bokmål)Norsk bokmål
- Persianفارسی
- PolishPolski
- PortuguesePortuguês
- Portuguese (Brazil)Português (Brasil)
- Portuguese (Portugal)Português (Portugal)
- Punjabiਪੰਜਾਬੀ
- RomanianRomână
- RussianРусский
- SerbianСрпски
- Sindhiسنڌي
- Sinhalaසිංහල
- SlovakSlovenčina
- SlovenianSlovenščina
- SpanishEspañol
- SundaneseBasa Sunda
- SwahiliKiswahili
- SwedishSvenska
- Tamilதமிழ்
- Teluguతెలుగు
- Thaiไทย
- TurkishTürkçe
- UkrainianУкраїнська
- Urduاردو
- UzbekOʻzbekcha
- VietnameseTiếng Việt
- ZuluisiZulu
If a target language you need is not on the OpenAI list, switch to Gemini in event settings to access the full catalogue of 81 languages.
OpenAI vs Gemini — which should I choose?
| Feature | OpenAI | Gemini |
|---|---|---|
| Languages | 13 | 81 |
| Voice style | Synthetic voice (selectable), adjustable speed | Preserves speaker's original voice and intonation |
| Transcription | Source-language captions | Target-language captions |
| Latency | Typically low | Slightly higher but comparable |
| Best for | Common languages, clean synthetic output, voice customisation | Rare or regional languages, natural voice fidelity |
You can switch engines live during an event. We recommend testing both before your event to see which produces better results for your speaker and content type.
Switching translation engines during an event
You can change the translation engine without ending the event. Active channels will restart with the new engine — listeners will experience a brief pause of a few seconds.
- In the event control room, open Runtime settings.
- Select the new engine (OpenAI or Gemini) from the provider dropdown.
- Confirm. All active channels will stop and restart using the new engine.
- Listeners are automatically reconnected to the restarted channels.
If one engine has a provider outage or unusually high latency, switching to the other is a fast recovery option.
Voice customisation (OpenAI engine)
When using the OpenAI engine, you can select from several synthetic voices and adjust the output speed.
- Voice selection: choose from available voice presets in event settings. Preview each voice before the event using the voice preview feature.
- Speed control: set playback speed from 0.25× (very slow) to 1.5× (fast). A slower speed gives listeners more time to follow along; a faster speed reduces the perceived lag between speaker and translation.
- Applies globally: the voice and speed settings apply to all language channels running on the OpenAI engine.
The Gemini engine does not have a synthetic voice selector — it preserves the speaker's original voice characteristics instead.
Translation accuracy and limitations
AI translation quality depends on audio clarity, speaking pace, accents, technical vocabulary, and background noise. For most conference and presentation contexts, the output is clear and understandable. Factors that reduce accuracy:
- Multiple speakers talking simultaneously
- Heavy accent or dialect not well-represented in the model's training data
- Fast speech with little pausing
- Technical, medical, or legal jargon
- Background music, applause, or crowd noise in the audio feed
Do not use Tõlk.fm as the sole source of translation for medical, legal, financial, safety, emergency, or other high-stakes decisions. Always provide human interpreter alternatives where accuracy is critical or legally required.
Billing
Billing & credits
Tõlk.fm uses a prepaid USD wallet. Top up before events, monitor usage live, and review charges in the Billing page.
How billing works
Tõlk.fm charges $0.09 per minute per active translation channel. A channel is one target language running during a live event. Charges are calculated at the end of each event.
| Example | Calculation | Cost |
|---|---|---|
| 30-min talk, 1 language | 30 × 1 × $0.09 | $15.00 (minimum applies) |
| 60-min talk, 2 languages | 60 × 2 × $0.09 | $15.00 (minimum applies) |
| 90-min talk, 3 languages | 90 × 3 × $0.09 | $24.30 |
| Half-day (4 hrs), 4 languages | 240 × 4 × $0.09 | $86.40 |
| Full day (8 hrs), 5 languages | 480 × 5 × $0.09 | $216.00 |
There is a $15.00 minimum charge per event, regardless of actual usage. If an event is very short or uses few channels, the minimum applies.
Top up your wallet
Go to the Billing page in the dashboard and choose a plan or enter a custom amount. Payments are processed by Stripe. Funds appear in your wallet immediately after payment.
| Plan | You pay | Wallet receives | Bonus |
|---|---|---|---|
| Starter | $12 | $12 | — |
| Pro | $108 | $120 | +$12 |
| Scale | $480 | $600 | +$120 |
| Custom | Any amount ≥$5 | Same amount | — |
The Pro and Scale plans include a volume bonus. For frequent events, these plans reduce your effective cost per channel-minute.
Estimate your event cost
Use this formula to estimate spend before an event:
Cost = max( duration_minutes × languages × $0.09, $15.00 )
Top your wallet up by at least the estimated cost before starting. The event will stop translating if your wallet balance reaches zero mid-event.
- For a 2-hour event in 3 languages: 120 × 3 × $0.09 = $32.40
- For a full-day summit (7 hours) in 6 languages: 420 × 6 × $0.09 = $226.80
- For a 20-minute demo in 1 language: minimum applies = $15.00
View transaction history
The Billing page shows:
- Your current wallet balance
- All top-up transactions with date, amount paid, and amount credited
- Per-event charges with event name, date, duration, languages, and cost
- Bonus credits received from Pro and Scale plan top-ups
To download a receipt for any top-up, click the transaction row and use the Stripe receipt link.
Refunds and billing disputes
Refunds are provided where required by law or approved by us on a case-by-case basis. Because a portion of costs are passed directly to AI providers (OpenAI, Google), fees already consumed by those services may be non-refundable.
- EU consumers retain a 14-day withdrawal right for distance contracts, though this may not apply once digital services have been fully performed.
- If you believe you were charged incorrectly, email billing@tolk.fm with your event ID and a description of the issue.
- Disputes related to Stripe card charges should be raised with Stripe first, as they handle payment processing.
Privacy
Privacy & security
How Tõlk.fm handles event audio, organiser data, listener data, and GDPR obligations.
How long is event audio retained?
| Data | Default retention |
|---|---|
| Live audio streams | Processed transiently for real-time translation. Not stored by default. |
| Uploaded media (scheduled playback) | Retained until you delete it or close your workspace. |
| Generated translated audio (prepared) | Retained until you delete it or close your workspace. |
| Captions and transcripts | Retained if the feature is enabled; otherwise not stored. |
| Event metadata (title, languages, timestamps) | Retained for billing, troubleshooting, and event history. |
| Usage logs and billing records | Retained for the statutory accounting period required under Estonian and applicable law. |
If you need data deleted sooner, contact privacy@tolk.fm.
Data shared with OpenAI and Google
Audio and metadata are sent to the AI provider you select for translation. We use business-tier APIs with data-processing agreements — your data is not used to train their models.
| Provider | What is shared | Data-processing posture |
|---|---|---|
| OpenAI | Audio stream, language settings, session metadata, generated translations. | Processed under OpenAI API data processing terms. Prompts and outputs are not used to train OpenAI models when using the API. |
| Google Gemini | Audio stream, language settings, session metadata, generated translations. | Paid Gemini API terms state prompts and responses are not used to improve Google products. Processed under Google's data processing addendum. |
For full details, see the Privacy Policy — AI providers section.
What data do we collect from listeners?
Listeners join without creating an account or logging in. The data collected is minimal and limited to what is needed to provide the stream.
- Join code and event identifier
- Selected language channel
- Playback session status (connected, disconnected)
- Device and browser type (user agent)
- IP address and approximate location from network data
- Connection logs for troubleshooting
Listener data is not sold or shared with third parties beyond the infrastructure providers (Supabase, Vercel) that host the service.
Organiser GDPR obligations
When you use Tõlk.fm at your event, you are responsible for the lawful processing of your attendees' personal data. Key obligations:
- Inform attendees that audio is streamed to AI providers for real-time translation — verbally at the start of the event and in writing in the event programme or venue signage.
- Obtain required consents or provide a legitimate-interest assessment for recording or streaming audience audio, particularly for open-mic sessions.
- Do not capture children's voices without documented parental consent and local safeguarding compliance.
- If your event is in a jurisdiction that requires advance disclosure of AI-generated content, ensure notices are in place before the event starts.
- Provide human interpreter alternatives where required by local accessibility, employment, or venue regulations.
Local laws vary significantly. Consult legal counsel for events in regulated industries or jurisdictions with strict recording or AI-disclosure laws.
Support
Troubleshooting
Diagnose and fix the most common issues with live translation, audio, listener access, and billing.
Translation stopped or froze mid-event
- Check your internet connection and confirm the organiser device is still online.
- Check your wallet balance on the Billing page. Translation stops if the balance reaches zero.
- In the control room, look at the event error log at the bottom of the runtime view for any error messages.
- Try stopping and restarting the affected language channel.
- If the issue persists, switch to the other translation engine (OpenAI ↔ Gemini) — provider-side outages are rare but do occur.
- If nothing resolves it, email hello@tolk.fm with the event ID from the dashboard URL.
Choppy or cutting-out audio for listeners
Common causes and fixes:
| Cause | Fix |
|---|---|
| Weak listener Wi-Fi or mobile data | Ask listeners to move closer to a Wi-Fi access point or switch to mobile data. |
| Organiser device on weak Wi-Fi | Switch the organiser device to wired Ethernet or a dedicated mobile hotspot. |
| Congested venue Wi-Fi | Use a separate SSID for the event, or ask the venue IT team to allocate a dedicated channel. |
| Too many devices per access point | Coordinate with the venue to ensure adequate Wi-Fi coverage density for the expected audience size. |
Translation latency is very high
A few seconds of latency is normal. Latency above 10–15 seconds warrants investigation.
- Network latency: check the organiser device's connection speed and ping. A round-trip time above 150 ms to the media worker increases translation latency.
- AI provider load: rare, but if a provider is under heavy load globally, translation latency increases. Try switching engines.
- Audio buffer issues: if the audio feed has long silences or is very slow, the model may buffer more aggressively. Ensure a consistent, well-paced audio input.
If latency is consistently above 10 seconds and switching engines does not help, contact support with the event ID.
Microphone not detected in browser
The browser requires explicit permission to access the microphone. If you accidentally denied it, reset it as follows:
- Chrome: click the lock icon in the address bar → Site settings → Microphone → Allow → reload the page.
- Firefox: click the lock icon → clear permissions → reload and re-grant when prompted.
- Safari: open Settings → Websites → Microphone → find tolk.fm → set to Allow.
- iOS Safari: open Settings → Safari → Camera & Microphone Access → enable for tolk.fm.
If using an external USB audio interface, make sure the correct device is selected in the audio input dropdown in the event control room, not the built-in microphone.
Event won't start — low wallet balance
Tõlk.fm requires a minimum wallet balance of $15.00 before a live event can be started. This covers the per-event minimum charge.
- Go to the Billing page.
- Top up your wallet with at least $15, or more if the expected event cost is higher.
- Return to the event control room and try starting again.
Use the cost estimator to add enough funds to cover the full expected duration. The event will stop translating mid-session if the balance runs out.
Find your event ID for support requests
Every event has a unique ID visible in the URL of the event control room:
tolk.fm/events/[event-id]
Include this ID in any support email so the team can locate your event quickly in server logs and billing records.