Archive for the ‘General Information’ Category

SDK 1.104 full version (dmg)

Wednesday, December 1st, 2010

We have build a new dmg including the previous released 1.104 patch (for iOS 4.2 simulator compatibility) into your download area.

Remember: it contains only the simulator libs. If you’ve bought the SDK, contact your sales manager in order to get the universal binary version.

La French Mobile (21/09)

Wednesday, September 29th, 2010

Antoine Kauffeisen, VP Marketing of Acapela Group, was at La French Mobile (Paris – 21/09/2010).

The slides of the short presentation of Acapela for iPhone and iPad are available here: Acapela TTS for iPhone and iPad “5 minutes” presentation (pdf – French)

Disruptive code presentation slides

Wednesday, September 22nd, 2010

Here is the link to my presentation at Disruptive Code (Stockholm) this 21st of September 2010:

“Easy creation of talking mobile applications with text-to-speech”

http://www.acapela-for-iphone.com/documents/Acapela_Group_Disruptive_Code_100921.zip

The zip file contains the pdf version of my slides, and also the 4 sounds files used: The Acapela Group presentation, and the 3 exemples of TTS technology from 1970 to now (formant, diphone synthesis and unit selection)

Disruptive code – Stockholm

Thursday, September 16th, 2010

For Scandinavian people, I’m a speaker at Disruptive Code, this 21th of September 2010 (15.10-15.45 in The Auditorium)

“Easy creation of talking mobile applications with text-to-speech”

Don’t hesitate to contact me if you go there. We could arrange a small informal meetings between sessions.

Twitter hashcode for the conference: #dcode

Preview video of Prizmo for iPhone

Thursday, August 5th, 2010

Creaceed, the belgian company specialized in Mac OS and iOS products (Vocalia, Prizmo) have released the second preview video of their upcoming new application: Prizmo for iPhone.

This video is showing Acapela TTS for iPhone and iPad voices in action into Prizmo for iPhone.

Prizmo for iPhone will be available soon on the AppStore.
It will be the first OCR iOS application integrating also high Quality Text-to-Speech!

iOS 4 and iPhone 4 simulator compatibility update

Friday, July 16th, 2010

As I told you 3 weeks ago, there is an incompatibility issue between our libraries and the iPhone 4 simulator of iOS 4 SDK.

However, iPad Simulator (iPhone OS 3.2 SDK) and devices versions (iPhone 2/3G/3Gs/4 – iPod Touch V1/2/3 – iPad running iPhone OS 3.2 or iOS 4.0) was working fine.

I’m glad to announce that Acapela TTS for iPhone and iPad 1.101 is now fully compatibe with the new iOS 4.1 SDK beta 1, available for registered iPhone developers since 14th of July 2010.

We cannot share more information about this beta 1 SDK version as this version is still under NDA. But as soon as you update to this beta and XCode 3.2.4, you will be able to test our SDK with the iPhone 4 simulator.

PS: Don’t forget to update to 1.101 version of our SDK is you want to update to iOS 4 SDK and further.

Presentation slides of webinars

Friday, July 9th, 2010

Thanks to all participants to the first Acapela TTS for iPhone and iPad webinars (30th of June 2010 for the French webinar, 1st of July 2010 for the English one).

These webinars were about the basics of Text-to-speech and how to integrate quickly Acapela TTS for iPhone and iPad into your application.

The topics were:

  • Introduction to Acapela Group
  • Introduction to Text to Speech Technology
  • Description of Acapela TTS for iPhone and iPad SDK
  • API quick overview and live demo of a simple application
  • Q&A

The slides are now available in the documentation section:

The live demo description can be found here: Quick Start: How to add TTS in your app (“HelloWorld TTS app”)

Other webinars are in preparation with more advanced topics like audio management, iOS4 features, inAppPurchase implementation etc ….

If you have other suggestions, don’t hesitate to add a comment.

Good App!

Jean-Michel

Reminder: Our First Webinars for iPhone/iPad developers

Monday, June 28th, 2010

Webinars for iPhone/iPad developers:
Create talkative apps and bring a smiling voice to your interface!

Reminder: Don’t forget to register to our 2 first webinars (one in French, the following in English) for iPhone and iPad developers.

Register to join one of our 30 minutes webinars:
>> In French/en francais on June 30th at 10 am CETClick here to register
>> In English on July 1st at 3 pm CETClick here to register

More information on our corporate website: http://www.acapela-group.com/webinars-for-iphone-ipad-developers-create-talkative-apps-and-bring-a-smiling-voice-to-your-interface-2200-speech-synthesis.html

Jean-Michel Reghem
Developer Solutions Product Manager

Text-to-Speech? What is that?

Saturday, June 12th, 2010

Text-to-speech is a technology that becomes more and more mainstream. However, this is yet still totally an unknown concept for some people.

Here is a small article with the most frequently asked questions about Text-to-Speech …

Text-to-What?

Text To Speech (abbreviation: TTS), also called “Speech synthesis” is the artificial production of human speech. The computer system used to produce TTS is called a speech engine.

A Text to Speech engine transforms any text into speech in real time. It literally reads out loud any written information with a smooth and natural sounding voice. The automatic intonation reflects the meaning of the text, with respect to pauses, breath groups, punctuation and context.

The most important qualities of a speech synthesis system are naturalness and intelligibility. Naturalness describes how closely the output sounds like human speech, while intelligibility is the ease with which the output is understood. The ideal speech synthesizer is both natural and intelligible.
Acapela Text To Speech maximizes both characteristics.

So, you speak in a microphone and it recognize what it is said?

No. Definitely not. This is just the perfect opposite.

Speech recognition (also known as automatic speech recognition (ASR) or dictation system) converts spoken words to text.

Text-to-Speech (TTS or Speech Synthesis) converts text to spoken words.

In a Human Machine Interface model, speech recognition is a way to transfer information FROM a human TO the computer (as does a keyboard, a mouse or a touchscreen).
Speech synthesis is a way to transfer information FROM the computer TO a human (as does a screen, a braille terminal, a beep sound …)

Yes, OK, I understand now … But how does it sound?

Here is an example of our female American voice Heather, who speaks this text: “Hi, I’m Heather, the female american english speech synthesis voice from Acapela. Efficient, fast and a very high quality, why not try me out with your own words?

You have examples of our Text-to-Speech voice on the demo page of our corporate website.

You can also send free speech powered e-card with our sparkling laboratory acapela.tv

And how does it work?

Acapela TTS is mainly based on a technology called “Non Uniform Unit Selection“. This is sometimes called “Text-to-speech of third generation” (previous one was formant synthesis and diphone concatenation)

This schema presents the chain of processes behind Text to Speech.

Step 1) Creation of the voice

This step is done by Acapela R&D and Linguistic team (the bottom part of the image).

In order to reproduce the natural sound of each language, a narrator records a series of texts (poetry, political news, sports results, stock exchange updates, etc.) which contain every possible sound in the chosen language.

These recordings are then sliced (automatically and manually) and organized into an acoustic database.

During database creation, all recorded speech is segmented into some or all of the following: diphones (most of the time), triphones, syllables, morphemes or even words, phrases, and sentences in some case.

Step 2) Text-to-speech realtime process

The first step is done offline, by us, and integrate into our product.
Then the voice and the linguistics data are packed in a product in order to be used by an application.

This is into this application than the Text-to-Speech process is realized (upper part of the picture).

The speech synthesis process is composed of 2 big parts: the linguistic analysis module and the synthesizer

a) Linguistic analysis

The Linguistic analysis is done by the NLP (Natural Language Processing) module.

When a sentence is sent to the TTS, the NLP module system begins by carrying out a sophisticated linguistic analysis that transposes written text into phonetic text.

  • A text preprocessor system transforms all date, currency, email or postal adress, phone number, into a normalized sentence.
    For example, a sentence like “I have only $2.56 in my pocket, it is 12:45 AND I SHOULD EAT something in this 5 stars St. John restaurant” will become internally “I have only two dollars and fifty six cents in my pocket, it is twelve forty-five and i should eat something in this five stars saint john restaurant“.
  • A grammatical and syntactic analysis then enables the system to define how to pronounce each word in order to reconstruct the sense.
  • A phonetizer, a set of rules and lexicons give the phonetic of each word, based on the context and the result of the grammatical and syntactic analysis and the proprocessor.
  • Finally, the system produces information associating the phonetic writing with the tone and required length of the pronunciation. We call this the prosody: it gives the rhythm and intonation of a sentence.
b) Synthesizer and sound output

The chain of analysis ends here and sound is generated by selecting the best units stocked in the acoustic database.

The algorithm takes in input the results of the NLP module, and selects the best chain of candidate units from the database who will match as much as possible the desired prosody: fundamental frequency (pitch), tone, length …

The units are extracted from the database, decoded, concatenated (without signal processing, in order to stay as natural as possible) and sent to the output.

This output may be loudspeakers, headset, file, telephony board, audio stream …

OK, Thanks, now I understand.

You’re welcome :-)

Last day at WWDC. Don’t miss the blue shirt Acapela guys!

Friday, June 11th, 2010

This is the last day at WWDC …
We have all enjoyed the WWDC Bash with OK GO, yesterday.
But there is still a lot of sessions and labs today before flying back to home (and enjoying the World Cup)

Take this last opportunity to discuss with up about Text-to-Speech technology!

DSC06802
DSC06806

and some last photos from WWDC:

“OK Go” Singer, at the WWDC Bash
DSC06830

Bonus: New super secret Science Fiction feature in iOS 4 …

New SciFi project in iOS

And of course we found a lot of useful information on this board

photo